Query-Driven Visualization and Analysis 127
bin, and rendering order takes into account region density so that important
regions are not hidden by occlusion. Different colors could be assigned to differ-
ent timesteps, producing a temporal parallel coordinates plot (see Fig. 7.3c).
This approach is useful in helping to reveal multivariate trends and outliers
across large, time-varying data sets.
7.3.2 Segmenting Query Results
A query describes a binary classification of the data, based on whether
a record satisfies the query condition(s). However, typically, the feature(s)
of interest to the end users are not individual data records, but regions of
space defined by connected components of a query—such as ignition kernels or
flame fronts. Besides physical components of a query, a feature of interest may
also be defined by groups of records (clusters) in high-dimensional variable
space. Combining QDV with methods for the segmentation and classification
of query results can facilitate the analysis of large data sets by supporting
the identification of subfeatures of a query, by suggesting strategies for the
refinement of queries, or by automating the definition of complex queries for
automatic feature detection.
A common approach for enhancing the QDV analysis process consists of
identifying spatially connected components in the query results. This has been
accomplished in the past using a technique known as connected component
labeling [35, 39]. Information from connected component labels provides the
means to enhance the visualization—as shown in Figure 7.4b—and enables
further quantitative analysis. For example, statistical analysis of the number
and distributions in size or volume of physical components of query results
can provide valuable information about the state and evolution of dynamic
physical processes, such as a flame [5].
Connected component analysis for QDV has a wide range of applications.
Stockinger et al. [31] applied this approach to combustion simulation data. In
this context, a researcher might be interested in finding ignition kernels, or
regions of extinction. On the other hand, when studying the stability of mag-
netic confinement for fusion, a researcher might be interested in regions with
high electric potential because of their association with zonal flows, critical to
the stability of the magnetic confinement—as shown in Figure 7.4a [41].
In practice, connected component analysis is mainly useful for data with
known topology, in particular, data defined on regular meshes. For scattered
data—such as particle data—where no connectivity is given, density-based
clustering approaches may provide a useful alternative to group selected par-
ticles based on their spatial distribution. One of the main limitations of a
connected component analysis, in the context of QDV, is that the labeling
of components by itself does not yield any direct feedback about possible
strategies for the refinement of data queries.
To address this problem, Gosink et al. [17] proposed the use of multivariate
statistics to support the exploration of the solution space of data queries. To