Query-Driven Visualization and Analysis 137












FIGURE 7.7: QDV of large 3D plasma-based particle acceleration data con-
taining 90 ×10
6
particles per timestep. On the left (a), parallel coordinates
of timestep t = 12 showing: (1) all particles above the base acceleration level
of the plasma wave (Query: px > 2 × 10
9
) (gray) and (2) a set of particles
that form a compact beam in the first wake period following the laser pulse
(Query: (px > 4.856 × 10
10
)AND(x>5.649 × 10
4
))(red). On the right (b),
volume rendering of the plasma density illustrating the 3D structure of the
plasma wave. The selected beam particles are shown in addition in red. Image
source: ubel et al., 2008 [27].
often results in the selection of multiple beam-like features trapped in different
periods of the plasma wave.
As illustrated in Figure 7.7a (red lines), to separate the different particle
beams and to extract the main particle beam, the selection is then often
further refined through range queries in the longitudinal coordinate x and
transverse coordinates y and z. In the selection process, parallel coordinates
provide interactive feedback about the structure of the selection, allowing for
fast identification of outliers and different substructures of the selection. High
performance scientific visualization methods are then used to validate and
further analyze the selected particles Figure 7.7b.
The next step is to trace, or follow, and analyze the high-energy particles
through time. Particle tracing is used to detect the point in time when the
beam reaches its peak energy and to assess the quality of the beam. Tracing
the beam particles further back in time, to the point at which the particles
enter the simulation window, supports analysis of the injection, beam forma-
tion, and beam evolution processes. Figure 7.3c shows an example in which
parallel coordinates are used to compare the acceleration behavior of two par-
ticle beams. Based on the information from different individual timesteps,
a user may refine a query, to select, for example, beam substructures that
are visible at different discrete timesteps. R¨ubel et al. [27] demonstrated that
using FastBit-accelerated equality queries, the time needed for tracing beam
particles can be reduced from many hours to less than a second.
Automatic Query-Based Beam Detection: Interactive query-driven visual
138 High Performance Visualization




















FIGURE 7.8: Visualization of the relative traces of a particle beam in a laser
plasma accelerator. The traces show the motion of the beam particles relative
to the laser pulse. The xy-plane shows isocontours of the particle density at
the timepoint when the beam reaches its peak energy, illustrating the location
of the beam within the plasma wave. The positive vertical axis and color of
the particle traces show the particle momentum in the transverse direction,
py. The image shows particles being injected from the sides and oscillating
while being accelerated. Image source: R¨ubel et al., 2009 [26].
data exploration is effective in that it provides great flexibility and supports
detailed data analysis, allowing scientist to define and validate new hypotheses
about the data and the physical phenomena being modeled. However, manual
detection and extraction of the acceleration features of interest, such as par-
ticle beams, is time consuming. To support the analysis of large collections of
accelerator simulations, efficient methods are needed for automatic detection
and extraction of particle beams.
ubel et al. [26] described an efficient query-based algorithm for automatic
detection of particle beams. At a single timestep, the analysis proceeds as fol-
lows. First, the algorithm computes a 3D conditional histogram of (x, y, px)
space, restricted to high-energy particles selected by the query px > 1e10.
Using a two-stage segmentation process, the algorithm first identifies the re-
gion in the physical space (x, y), containing the highest energy particles. The
so-defined spatial selection is then refined using a 3D region-growing-based
segmentation centered around the highest-density histogram bin. Histogram-
based bin-queries (see 7.2.1) are then used to extract the particles belonging to
the so-identified feature. The identified per-timestep features are then traced
over time and refined through additional per-timestep segmentations.
Finally, the particles of interest are traced over time using a series of
Query-Driven Visualization and Analysis 139
ID-based equality queries, and additional particle-to-feature distance fields
are computed. In this query-based analysis process, the computation of con-
ditional histograms, the evaluation of histogram-based bin queries, and the
identifier-based particle tracing are accelerated through FastBit.
Even for a large 3D data set (approximately 620GB) consisting of 26
timesteps with approximately 230 × 10
6
particles per timestep, the analy-
sis requires only about 3 minutes in serial, which is already much less than
what a single histogram-based bin-query would require without using Fast-
Bit. As shown in Figure 7.2b, the analysis has to evaluate tens of bin queries,
52 threshold and equality queries. Combining the query-based beam analysis
with advanced visualization supports automated analysis and comparison of
particle beams in complex LPA simulations (see Fig. 7.4c). Figure 7.8 shows
as an example a visualization of the injection and acceleration process of an
automatically detected particle beam.
7.5 Conclusion
Query-driven visualization and analysis is a blend of technologies that limit
visualization and analysis processing to the subset of data that is “interesting.”
The premise is that, in any given investigation, only a small fraction of a large
data set is of interest.
QDV is useful for fast visual exploration of extremely large data sets.
It has been shown to be useful to accelerate and automate complex feature
detection tasks, in particular, when combined with other analysis methods,
such as unsupervised learning. As a general concept, QDV is very flexible
and can be easily integrated with a large range of other methods. Advanced
index/query systems (e.g., FastQuery) and integration of these data interfaces
with advanced visualization systems (such as VisIt, Chap. 16), make QDV-
based analysis much more accessible to the scientific community.
QDV is among the small subset of techniques that can address visualization
and analysis of large and highly complex data. The case studies in this chapter,
taken from forensic cybersecurity and high-energy physics applications, make
use of supercomputing platforms to perform QDV and analysis on data sets
of unprecedented sizes. Other recent work in QDV, not discussed here, applies
the concepts to diverse application areas, like computational finance, climate
modeling and analysis, magnetically confined fusion, and astrophysics. One
theme across all of these studies is that QDV reduces visualization and analysis
processing time from hours or days to seconds or minutes.
140 High Performance Visualization
References
[1] G. Antoshenkov. Byte-aligned Bitmap Compression. Technical report,
Oracle Corp., 1994. U.S. Patent number 5,363,098.
[2] Rudolf Bayer and Edward McCreight. Organization and Maintenance of
Large Ordered Indexes. Acta Informatica, 3:173–189, 1972.
[3] E. Wes Bethel, Scott Campbell, Eli Dart, Kurt Stockinger, and Kesheng
Wu. Accelerating Network Traffic Analysis Using Query-Driven Visual-
ization. In Proceedings of 2006 IEEE Symposium on Visual Analytics
Science and Technology, pages 115–122. IEEE Computer Society Press,
October 2006. LBNL-59891.
[4] Peter A. Boncz, Marcin Zukowski, and Niels Nes. MonetDB/X100:
Hyper-Pipelining Query Execution. In Proceedings of the Biennial Con-
ference on Innovative Data Systems Research (CIDR), pages 225–237,
January 2005.
[5] Peer-Timo Bremer, Gunther H. Weber, Valerio Pascucci, Marcus S. Day,
and John B. Bell. Analyzing and Tracking Burning Structures in Lean
Premixed Hydrogen Flames. IEEE Transactions on Visualization and
Computer Graphics, 16(2):248–260, Mar/Apr 2010. LBNL-2276E.
[6] Paul G. Brown. Overview of sciDB: Large Scale Array Storage, Processing
and Analysis. In Proceedings of the 2010 ACM SIGMOD International
Conference on Management of Data, SIGMOD ’10, pages 963–968, Indi-
anapolis, IN, USA, 2010.
[7] C. Y. Chan and Y. E. Ioannidis. Bitmap Index Design and Evaluation.
In Proceedings of the 1998 ACM SIGMOD International Conference on
Management of Data, SIGMOD ’98, pages 355–366, New York, NY, USA,
June 1998. ACM.
[8] C. Y. Chan and Y. E. Ioannidis. An Efficient Bitmap Encoding Scheme
for Selection Queries. In SIGMOD, Philadelphia, PA, USA, June 1999.
ACM Press.
[9] Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes
Bethel, Arie Shoshani, Oliver R¨ubel, Prabhat, and Rob D. Ryne. Par-
allel Index and Query for Large Scale Data Analysis. In Proceedings of
2011 International Conference for High Performance Computing, Net-
working, Storage and Analysis, SC ’11, pages 30:1–30:11, Seattle, WA,
USA, November 2011.
[10] Douglas Comer. The Ubiquitous B-Tree. Computing Surveys, 11(2):121–
137, 1979.
Query-Driven Visualization and Analysis 141
[11] J. N. Corlett, K. M. Baptiste, J. M. Byrd, P. Denes, R. J. Donahue, L. R.
Doolittle, R. W. Falcone, D. Filippetto, J. Kirz, D. Li, H. A. Padmore,
C. F. Papadopoulos, G. C. Pappas, G. Penn, M. Placidi, S. Prestemon,
J.Qiang,A.Ratti,M.W.Reinsch,F.Sannibale,D.Schlueter,R.W.
Schoenlein, J. W. Staples, T. Vecchione, M. Venturini, R. P. Wells, R. B.
Wilcox, J. S. Wurtele, A. E. Charman, E. Kur, and A. Zholents. A Next
Generation Light Source Facility at LBNL. In Proceedings of PAC 2011,
New York, NY, USA, April 2011.
[12] E. Esarey, C. B. Schroeder, and W. P. Leemans. Physics of Laser-
Driven Plasma-Based Electron Accelerators. Reviews of Modern Physics,
81:1229–1285, 2009.
[13] V. Gaede and O. G¨unther. Multidimension Access Methods. ACM Com-
puting Surveys, 30(2):170–231, 1998.
[14] C. G. R. Geddes, Cs. Toth, J. van Tilborg, E. Esarey, C. B. Schroeder,
D. Bruhwiler, C. Nieter, J. Cary, and W. P. Leemans. High-Quality Elec-
tron Beams from a Laser Wakefield Accelerator Using Plasma-Channel
Guiding. Nature, 438:538–541, 2004. LBNL-55732.
[15]LukeGosink,JohnC.Anderson,E.WesBethel,andKennethI.Joy.
Variable Interactions in Query-Driven Visualization. IEEE Transactions
on Visualization and Computer Graphics (Proceedings of Visualization
2007), 13(6):1400–1407, November/December 2007. LBNL-63524.
[16] Luke Gosink, John Shalf, Kurt Stockinger, Kesheng Wu, and E. Wes
Bethel. HDF5-FastQuery: Accelerating Complex Queries on HDF
Datasets using Fast Bitmap Indices. In Proceedings of the 18th Inter-
national Conference on Scientific and Statistical Database Management,
pages 149–158. IEEE Computer Society Press, July 2006. LBNL-59602.
[17] Luke J. Gosink, Christoph Garth, John C. Anderson, E. Wes Bethel, and
Kenneth I. Joy. An Application of Multivariate Statistical Analysis for
Query-Driven Visualization. IEEE Transactions on Visualization and
Computer Graphics, 17(3):264–275, 2011. LBNL-3536E.
[18] Alfred Inselberg. Parallel Coordinates Visual Multidimensional Geometry
and Its Applications. Springer-Verlag, Secaucus, NJ, USA, 2008.
[19] W. P. Leemans, B. Nagler, A. J. Gonsalves, Cs. Toth, K. Nakamura,
C. G. R. Geddes, E. Esarey, C. B. Schroeder, and S. M. Hooker. GeV
Electron Beams from a Centimetre-Scale Accelerator. Nature Physics,
2:696–699, 2006.
[20] C. Nieter and J. R. Cary. VORPAL: A versatile plasma simulation code.
Journal of Computational Physics, 196(2):448–473, 2004.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.131.47