Chapter 6
Parallel Integral Curves
David Pugmire
Oak Ridge National Laboratory
Tom Peterka
Argonne National Laboratory
Christoph Garth
University of Kaiserslautern
6.1 Introduction ...................................................... 91
6.2 Challenges to Parallelization ..................................... 94
6.2.1 Problem Classification ................................... 94
6.3 Approaches to Parallelization .................................... 95
6.3.1 Test Data ................................................ 96
6.3.2 Parallelization Over Seed Points ........................ 97
6.3.3 Parallelization Over Data ............................... 99
6.3.4 A Hybrid Approach to Parallelization .................. 99
6.3.5 Algorithm Analysis ...................................... 103
6.3.6 Hybrid Data Structure and Communication Algorithm 106
6.4 Conclusion ........................................................ 111
References .......................................................... 112
Understanding vector fields resulting from large scientific simulations is an
important and often difficult task. Integral curves (ICs)—curves that are tan-
gential to a vector field at each point—are a powerful visualization method in
this context. The application of an integral curve-based visualization to a very
large vector field data represents a significant challenge, due to the non-local
and data-dependent nature of IC computation. The application requires a
careful balancing of computational demands placed on I/O, memory, commu-
nication, and processors. This chapter reviews several different parallelization
approaches, based on established parallelization paradigms (across particles
and data blocks) and current advanced techniques for achieving a scalable,
parallel performance on very large data sets.
91
92 High Performance Visualization
6.1 Introduction
The visualization of vector fields is a challenging area of scientific visualiza-
tion. For example, the analysis of fluid flow that governs natural phenomena
on all scales, from the smallest (e.g., Rayleigh–Taylor mixing of fluids) to
the largest (e.g., supernovae explosions), relies on visualization to elucidate
the patterns exhibited by flows and the dynamical structures driving them
(see Fig. 6.1 and 6.2). Processes typically depicted by vector fields—such as
transport, circulation, and mixing—are prevalently non-local in nature. Due
to this specific property, methods and techniques that were developed and
have proven successful for scalar data visualization are not readily generalized
for the study of vector fields. Hence, while it is technically feasible to apply
such methods directly to derived scalar quantities, such as vector magnitude,
the resulting visualizations often fall short in explaining the mechanisms un-
derlying the scientific problem.
FIGURE 6.1: A stream surface visualizes recirculating flow in a vortex break-
down bubble. The surface, computed from high-resolution adaptive, unstruc-
tured flow simulation output, consists of several million triangles and was
rendered using an adaptive approach for its transparency, designed to re-
veal the flow structure inside the bubble. Two stripes highlight the different
trajectories taken by particles encountering the recirculation. Image courtesy
of Christoph Garth (University of Kaiserslautern), data courtesy of M. R¨utten
(German Aerospace Center, ottingen).
A large majority of visualization approaches for the visualization and anal-
ysis of vector fields are based on the study of integral curves (ICs). Naturally
understood as trajectories of massless particles, such curves are ideal tools to
study transport, mixing, and other similar processes. These integration-based
methods make use of the intuitive interpretation of ICsasthetrajectoriesof
massless particles and were originally developed to reproduce physical flow
visualization experiments based on small, neutrally buoyant particles. Mathe-
matically, an IC is tangential to the vector field at every point along the curve
and individual curves are determined by selecting an initial condition or seed
Parallel Integral Curves 93
FIGURE 6.2: Streamlines showing outflows in a magnetic flow field computed
by an Active Galactic Nuclei astrophysics simulation (FLASH code). Image
courtesy of David Pugmire (ORNL), data courtesy Paul Sutter (UIUC).
point, the location from which the curve begins, and an integration time over
which the virtual particle is traced.
ICs are applicable in many different settings. For example, the direct vi-
sualization of particles, and particle families and their trajectories, gives rise
to such visualization primitives as streamlines, pathlines, streak lines, or inte-
gral surfaces (e.g., see Krishnan et al. [6]). To specifically study transport and
mixing, ICs also offer an interesting change of perspective. Instead of consid-
ering the so-called Eulerian perspective that describes evolution of quantities
at fixed locations in space, the Lagrangian view examines the evolution from
the point of view of an observer attached to a particle moving with the vector
field, offering a natural and intuitive description of transport and mixing pro-
cesses. Consider, for example, the case of combustion: fuel burns while it is
advected by a surrounding flow; hence, the burning process and its governing
equations are primarily Lagrangian in nature.
Further, Lagrangian methods concentrate on deriving structural analysis
from the Lagrangian perspective. For example, Finite-Time Lyapunov Ex-
ponents (FTLE) empirically determine exponential separation rates among
neighboring particles from a dense set of ICs covering a domain of interest
[7, 5]. From ridges in FTLE, that is locally maximal lines and surfaces, one
can identify so-called Lagrangian coherent structures (LCS) that approximate
hyperbolic material lines and represent the dynamic skeleton of a vector field
that drives its structure and evolution. For an overview of modern flow visu-
alization techniques, see McLoughlin et al. [8].
Computationally, ICs are approximated using numerical integration
94 High Performance Visualization
schemes (cf. [4]). These schemes construct a curve in successive pieces: starting
from the seed point, the vector field is sampled in the vicinity of the current
integration point, and an additional curve sequence is determined and ap-
pended to the existing curve. This is repeated until the curve has reached its
desired length or leaves the vector field domain. To propagate an IC through
a region of a given vector requires access to the source data. If the source data
canremaininthemainmemory,itwillbemuchfasterthanasituationwhere
the source data must be paged into main memory from a secondary location.
In this setting, the non-local nature of particle advection implies that the path
taken by a particle largely determines which blocks of data must be loaded.
This information is a priori unknown and depends on the vector field itself.
Thus, general parallelization of IC computation is a difficult problem for large
data sets on distributed-memory architectures.
6.2 Challenges to Parallelization
This chapter aims to both qualify and quantify the performance of three
different parallelization strategies for IC computation. Before providing a dis-
cussion of particular parallelization algorithms in 6.3, this section will first
present challenges that are particular to the parallelization of IC computa-
tion on distributed-memory systems.
6.2.1 Problem Classification
The parallel IC problem is complex and challenging. To design an ex-
perimental methodology that provides robust coverage of different aspects of
algorithmic behavior, some of which is data set dependent, the following fac-
tors must be taken into account, which influence parallelization strategy and
performance test design.
Data Set Size. If the data set is small, in the sense that it fits entirely into
the memory footprint of each task, then it makes the most sense to distribute
the IC computation workload. On the other hand, if the data set is larger than
will fit in each task’s memory, then more complex approaches are required that
involve data distribution and possibly computational distribution.
Seed Set Size. If the problem at hand requires only the computation of
a thousand streamlines, parallel computation takes secondary precedence to
optimal data I/O and distribution. In this case, the corresponding seed set is
small. Such small seed sets are typically encountered in interactive exploration
scenarios, where relatively few ICs are interactively seeded by a user. A large
seed set may consist of many thousands of seed points. Here, it will be desir-
able to distribute the IC computational workload, yet, the data distribution
schemes need to allow for efficient parallel IC computation.
Seed Set Distribution. In the case where seed points are located densely
within the spatial and temporal domain of a defined vector field, it is likely
Parallel Integral Curves 95
that the ICs will traverse a relatively small amount of the overall data. On
the other hand, for some applications, like streamline statistics, a sparse seed
point set covers the entire vector field evenly. This distribution results in ICs
traversing the entire data set. Hence, the seed set distribution is a significant
factor in determining if the performance stands to gain the most from parallel
computation, data distribution, or both.
Vector Field Complexity. Depending on the choice of seed points, the struc-
ture of a vector field can have a strong influence on which parts of the data need
to be taken into account in the IC computation process. Critical points, or
invariant manifolds, of a strongly attracting nature draw streamlines towards
them. The resulting ICs seeded in or traversing their vicinity will remain
closely localized. In the opposite case, a nearly uniform vector field requires
ICs to pass through large parts of the data. This dependency of IC com-
putation on the underlying vector field is both counterintuitive and hard to
identify without conducting a prior analysis to determine the field structure,
as is done in Yu et al.’s study [13], for example. While such analysis can be
useful for specific problems, this chapter’s contents consider a more general
setting, where the burden and cost of pre-processing is not considered.
6.3 Approaches to Pa rallelization
IC calculation places demands on almost all components of a computa-
tional system, including memory, processing, communication, and I/O. Be-
cause of the complex nature of vector fields, seeding scenarios, and the types
of analyses, as outlined in 6.2, there is no single scalable algorithm suitable
for all situations.
Generally, algorithms can be parallelized in two primary ways. The first
way is parallelization across the seed points, where seeds are assigned to pro-
cessors, and data blocks are loaded as needed. The second way is paralleliza-
tion across the data blocks, where processors are assigned a set of data blocks,
and particles are communicated to the processor that owns the data block. In
choosing between these two axes of parallelization, the different assignments
of particles and data blocks to processors will place different demands on the
computing, memory, communication, and I/O subsystems of a cluster.
These two classes of algorithms have a tendency to perform poorly due to
a workload imbalance, either through an imbalance in particle to processor as-
signment, or through loading too many data blocks, which become I/O bound.
Recent research blending these two approaches shows promising results.
The subsequent sections outline algorithms that parallelize with respect to
particles (6.3.2) and data blocks (6.3.3), as well as a hybrid approach (6.3.4)
and discuss the algorithm’s performance characteristics. These strategies aim
to keep a balanced workload across the set of processors and to design efficient
data structures for handling IC computation and communication.
In the approaches outlined below, the problem mesh is decomposed into
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.250.11