6. Parallel Integral Curves (1/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6

Parallel Integral Curves

David Pugmire

Oak Ridge National Laboratory

Tom Peterka

Argonne National Laboratory

Christoph Garth

University of Kaiserslautern

6.1 Introduction ...................................................... 91

6.2 Challenges to Parallelization ..................................... 94

6.2.1 Problem Classiﬁcation ................................... 94

6.3 Approaches to Parallelization .................................... 95

6.3.1 Test Data ................................................ 96

6.3.2 Parallelization Over Seed Points ........................ 97

6.3.3 Parallelization Over Data ............................... 99

6.3.4 A Hybrid Approach to Parallelization .................. 99

6.3.5 Algorithm Analysis ...................................... 103

6.3.6 Hybrid Data Structure and Communication Algorithm 106

6.4 Conclusion ........................................................ 111

References .......................................................... 112

Understanding vector ﬁelds resulting from large scientiﬁc simulations is an

important and often diﬃcult task. Integral curves (ICs)—curves that are tan-

gential to a vector ﬁeld at each point—are a powerful visualization method in

this context. The application of an integral curve-based visualization to a very

large vector ﬁeld data represents a signiﬁcant challenge, due to the non-local

and data-dependent nature of IC computation. The application requires a

careful balancing of computational demands placed on I/O, memory, commu-

nication, and processors. This chapter reviews several diﬀerent parallelization

approaches, based on established parallelization paradigms (across particles

and data blocks) and current advanced techniques for achieving a scalable,

parallel performance on very large data sets.

92 High Performance Visualization

6.1 Introduction

The visualization of vector ﬁelds is a challenging area of scientiﬁc visualiza-

tion. For example, the analysis of ﬂuid ﬂow that governs natural phenomena

on all scales, from the smallest (e.g., Rayleigh–Taylor mixing of ﬂuids) to

the largest (e.g., supernovae explosions), relies on visualization to elucidate

the patterns exhibited by ﬂows and the dynamical structures driving them

(see Fig. 6.1 and 6.2). Processes typically depicted by vector ﬁelds—such as

transport, circulation, and mixing—are prevalently non-local in nature. Due

to this speciﬁc property, methods and techniques that were developed and

have proven successful for scalar data visualization are not readily generalized

for the study of vector ﬁelds. Hence, while it is technically feasible to apply

such methods directly to derived scalar quantities, such as vector magnitude,

the resulting visualizations often fall short in explaining the mechanisms un-

derlying the scientiﬁc problem.

FIGURE 6.1: A stream surface visualizes recirculating ﬂow in a vortex break-

down bubble. The surface, computed from high-resolution adaptive, unstruc-

tured ﬂow simulation output, consists of several million triangles and was

rendered using an adaptive approach for its transparency, designed to re-

veal the ﬂow structure inside the bubble. Two stripes highlight the diﬀerent

trajectories taken by particles encountering the recirculation. Image courtesy

of Christoph Garth (University of Kaiserslautern), data courtesy of M. R¨utten

(German Aerospace Center, G¨ottingen).

A large majority of visualization approaches for the visualization and anal-

ysis of vector ﬁelds are based on the study of integral curves (ICs). Naturally

understood as trajectories of massless particles, such curves are ideal tools to

study transport, mixing, and other similar processes. These integration-based

methods make use of the intuitive interpretation of ICsasthetrajectoriesof

massless particles and were originally developed to reproduce physical ﬂow

visualization experiments based on small, neutrally buoyant particles. Mathe-

matically, an IC is tangential to the vector ﬁeld at every point along the curve

and individual curves are determined by selecting an initial condition or seed

Parallel Integral Curves 93

FIGURE 6.2: Streamlines showing outﬂows in a magnetic ﬂow ﬁeld computed

by an Active Galactic Nuclei astrophysics simulation (FLASH code). Image

courtesy of David Pugmire (ORNL), data courtesy Paul Sutter (UIUC).

point, the location from which the curve begins, and an integration time over

which the virtual particle is traced.

ICs are applicable in many diﬀerent settings. For example, the direct vi-

sualization of particles, and particle families and their trajectories, gives rise

to such visualization primitives as streamlines, pathlines, streak lines, or inte-

gral surfaces (e.g., see Krishnan et al. [6]). To speciﬁcally study transport and

mixing, ICs also oﬀer an interesting change of perspective. Instead of consid-

ering the so-called Eulerian perspective that describes evolution of quantities

at ﬁxed locations in space, the Lagrangian view examines the evolution from

the point of view of an observer attached to a particle moving with the vector

ﬁeld, oﬀering a natural and intuitive description of transport and mixing pro-

cesses. Consider, for example, the case of combustion: fuel burns while it is

advected by a surrounding ﬂow; hence, the burning process and its governing

equations are primarily Lagrangian in nature.

Further, Lagrangian methods concentrate on deriving structural analysis

from the Lagrangian perspective. For example, Finite-Time Lyapunov Ex-

ponents (FTLE) empirically determine exponential separation rates among

neighboring particles from a dense set of ICs covering a domain of interest

[7, 5]. From ridges in FTLE, that is locally maximal lines and surfaces, one

can identify so-called Lagrangian coherent structures (LCS) that approximate

hyperbolic material lines and represent the dynamic skeleton of a vector ﬁeld

that drives its structure and evolution. For an overview of modern ﬂow visu-

alization techniques, see McLoughlin et al. [8].

Computationally, ICs are approximated using numerical integration

94 High Performance Visualization

schemes (cf. [4]). These schemes construct a curve in successive pieces: starting

from the seed point, the vector ﬁeld is sampled in the vicinity of the current

integration point, and an additional curve sequence is determined and ap-

pended to the existing curve. This is repeated until the curve has reached its

desired length or leaves the vector ﬁeld domain. To propagate an IC through

a region of a given vector requires access to the source data. If the source data

canremaininthemainmemory,itwillbemuchfasterthanasituationwhere

the source data must be paged into main memory from a secondary location.

In this setting, the non-local nature of particle advection implies that the path

taken by a particle largely determines which blocks of data must be loaded.

This information is a priori unknown and depends on the vector ﬁeld itself.

Thus, general parallelization of IC computation is a diﬃcult problem for large

data sets on distributed-memory architectures.

6.2 Challenges to Parallelization

This chapter aims to both qualify and quantify the performance of three

diﬀerent parallelization strategies for IC computation. Before providing a dis-

cussion of particular parallelization algorithms in 6.3, this section will ﬁrst

present challenges that are particular to the parallelization of IC computa-

tion on distributed-memory systems.

6.2.1 Problem Classiﬁcation

The parallel IC problem is complex and challenging. To design an ex-

perimental methodology that provides robust coverage of diﬀerent aspects of

algorithmic behavior, some of which is data set dependent, the following fac-

tors must be taken into account, which inﬂuence parallelization strategy and

performance test design.

Data Set Size. If the data set is small, in the sense that it ﬁts entirely into

the memory footprint of each task, then it makes the most sense to distribute

the IC computation workload. On the other hand, if the data set is larger than

will ﬁt in each task’s memory, then more complex approaches are required that

involve data distribution and possibly computational distribution.

Seed Set Size. If the problem at hand requires only the computation of

a thousand streamlines, parallel computation takes secondary precedence to

optimal data I/O and distribution. In this case, the corresponding seed set is

small. Such small seed sets are typically encountered in interactive exploration

scenarios, where relatively few ICs are interactively seeded by a user. A large

seed set may consist of many thousands of seed points. Here, it will be desir-

able to distribute the IC computational workload, yet, the data distribution

schemes need to allow for eﬃcient parallel IC computation.

Seed Set Distribution. In the case where seed points are located densely

within the spatial and temporal domain of a deﬁned vector ﬁeld, it is likely

Parallel Integral Curves 95

that the ICs will traverse a relatively small amount of the overall data. On

the other hand, for some applications, like streamline statistics, a sparse seed

point set covers the entire vector ﬁeld evenly. This distribution results in ICs

traversing the entire data set. Hence, the seed set distribution is a signiﬁcant

factor in determining if the performance stands to gain the most from parallel

computation, data distribution, or both.

Vector Field Complexity. Depending on the choice of seed points, the struc-

ture of a vector ﬁeld can have a strong inﬂuence on which parts of the data need

to be taken into account in the IC computation process. Critical points, or

invariant manifolds, of a strongly attracting nature draw streamlines towards

them. The resulting ICs seeded in or traversing their vicinity will remain

closely localized. In the opposite case, a nearly uniform vector ﬁeld requires

ICs to pass through large parts of the data. This dependency of IC com-

putation on the underlying vector ﬁeld is both counterintuitive and hard to

identify without conducting a prior analysis to determine the ﬁeld structure,

as is done in Yu et al.’s study [13], for example. While such analysis can be

useful for speciﬁc problems, this chapter’s contents consider a more general

setting, where the burden and cost of pre-processing is not considered.

6.3 Approaches to Pa rallelization

IC calculation places demands on almost all components of a computa-

tional system, including memory, processing, communication, and I/O. Be-

cause of the complex nature of vector ﬁelds, seeding scenarios, and the types

of analyses, as outlined in 6.2, there is no single scalable algorithm suitable

for all situations.

Generally, algorithms can be parallelized in two primary ways. The ﬁrst

way is parallelization across the seed points, where seeds are assigned to pro-

cessors, and data blocks are loaded as needed. The second way is paralleliza-

tion across the data blocks, where processors are assigned a set of data blocks,

and particles are communicated to the processor that owns the data block. In

choosing between these two axes of parallelization, the diﬀerent assignments

of particles and data blocks to processors will place diﬀerent demands on the

computing, memory, communication, and I/O subsystems of a cluster.

These two classes of algorithms have a tendency to perform poorly due to

a workload imbalance, either through an imbalance in particle to processor as-

signment, or through loading too many data blocks, which become I/O bound.

Recent research blending these two approaches shows promising results.

The subsequent sections outline algorithms that parallelize with respect to

particles (6.3.2) and data blocks (6.3.3), as well as a hybrid approach (6.3.4)

and discuss the algorithm’s performance characteristics. These strategies aim

to keep a balanced workload across the set of processors and to design eﬃcient

data structures for handling IC computation and communication.

In the approaches outlined below, the problem mesh is decomposed into

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6. Parallel Integral Curves (1/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
6. Parallel Integral Curves (1/5)