9. In Situ Processing (1/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9

In Situ Processing

Hank Childs

Lawrence Berkeley National Laboratory

Kwan-Liu Ma, Hongfeng Yu

University of California, Davis and Sandia National Laboratories

Brad Whitlock, Jeremy Meredith, Jean Favre

Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, and

Swiss Center for Scientiﬁc Computing

Scott Klasky, Norbert Podhorszki

Oak Ridge National Laboratory

Karsten Schwan, Matthew Wolf

Georgia Institute of Technology, Atlanta

Manish Parashar

Rutgers University

Fan Zhang

Rutgers University

9.1 Introduction ...................................................... 172

9.2 Tailored Co-Processing at High Concurrency ................... 174

9.3 Co-Processing With General Visualization Tools Via Adaptors 175

9.3.1 Adaptor Design .......................................... 178

9.3.2 High Level Implementation Issues ....................... 178

9.3.3 In Practice ............................................... 179

9.3.4 Co-Processing Performance .............................. 180

9.4 Concurrent Processing ........................................... 183

9.4.1 Service Oriented Architecture for Data Management in

HPC ..................................................... 183

9.4.2 The ADaptable I/O System, ADIOS .................... 184

9.4.3 Data Staging for In Situ Processing .................... 185

9.4.4 Exploratory Visualization with VisIt and Paraview

Using ADIOS ............................................ 186

171

172 High Performance Visualization

9.5 In Situ Analytics Using Hybrid Staging ......................... 187

9.6 Data Exploration and In Situ Processing ....................... 189

9.6.1 In Situ Visualization by Proxy .......................... 190

9.6.2 In Situ Data Triage ..................................... 191

9.7 Conclusion ........................................................ 193

References .......................................................... 194

Traditionally, visualization is done via post-processing: a simulation produces

data, writes that data to disk, and then, later, a separate visualization program

reads the data from disk and operates on it. In situ processing refers to a

diﬀerent approach: the data is processed while it is being produced by the

simulation, allowing visualization to occur without involving disk storage. As

recent supercomputing trends have simulations producing data at a much

faster rate than I/O bandwidth, in situ processing will likely play a bigger

and bigger role in visualizing data sets on the world’s largest machines.

The push towards commonplace in situ processing has a beneﬁt besides

saving on I/O costs. Already, scientists must limit how much data they store

for later processing, potentially limiting their discoveries and the value of

their simulations. In situ processing, however, enables the processing of this

unexplored data.

This chapter describes the diﬀerent approaches for in situ processing and

discusses their beneﬁts and limitations.

9.1 Introduction

The terms used for in situ processing have not been used consistently

throughout the community. In this book, in situ processing refers to a

spectrum of processing techniques. The commonality between these tech-

niques is that they enable visualization and analysis techniques without the

signiﬁcant—and increasing expensive—cost of I/O. On one end of the in situ

spectrum, referred to in this book as co-processing, visualization routines are

part of the simulation code, with direct access to the simulation’s memory. On

the other end of the in situ spectrum, referred to in this book as “concurrent

processing,” the visualization program runs separately on distinct resources,

with data transferred from the simulation to the visualization program via the

network. Hybrid methods combine these approaches: data is processed and re-

duced using direct access to the simulation’s memory (i.e., co-processing) and

then sent to a dedicated visualization resource for further processing (i.e.,

concurrent processing). Table 9.1 summarizes these methods.

For approaches that process data via direct access to the simulation’s mem-

ory (i.e., co-processing and hybrid approaches), another important considera-

tion is whether to use custom, tailored code, written for a speciﬁc simulation,

or whether to leverage existing, richly featured visualization software. Tai-

In Situ Processing 173

Technique Co-processing Concurrent Hybrid

Processing

Tightly coupled Loosely coupled

Aliases Synchronous Asynchronous None

Staging

Vis runs on Data is reduced

Vis routines have dedicated, via co-

Description direct access concurrent processing

to memory of resources and and sent to

simulation code access data a concurrent

via network resource.

Memory

constraints. Data movement Complex. Also

Negative Large impact costs. Requires shares negatives

Aspects on simulation separate of other

(crashes, resources. approaches.

performance).

TABLE 9.1: Summary of in situ processing techniques

lored code is most likely to be well-optimized for performance and memory

requirements, for example by using a simulation’s data structures without

copying them into another form. However, tailored code is often not portable

or re-usable, making it tied to a speciﬁc simulation code. Utilizing existing,

richly featured visualization software is more likely to work well with many

simulations. But, typically, the price of this generality may be increased due

to the usage of resources, such as memory, network bandwidth, or CPU time.

This chapter discusses case studies for three in situ processing scenarios:

• a tailored, co-processing approach at high levels of concurrency, de-

scribed in 9.2;

• a co-processing approach via an adaptor layer to a richly featured visu-

alization tool, described in 9.3; and

• a concurrent approach in the context of the ADIOS system, described

in 9.4.

Hybrid processing is an emerging idea, with little published work; its concepts

and recent results are discussed in 9.5. Finally, 9.6 discusses how to use in

situ processing when the visualizations and analyses to be performed are not

known a priori.

174 High Performance Visualization

9.2 Tailored Co-Processing at High Concurrency

Tailored co-processing is the most natural approach when implementing

an in situ solution from scratch and focusing on integration with just one

simulation code. The techniques and data structures are implemented with a

speciﬁc target in mind, resulting in high eﬃciency in performance and memory

overhead, since the simulation’s data does not need to be copied. Not surpris-

ingly, many of the largest scale examples of in situ processing, to date, have

been done using tailored co-processing. Co-processing has been practiced by

some researchers in the past with the objective to either monitor or steer sim-

ulations [10, 21, 17, 20]. Later results focused on speeding up the simulation

by reducing I/O costs [32, 37].

Co-processing has many advantages:

• Accessing the data is very eﬃcient, as the data is already in the primary

memory. In contrast, post-processing accesses data via the ﬁle system

and concurrent processing accesses data via the network.

• Visualization routines can access more data than is typically possible.

Simulation codes limit the number of time slices they output for post-

processing, since I/O is so expensive. But, with co-processing, the visu-

alization routines can be applied to every time slice, and at a low cost.

This prevents discoveries from being missed because the data could never

be explored.

• The integrity of the data does need to be compromised. Some simulations

produce so much data that the data must be reduced somehow, for

example by subsampling; this is not necessary when co-processing.

Co-processing also has some disadvantages:

• Any resources, such as memory or network bandwidth, consumed by

visualization routines will reduce what is available to the simulation.

Further, existing visualization algorithms are usually not optimized to

use the domain decomposition and data structures designed for the sim-

ulation code. To be used in situ, the algorithms may have to be refor-

mulated so as not to duplicate data and incur excessive interprocessor

communication for the visualization calculations.

• The visualization calculations should only take a small fraction of the

overall simulation time. If they take longer, the visualization routines

can impact the ability of the simulation to advance if they can not ﬁn-

ish quickly enough. Although sophisticated visualization methods oﬀer

visually compelling results, they are often not acceptable for in situ vi-

sualization.

• If the visualization routines crash or corrupt memory, then the simula-

tion will be aﬀected—possibly crashing itself.

In Situ Processing 175

• The visualization routines must run at the same level of concurrency (in

terms of numbers of nodes) as the simulation itself, which may require

rethinking some visualization algorithm design and implementation.

In short, it is both challenging and beneﬁcial to design scalable parallel

in situ visualization algorithms. The resulting visualization should be cost-

eﬀective and highlight the best features of interest in the modeled phenomena

without constantly acquiring global information about the spatial and tem-

poral domains of the data. A good design must take into account the domain

knowledge about the modeled phenomena and the simulation model. As such,

a co-processing visualization solution should be developed as a collective eﬀort

between the visualization specialists and simulation scientists. Other design

considerations for co-processing in situ visualization are discussed by Ma [18].

Finally, the recent results of Yu et al. [38] are brieﬂy summarized here,to

understand a real-world, tailored co-processing approach. They achieved

highly scalable in situ visualization of a turbulent combustion simulation [6] on

a Cray XT5 supercomputer, using up to 15,260 cores. Figure 9.1 shows a set of

timing test results. This work was novel outside of its extreme scale, because it

introduced an integrated solution for visualizing both volume data and particle

data, allowing the scientists to examine complex particle-turbulence interac-

tion, like the type shown in Figure 9.2. The mixed types of data required con-

siderable coordination for data occurring along the boundaries [38] and highly

streamlined routines, like the 2-3 swap compositing algorithm (see 5.3.1). Fi-

nally, their eﬀorts had beneﬁts beyond performance. Before employing in situ

visualization, the combustion scientists subsampled the data to reduce both

data movement and computational requirements for post-processing visualiza-

tion. This work allowed them to operate on the original full-resolution data.

9.3 Co-Processing With General Visualization Tools Via

Adaptors

As with tailored code, co-processing with a general visualization system

incorporates visualization and analysis routines into the simulation code, al-

lowing direct access to the simulation’s data and compute resources. Whereas

co-processing with tailored code integrates tightly-coupled visualization rou-

tines adapted to the simulation’s own data structures, co-processing using gen-

eral routines from fully featured visualization systems uses an adaptor layer

to access simulation data. The adaptor layer is simply a set of routines that

the simulation developer provides to expose a simulation’s data structures in

a manner that is compatible with the visualization system’s code.

The complexity of adaptors can vary greatly. If the data structures of the

simulation and the visualization system diﬀer, then the adaptor’s job is to

copy and reorganize data. For example, a simulation may contain particle

data, with spatial coordinates and other attributes for each particle. If the

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9. In Situ Processing (1/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
9. In Situ Processing (1/6)