16. VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data (2/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

362 High Performance Visualization

• mdserver: A program that browses remote ﬁle systems and reads meta-

data.

• vcl: VisIt Component Launcher, a program whose sole job is to launch

other server-side programs. Without this program, the user would have

to issue credentials for the launch of each program on the remote ma-

chine.

While the conﬁguration in Figure 16.1 is the most common, other variants are

also used:

• Data is located on the local machine, so all programs, including the

server-side programs, run on the local machine.

• The client-side programs run on a remote machine. This mode occurs

most often in conjunction with graphical desktop sharing, such as VNC.

• Multiple servers are run simultaneously to access data on multiple re-

mote machines.

• VisIt is run entirely in “batch mode.” The gui program is not used and

the viewer program runs in a windowless mode.

• VisIt’s client-side programs are coupled with a simulation code and data

is processed in situ. In this case, the simulation embeds a copy of the

engine program.

16.3.2 Parallelism

VisIt has multiple processing modes—multiresolution processing (Chap. 8),

in situ processing (Chap. 9), and out-of-core processing (Chap. 10)—but its

most frequent mode is pure parallelism (Chap. 2), where the data is partitioned

over its MPI tasks and is processed at its native resolution. Most visualization

and analysis algorithms are embarrassingly parallel, meaning that portions of

the data set can be processed in any order and without coordination and

communication. For this case, VisIt’s core infrastructure manages the parti-

tioning of the data and all parallelism. For non-embarrassingly parallel cases

like streamline calculation or volume rendering, algorithms are able to man-

age parallelism themselves and can opt to perform collective communication

if necessary.

In VisIt’s most typical visualization use case, a user loads a large data set,

applies operations to reduce the data size, and then transfers the resulting

data set to the local client for interactive rendering, using the local graphics

card. However, some data sets are so large that their reduced forms are too

large for a desktop machine. This case requires a backup plan. VisIt’s backup

plan is to switch to a parallel rendering mode: data is left on the parallel

server, each MPI task renders its own piece, and the resulting subimages are

VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data 363

composited together. The ﬁnal image is then brought back to the viewer and

placed in the visualization window, as if it was rendered with the graphics

card. Although this process sounds cumbersome, the switch to the parallel

rendering mode is transparent to end users and frame rates approaching ten

frames per second can be achieved.

VisIt was designed for many scales of concurrency. Many users run serial or

modestly parallel versions on their desktop machines. When users utilize par-

allel resources on a supercomputer, they typically run with 32 to 512 tasks.

But, for the largest data sets, VisIt servers with thousands or even tens of

thousands of tasks are used (see 16.4.1). VisIt demonstrates excellent scala-

bility and performance at each of these scales.

16.3.3 User Interface Concepts and Extensibility

Type Description # of instances

Database How to read from a ﬁle ˜115

Operator How to manipulate data ˜60

Plot How to render data ˜20

Expression How to derive new quantities ˜190

Queries How to extract quantitative ˜90

and debugging information

TABLE 16.1: VisIt’s ﬁve primary user interface concepts.

Table 16.1 shows the ﬁve primary user interface concepts in VisIt. A

strength of these concepts is their interoperability. Each plot can work on

data directly from a ﬁle (databases) or from derived data (expressions), and

can have an arbitrary number of data transformations or subselections applied

(operators). Once the key information is extracted, quantitative or debugging

information can be extracted (queries) or the data can be rendered (plots).

Consider an example: a user reads from a ﬁle (database), calculates the λ-

2 metric for ﬁnding high vorticity (expressions), isolates out the regions of

highest vorticity operators, renders it (plots), then calculates the number of

connected components and statistics about them (queries).

VisIt makes it easy to add new types of databases, operators, and plots.

The base infrastructure deals with these concepts as abstract types; it only

discovers the concrete databases, operators, and plots instances at start-up, by

loading them as plug-ins. Thus, adding new functionality to VisIt translates

to developing a new plug-in. Further, VisIt facilitates the plug-in development

process. It provides an environment for deﬁning a plug-in and also performs

code generation. The developer starts by setting up options for the plug-ins,

and then VisIt generates attributes for storing the options, user interface com-

ponents (Python, Qt, and Java), the plug-in bindings, and C++ methods with

364 High Performance Visualization

“dummy” implementations. The developer then replaces the dummy imple-

mentations with their intended algorithm, ﬁle reading code, etc.

16.3.4 The Size and Breadth of VisIt

Although only brieﬂy discussed in this chapter, VisIt has an extensive

list of features. Its ˜115 ﬁle format readers include support for many HDF5-

and NetCDF-based formats, CGNS, and others, including generic readers for

some types of binary and ASCII ﬁles. Its ˜60 operators include transforma-

tions (such as projections, scaling, rotation, and translation), data subsetting

(such as thresholding and contouring), and spatial thresholding (such as lim-

iting to a box or a plane), among many others. Its ˜90 queries allow users to

get customizable reports about speciﬁc cells or points, integrate quantities,

calculate surface areas and volumes, insert synthetic diagnostics/virtual de-

tectors, and much more. Its ˜190 expressions go well beyond simple math. For

example, the user can create derived quantities like, “if the magnitude of the

gradient of density is greater than this, then do this, else do that.”

And many features do not ﬁt into the ﬁve primary user interface con-

cepts. There is support for positioning light sources, making movies (includ-

ing MPEG encoding), eliminating data based on known categorizations (e.g.,

“show me only this reﬁnement level” from an AMR mesh), and rendering ef-

fects like shadows and specular highlights, to name a few. In total, VisIt is

approximately one and a half million lines of code.

Finally, VisIt makes heavy use of the Visualization ToolKit (VTK) [17].

This library contains an execution model, a data model, and many algorithms

for transforming data. VisIt implements its own execution model, but the

other two pieces form the foundation of VisIt’s data processing. VTK’s data

model forms the basis of VisIt’s data model, although VisIt provides support

for mixed material cells, metadata for faster processing, and other concepts

not natively supported by VTK. Further, VisIt uses the native VTK algorithm

for many embarrassingly parallel visualization algorithms. In short, VTK has

provided an important leverage to the VisIt project, allowing VisIt developers

to direct their attention to the project’s three main focal points.

16.4 Successes

The VisIt project has succeeded in multiple ways: by providing a scalable

infrastructure for visualization and analysis, by populating that infrastruc-

ture with cutting-edge algorithms, by informing the limits of new hardware

architectures, and, most importantly, by enabling successes for the tool’s end

users. A few noteworthy highlights are summarized in the subsections below.

VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data 365

16.4.1 Scalability Successes

As discussed previously in Chapter 13, a pair of studies were run in 2009 to

demonstrate VisIt’s capabilities for scalability and large data (see Fig. 16.2).

In the ﬁrst study, VisIt’s infrastructure and some of its key visualization al-

gorithms were demonstrated to support weak scaling. This demonstration led

to be VisIt being selected as a “Joule code,” a formal certiﬁcation process by

the US Oﬃce of Management and Budget to ensure that programs running

on high-end supercomputers are capable of using the machine eﬃciently. In

the second study, VisIt was scaled up to tens of thousands of cores and used

to visualize data sets with trillions of cells per time slice. This study found

VisIt itself to perform quite well, although overall performance was limited

by the supercomputer’s I/O bandwidth. Both studies are further described by

Childs et al. [6].

FIGURE 16.2: The two left images show a contouring and a volume rendering

from a Denovo radiation transport simulation. They were produced by VisIt

using 12,270 cores of JaguarPF as part of the “Joule code” certiﬁcation, which

showed that VisIt is weakly scalable. The two right images show a contouring

and a volume rendering of a two trillion cell data set produced by VisIt using

32,000 cores of JaguarPF as part of a study on scalability at high levels of

concurrency and on large data sets. The volume rendering was reproduced in

2011 on a one trillion cell version of the data set using only 800 cores of the

TACC Longhorn machine. Image source: Childs et al., 2010 [6].

16.4.2 A Repository for Large Data Algorithms

Many advanced algorithms for visualizing and analyzing large data have

been implemented inside of VisIt, making them directly available to end users.

Notable algorithms include:

• A novel streamline algorithm that melds two diﬀerent parallelization

strategies (“over data” and “over seeds”) to retain their positive eﬀects

while minimizing their negative ones [13];

• A volume rendering algorithm that handles the compositing complexi-

ties inherent to unstructured meshes while still delivering scalable per-

formance [4];

366 High Performance Visualization

• An algorithm for identifying connected components in unstructured

meshes in a distributed-memory parallel setting on very large data

sets [9];

• An algorithm for creating crack-free isosurfaces for adaptive mesh re-

ﬁnement data, a common mesh type for very large data [18];

• A well-performing material interface reconstruction algorithm for

distributed-memory parallel environments that balances concerns for

both visualization and analysis [12]; and

• A method for repeated interpolations of velocity ﬁelds in unstructured

meshes, to accelerate streamlines [8].

Further, VisIt has been the subject of much systems research, including

papers on the base VisIt architecture [5], VisIt’s “contract” system which

allows it to detect the processing requirements for the current operations and

adaptively apply the best optimizations [3], and a description of the adapter

layer that allows VisIt to couple with a simulation and run in situ [19].

16.4.3 Supercomputing Research Performed with VisIt

As the landscape for parallel computers changes, VisIt has been used to

test the beneﬁts of emerging algorithms and hardware features, including:

• Studying modiﬁcations to collective communication patterns for ghost

data generation, to be suitable for out-of-core processing, thereby im-

proving cache coherency and reducing memory footprint [10];

• Studying the viability of hardware-accelerated volume rendering on

distributed-memory parallel visualization clusters powered by GPUs [7];

• Studying the beneﬁts of hybrid parallelism for streamline algorithms [1];

and

• Studying the issues and strategies for porting to new operating sys-

tems [14].

16.4.4 User Successes

Of course, the most important measure for the project is helping users

better understand their data. Unfortunately, metrics of success in this space

are diﬃcult:

• Some national laboratories keep statistics on their user communities: the

United States’ Lawrence Livermore Lab has approximately 300 regular

users, the United Kingdom’s Atomic Weapons Establishment (AWE)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 16. VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data (2/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
16. VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data (2/4)