362 High Performance Visualization
mdserver: A program that browses remote file systems and reads meta-
data.
vcl: VisIt Component Launcher, a program whose sole job is to launch
other server-side programs. Without this program, the user would have
to issue credentials for the launch of each program on the remote ma-
chine.
While the configuration in Figure 16.1 is the most common, other variants are
also used:
Data is located on the local machine, so all programs, including the
server-side programs, run on the local machine.
The client-side programs run on a remote machine. This mode occurs
most often in conjunction with graphical desktop sharing, such as VNC.
Multiple servers are run simultaneously to access data on multiple re-
mote machines.
VisIt is run entirely in “batch mode.” The gui program is not used and
the viewer program runs in a windowless mode.
VisIt’s client-side programs are coupled with a simulation code and data
is processed in situ. In this case, the simulation embeds a copy of the
engine program.
16.3.2 Parallelism
VisIt has multiple processing modes—multiresolution processing (Chap. 8),
in situ processing (Chap. 9), and out-of-core processing (Chap. 10)—but its
most frequent mode is pure parallelism (Chap. 2), where the data is partitioned
over its MPI tasks and is processed at its native resolution. Most visualization
and analysis algorithms are embarrassingly parallel, meaning that portions of
the data set can be processed in any order and without coordination and
communication. For this case, VisIt’s core infrastructure manages the parti-
tioning of the data and all parallelism. For non-embarrassingly parallel cases
like streamline calculation or volume rendering, algorithms are able to man-
age parallelism themselves and can opt to perform collective communication
if necessary.
In VisIt’s most typical visualization use case, a user loads a large data set,
applies operations to reduce the data size, and then transfers the resulting
data set to the local client for interactive rendering, using the local graphics
card. However, some data sets are so large that their reduced forms are too
large for a desktop machine. This case requires a backup plan. VisIt’s backup
plan is to switch to a parallel rendering mode: data is left on the parallel
server, each MPI task renders its own piece, and the resulting subimages are
VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data 363
composited together. The final image is then brought back to the viewer and
placed in the visualization window, as if it was rendered with the graphics
card. Although this process sounds cumbersome, the switch to the parallel
rendering mode is transparent to end users and frame rates approaching ten
frames per second can be achieved.
VisIt was designed for many scales of concurrency. Many users run serial or
modestly parallel versions on their desktop machines. When users utilize par-
allel resources on a supercomputer, they typically run with 32 to 512 tasks.
But, for the largest data sets, VisIt servers with thousands or even tens of
thousands of tasks are used (see 16.4.1). VisIt demonstrates excellent scala-
bility and performance at each of these scales.
16.3.3 User Interface Concepts and Extensibility
Type Description # of instances
Database How to read from a file ˜115
Operator How to manipulate data ˜60
Plot How to render data ˜20
Expression How to derive new quantities ˜190
Queries How to extract quantitative ˜90
and debugging information
TABLE 16.1: VisIt’s ve primary user interface concepts.
Table 16.1 shows the five primary user interface concepts in VisIt. A
strength of these concepts is their interoperability. Each plot can work on
data directly from a file (databases) or from derived data (expressions), and
can have an arbitrary number of data transformations or subselections applied
(operators). Once the key information is extracted, quantitative or debugging
information can be extracted (queries) or the data can be rendered (plots).
Consider an example: a user reads from a file (database), calculates the λ-
2 metric for finding high vorticity (expressions), isolates out the regions of
highest vorticity operators, renders it (plots), then calculates the number of
connected components and statistics about them (queries).
VisIt makes it easy to add new types of databases, operators, and plots.
The base infrastructure deals with these concepts as abstract types; it only
discovers the concrete databases, operators, and plots instances at start-up, by
loading them as plug-ins. Thus, adding new functionality to VisIt translates
to developing a new plug-in. Further, VisIt facilitates the plug-in development
process. It provides an environment for defining a plug-in and also performs
code generation. The developer starts by setting up options for the plug-ins,
and then VisIt generates attributes for storing the options, user interface com-
ponents (Python, Qt, and Java), the plug-in bindings, and C++ methods with
364 High Performance Visualization
“dummy” implementations. The developer then replaces the dummy imple-
mentations with their intended algorithm, file reading code, etc.
16.3.4 The Size and Breadth of VisIt
Although only briefly discussed in this chapter, VisIt has an extensive
list of features. Its ˜115 file format readers include support for many HDF5-
and NetCDF-based formats, CGNS, and others, including generic readers for
some types of binary and ASCII files. Its ˜60 operators include transforma-
tions (such as projections, scaling, rotation, and translation), data subsetting
(such as thresholding and contouring), and spatial thresholding (such as lim-
iting to a box or a plane), among many others. Its ˜90 queries allow users to
get customizable reports about specific cells or points, integrate quantities,
calculate surface areas and volumes, insert synthetic diagnostics/virtual de-
tectors, and much more. Its ˜190 expressions go well beyond simple math. For
example, the user can create derived quantities like, “if the magnitude of the
gradient of density is greater than this, then do this, else do that.”
And many features do not fit into the five primary user interface con-
cepts. There is support for positioning light sources, making movies (includ-
ing MPEG encoding), eliminating data based on known categorizations (e.g.,
“show me only this refinement level” from an AMR mesh), and rendering ef-
fects like shadows and specular highlights, to name a few. In total, VisIt is
approximately one and a half million lines of code.
Finally, VisIt makes heavy use of the Visualization ToolKit (VTK) [17].
This library contains an execution model, a data model, and many algorithms
for transforming data. VisIt implements its own execution model, but the
other two pieces form the foundation of VisIt’s data processing. VTK’s data
model forms the basis of VisIt’s data model, although VisIt provides support
for mixed material cells, metadata for faster processing, and other concepts
not natively supported by VTK. Further, VisIt uses the native VTK algorithm
for many embarrassingly parallel visualization algorithms. In short, VTK has
provided an important leverage to the VisIt project, allowing VisIt developers
to direct their attention to the project’s three main focal points.
16.4 Successes
The VisIt project has succeeded in multiple ways: by providing a scalable
infrastructure for visualization and analysis, by populating that infrastruc-
ture with cutting-edge algorithms, by informing the limits of new hardware
architectures, and, most importantly, by enabling successes for the tool’s end
users. A few noteworthy highlights are summarized in the subsections below.
VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data 365
16.4.1 Scalability Successes
As discussed previously in Chapter 13, a pair of studies were run in 2009 to
demonstrate VisIt’s capabilities for scalability and large data (see Fig. 16.2).
In the first study, VisIt’s infrastructure and some of its key visualization al-
gorithms were demonstrated to support weak scaling. This demonstration led
to be VisIt being selected as a “Joule code,” a formal certification process by
the US Office of Management and Budget to ensure that programs running
on high-end supercomputers are capable of using the machine efficiently. In
the second study, VisIt was scaled up to tens of thousands of cores and used
to visualize data sets with trillions of cells per time slice. This study found
VisIt itself to perform quite well, although overall performance was limited
by the supercomputer’s I/O bandwidth. Both studies are further described by
Childs et al. [6].
FIGURE 16.2: The two left images show a contouring and a volume rendering
from a Denovo radiation transport simulation. They were produced by VisIt
using 12,270 cores of JaguarPF as part of the “Joule code” certification, which
showed that VisIt is weakly scalable. The two right images show a contouring
and a volume rendering of a two trillion cell data set produced by VisIt using
32,000 cores of JaguarPF as part of a study on scalability at high levels of
concurrency and on large data sets. The volume rendering was reproduced in
2011 on a one trillion cell version of the data set using only 800 cores of the
TACC Longhorn machine. Image source: Childs et al., 2010 [6].
16.4.2 A Repository for Large Data Algorithms
Many advanced algorithms for visualizing and analyzing large data have
been implemented inside of VisIt, making them directly available to end users.
Notable algorithms include:
A novel streamline algorithm that melds two different parallelization
strategies (“over data” and “over seeds”) to retain their positive effects
while minimizing their negative ones [13];
A volume rendering algorithm that handles the compositing complexi-
ties inherent to unstructured meshes while still delivering scalable per-
formance [4];
366 High Performance Visualization
An algorithm for identifying connected components in unstructured
meshes in a distributed-memory parallel setting on very large data
sets [9];
An algorithm for creating crack-free isosurfaces for adaptive mesh re-
finement data, a common mesh type for very large data [18];
A well-performing material interface reconstruction algorithm for
distributed-memory parallel environments that balances concerns for
both visualization and analysis [12]; and
A method for repeated interpolations of velocity fields in unstructured
meshes, to accelerate streamlines [8].
Further, VisIt has been the subject of much systems research, including
papers on the base VisIt architecture [5], VisIt’s “contract” system which
allows it to detect the processing requirements for the current operations and
adaptively apply the best optimizations [3], and a description of the adapter
layer that allows VisIt to couple with a simulation and run in situ [19].
16.4.3 Supercomputing Research Performed with VisIt
As the landscape for parallel computers changes, VisIt has been used to
test the benefits of emerging algorithms and hardware features, including:
Studying modifications to collective communication patterns for ghost
data generation, to be suitable for out-of-core processing, thereby im-
proving cache coherency and reducing memory footprint [10];
Studying the viability of hardware-accelerated volume rendering on
distributed-memory parallel visualization clusters powered by GPUs [7];
Studying the benefits of hybrid parallelism for streamline algorithms [1];
and
Studying the issues and strategies for porting to new operating sys-
tems [14].
16.4.4 User Successes
Of course, the most important measure for the project is helping users
better understand their data. Unfortunately, metrics of success in this space
are difficult:
Some national laboratories keep statistics on their user communities: the
United States’ Lawrence Livermore Lab has approximately 300 regular
users, the United Kingdom’s Atomic Weapons Establishment (AWE)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.145.82