10. Streaming and Out-of-Core Methods (2/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

204 High Performance Visualization

space for inter-stage buﬀering. As illustrated in Figure 10.2, the synchronous

pipeline, on the left, cannot start processing the second portion until the ﬁrst

is ﬁnished. The ﬁrst module in the asynchronous pipeline, on the right, will

start working on the second portion as soon as it has ﬁnished processing the

ﬁrst one.

10.3 Streamed Visualization Concepts

This section discusses some general aspects of applying the concept of

streaming to scientiﬁc visualization. The discussion will be limited to tradi-

tional qualitative scientiﬁc visualization in which data sets are processed and

rendered in an eﬀort to gain qualitative insights into the problem at hand.

10.3.1 Data Structures

Ideally, streaming visualization must break scientiﬁc data apart into co-

herent regions of space and time. The diﬀerent classes of input data that are

to be visualized have properties that make them more or less well-suited for

partitioning.

A completely structured data set has the property where the topology and

geometry are implicit and the data values represent regularly separated sam-

ples. Structured data sets are eﬃciently stored as data value arrays. Streaming

with structured data is relatively easy:

• The mapping between geometric location and a location within a data

array is implicit and easily calculated.

• Portions containing spatially coherent regions (slices along the slow-

est changing direction) are straightforward [32, 34] without any pre-

processing.

• With pre-processing, portions containing contiguous regions [9] and mul-

tiresolution representations [26] are also trivial to acquire.

• Strided sampling can produce lower resolution data [27, 3].

Unstructured data are not as well-suited to streaming because of their

unconstrained topology and geometry. For example, in the context of the

VTK library, to know the location of any given cell, the program must access

the cell’s vertices by walking the cell’s vertex index list. The vertices are

stored in their own array, and the vertices, for any cell will be arbitrarily

scattered in that array. It is not trivial to create coherent portions since the

cell and vertex arrays may not ﬁt into the memory, and because geometric

information is spread across that memory layout. Once partitioned, the fact

that a portion’s bounding box is not known until after the portion is entirely

read can also be problematic.

Streaming and Out-of-Core Methods 205

Modern external memory and streaming formats for unstructured data

solve the ﬁrst problem by in-lining the vertex data with the cell data (also

called normalized or triangle soup representations [6]). The cells and vertices

are kept together in portions called meta-cells [5], or by intermixing the vertex

and cell lists so that the vertices and the cells that reference them are kept

very close together in the data stream [19]. The second problem is solved by

keeping meta-information along with each portion.

A notable exception to the rule that unstructured data is hard to stream

is the point set. It is an unstructured data type that consists only of a set of

locations in space without any connections between them. The fact that one

level of indirection (from cell list to point list) is removed greatly simpliﬁes

the handling of large point sets. There are also many algorithms that work on

point sets that do not require spatially coherent regions. For those algorithms

that do, standard OOC sorting algorithms [35] can be used in a pre-processing

step to produce regions that do not overlap.

Composite data sets, which consist of sets of structured or unstructured

data sets (i.e., blocks), can be straightforward to stream, as long as metadata

describing each block is available. In this situation it is straightforward to

process each block, in turn.

10.3.2 Repetition

In streaming visualization, the visualization pipeline has to support repe-

tition to iterate through all of the portions that make up the whole data set.

As Figure 10.3 illustrates, both push and pull pipeline models can support

streaming. In the push model, the system reads a portion at a time. The act

of reading each portion causes the pipeline to push it through toward the

rendering module. In the pull model, it is the rendering module that requests

each portion, in turn (see also 2.2.2).

Comparing the two, a push model is better suited to unbounded length

streams since the availability of new data can simply trigger the next cy-

cle. Push models also have less overhead in asynchronous streaming because

downstream modules can begin as soon as results are ready [36]. On the other

hand, push models are not as eﬀective at sparse traversal because downstream

modules can not be as selective about what the pipeline processes [23].

With either approach, streaming visualization systems must aggregate re-

sults over time as each new result is produced. For example, with a scan-

converted rendering of opaque surfaces, the color and depth buﬀers are not

cleared in between portions, so that each pixel ends up with the color of the

nearest primitive out of the entire data set.

10.3.3 Algorithms

Some, but not all, visualization algorithms are inherently well-suited for

streaming [32, 34, 11]. The deﬁning factor is how much support an algorithm

206 High Performance Visualization

push

pull

create_pipeline

(source, ..., sink)

while (!endofstream)

source.readMore()

create_pipeline

(source, ..., sink)

while (piece<#pieces)

sink.render(piece++)

FIGURE 10.3: In push and pull pipelines, the source or sink direct the pipeline

to process the next portion.

requires: local support or global support? Localized algorithms, typiﬁed by

fragment programs with pixel processing, are well-suited, because they ana-

lyze each point or cell independently, and are thus insensitive to data parti-

tioning and traversal order. Many extraction algorithms fall into this category.

Algorithms that require information from a local neighborhood of elements,

for example diﬀerential operators, are less well-suited as both the primary el-

ement and its neighbors must be resident together. This is typically solved by

using a ghost cell technique, where portions include a layer of cells belonging

to their neighbors.

Algorithms that need global support must be recast into a local problem to

work eﬃciently under a streaming paradigm [40]. Problems such as streamline

integration, which can suﬀer from load imbalances in a data parallel solution,

also tend to cause excessive passes through the data in a streaming solution.

In a distributed-memory data parallel context, in which the data ﬁts in the

parallel machine’s aggregate memory, the data needed by one processor to

continue work is always present somewhere. An algorithm can resort to com-

municating between diﬀerent processors to continue forward work progress.

In a streaming context, data dependencies are more critical, since the data

needed to continue work may not have been loaded yet, or it might have

been processed and evicted from the memory to make space for current data.

This problem may require additional passes to solve. The problem of interde-

pendence between portions can sometimes be solved by carefully considering

the order in which portions are processed, such as in the parallel streamed

computation of ghost cells [16].

Rendering algorithms may be local or global in nature. Aggregating scan-

converted or splat-rendered results, neglecting translucency, are handled sim-

ply, as described earlier. Volume rendering and translucent surface rendering

is problematic for two reasons: ﬁrst, as the interior of the data is visible, a

larger fraction of it contributes to the image and fewer portions can be culled;

and second, one must maintain a back-to-front (or front-to-back) traversal

order [32]. Ray tracing and streamlines calculations are problematic because

of their unstructured memory access characteristics (see 14.3 and Chap. 6).

Streaming and Out-of-Core Methods 207

FIGURE 10.4: Culling unimportant data leaves only portion 2 when slicing

at X = 0. Likewise, lazy evaluation could operate on only the important cells

within the portion.

10.3.4 Sparse Traversal

In streamed visualization, where each pass through the data takes a signif-

icant amount of time and where many passes may be needed, achieving inter-

activity may require reducing the number of portions processed. Any region of

the data that can be quickly determined as being unimportant to the ﬁnal vi-

sualization should be culled to avoid time-consuming processing [23, 7, 2]. If a

portion’s extreme data ranges do not encompass a contour module’s isovalue,

the entire portion can be ignored by the whole pipeline. Similarly, as shown

in Figure 10.4, a slice module could reject a portion by testing its bounding

box against a slicing plane. These “lightweight,” “metadata” operations can

be performed quickly and are able to save time by skipping “heavyweight”

module operations. At a ﬁner granularity, expensive derived computations on

cells within a passing portion may also be avoided.

Other lightweight computations, such as rendering optimization, can save

time as well. For example, portions that lie entirely outside the view frustum,

or that are occluded by other portions, should be skipped. In some situations,

back-facing portions are assumed to be occluded, so back-facing portions can

be rejected in the same way that back-facing polygons are rejected in render-

ing [28]. Level-of-detail techniques are also appropriate for reducing data and

processing passes. With large data, there may be many more display primi-

tives (polygons, voxels, etc.,) than there are pixels, so only a subset of those

primitives actually contribute [29, 37, 3]. During interaction, responsiveness

is more important than accuracy, so the level of detail can be further reduced.

In practice, the appropriate level of detail can be determined by projecting

the portion’s bounding boxes into screen-space; low-resolution data can be

obtained by downsampling either on-the-ﬂy or in a pre-processing step.

208 High Performance Visualization

slice

x=0

read

portion 1

20,60

portion 2

-20,20

portion 3

-60,-20

render

translate

x'=x-40

slice

x'=0

read render

FIGURE 10.5: In order to be used to avoid processing portions, metadata

must remain accurate without processing the data in the portion. The slice

module in the pipeline on top can trust the metadata from the source and will

ignore portions 1 and 3. The translate module in the pipeline on the bottom

will alter the real data, and the slice module should ignore portions 2 and 3

instead.

Sparse traversal adds an additional requirement to have accurate meta-

data with the partitioned data. In a few cases, simulation codes store meta-

data along with the data. If metadata is not present, the data must be pre-

processed, or, the metadata can be derived on the ﬁrst pass through the data,

during which sparse traversal will not be possible.

Once the metadata is obtained, it must be managed through the

pipeline [10]. The pipeline avoids processing chunks based on the contents of

their metadata. A portion that is oﬀ-screen will be ignored. However, pipeline

modules manipulate data and thereby invalidate every portion’s metadata.

For example, a module that repositions data may move an oﬀscreen portion

back into view. Downstream modules can not cull anything then, unless up-

stream modules change the metadata before processing the data. Figure 10.5

demonstrates another example. In many cases the metadata can not be deter-

mined a priori. The system should fall back to marking a portion’s metadata

as “known to be clean” or “potentially dirty,” and then cull portions based

only on the clean information. Later, the dirty information can be updated

and marked clean after data processing once it is known.

Lastly, the output of the visualization pipeline tends to be much smaller

than the whole data set. Although the input data may be very large, the

salient features within it tend to be small. Some salient features are: the part

of the data within a particularly important region; slices, isocontours and

other techniques that reduce dimensionality; descriptive statistical metrics;

locations of extremal values; the presence or absence of particular events; etc.

Because the features are small, it is often possible to cache results [23, 10] in

the available memory, in order to avoid redundant processing while the user

interacts with previously generated results. Figure 10.6 illustrates an example

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Streaming and Out-of-Core Methods (2/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
10. Streaming and Out-of-Core Methods (2/5)