Chapter 4
Rendering
Charles Hansen
University of Utah
E. Wes Bethel
Lawrence Berkeley National Laboratory
Thiago Ize
University of Utah
Carson Brownlee
University of Utah
4.1 Introduction ...................................................... 49
4.2 Rendering Taxonomy ............................................ 50
4.3 Rendering Geometry ............................................. 52
4.4 Volume Rendering ................................................ 53
4.5 Real-Time Ray Tracer for Visualization on a Cluster ........... 56
4.5.1 Load Balancing .......................................... 57
4.5.2 Display Process .......................................... 57
4.5.3 Distributed Cache Ray Tracing ......................... 58
4.5.3.1 DC BVH ................................... 59
4.5.3.2 DC Primitives .............................. 60
4.5.4 Results ................................................... 61
4.5.5 Maximum Frame Rate ................................... 62
4.6 Conclusion ........................................................ 66
References .......................................................... 67
4.1 Introduction
The process of creating an image from the data, as the last step before
display, in the visualization pipeline is called rendering; this is a key aspect of
all high performance visualizations. As shown in Figure 1.1, the rendering pro-
cess takes data resulting from visualization algorithms and generates an image
for display. This figure highlights that data is typically filtered, transformed,
subsetted and then mapped to renderable geometry before rendering takes
place. There are two forms of rendering typically used in visualization, based
49
50 High Performance Visualization
on the underlying mapped data: rendering of geometry generated through vi-
sualization mapping algorithms (see Fig. 1.1) and direct rendering from the
data. For geometric rendering, typical high performance visualization packages
use the OpenGL library, which converts the geometry generated by visualiza-
tion mapping algorithms to colored pixels through rasterization [36]. Another
method for generating images from geometry is ray tracing, where rays from
the viewpoint through the image pixels, are intersected with geometry to
form the colored pixels [35]. Direct rendering does not require an intermediate
mapping of data to geometry, but rather, it directly generates the colored pix-
els through a mapping that involves sampling the data and for each sample,
mapping the data to renderable quantities (color and transparency). Direct
volume rendering is a common technique for rendering scalar fields directly
into images.
The rest of this chapter will discuss a rendering taxonomy that is widely
used in high performance visualization systems. Geometric rendering is then
presented with examples of both rasterization and ray tracing solutions. Di-
rect volume rendering is introduced. And finally, an example of a geometric
rendering system is discussed, using ray tracing on a commodity cluster.
4.2 Rendering Taxonomy
In 1994, Molnar et al. described a taxonomy of parallel rasterization graph-
ics hardware that has become the basis for most parallel implementations—
both hardware and software based—and is widely used in high performance
visualization systems [25]. While the taxonomy describes different forms of
graphics hardware, the generalization of the taxonomy provides the basis for
most software-based and GPU-based parallel rendering systems for both ras-
terization rendering and direct volume rendering.
The taxonomy describes three methods for the parallelization of graph-
ics hardware performing rasterization, based on when the assignment to the
image space (sometimes called screen-space) takes place. If one considers the
rasterization process of consisting of two basic steps, geometry processing and
rasterization (the generation of pixels from the geometry), then the sort (the
assignment of data, geometry, or pixels), to image space can occur at three
points.
As shown in the left of Figure 4.1, geometric primitives can be assigned
to screen-space before geometry processing takes place. This is called sort-
first. Since geometry processing involves the transformation of the geometry
to image space, this requires a priori information about the geometry and the
mapping to image space. Such a priori information can be obtained from pre-
viously rendered frames, heuristics, or by simply replicating the data among
all parallel processes. The image space is divided into multiple tiles, each of
which can be processed in parallel. Since the geometry is assigned to the
appropriate image space tile, geometry processing, followed by rasterization,
Rendering 51
generates the final colored pixels of the image. This process can be conducted
in parallel by each of the processors responsible for a particular image tile. The
advantage of sort-first methods is that image compositing is not required, since
subregions of the final image are defined uniquely. The disadvantage is that,
without data replication or heuristics, the assignment of data to the appro-
priate image space subregion is difficult. Sort-first methods typically leverage
the frame-to-frame coherency, which is often readily available in interactive
applications.









   



   












  
FIGURE 4.1: The rendering taxonomy based on Molnar’s description. Left
most is sort-first, the center is sort-middle, while the right most is sort-last.
Sort-middle architectures perform the sort to image space, after geometry
processing, but before rasterization. As in sort-first, the image is divided into
tiles and each rasterization processor is responsible for a particular image tile.
Geometry is processed in parallel by the multiple geometry processors that
obtain geometry in some manner, such as round-robin distribution. One of
the steps of geometry processing is the transformation of the geometry to
image space. The image space geometry is sorted and sent to the appropriate
rasterization processor responsible for the image space partition covered by the
processed geometry. While common in graphics hardware, high performance
visualization systems typically use either sort-last or sort-first methodology
for parallel rendering.
The right-most image in Figure 4.1 shows the sort-last technique. In this
method, geometry is distributed to geometry processors, in some manner, such
as round-robin distribution. Each processor in parallel transforms the geom-
etry to image space and then, rasterizes the geometry to a local image. The
rasterizer generates the image pixels. Note that depending on the geometry,
the pixels may cover an entire image, or more typically, a small portion of the
image. After all the geometry has been processed and rasterized, the resulting
partial images are combined, through compositing, as described in Chapter 5.
The advantage of the sort-last method is that renderable entities, geometry,
or data can be partitioned among the parallel processors and the final im-
age is constructed through parallel compositing. The disadvantage is that the
compositing step can dominate the image generation time. Sort-last methods
52 High Performance Visualization
are the most common found in high performance visualization and are part
of the VisIt and ParaView distributions.
4.3 Rendering Geometry
As described above, one method for visualization involves mapping the
data to geometry, such as an isosurface or an intermediate representation,
such as spheres, and then rendering the geometry into an image for display.
As described in Chapter 3, the send-geometry method generates such geome-
try for rendering on a client. The rendering can be performed either serially,
or in parallel and in either software or more typically through GPU hard-
ware. The most common rasterization library for performing this task is the
OpenGL library. OpenGL is an industry standard graphics API, with render
implementations supporting GPU, or hardware-accelerated rendering. There
is a long history of research in parallel and distributed geometry rasteriza-
tion, focusing on both early massively parallel processors (MPP) and clusters
of PCs.
Crockett and Orloff [9] describe a sort-middle parallel rendering system
for message-passing machines. They developed a formal model to measure the
performance and they were one of the first to use MPPs for parallel rendering.
In comparison, Ortega et al. [30] introduce a data-parallel geometry rendering
system using sort-last compositing. This implementation allowed integrated
geometry extraction and rendering on the CM5 and was targeted for appli-
cations, which required extremely fast rendering for extremely large sets of
polygons. The rendering toolkit enables the display of 3D shaded polygons
directly from a parallel machine, avoiding the transmission of huge amounts
of geometry data to a post-processing rendering system over the then slow
networks. Krogh et al. [20] present a parallel rendering system for molecu-
lar dynamics applications, where the massive particle system is represented as
spheres. The spheres are rendered with depth-enhanced sprites (screen aligned
textures) and the results are composited with sort-last compositing. This sys-
tem avoids rasterization by the use of sprites.
Correa et al. [7] present a technique for the out-of-core rendering of large
models on cluster-based tiled displays, using sort-first. In their study, a hierar-
chical model is generated in a pre-process and at runtime, each node renders an
image-tile in a multi-threaded manner, overlapping rendering, visibility com-
putation, and disk operations. By using 16 nodes, they were able to match
the performance of the, at the time, current high-end graphics workstations.
Samanta et al. [34] introduce k-way replication for sort-first rendering where
geometry is only replicated on k out of n nodes with k being much smaller than
n. This small replication factor avoids the full data replication, typically re-
quired for sort-first rendering. It also supports rendering large-scale geometry,
where the geometry can be larger than the memory capacity of an individ-
ual node. It simultaneously reduces communication overheads by leveraging
Rendering 53
frame-to-frame coherence, but allowing a dynamic, view-dependent partition-
ing of the data. The study found that parallel rendering efficiencies, achieved
with small replication factors were, similar to the ones measured with full
replication. VTK and VisIt use the Mesa OpenGL library, which implements
the OpenGL API in software, to enable parallel rendering through sort-last
compositing (see Chap. 17 for more information). Rendering, with either VisIt
or Paraview, scales well on GPU clusters using hardware rendering and sort-
last compositing, but they are bounded by the composite time for the image
display. When using Mesa, the rendering is not scalable or interactive, due to
the rasterization speed; though, replacing rasterization with ray tracing can
improve the scalability of software rendering [4].
An alternative to rasterization for high performance visualization is to
render images using ray tracing. Ray tracing refers to tracing rays from the
viewpoint, through pixels, and intersecting the rays with the geometry to
be rendered. There have been several parallel ray tracing implementations for
high performance visualization. Parker et al. [31] implement a shared-memory
ray-tracer, RTRT, which performs interactive isosurface generation and ren-
dering on large shared-memory computers, such as the SGI Origin. This proves
to be effective for large data sets, as the data and geometry are accessible by
all processors. Bigler et al. [2] demonstrate the effectiveness of this system
on large-scale scientific data. They extend RTRT to use various methods in
order to visualize the particle data, from simulations using the material point
method, as spheres. They also describe two methods for augmenting the vi-
sualization using silhouette edges and advanced illumination, such as ambient
occlusion. In 4.5, there are details about how to accelerate ray tracing for
large amounts of geometry on distributed-memory platforms, like commodity
clusters.
4.4 Volume Rendering
Direct volume rendering methods generate images of a 3D volumetric data
set, without explicitly extracting geometry from the data [21, 12]. Conceptu-
ally, volume rendering proceeds by casting rays from the viewpoint through
pixel centers as shown in Figure 4.2. These rays are sampled where they inter-
sect the data volume. Volume rendering uses an optical model to map sampled
data values to optical properties, such as color and opacity [24]. During ren-
dering, optical properties are accumulated along each viewing ray to form an
image of the data, as shown in Figure 4.3. Although the data set is interpreted
as a continuous function in space, for practical purposes it is represented as
a discrete 3D scalar field. The samples typically are trilinearly interpolated
data values from the 3D scalar field. On a GPU, the trilinear interpolation is
performed in hardware and, therefore, is quite fast (see Chap. 11). Samples
along a ray can be determined analytically, as in ray marching [35], or can be
generated through proxy geometry on a GPU [14].
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.144.56