17. IceT (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 17

IceT

Kenneth Moreland

Sandia National Laboratories

17.1 Introduction ...................................................... 373

17.2 Motivation ........................................................ 374

17.3 Implementation .................................................. 374

17.3.1 Theoretical Limitations ... and How to Break Them ... 375

17.3.2 Pixel Reduction Techniques ............................. 376

17.3.3 Tricks to Boost the Frame Rate ......................... 377

17.4 Application Programming Interface ............................. 378

17.4.1 Image Generation ........................................ 378

17.4.2 Opaque versus Transparent Rendering .................. 379

17.5 Conclusion ........................................................ 379

References .......................................................... 381

The Image Composition Engine for Tiles (IceT) is a high-performance,

sort-last parallel rendering library. IceT is designed for use in large-scale,

distributed-memory rendering applications. It works eﬃciently with large

numbers of processes, with large amounts of geometry, and with high-

resolution images. In addition to providing accelerated rendering for a stan-

dard display, IceT also provides the unique ability to generate images for tiled

displays. The overall resolution of the display may be several times larger

than any viewport rendered by a single machine. IceT is currently being used

as the parallel rendering implementation in high performance visualization

applications as like VisIt (see Chap. 16) and ParaView (see Chap. 18).

17.1 Introduction

The Image Composition Engine for Tiles (IceT) is an API designed to

enable applications to perform sort-last parallel rendering on large displays [8].

The design philosophy behind IceT is to allow very large data sets to be

rendered on arbitrarily high resolution displays. Although frame rates can be

sacriﬁced in lieu of scalable polygon/second rendering rates, there are many

features in IceT that allow an application to achieve interactive rates. These

373

374 High Performance Visualization

features include image compression, empty pixel skipping, image reduction,

and data replication. Together, these features make IceT a versatile parallel

rendering solution that provides optimal parallel rendering under most data

size and image size combinations.

IceT is designed to take advantage of spatial decomposition of the geometry

being rendered. That is, it works best if all the geometry on each process is

located in as small a region of space as possible. When this is true, each process

usually projects geometry on only a small section of the screen. This results

in less work for the compositing engine.

Overall, IceT demonstrates extraordinary speed and scalability. It is used

to render at tremendous rates, such as billions of polygons per second, and on

the largest supercomputers in the world [10].

17.2 Motivation

The original motivation for IceT was the need to support high performance

rendering for scientiﬁc visualization on large format displays [16]. Further-

more, the IceT development group needed to take advantage of the distributed

memory rendering clusters that were replacing the more expensive multipipe

rendering computers of the day. These requirements still ring true today. Sci-

entiﬁc data continues to grow, desktop displays with over 2 megapixels are

common, and nearly all high performance scientiﬁc visualization is performed

on distributed-memory computers.

There are three general classes of parallel rendering algorithms: sort-ﬁrst,

sort-middle, and sort-last [7] (although, it is possible to combine elements of

these classes together [14]). When run on a distributed-memory machine, every

type of parallel rendering algorithm has some overhead caused by communi-

cation. In sort-ﬁrst and sort-middle algorithms, this overhead is proportional

to the amount of geometry being rendered. In the sort-last algorithms, this

overhead is proportional the number of pixels on the display.

Although sort-ﬁrst and sort-middle parallel rendering algorithms eﬃciently

divide screen space and are often used to drive tiled displays, these algorithms

simply cannot scale to the size of data that sort-last algorithms are able to

support [12, 18]. Because large-scale data is its primary concern, IceT im-

plements sort-last rendering and employs the techniques used to reduce the

overhead incurred with high-resolution displays.

17.3 Implementation

The most important aspect of parallel rendering in IceT’s implementation

is that it performs well with both large amounts of data and high-resolution

displays. The sort-last compositing algorithms, described in Chapter 5, en-

sure that IceT performs well with large amounts of data and large process

IceT 375

counts [10]. This chapter primarily describes the techniques IceT uses to ef-

fectively render to large-format displays.

17.3.1 Theoretical Limitations ... and How to Break Them

There are a number of theoretical metrics with important practical conse-

quences. These include the number of pixel-blending operations performed by

each process (which aﬀects the total time computing), the number of pixels

sent or received by each process (which aﬀects how long it takes to trans-

fer data), the number of messages sent at any one time (which can aﬀect

network congestion), and the number of sequential messages sent (which can

accumulate the eﬀect of the network latency).

With respect to IceT’s performance on large images, the most important of

these metrics are the number of pixel-blending operations and the number of

pixels sent and received. It is easy to show, for example, that the binary-swap

algorithm is optimal on both counts. The binary-swap algorithm is described

in 5.2.2 as well as previous studies [3, 4].

Binary-swap is a divide-and-conquer algorithm that operates in rounds

that pair processes, swap image halves, and recurse in each half. Given p

processes compositing an image of n pixels, there must be at least (p −1) · n

blending operations (because it takes p −1 operations to blend the p versions

of each pixel generated by all the processes). A perfectly balanced parallel

algorithm will blend

(p−1)·n

pixels in each process.

The binary-swap algorithm has log

p rounds with each round blending

n/2

pixels in each process, where i is the round index starting at 1. The total

number of blending operations performed in each process is therefore

log



i=1

= n −

log

= n −

(p − 1) ·n

, (17.1)

which is, as previously mentioned, optimal. Likewise, binary-swap has an op-

timal amount of pixels transferred. Radix-k [13], which is also supported in

IceT, is similarly optimal with respect to pixel blending and pixel transfer.

In addition, radix-k can also reduce the number of rounds as well as overlap

computation and communication [2].

Although this theoretical, optimal solution still grows linearly with respect

to the resolution of the image, it is possible, in practice, to perform much

better. The previous analysis makes a critical assumption: that every pixel

generated by every process contains useful data. Such generation is possible

in the worst case, but in practice, many pixels can be ignored.

Consider the example of parallel rendering shown in Figure 17.1. Each

process renders a localized cube of data surrounded by an abundance of blank

space. Although the example in Figure 17.1 may seem artiﬁcial, this case is

actually quite common. Sort-last volume rendering requires the geometry to

376 High Performance Visualization

FIGURE 17.1: An example rendering of a cubic volume by eight processes.

The ﬁrst eight images represent those generated by each process. The image

at the right is a fully composited image.

be spatially decomposed in this way [9], and even unstructured data tends to

exhibit good spatial decomposition.

If this blank space is identiﬁed and grouped, it does not need to be trans-

ferred or blended. In this way, the overhead from high-resolution images can

be reduced. Furthermore, as more processes are used, the geometry gets di-

vided into even smaller units, thereby further reducing the amount of pixel

data. Consequently, the larger overhead from higher-resolution images can be

compensated for by adding more processes to the task.

17.3.2 Pixel Reduction Techniques

The ﬁrst step in reducing the amount of pixel data to be composited is

to identify the active pixels, which are those that have been rendered to,

and the complementary inactive pixels, which are those that have no render

information. IceT ﬁrst conservatively estimates a region of inactive pixels by

projecting the bounding box of the geometry on the screen and declaring

everything outside of this inactive. It then checks the remaining pixels for the

background depth, or opacity, to ﬁnd any remaining inactive pixels.

Once inactive pixels are identiﬁed, they are marked and their data is re-

moved from the image data storage. IceT uses a form of run-length encoding

called active-pixel encoding [11]. Active-pixel encoding stores alternating run

lengths of inactive and active pixels. Data for inactive pixels are removed,

whereas data for active pixels follow their respective run lengths. Active-pixel

encoding provides the double beneﬁt of: (1) decreasing the amount of data

transferred; and (2) providing a simpliﬁed means of skipping over inactive

pixels that need not be blended.

Although active-pixel encoding does reduce the overhead of sort-last par-

allel rendering, the savings are generally not well balanced. When binary-swap

or radix-k partitions an image, the resulting subimages are unlikely to contain

the same number of active pixels.

A simple but eﬀective method to rebalance the parallel compositing is to

interlace the image [7, 17]. In image interlacing, the pixels are shuﬄed around

to distribute local regions throughout the image. When interlacing an image,

IceT carefully shuﬄes regions that match those created by the binary-swap

or radix-k algorithm such that each partition binary-swap or radix-k creates

contains a block of unshuﬄed pixels. Thus, the reshuﬄing back to the original

IceT 377

FIGURE 17.2: Image interlacing in IceT.

pixel order is combined with the image partition collection at no extra cost

as shown in Figure 17.2

IceT can also take advantage of spatial decomposition in other ways on

multitile displays. In such a case, processes tend to render anything only on a

small set of tiles. IceT identiﬁes completely blank tiles and removes them from

the compositing computation. Special parallel compositing algorithms balance

the compositing work for the remaining tiles that contain valid data [11].

The most eﬀective of these algorithms is a reduction algorithm that assigns

processes to tiles in proportion to the number of images generated for each

tile. All images are sent to a process assigned to the corresponding tile, and

subsequently, each process group composites an image for their assigned tile.

17.3.3 Tricks to Boost the Frame Rate

Even with the pixel reduction techniques implemented in IceT, it may be

the case that the image composition overhead is still too great to maintain

an interactive frame rate. For example, this can occur when there are too few

processes driving a large-format display or when the view direction is zoomed

on large portions of the geometry.

A straightforward method to reduce the image compositing overhead is

to simply reduce the number of pixels in the image. Rather than render a

full-resolution image, the user can render a smaller image then resample up

to the displayed size. Clearly, this is not a technique that can be used for

every render because it loses detail. However, it is often the case in scientiﬁc

visualization that, when interacting with the data, a reduced level of detail

can be used in place of a full-resolution representation [1]. In such a case, you

might render a lower-resolution image, then replace it with a full-resolution

image after interacting is ﬁnished. To better support this technique, IceT can

automatically inﬂate images when drawing to an OpenGL context.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 17. IceT (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
17. IceT (1/2)