i
i
i
i
i
i
i
i
18.3. Processing Geometry into Pixels 455
you is that you essentially have a multi-processor machine. This turns out to be
a good way to think about your graphics hardware, since it means that you may
be able to use the graphics hardware processor to relieve the load on the CPU in
some of your applications. The graphics hardware processors are often referred
to as GPUs. GPU stands for graphics processing unit and highlights the fact
that graphics hardware components now contain a separate processor dedicated
to graphics-related computations.
Historical: Programming
the pipeline is not entirely
new. One of the first
introductions of a graphics
hardware architecture
designed for program-
ming flexibility were the
PixelFlow architectures
and shading languages
from UNC (Molnar et
al., 1992; Lastra et al.,
1995; Olano & Lastra,
1998). Additional efforts
to provide custom shading
techniques have included
shade trees (Cook,
1984), RenderMan (Pixar,
2000), accelerated multi-
pass rendering using
OpenGL
TM
(Peercy et al.,
2000), and other real-time
shading languages (Proud-
foot et al., 2001; McCool et
al., 2004).
Interestingly, modern GPUs contain more transistors than modern CPUs. For
the time being, GPUs are utilizing most of these transistors for computations and
less for memory or cache management operations.
However, this will not always be the case as graphics hardware continues to
advance. And just because the computations are geared towards 3D graphics,
it does not mean that you cannot perform computations unrelated to computer
graphics on the GPU. The manner in which the GPU is programmed is differ-
ent from your general purpose CPU and will require a slightly modied way of
thinking about how to solve problems and program the graphics hardware.
The GPU is a stream processor that excels at 3D vector operations such as
vector multiplication, vector addition, dot products, and other operations neces-
sary for basic lighting of surfaces and texture mapping. As stream processors,
both the vertex and fragment processing components include the ability to pro-
cess multiple primitives at the same time. In this regard, the GPU acts as a SIMD
(Single Instruction, Multiple Data) processor, and in certain hardware implemen-
tations of the fragment processor, up to 16 pixels can be processed at a time.
When you write programs for these processing components, it will be helpful, at
least conceptually, to think of the computations being performed concurrently on
your data. In other words, the vertex shader program will run for all vertices at
the same time. The vertex computations will then be followed by a stage in which
your fragment shader program will execute simultaneously on all fragments. It
is important to note that while the computations on vertices or fragments occur
concurrently, the staging of the pipeline components still occur in the same order.
The manner in which vertex and fragment shaders work is simple. You write
a vertex shader program and a fragment shader program and send it to the graph-
ics hardware. These programs can be used on specic geometry, and when your
geometry is processed, the vertex shader is used to transform and light the ver-
tices, while the fragment shader performs the nal shading of the geometry on a
per-pixel basis. Just as you can texture map different images onto different pieces
of geometry, you can also write different shader programs to act upon different
objects in your application. Shader programs are a part of the graphics state so
you do need to be concerned with how your shader programs might get swapped
in and out based on the geometry being rendered.
i
i
i
i
i
i
i
i
456 18. Using Graphics Hardware
The details tend to be a bit more complicated, however. Vertex shaders usually
perform two basic actions: set the color at the vertex and transform the vertex into
screen coordinates by multiplying the vertex by the modelview and projection
matrices. The perspective divide and clipping steps are not performed in a vertex
program. Vertex shaders are also often used to set the stage for a fragment shader.
In particular, you may have vertex attributes, such as texture coordinates or other
application-dependent data, that the vertex shader calculates or modies and then
sends to the fragment processing stage for use in your fragment shader. It may
seem strange at rst, but vertex shaders can be used to manipulate the positions
of the vertices. This is often useful for generating simulated ocean wave motion
entirely on the GPU.
In a fragment shader, it is required that the program outputs the fragment
color. This may involve looking up texture values and combining them in some
manner with values obtained by performing a lighting calculation at each pixel;
or, it may involve killing the fragment from being drawn entirely. Because op-
erations in the fragment shader operate at the fragment level, the real power of
the programmable graphics hardware is in the fragment shader. This added pro-
cessing power represents one of the key differences between the xed function
pipeline and the programmable pipeline. In the xed pipeline, fragment process-
ing used illumination values interpolated between the vertices of the triangle to
compute the fragment color. With the programmable pipeline, the color at each
fragment can be computed independently. For instance, in the example situation
posed in Figure 18.4, Gouraud shading of a triangle face fails to produce a reason-
able solution because lighting only occurs at the vertices which are farther away
from the light than the center of the triangle. In a fragment shader, the lighting
equation can be evaluated at each fragment, rather than at each vertex, resulting
in a more accurate rendering of the face.
18.3.2 Basic Execution Model
When writing vertex or fragment shaders, there are a few important things to un-
derstand in terms of how vertex and fragment programs execute and access data
on the GPU. Because these programs run entirely on the GPU, the rst details
you will need to gure out are which data your shaders will use and how to get
that data to them. There are several characteristics associated with the data types
used in shader programs. The following terms, which come primarily from the
OpenGL
TM
Shading Language framework, are used to describe the conceptual
aspects of these data characteristics. The concepts are the same across different
shading language frameworks. In the shaders you write, variables are character-
ized using one of the following terms:
i
i
i
i
i
i
i
i
18.3. Processing Geometry into Pixels 457
attributes. Attribute variables represent data that changes frequently, often
on a per-vertex basis. Attribute variables are often tied to the changing
graphics state associated with each vertex. For instance, normal vectors or
texture coordinates are considered to be attribute data since they are part of
the graphics state associated with each vertex.
uniforms. Uniform variables represent data that cannot change during the
execution of a shader program. However, uniform variables can be mod-
ied by your application between executions of a shader. This provides
another way for your application to communicate data to a shader. Uniform
data often represent the graphics state associated with an application. For
instance, the modelview and projection matrices can be accessed through
uniform variables. Information about light sources in your application can
also be obtained through uniform variables. In these examples, the data
does not change while the shader is executing, but could (e.g., the light
could move) prior to the next iteration of the application.
varying. Varying data is used to pass data between a vertex shader and
a fragment shader. The reason the data is considered varying is because
it is written by vertex shaders on a per-vertex basis, but read by fragment
shaders as value interpolated across the face of the primitive between neigh-
boring vertices.
Var ia ble s d ened using one of these three characteristics can either be built-in
variables or user-dened variables. In addition to accessing the built-in graphics
state, attribute and uniform variables are one of the ways to communicate user-
dened data to your vertexand fragment programs. Varying data is the only means
to pass data from a vertex shader to a fragment shader. Figure 18.6 illustrates the
basic execution of the vertex and fragment processors in terms of the inputs and
outputs used by the shaders.
Another way to pass data to vertex and fragment shaders is by using texture
maps as sources and sinks of data. This may come as a surprise if you have been
thinking of texture maps solely as images that are applied to the outside surface of
geometry. The reason texture maps are important is because they give you access
to the memory on the graphics hardware. When you write applications that run
on the CPU, you control the memory your application requires and have direct
access to it when necessary. On graphics hardware, memory is not accessed in
the same manner. In fact, you are not directly able to allocate and deallocate gen-
eral purpose memory chunks, and this particular aspect usually requires a slight
change in thinking.
i
i
i
i
i
i
i
i
458 18. Using Graphics Hardware
per-vertex attributes
uniform graphics state
texture data
special: vertex position,
vertex color
varying per-pixel data
vertex
processor
vertex transformation,
per-vertex lighting,
computation
vertex shader
varying per-pixel data
uniform graphics state
texture data
special: fragment color
or other attributes
fragment
processor
per-pixel lighting,
texture map generation,
computation
fragment shader
texture data
Figure 18.6. The execution model for shader programs. Input, such as per-vertex attributes,
graphics state-related uniform variables, varying data, and texture maps are provided to
vertex and fragment programs within the shader processor. Shaders output special variables
used in later parts of the graphics pipeline.
Texture maps on graphics hardware, however, can be created, deleted, and
controlled through the graphics API you use. In other words, for general data
used by your shader, you will create texture maps that contain that data and then
use texture access functions to look up the data in the texture map. Technically,
textures can be accessed by both vertex and fragment shaders. However, in prac-
Note: The shader lan-
guage examples used in
this chapter are presented
using GLSL (OpenGL
TM
Shading Language). This
language was chosen since
it is being developed by
the OpenGL
TM
Architec-
ture Review Board and
will likely become a stan-
dard shading language for
OpenGL
TM
with the release
of OpenGL
TM
2.0. As
of this writing, GLSL can
be used on most mod-
ern graphics cards with up-
dated graphics hardware
drivers.
tice, texture lookups from the vertex shader are not currently supported on all
graphics cards. An example that utilizes a texture map as a data source is bump
mapping. Bump mapping uses a normal map which denes how the normal vec-
tors change across a triangle face. A bump mapping fragment shader would look
up the normal vector in the normal map “texture data” and use it in the shading
calculations at that particular fragment.
You need to be concerned about the types of data you put into your tex-
ture maps. Not all numerical data types are well supported and only recently
has graphics hardware included oating point textures with 16-bit components.
Moreover, none of the computation being performed on your GPU is done with
double-precision math! If numerical precision is important for your application,
you will need to think through these issues very carefully to determine if using
the graphics hardware for computation is useful.
So what do these shader programs look like? One way to write vertex and
fragment shaders is through assembly language instructions. For instance, per-
forming a matrix multiplication in shader assembly language looks something
like this:
DP4 p[0].x, M[0], v[0];
DP4 p[0].y, M[1], v[0];
DP4 p[0].z, M[2], v[0];
DP4 p[0].w, M[3], v[0];
i
i
i
i
i
i
i
i
18.3. Processing Geometry into Pixels 459
In this example, the DP4 instruction is a 4-component dot product function. It
stores the result of the dot product in the rst register and performs the dot
product between the last two registers. In shader programming, registers hold
4-components corresponding to the x, y, z,andw components of a homogeneous
coordinate, or the r, g, b,anda components of a RGBA tuple. So, in this example,
a simple matrix multiplication,
p = Mv
is computed by four DP4 instructions. Each instruction computes one element of
the nal result.
Fortunately though, you are not forced to program in assembly language. The
good news is that higher-level languages are available to write vertex and frag-
ment shaders. NVIDIAs Cg, the OpenGL
TM
Shading Language (GLSL), and
Microsoft’s High Level Shading Language (HLSL) all provide similar interfaces
to the programmable aspects of graphics hardware. Using the notation of GLSL,
the same matrix multiplication performed above looks like this:
p=M
*
v;
where p and v are vertex data types and M is a matrix data type. As evidenced
here, one advantage of using a higher-level language over assembly language is
that various data types are available to the programmer. In all of these languages,
there are built-in data types for storing vectors and matrices, as well as arrays and
constructs for creating structures. Many different functions are also built in to
these languages to help compute trigonometric values (sin, cos, etc...), minimum
and maximum values, exponential functions (log2, sqrt, pow, etc...), and other
math or geometric-based functions.
18.3.3 Vertex Shader Example
Vertex shaders give you control over how your vertices are lit and transformed.
They are also used to set the stage for fragment shaders. An interesting aspect to
vertex shaders is that you still are able to use geometry-caching mechanisms, such
as display lists or VBOs, and thus, benet from their performance gains while us-
ing vertex shaders to do computation on the GPU. For instance, if the vertices
represent particles and you can model the movement of the particles using a ver-
tex shader, you have nearly eliminated the CPU from these computations. Any
bottleneck in performance that may have occurred due to data being passed be-
tween the CPU and the GPU will be minimized. Prior to the introduction of vertex
shaders, the computation of the particle movement would have been performed
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.127.68