80 5.DelayingOpenGLCalls
their Clean() method. The constructor is responsible for creating, compiling,
and linking shader objects, as well as iterating over the program’s active uni-
forms, to populate
m_uniforms.
A user accesses a particular uniform by calling
GetUniformByName(), which
has a straightforward implementation that uses the
find() method of std::map
to look up the uniform. This method should not be called every time the uniform
is updated because of the string-based map search. Instead, the method should be
called once, and the returned
Uniform object should be reused every frame to
modify the uniform, similar to what is done in Listings 5.1 and 5.2.
The most important methods in the
ShaderProgram class are Noti-
fyDirty()
and Clean(). As we will see when we look at the implementation for
the
Uniform class, NotifyDirty() is called when a uniform wants to notify the
program that it is dirty. In response, the program adds the uniform to the dirty
list. It is the uniform’s responsibility to make sure it doesn’t redundantly notify
the program and be put on the dirty list multiple times. Finally, before a draw
call, the shader’s
Clean() method needs to be called. The method iterates over
each dirty uniform, which in turn makes the actual OpenGL call to modify the
uniform’s value. The dirty list is then cleared since no uniforms are dirty.
class ShaderProgram : public ICleanableObserver
{
public:
ShaderProgram(const std::string& vertexSource,
const std::string& fragmentSource)
{
m_handle = glCreateProgram();
// ... Create, compile, and link shader objects.
// Populate m_uniforms with program's active uniforms by
// calling glGetActiveUniform to get the name and location
// for each uniform.
}
virtual ~ShaderProgram()
{
// Delete shader objects, program, and m_uniforms.
}
5.4DelayedCallsImplementation 81
Uniform *GetUniformByName(const std::string name)
{
std::map<std::string, Uniform *>::iterator i =
m_uniforms.find(name);
return ((i != m_uniforms.end()) ? i->second : 0);
}
void Clean()
{
std::for_each(m_dirtyUniforms.begin(),
m_dirtyUniforms.end(), std::mem_fun(&ICleanable::Clean));
m_dirtyUniforms.clear();
}
// ICleanableObserver implementation.
virtual void NotifyDirty(ICleanable *value)
{
m_dirtyUniforms.push_back(value);
}
private:
GLuint m_handle;
std::vector<ICleanable *> m_dirtyUniforms;
std::map<std::string, Uniform *> m_uniforms;
};
Listing 5.6. Partial implementation for a shader program abstraction.
The other half of the implementation is the code for Uniform, which is
shown in Listing 5.7. A uniform needs to know its OpenGL location
(
m_location), its current value (m_value), if it is dirty (m_dirty), and the pro-
gram that is observing it (
m_observer). When a program creates a uniform, the
program passes the uniform’s location and a pointer to itself to the uniform’s
constructor. The constructor initializes the uniform’s value to zero and then noti-
fies the shader program that it is dirty. This has the effect of initializing all uni-
forms to zero. Alternatively, the uniform’s value can be queried with
glGetUniform(), but this has been found to be problematic on various drivers.
The bulk of the work for this class is done in
SetValue() and Clean().
When the user provides a clean uniform with a new value, the uniform marks
82 5.DelayingOpenGLCalls
itself as dirty and notifies the program that it is now dirty. If the uniform is al-
ready dirty or the user-provided value is no different than the current value, the
program is not notified, avoiding adding duplicate uniforms to the dirty list. The
Clean() function synchronizes the uniform’s value with OpenGL by calling
glUniform1f() and then marking itself clean.
class Uniform : public ICleanable
{
public:
Uniform(GLint location, ICleanableObserver *observer) :
m_location(location), m_currentValue(0.0F), m_dirty(true),
m_observer(observer)
{
m_observer->NotifyDirty(this);
}
float GetValue() const
{
return (m_currentValue);
}
void SetValue(float value)
{
if ((!m_dirty) && (m_currentValue != value))
{
m_dirty = true;
m_observer->NotifyDirty(this);
}
m_currentValue = value;
}
// ICleanable implementation.
virtual void Clean()
{
glUniform1f(m_location, m_currentVvalue);
m_dirty = false;
}
5.5ImplementationNotes 83
private:
GLint m_location;
GLfloat m_currentValue;
bool m_dirty;
ICleanableObserver *m_observer;
};
Listing 5.7. Implementation for a scalar floating-point uniform abstraction.
The final piece of the puzzle is implementing a draw call that cleans a shader
program. This is as simple as requiring the user to pass a
ShaderProgram in-
stance to every draw call in your OpenGL abstraction (you’re not exposing a
separate method to bind a program, right?), then calling
glUseProgram(), fol-
lowed by the program’s
Clean() method, and finally calling the OpenGL draw
function. If the draw calls are part of a class that represents an OpenGL context,
it is also straightforward to factor out redundant
glUseProgram() calls.
5.5ImplementationNotes
Our implementation is efficient in that it avoids redundant OpenGL calls and us-
es very little CPU. Once the
std::vector has been “primed,” adding a uniform
to the dirty list is a constant time operation. Likewise, iterating over it is efficient
because only dirty uniforms are touched. If no uniforms changed between one
draw call and the next, then no uniforms are touched. If the common case in your
engine is that most or all uniforms change from draw call to draw call, consider
removing the dirty list and just iterating over all uniforms before each draw.
If you are using reference counting when implementing this technique, keep
in mind that a uniform should keep a weak reference to its program. This is not a
problem in garbage-collected languages.
Also, some methods, including
ShaderProgram::Clean(), ShaderProgram
::NotifyDirty()
, and Uniform::Clean(), should not be publicly accessible.
In C++, this can be done by making them private or protected and using the
somewhat obscure
friend keyword. A more low-tech option is to use a naming
convention so clients know not to call them directly.
5.6ImprovedFlexibility
By delaying OpenGL calls until draw time, we gain a great deal of flexibility. For
starters, calling
Uniform::GetValue() or Uniform::SetValue() does not re-
84 5.DelayingOpenGLCalls
quire a current OpenGL context. For games with multiple contexts, this can min-
imize bugs caused by incorrect management of the current context. Likewise, if
you are developing an engine that needs to play nice with other libraries using
their own OpenGL context,
Uniform::SetValue() has no context side effects
and can be called anytime, not just when your context is current.
Our technique can also be extended to minimize managed to native code
round-trip overhead when using OpenGL with languages like Java or C#. Instead
of making fine-grained
glUniform1f() calls for each dirty uniform, the list of
dirty uniforms can be passed to native C++ code in a single coarse-grained call.
On the C++ side,
glUniform1f() is called for each uniform, thus eliminating the
per-uniform round trip. This can be taken a step further by making all the re-
quired OpenGL calls for a draw in a single round trip.
5.7ConcludingRemarks
An alternative to our technique is to use direct state access (DSA) [Kilgard
2009], an OpenGL extension that allows updating OpenGL state without previ-
ously setting global state. For example, the following two lines,
glUseProgram(m_handle);
glUniform1f(m_location, value);
can be combined into one:
glProgramUniform1fEXT(m_handle, m_location, m_currentValue);
As of this writing, DSA is not a core feature of OpenGL 3.3, and as such, is not
available on all platforms, although
glProgramUniform*() calls are mirrored in
the separate shader objects extension [Kilgard et al. 2010] which has become
core functionality in OpenGL 4.1.
Delaying selector-based OpenGL calls until draw time has a lot of benefits,
although there are some OpenGL calls that you do not want to delay. It is im-
portant to allow the CPU and GPU to work together in parallel. As such, you
would not want to delay updating a large vertex buffer or texture until draw time
because this could cause the GPU to wait, assuming it is not rendering one or
more frames behind the CPU.
Finally, I’ve had great success using this technique in both commercial and
open source software. I’ve found it quick to implement and easy to debug. An
excellent next step for you is to generalize the code in this chapter to support all
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.131.212