In this chapter, we will learn how to implement a hierarchical scene representation and corresponding rendering pipeline. This will help us combine the rendering we completed for the geometry and materials we explored in the previous chapters. Instead of implementing a naive object-oriented scene graph where each node is represented by an object that's allocated on the heap, we will learn how to apply the data-oriented design approach to simplify the memory layout of our scene. This will make the modifications we apply to the scene graph significantly faster. This will also act as a basis for learning about data-oriented design principles and applying them in practice. The scene graph and materials representation presented here is compatible with glTF2.
This chapter will cover how to organize the overall rendering process of complex scenes with multiple materials. We will be covering the following recipes:
To run the recipes in this chapter, you must have a computer with a video card that supports OpenGL 4.6, with ARB_bindless_texture, and Vulkan 1.2, with nonuniform indexing for sampled image arrays. Read Chapter 1, Establishing a Build Environment, if you want to learn how to build the demo applications shown in this book.
The source code for this chapter can be found on GitHub at https://github.com/PacktPublishing/3D-Graphics-Rendering-Cookbook.
Numerous hobby 3D engines use a straightforward and naive class-based approach to implement a scene graph. It is always tempting to define a structure similar to the following code, but please do not do this:
struct SceneNode {
SceneNode* parent_;
vector<SceneNode*> children_;
mat4 localTransform_;
mat4 globalTransform_;
Mesh* mesh_;
Material* material_;
void Render();
};
On top of this structure, you can define numerous recursive traversal methods, such as the dreaded render() operation. Let's say we have the following root object:
SceneNode* root;
Here, rendering a scene graph can be as simple as doing the following:
root->render();
The rendering routine in this case does multiple things. Most importantly, the render() method calculates the global transform for the current node. After that, depending on the rendering API being used, mesh geometry and material information is sent to the rendering pipeline. At the end, a recursive call is made to render all the children:
void SceneNode::render() {
globalTransform_ = parent_ ? parent_->globalTransform : mat4(1) * localTransform_;
… API-specific rendering calls…
for (auto& c: this->children_)
c->Render();
}
While being a simple and "canonical" object-oriented implementation, it has multiple serious drawbacks:
The 3D engine grows and scene graph requirements become increasingly numerous, new fields, arrays, callbacks, and pointers must be added and handled in the SceneNode structure making this approach essentially fragile and hard to maintain.
Let us step back and rethink how to keep the relative scene structure without using large monolithic classes with heavyweight dynamic containers inside.
To represent complex nested visual objects such as robotic arms, planetary systems, or deeply branched animated trees, you can split the object into parts and keep track of the hierarchical relationships between them. A directed graph of parent-child relationships between different objects in a scene is called a scene graph. We are deliberately avoiding using the words "acyclic graph" here because, for convenience, you may decide to use circular references between nodes in a controlled way. Most 3D graphics tutorials aimed at hobbyists lead directly down the simple but non-optimal path we identified in the previous recipe, How not to do a scene graph. Let's go a bit deeper into the rabbit hole and learn how to apply data-oriented design to implement a faster scene graph.
In this recipe, we will learn how to get started with a decently performant scene graph design. Our focus will be on scene graphs with fixed hierarchies. In Chapter 9, Working with Scene Graphs, we will elaborate on this topic more and explore how to deal with runtime topology changes and other scene graph operations.
The source code for this recipe is split between the scene geometry conversion tool, Chapter7/Scene Converter/src/main.cpp, and the rendering code, which can be found in the Chapter7/GL01_LargeScene and Chapter7/VK01_SceneGraph demos.
It seems logical to store a linear array of individual nodes, and also replace all the "external" pointers such as Mesh* and Material*, by using suitably sized integer handles, which are just indices to some other arrays. The array of child nodes and references to parent nodes are left outside.
The local and global transforms are also stored in separate arrays and can be easily mapped to a GPU buffer without conversion, making them directly accessible from GLSL shaders. Let's look at the implementation:
struct SceneNode {
int mesh_;
int material_;
};
struct Scene {
vector<SceneNode> nodes_;
vector<mat4> local_;
vector<mat4> global_;
};
One question remains: how can we store the hierarchy? The solution is well-known and is called the Left Child – Right Sibling tree representation. Since a scene graph is really a tree, at least in theory, where no optimization-related circular references are introduced, we may convert any tree with more than two children into a binary tree by "tilting" the branches, as shown in the following diagram:
Figure 7.1 – Tree representations
The image on the left-hand side shows a standard tree with a variable number of children for each node, while the image on the right-hand side shows a new structure that only stores a single reference to the first child and another reference to the next "sibling." Here, "being a sibling node" means "to be a child node of the same parent node." This transformation removes the need to store std::vector in each scene node. Finally, if we "tilt" the right image, we get a familiar binary tree structure where the left arrows are solid and represent a "first child" reference and the right arrows are dashed and represent the "next sibling" reference:
Figure 7.2 – Tilted tree
struct SceneNode {
int mesh_, material_;
int parent_;
int firstChild_;
int rightSibling_;
};
What we have now is a compact linear list of constant sized objects that are plain old data. Yes, the tree traversal and modification routines may seem unusual, but these are just linked list iterations. It would be unfair not to mention a rather serious disadvantage, though: random access to a child node is now slower on average because we must traverse each node in a list. For our purposes, this is not fatal since we will either touch all the children or none of them.
struct Hierarchy {
int parent_;
int firstChild_;
int nextSibling_;
int level_;
};
We have changed "left" to "first" and "right" to "next" since tree node geometry does not matter here. The level_ field stores the cached depth of the node from the top of the scene graph. The root node is at level zero; all the children have a level that's greater by one with respect to their parents.
Also, the Mesh and Material objects for each node can be stored in separate arrays. However, if not all the nodes are equipped with a mesh or material, we can use a hash table to store node-to-mesh and node-to-material mappings. Absence of such mappings simply indicates that a node is only being used for transformations or hierarchical relation storage. The hash tables are not as linear as arrays, but they can be trivially converted to and from arrays of {key, value} pairs.
struct Scene {
vector<mat4> localTransforms_;
vector<mat4> globalTransforms_;
vector<Hierarchy> hierarchy_;
// Meshes for nodes (Node -> Mesh) unordered_map<uint32_t, uint32_t> meshes_;
// Materials for nodes (Node -> Material) unordered_map<uint32_t, uint32_t> materialForNode_;
// Node names: which name is assigned to the node std::unordered_map<uint32_t, uint32_t> nameForNode_;
// Collection of scene node names std::vector<std::string> names_;
// Collection of debug material names std::vector<std::string> materialNames_;
};
One thing that is missing is the SceneNode structure itself, which is now represented by integer indices in the arrays of the Scene structure. It is rather amusing and unusual for an object-oriented mind to speak about SceneNode while not needing or having the scene node class itself.
The conversion routine for Assimp's aiScene into our format is implemented in the Chapter7/SceneConverter project. It is a form of top-down recursive traversal where we create our implicit SceneNode objects in the Scene structure. Let's go through the steps for traversing a scene stored in the aforementioned format:
void traverse(const aiScene* sourceScene, Scene& scene, aiNode* node, int parent, int atLevel)
{
int newNodeID = addNode(scene, parent, atLevel);
if (node->mName.C_Str()) {
uint32_t stringID = (uint32_t)scene.names_.size();
scene.names_.push_back( std::string( node->mName.C_Str()) );
scene.nameForNode_[newNodeID] = stringID;
}
for (size_t i = 0; i < node->mNumMeshes ; i++) {
int newSubNodeID = addNode( scene, newNode, atLevel + 1);
uint32_t stringID = (uint32_t)scene.names_.size();
scene.names_.push_back( std::string(node->mName.C_Str()) + "_Mesh_" + std::to_string(i));
scene.nameForNode_[newSubNodeID] = stringID;
int mesh = (int)node->mMeshes[i];
scene.meshes_[newSubNodeID] = mesh;
scene.materialForNode_[newSubNodeID] = sourceScene->mMeshes[mesh]->mMaterialIndex;
scene.globalTransform_[newSubNode] = glm::mat4(1.0f);
scene.localTransform_[newSubNode] = glm::mat4(1.0f);
}
scene.globalTransform_[newNode] = glm::mat4(1.0f);
scene.localTransform_[newNode] = toMat4(N->mTransformation);
for (unsigned int n = 0 ; n < N->mNumChildren ; n++)
traverse(sourceScene, scene, N->mChildren[n], newNode, atLevel+1);
}
glm::mat4 toMat4(const aiMatrix4x4& m) {
glm::mat4 mm;
for (int i = 0; i < 4; i++)
for (int j = 0; j < 4; j++)
mm[i][j] = m[i][j];
return mm;
}
The most complex part of the code for dealing with the Scene data structure is the addNode() routine, which allocates a new scene node and adds it to the scene hierarchy. Let's check out how to implement it:
int addNode(Scene& scene, int parent, int level)
{
int node = (int)scene.hierarchy_.size();
scene.localTransform_.push_back(glm::mat4(1.0f));
scene.globalTransform_.push_back(glm::mat4(1.0f));
scene.hierarchy.push_back({ .parent = parent });
if (parent > -1) {
int s = scene.hierarchy_[parent].firstChild_;
if (s == -1) {
scene.hierarchy_[parent].firstChild_ = node;
scene.hierarchy_[node].lastSibling_ = node;
} else {
int dest = scene.hierarchy_[s].lastSibling_;
if (dest <= -1) {
// iterate nextSibling_ indices for (dest = s; scene.hierarchy_[dest].nextSibling_ != -1; dest = scene.hierarchy_[dest].nextSibling_);
}
scene.hierarchy_[dest].nextSibling_ = node;
scene.hierarchy_[s].lastSibling_ = node;
}
}
After the for loop, we assign our new node as the next sibling of the last child. Note that this linear run over the siblings is not really necessary if we store the index of the last child node that was added. Later, in the Implementing transformations recipe, we will show you how to modify addNode() and remove the preceding loop.
scene.hierarchy[node].level = level;
scene.hierarchy_[node].nextSibling_ = -1;
scene.hierarchy_[node].firstChild_ = -1;
return node;
}
Once we have the material system in place, we can use the traverse() routine in our new SceneConvert tool.
Data-oriented design (DOD) is a vast domain, and we just used a few techniques from it. We recommend reading the online book Data-Oriented Design, by Richard Fabian, to get yourself familiar with more DOD concepts: https://www.dataorienteddesign.com/dodbook.
The Chapter7/VK01_SceneGraph demo application contains some basic scene graph editing capabilities for ImGui. These can help you get started with integrating scene graphs into your productivity tools. Check out shared/vkFramework/GuiRenderer.cpp for more details. The following recursive function, called renderSceneTree(), is responsible for rendering the scene graph tree hierarchy in the UI and selecting a node for editing:
int renderSceneTree(const Scene& scene, int node) {
int selected = -1;
std::string name = getNodeName(scene, node);
std::string label = name.empty() ? (std::string("Node") + std::to_string(node)) : name;
int flags = (scene.hierarchy_[node].firstChild_ < 0) ? ImGuiTreeNodeFlags_Leaf|ImGuiTreeNodeFlags_Bullet : 0;
const bool opened = ImGui::TreeNodeEx( &scene.hierarchy_[node], flags, "%s", label.c_str());
ImGui::PushID(node);
if (ImGui::IsItemClicked(0)) selected = node;
if (opened) {
for (int ch = scene.hierarchy_[node].firstChild_; ch != -1; ch = scene.hierarchy_[ch].nextSibling_)
{
int subNode = renderSceneTree(scene, ch);
if (subNode > -1) selected = subNode;
}
ImGui::TreePop();
}
ImGui::PopID();
return selected;
}
The editNode() function can be used as a basis for building editing functionality for nodes, materials, and other scene graph content.
To quote Frederick Brooks, "Show me your data structures and I do not need to see your code." Hopefully, it is already more or less clear how to implement basic operations on a scene graph, but the remaining recipes in this chapter will explicitly describe all the required routines. Here, we will provide an overview of the loading and saving operations for our scene graph structure.
Make sure you have read the previous recipe, Using data-oriented design for a scene graph, before proceeding any further.
The loading procedure is a sequence of fread() calls, followed by a pair of loadMap() operations. As usual, we will be omitting any error handling code in this book's text; however, the accompanying source code bundle contains many necessary checks to see if the file was actually opened and so on. Let's get started:
void loadScene(const char* fileName, Scene& scene)
{
FILE* f = fopen(fileName, "rb");
uint32_t sz;
fread(&sz, sizeof(sz), 1, f);
scene.hierarchy.resize(sz);
scene.globalTransform.resize(sz);
scene.localTransform.resize(sz);
fread(scene.localTransform.data(), sizeof(glm::mat4), sz, f);
fread(scene.globalTransform.data(), sizeof(glm::mat4), sz, f);
fread( scene.hierarchy.data(), sizeof(Hierarchy), sz, f);
loadMap(f, scene.materialForNode);
loadMap(f, scene.meshes);
if (!feof(f)) {
loadMap(f, scene.nameForNode_);
loadStringList(f, scene.names_);
loadStringList(f, scene.materialNames_);
}
fclose(f);
}
Saving the scene reverses the loadScene() routine. Let's take a look:
void saveScene(const char* fileName, const Scene& scene)
{
FILE* f = fopen(fileName, "wb");
uint32_t sz = (uint32_t)scene.hierarchy.size();
fwrite(&sz, sizeof(sz), 1, f);
fwrite(scene.localTransform.data(), sizeof(glm::mat4), sz, f);
fwrite(scene.globalTransform.data(), sizeof(glm::mat4), sz, f);
fwrite(scene.hierarchy.data(), sizeof(Hierarchy), sz, f);
saveMap(f, scene.materialForNode);
saveMap(f, scene.meshes);
If the scene node and material names are not empty, we must also store these maps:
if (!scene.names_.empty() && !scene.nameForNode_.empty()) {
saveMap(f, scene.nameForNode_);
saveStringList(f, scene.names_);
saveStringList(f, scene.materialNames_);
}
fclose(f);
}
Now, let's briefly describe the helper routines for loading and saving unordered maps. std::unordered_map is loaded in three steps:
void loadMap(FILE* f, std::unordered_map<uint32_t, uint32_t>& map)
{
std::vector<uint32_t> ms;
uint32_t sz;
fread(&sz, 1, sizeof(sz), f);
ms.resize(sz);
fread(ms.data(), sizeof(int), sz, f);
for (size_t i = 0; i < (sz / 2) ; i++)
map[ms[i * 2 + 0]] = ms[i * 2 + 1];
}
The saving routine for std::unordered_map is created by reversing loadMap() line by line:
void saveMap(FILE* f, const std::unordered_map<uint32_t, uint32_t>& map)
{
std::vector<uint32_t> ms;
ms.reserve(map.size() * 2);
for (const auto& m : map) {
ms.push_back(m.first);
ms.push_back(m.second);
}
uint32_t sz = ms.size();
fwrite(&sz, sizeof(sz), 1, f);
fwrite(ms.data(), sizeof(int), ms.size(), f);
}
Topology changes for the nodes in our scene graph pose a certain, nevertheless solvable, problem. The corresponding source code is discussed in the Deleting nodes and merging scene graphs recipe of Chapter 9, Working with Scene Graphs. We just have to keep all the mesh geometries in a single GPU buffer. We will show you how to implement this later in this chapter in MultiRenderer, which is a refactoring of the MultiMeshRenderer class from Chapter5/VK01_MultiMeshDraw.
The material conversion routines will be implemented in the Implementing a material system recipe. Together with scene loading and saving, they complete the SceneConvert tool.
A scene graph is typically used to represent spatial relationships. For the purpose of rendering, we must calculate a global affine 3D transformation for each of the scene graph nodes. This recipe will show you how to correctly calculate global transformations from local transformations without making any redundant calculations.
Using the previously defined Scene structure, we will show you how to correctly recalculate global transformations. Please revisit the Using data-oriented design for a scene graph recipe before proceeding. To start this recipe, recall that we had the dangerous but tempting idea of using a recursive global transform calculator in the non-existent SceneNode::render() method:
SceneNode::Render() {
mat4 parentTransform = parent ? parent->globalTransform : identity();
this->globalTransform = parentTransform * localTransform;
... rendering and recursion
}
It is always better to separate operations such as rendering, scene traversal, and transform calculation, while at the same time executing similar operations in large batches. This separation becomes even more important when the number of nodes becomes large.
We have already learned how to render several meshes with a single GPU draw call by using a combination of indirect rendering and programmable vertex pulling. Here, we will show you how to perform the minimum amount of global transform recalculations.
It is always good to avoid unnecessary calculations. In the case of global transformations of the scene nodes, we need a way to mark certain nodes whose transforms have changed in this frame. Since changed nodes may have children, we must also mark those children as changed. Let's take a look:
struct Scene {
… somewhere in transform component …
std::vector<int>
changedAtThisFrame_[MAX_NODE_LEVEL];
};
void markAsChanged(Scene& scene, int node) {
int level = scene.hierarchy_[node].level;
scene.changedAtThisFrame_[level].push_back(node);
for (int s = scene.hierarchy_[node].firstChild_ ; s != - 1; s = scene.hierarchy_[s].nextSibling_)
markAsChanged(scene, s);
}
To recalculate all the global transformations for changed nodes, the following function must be implemented. No work is done if no local transformations were updated, and the scene is essentially static. Let's take a look:
void recalculateGlobalTransforms(Scene& scene)
{
if (!scene.changedAtThisFrame_[0].empty()) {
int c = scene.changedAtThisFrame_[0][0];
scene.globalTransform_[c] = scene.localTransform_[c];
scene.changedAtThisFrame_[0].clear();
}
for (int i = 1 ; i < MAX_NODE_LEVEL && !scene.changedAtThisFrame_[i].empty(); i++ )
{
for (int c : scene.changedAtThisFrame_[i]) {
int p = scene.hierarchy_[c].parent;
scene.globalTransform_[c] = scene.globalTransform_[p] * scene.localTransform_[c];
}
scene.changedAtThisFrame_[i].clear();
}
}
The essence of this implementation is the fact that we do not recalculate any of the global transformations multiple times. Since we start from the root layer of the scene graph tree, all the changed layers below the root acquire a valid global transformation for their parents.
Note
Depending on how frequently local transformations are updated, it may be more performant to eliminate the list of recently updated nodes and always perform a full update. Profile your real code before making a decision.
As an advanced exercise, transfer the computation of changed node transformations to your GPU. This is relatively easy to implement, considering that we have compute shaders and buffer management in place.
Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, provided a description of the PBR shading model and presented all the required GLSL shaders for rendering a single 3D object using multiple textures. Here, we will show you how to organize scene rendering with multiple objects with different materials and properties. Our material system is compatible with the glTF2 material format and easily extensible for incorporating many existing glTF2 extensions.
The previous chapters dealt with rendering individual objects and applying a PBR model to lighten them. In the Using data-oriented design for a scene graph recipe, we learned the general structure for scene organization and used opaque integers as material handles. Here, we will define a structure for storing material parameters and show you how this structure can be used in GLSL shaders. The routine to convert material parameters from the ones loaded by Assimp will be described later in this chapter, in the Importing materials with Assimp recipe.
We need a structure to represent our PBR material, both in CPU memory to load it from a file and in a GPU buffer. Let's get started:
#ifdef __GNUC__
# define PACKED_STRUCT __attribute__((packed,aligned(1)))
#else
# define PACKED_STRUCT
#endif
struct PACKED_STRUCT MaterialDescription final
{
gpuvec4 emissiveColor_ = { 0.0f, 0.0f, 0.0f, 0.0f};
gpuvec4 albedoColor_ = { 1.0f, 1.0f, 1.0f, 1.0f };
gpuvec4 roughness_ = { 1.0f, 1.0f, 0.0f, 0.0f };
float transparencyFactor_ = 1.0f;
float alphaTest_ = 0.0f;
float metallicFactor_ = 1.0f;
uint32_t flags_ = sMaterialFlags_CastShadow | sMaterialFlags_ReceiveShadow;
uint64_t ambientOcclusionMap_ = 0xFFFFFFFF;
uint64_t emissiveMap_ = 0xFFFFFFFF;
uint64_t albedoMap_ = 0xFFFFFFFF;
uint64_t metallicRoughnessMap_ = 0xFFFFFFFF;
uint64_t normalMap_ = 0xFFFFFFFF;
uint64_t opacityMap_ = 0xFFFFFFFF;
};
See the Implementing a scene conversion tool recipe for further details on how to convert and pack material textures.
struct PACKED_STRUCT gpuvec4 {
float x, y, z, w;
gpuvec4() = default;
gpuvec4(float a, float b, float c, float d)
: x(a), y(b), z(c), w(d) {}
gpuvec4(const vec4& v) : x(v.x), y(v.y), z(v.z),
w(v.w) {}
};
Note that we have the data structures in place, let's take a look at the loading and saving code:
void loadMaterials(const char* fileName, std::vector<MaterialDescription>& materials, std::vector<std::string>& files)
{
FILE* f = fopen(fileName, "rb");
if (!f) return;
uint32_t sz;
fread(&sz, 1, sizeof(uint32_t), f);
materials.resize(sz);
fread(materials.data(), materials.size(), sizeof(MaterialDescription), f);
loadStringList(f, files);
fclose(f);
}
void saveMaterials(const char* fileName, const std::vector<MaterialDescription>& materials, const std::vector<std::string>& files)
{
FILE* f = fopen(fileName, "wb");
if (!f) return;
uint32_t sz = (uint32_t)materials.size();
fwrite(&sz, 1, sizeof(uint32_t), f);
fwrite(materials.data(), sz, sizeof(MaterialDescription), f);
saveStringList(f, files);
fclose(f);
}
At program start, we load the list of materials and all the texture files into GPU textures. Now, we are ready to learn how the MaterialDescription structure is used in GLSL shaders:
layout(location = 0) out vec3 uvw;
layout(location = 1) out vec3 v_worldNormal;
layout(location = 2) out vec4 v_worldPos;
layout(location = 3) out flat uint matIdx;
#include <data/shaders/chapter07/VK01.h>
struct ImDrawVert {
float x, y, z; float u, v; float nx, ny, nz;
};
struct DrawData {
uint mesh;
uint material;
uint lod;
uint indexOffset;
uint vertexOffset;
uint transformIndex;
};
#include <data/shaders/chapter07/VK01_VertCommon.h>
The VK01_VertCommon.h file contains all the buffer attachments for the vertex shader. The first buffer contains two per-frame uniforms – the model-view-projection matrix and the camera position in the world space:
layout(binding = 0) uniform UniformBuffer
{ mat4 inMtx; vec4 cameraPos; } ubo;
layout(binding = 1) readonly buffer SBO { ImDrawVert data[]; } sbo;
layout(binding = 2) readonly buffer IBO { uint data[]; } ibo;
layout(binding = 3) readonly buffer DrawBO { DrawData data[]; } drawDataBuffer;
layout(binding = 5) readonly buffer XfrmBO { mat4 data[]; } transformBuffer;
The rest of the VK01.h file refers to yet another file, called data/shaders/chapter07/MaterialData.h, that defines a GLSL structure equivalent to MaterialData, which was described at the beginning of this recipe.
Now, let's return to the main vertex shader:
void main() {
DrawData dd = drawDataBuffer.data[gl_BaseInstance];
uint refIdx = dd.indexOffset + gl_VertexIndex;
ImDrawVert v = sbo.data[ibo.data[refIdx] +
dd.vertexOffset];
mat4 model = transformBuffer.data[gl_BaseInstance];
v_worldPos = model * vec4(v.x, -v.y, v.z, 1.0);
v_worldNormal = transpose(inverse(mat3(model))) * vec3(v.nx, -v.ny, v.nz);
gl_Position = ubo.inMtx * v_worldPos;
matIdx = dd.material;
uvw = vec3(v.u, v.v, 1.0);
}
Now, let's take a look at the fragment shader:
layout(binding = 4) readonly buffer MatBO { MaterialData data[]; } mat_bo;
layout(binding = 9) uniform sampler2D textures[];
void main() {
MaterialData md = mat_bo.data[matIdx];
vec4 emission = md.emissiveColor_;
vec4 albedo = vec4(1.0, 0.0, 0.0, 1.0);
vec3 normalSample = vec3(0.0, 0.0, 1.0);
{
uint texIdx = uint(md.albedoMap_);
albedo = texture( textures[nonuniformEXT(texIdx)], uvw.xy);
}
{
uint texIdx = uint(md.normalMap_);
normalSample = texture( textures[nonuniformEXT(texIdx)], uvw.xy).xyz;
}
runAlphaTest(albedo.a, md.alphaTest_);
To avoid dealing with any kind of scene sorting at this point, alpha transparency is simulated using dithering and punch-through transparency. You can find some useful insights at http://alex-charlton.com/posts/Dithering_on_the_GPU. The following is the final solution:
void runAlphaTest(float alpha, float alphaThreshold) {
if (alphaThreshold == 0.0) return;
mat4 thresholdMatrix = mat4( 1.0 /17.0, 9.0/17.0, 3.0/17.0, 11.0/17.0, 13.0/17.0, 5.0/17.0, 15.0/17.0, 7.0/17.0, 4.0 17.0, 12.0/17.0, 2.0/17.0, 10.0/17.0, 16.0/17.0, 8.0/17.0, 14.0/17.0, 6.0/17.0 );
int x = int(mod(gl_FragCoord.x, 4.0));
int y = int(mod(gl_FragCoord.y, 4.0));
alpha = clamp( alpha - 0.5 * thresholdMatrix[x][y], 0.0, 1.0);
if (alpha < alphaThreshold) discard;
}
vec3 n = normalize(v_worldNormal);
if (length(normalSample) > 0.5)
n = perturbNormal(n, normalize(ubo.cameraPos.xyz – v_worldPos.xyz),normalSample, uvw.xy);
vec3 lightDir = normalize(vec3(-1.0, -1.0, 0.1));
float NdotL = clamp( dot(n, lightDir), 0.3, 1.0 );
outColor = vec4( albedo.rgb * NdotL + emission.rgb, 1.0 );
}
The next recipe will show you how to extract and pack the values from the Assimp library's aiMaterial structure into our MaterialData structure.
In Chapter 5, Working with Geometry Data, we learned how to define a runtime data storage format for mesh geometry. This recipe will show you how to use the Assimp library to extract material properties from Assimp data structures. Combined with the next recipe, which will cover our SceneConverter tool, this concludes the process of describing our data content exporting pipeline.
In the previous recipe, we learned how to render multiple meshes with different materials. Now, it is time to learn how to import the material data from popular 3D asset formats.
Let's take a look at the convertAIMaterialToDescription() function that's used in the SceneConverter tool. It retrieves all the required parameters from the aiMaterial structure and returns a MaterialDescription object that can be used with our GLSL shaders. Let's take a look:
MaterialDescription convertAIMaterialToDescription( const aiMaterial* M, std::vector<std::string>& files, std::vector<std::string>& opacityMaps)
{
MaterialDescription D;
aiColor4D Color;
if ( aiGetMaterialColor(M, AI_MATKEY_COLOR_AMBIENT, &Color) == AI_SUCCESS ) {
D.emissiveColor_ = { Color.r, Color.g, Color.b, Color.a };
if ( D.emissiveColor_.w > 1.0f )
D.emissiveColor_.w = 1.0f;
}
The first parameter we are trying to extract is the "ambient" color, which is stored in the emissiveColor_ field of MaterialDescription. The alpha value is clamped to 1.0.
if ( aiGetMaterialColor(M, AI_MATKEY_COLOR_DIFFUSE, &Color) == AI_SUCCESS ) {
D.albedoColor_ = { Color.r, Color.g, Color.b, Color.a };
if ( D.albedoColor_.w > 1.0f ) D.albedoColor_.w = 1.0f;
}
if (aiGetMaterialColor(M, AI_MATKEY_COLOR_EMISSIVE, &Color) == AI_SUCCESS ) {
D.emissiveColor_.x += Color.r; D.emissiveColor_.y += Color.g; D.emissiveColor_.z += Color.b; D.emissiveColor_.w += Color.a;
if ( D.emissiveColor_.w > 1.0f )
D.albedoColor_.w = 1.0f;
}
const float opaquenessThreshold = 0.05f;
float Opacity = 1.0f;
In our conversion routine, we are using one simple optimization trick for transparent materials: anything with an opaqueness of 95% or more is considered opaque and avoids any blending.
if ( aiGetMaterialFloat(M, AI_MATKEY_OPACITY, &Opacity) == AI_SUCCESS ) {
D.transparencyFactor_ = glm::clamp(1.0f-Opacity, 0.0f, 1.0f);
if ( D.transparencyFactor_ >= 1.0f – opaquenessThreshold )
D.transparencyFactor_ = 0.0f;
}
if ( aiGetMaterialColor(M, AI_MATKEY_COLOR_TRANSPARENT, &Color) == AI_SUCCESS ) {
const float Opacity = std::max(std::max(Color.r, Color.g), Color.b);
D.transparencyFactor_ = glm::clamp( Opacity, 0.0f, 1.0f );
if ( D.transparencyFactor_ >= 1.0f – opaquenessThreshold )
D.transparencyFactor_ = 0.0f;
D.alphaTest_ = 0.5f;
}
float tmp = 1.0f;
if (aiGetMaterialFloat(M, AI_MATKEY_GLTF_PBRMETALLICROUGHNESS _METALLIC_FACTOR, &tmp) == AI_SUCCESS)
D.metallicFactor_ = tmp;
if (aiGetMaterialFloat(M, AI_MATKEY_GLTF_PBRMETALLICROUGHNESS _ROUGHNESS_FACTOR, &tmp) == AI_SUCCESS)
D.roughness_ = { tmp, tmp, tmp, tmp };
aiString Path;
aiTextureMapping Mapping;
unsigned int UVIndex = 0;
float Blend = 1.0f;
aiTextureOp TextureOp = aiTextureOp_Add;
const aiTextureMapMode TextureMapMode[2] = { aiTextureMapMode_Wrap, aiTextureMapMode_Wrap };
unsigned int TextureFlags = 0;
This function requires several parameters, most of which we will ignore in our converter for the sake of simplicity.
if (aiGetMaterialTexture( M, aiTextureType_EMISSIVE, 0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp, TextureMapMode,&TextureFlags ) == AI_SUCCESS)
D.emissiveMap_ = addUnique(files, Path.C_Str());
if (aiGetMaterialTexture( M, aiTextureType_DIFFUSE, 0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp, TextureMapMode, &TextureFlags ) == AI_SUCCESS)
D.albedoMap_ = addUnique(files, Path.C_Str());
if (aiGetMaterialTexture( M, aiTextureType_NORMALS, 0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp, TextureMapMode, &TextureFlags) == AI_SUCCESS)
D.normalMap_ = addUnique(files, Path.C_Str());
if (D.normalMap_ == 0xFFFFFFFF)
if (aiGetMaterialTexture( M, aiTextureType_HEIGHT, 0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp, TextureMapMode,
&TextureFlags ) == AI_SUCCESS)
D.normalMap_ = addUnique(files, Path.C_Str());
if (aiGetMaterialTexture( M, aiTextureType_OPACITY, 0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp, TextureMapMode, &TextureFlags ) == AI_SUCCESS) {
D.opacityMap_ = addUnique(opacityMaps, Path.C_Str());
D.alphaTest_ = 0.5f;
}
aiString Name;
std::string materialName;
if (aiGetMaterialString(M, AI_MATKEY_NAME, &Name) == AI_SUCCESS)
materialName = Name.C_Str();
if (materialName.find("Glass") != std::string::npos)
D.alphaTest_ = 0.75f;
if (materialName.find("Bottle") != std::string::npos)
D.alphaTest_ = 0.54f;
return D;
}
int addUnique(std::vector<std::string>& files, const std::string& file)
{
if (file.empty()) return -1;
auto i = std::find(std::begin(files), std::end(files), file);
if (i != files.end())
return (int)std::distance(files.begin(), i);
files.push_back(file);
return (int)files.size() - 1;
}
Before we move on, let's take a look at how to implement all the helper routines necessary for our scene converter tool, which will be described in the next recipe. The convertAndDownscaleAllTextures() function is used to generate the internal filenames for each of the textures and convert the contents of each texture into a GPU-compatible format. Let's take a look:
void convertAndDownscaleAllTextures( const std::vector<MaterialDescription>& materials, const std::string& basePath, std::vector<std::string>& files, std::vector<std::string>& opacityMaps)
{
std::unordered_map<std::string, uint32_t> opacityMapIndices(files.size());
for (const auto& m : materials)
if (m.opacityMap_ != 0xFFFFFFFF && m.albedoMap_ != 0xFFFFFFFF)
opacityMapIndices[files[m.albedoMap_]] = m.opacityMap_;
auto converter = [&](const std::string& s) -> std::string {
return convertTexture( s, basePath, opacityMapIndices, opacityMaps);
};
std::transform(std::execution::par, std::begin(files), std::end(files), std::begin(files), converter);
}
The std::execution::par parameter is a C++20 feature that allows us to parallel process the array. Since converting the texture data is a rather lengthy process, this straightforward parallelization reduces our processing time significantly.
A single texture map is converted into our runtime data format with the following routine:
std::string convertTexture(const std::string& file, const std::string& basePath, std::unordered_map<std::string, uint32_t>& opacityMapIndices, const std::vector<std::string>& opacityMaps)
{
const int maxNewWidth = 512;
const int maxNewHeight = 512;
std::vector<uint8_t> tmpImage( maxNewWidth * maxNewHeight * 4);
const auto srcFile = replaceAll(basePath + file, "\", "/");
const auto newFile = std::string("data/out_textures/") + lowercaseString(replaceAll(replaceAll( srcFile, "..", "__"), "/", "__") + std::string("__rescaled")) + std::string(".png");
int texWidth, texHeight, texChannels;
stbi_uc* pixels = stbi_load(fixTextureFile(srcFile).c_str(), &texWidth, &texHeight, &texChannels, STBI_rgb_alpha);
uint8_t* src = pixels;
texChannels = STBI_rgb_alpha;
Note
The fixTextureFile() function fixes situations where 3D model material data references texture files with inappropriate case in filenames. For example, the .mtl file may contain map_Ka Texture01.png, while the actual filename on the file system is called texture01.png. This way, we can fix naming inconsistencies in the Bistro scene on Linux.
if (!src) {
printf("Failed to load [%s] texture ", srcFile.c_str());
texWidth = maxNewWidth;
texHeight = maxNewHeight;
src = tmpImage.data();
}
if (opacityMapIndices.count(file) > 0) {
const auto opacityMapFile = replaceAll(basePath + opacityMaps[opacityMapIndices[file]], "\", "/");
int opacityWidth, opacityHeight;
stbi_uc* opacityPixels = stbi_load(opacityMapFile.c_str(), &opacityWidth, &opacityHeight, nullptr, 1);
if (!opacityPixels) {
printf("Failed to load opacity mask [%s] ", opacityMapFile.c_str());
}
assert(opacityPixels);
assert(texWidth == opacityWidth);
assert(texHeight == opacityHeight);
for (int y = 0; y != opacityHeight; y++)
for (int x = 0; x != opacityWidth; x++)
src[(y * opacityWidth + x) * texChannels + 3] = opacityPixels[y * opacityWidth + x];
stbi_image_free(opacityPixels);
}
const uint32_t imgSize = texWidth * texHeight * texChannels;
std::vector<uint8_t> mipData(imgSize);
uint8_t* dst = mipData.data();
const int newW = std::min(texWidth, maxNewWidth);
const int newH = std::min(texHeight, maxNewHeight);
stbir_resize_uint8(src, texWidth, texHeight, 0, dst, newW, newH, 0, texChannels);
stbi_write_png( newFile.c_str(), newW, newH, texChannels, dst, 0);
if (pixels) stbi_image_free(pixels);
return newFile;
}
This way, we ensure that if the conversion tool has completed without errors, the converted dataset is always valid and requires significantly fewer runtime checks.
This relatively long recipe has shown all the necessary routines for retrieving material and texture data from external 3D assets. To learn how these functions are used in real code, let's jump to the next recipe, Implementing a scene conversion tool. The previous recipe, Implementing a material system, showed you how to use the imported materials with GLSL shaders.
In Chapter 5, Working with Geometry Data, we implemented a geometry conversion tool capable of loading meshes in various formats supported by the Assimp library, such as .gltf or .obj, and storing them in our runtime format, which is suitable for fast loading and rendering. In this recipe, we will extend this tool into a full scene converter that will handle all our materials and textures. Let's get started and learn how to do this.
The source code for the scene conversion tool described in this chapter can be found in the Chapter7SceneConverter folder. The entire project is covered in this recipe. If you want to start with a simpler version of the tool that only deals with geometry data, take a look at the Implementing a geometry conversion tool recipe in Chapter 5, Working with Geometry Data.
Before we look at this recipe, make sure you're familiar with the Implementing a material system and Importing materials from Assimp recipes of this chapter.
Our geometry conversion tool takes its configuration from a .json file that, for the Lumberyard Bistro mesh used in this book, looks like this:
[{ "input_scene": "deps/src/bistro/Exterior/exterior.obj", "output_mesh": "data/meshes/test.meshes", "output_scene": "data/meshes/test.scene", "output_materials": "data/meshes/test.materials", "output_boxes": "data/meshes/test.boxes", "scale": 0.01, "calculate_LODs": false, "merge_instances": true },
{ "input_scene": "deps/src/bistro/Interior/interior.obj", "output_mesh": "data/meshes/test2.meshes", "output_scene": "data/meshes/test2.scene", "output_materials": "data/meshes/test2.materials", "output_boxes": "data/meshes/test2.boxes", "scale": 0.01, "calculate_LODs": false, "merge_instances": true }]
To parse this configuration file, we are going to use the RapidJSON library, which can be found on GitHub at https://github.com/Tencent/rapidjson.
First, we should take a look at how to implement the .json parsing step:
struct SceneConfig {
std::string fileName;
std::string outputMesh;
std::string outputScene;
std::string outputMaterials;
std::string outputBoxes;
float scale;
bool calculateLODs;
bool mergeInstances;
};
std::vector<SceneConfig> readConfigFile( const char* cfgFileName) {
std::ifstream ifs(cfgFileName);
rapidjson::IStreamWrapper isw(ifs);
rapidjson::Document document;
const rapidjson::ParseResult = document.ParseStream(isw);
std::vector<SceneConfig> configList;
for (rapidjson::SizeType i = 0; i < document.Size(); i++) {
configList.emplace_back(SceneConfig { .fileName = document[i]["input_scene"].GetString(), .outputMesh = document[i]["output_mesh"].GetString(), .outputScene = document[i]["output_scene"].GetString(), .outputMaterials = document[i]["output_materials"].GetString(), .outputBoxes = document[i].HasMember("output_boxes") ? document[i]["output_boxes"].GetString() : std::string(), .scale = (float)document[i]["scale"].GetDouble(), .calculateLODs = document[i]["calculate_LODs"].GetBool(), .mergeInstances = document[i]["merge_instances"].GetBool() });
}
return configList;
}
int main() {
fs::create_directory("data/out_textures");
const auto configs = readConfigFile("data/sceneconverter.json");
for (const auto& cfg: configs)
processScene(cfg);
return 0;
}
The actual heavy lifting is done inside processScene(). It loads a single scene file using Assimp and converts all the data into formats suitable for rendering. Let's look deeper to see how this is done:
std::vector<Mesh> g_meshes;
std::vector<BoundingBox> g_boxes;
std::vector<uint32_t> g_indexData;
std::vector<float> g_vertexData;
uint32_t g_indexOffset = 0;
uint32_t g_vertexOffset = 0;
void processScene(const SceneConfig& cfg) {
g_meshes.clear();
g_indexData.clear();
g_vertexData.clear();
g_indexOffset = 0;
g_vertexOffset = 0;
const size_t pathSeparator = cfg.fileName.find_last_of("/\");
const string basePath = (pathSeparator != string::npos) ? cfg.fileName.substr(0, pathSeparator + 1) : "";
The actual file is in another folder. We are going to need it later, when we deal with the textures.
const unsigned int flags = 0 | aiProcess_JoinIdenticalVertices | aiProcess_Triangulate | aiProcess_GenSmoothNormals | aiProcess_LimitBoneWeights | aiProcess_SplitLargeMeshes | aiProcess_ImproveCacheLocality | aiProcess_RemoveRedundantMaterials | aiProcess_FindDegenerates | aiProcess_FindInvalidData | aiProcess_GenUVCoords;
const aiScene* scene = aiImportFile(cfg.fileName.c_str(), flags);
g_meshes.reserve(scene->mNumMeshes);
for (unsigned int i = 0; i != scene->mNumMeshes; i++) {
Mesh = convertAIMesh(scene->mMeshes[i], cfg);
g_meshes.push_back(mesh);
if (!cfg.outputBoxes.empty()) {
BoundingBox box = calculateBoundingBox( g_vertexData.data()+mesh.vertexOffset, mesh.vertexCount);
g_boxes.push_back(box);
}
}
saveMeshesToFile(cfg.outputMesh.c_str());
if (!cfg.outputBoxes.empty())
saveBoundingBoxes( cfg.outputBoxes.c_str(), g_boxes);
std::vector<MaterialDescription> materials;
std::vector<std::string>& materialNames = ourScene.materialNames_;
std::vector<std::string> files;
std::vector<std::string> opacityMaps;
for (unsigned int m = 0; m < scene->mNumMaterials; m++) {
aiMaterial* mm = scene->mMaterials[m];
materialNames.push_back( std::string(mm->GetName().C_Str()));
MaterialDescription matDescription = convertAIMaterialToDescription( mm, files, opacityMaps);
materials.push_back(matDescription);
}
convertAndDownscaleAllTextures( materials, basePath, files, opacityMaps);
saveMaterials( cfg.outputMaterials.c_str(), materials, files);
traverse(scene, ourScene, scene->mRootNode, -1, 0);
saveScene(cfg.outputScene.c_str(), ourScene);
}
At this point, the data is ready for rendering. The output from running the conversion tool should look as follows:
Loading scene from 'deps/src/bistro/Exterior/exterior.obj'...
Converting meshes 1/22388...
... skipped ...
Loading scene from 'deps/src/bistro/Interior/interior.obj'...
Converting meshes 1/2381...
... skipped ...
If everything works as planned, the tool will output the converted mesh data to data/meshes and the packed textures to data/out_textures.
Our texture conversion code goes through all the textures, downscales them to 512x512 where necessary, and saves them in RGBA .png files. In a real-world content pipeline, this conversion process may include a texture compression phase. We recommend that you implement this as an exercise using the ETC2Comp library described in Chapter 2, Using Essential Libraries. Adding texture compression code directly to the convertTexture() function in Chapter7SceneConvertersrcmain.cpp should be the easiest way to go about this.
In the previous chapters, we implemented individual manual management for Vulkan resources in all our rendering classes. This recipe describes the system that manages all Vulkan-related objects and provides utility functions to create entities such as offscreen framebuffers, render passes, pipelines, textures, and storage buffers. All the functions described here will be used in the subsequent recipes.
The largest part of our resource management scene, which includes creating the descriptor set and update routines, is not included in this recipe. See the Unifying descriptor set creation routines recipe for additional implementation details.
The VulkanResources class contains a list of all the Vulkan objects. Its private part, along with a reference to VulkanRenderDevice, contains various std::vector members for storing our whole safari park of Vulkan objects. Let's take a look:
struct VulkanResources {
private:
VulkanRenderDevice& vkDev;
std::vector<VulkanTexture> allTextures;
std::vector<VulkanBuffer> allBuffers;
std::vector<VkFramebuffer> allFramebuffers;
std::vector<VkRenderPass> allRenderPasses;
std::vector<VkPipelineLayout> allPipelineLayouts;
std::vector<VkPipeline> allPipelines;
std::vector<VkDescriptorSetLayout> allDSLayouts;
std::vector<VkDescriptorPool> allDPools;
explicit VulkanResources(VulkanRenderDevice& vkDev)
: vkDev(vkDev) {}
~VulkanResources() {
for (auto& t: allTextures)
destroyVulkanTexture(vkDev.device, t);
for (auto& b: allBuffers) {
vkDestroyBuffer( vkDev.device, b.buffer, nullptr);
vkFreeMemory(vkDev.device, b.memory, nullptr);
}
for (auto& fb: allFramebuffers)
vkDestroyFramebuffer(vkDev.device, fb, nullptr);
for (auto& rp: allRenderPasses)
vkDestroyRenderPass(vkDev.device, rp, nullptr);
for (auto& ds: allDSLayouts)
vkDestroyDescriptorSetLayout( vkDev.device, ds, nullptr);
for (auto& pl: allPipelineLayouts)
vkDestroyPipelineLayout( vkDev.device, pl, nullptr);
for (auto& p: allPipelines)
vkDestroyPipeline(vkDev.device, p, nullptr);
for (auto& dpool: allDPools)
vkDestroyDescriptorPool( vkDev.device, dpool, nullptr);
}
inline void registerFramebuffer(VkFramebuffer fb) {
allFramebuffers.push_back(fb);
}
inline void registerRenderPass(VkRenderPass rp) {
allRenderPasses.push_back(rp);
}
In our previous examples, we loaded the textures in an ad hoc fashion, as well as created the image and sampler. Here, we will wrap the texture file loading code in a single method:
VulkanTexture loadTexture2D(const char* filename) {
VulkanTexture tex;
if (!createTextureImage(vkDev, filename, tex.image.image, tex.image.imageMemory)) {
printf("Cannot load %s 2D texture file ", filename);
exit(EXIT_FAILURE);
}
VkFormat format = VK_FORMAT_R8G8B8A8_UNORM;
transitionImageLayout(vkDev, tex.image.image, format, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
if (!createImageView(vkDev.device, tex.image.image, format, VK_IMAGE_ASPECT_COLOR_BIT, &tex.image.imageView)) {
printf("Cannot create image view for 2d texture (%s) ", filename);
exit(EXIT_FAILURE);
}
createTextureSampler(vkDev.device, &tex.sampler);
allTextures.push_back(tex);
return tex;
}
Along with loadTexture(), three other loading methods are provided for different types of textures:
VulkanTexture loadCubeMap( const char* fileName, uint32_t mipLevels);
VulkanTexture loadKTX(const char* fileName);
VulkanTexture createFontTexture(const char* fontFile);
The source code for loadCubeMap() is located in the UtilsVulkanPBRModelRenderer.cpp file. The only difference, as with all the loading routines, is that we are adding the created VulkanTexture to our allTextures container so that it will be deleted at the end of our program. The loadKTX() function is similar to the KTX file loading process that's described in the constructor of PBRModelRenderer.
After loading the texture data, we must create an image view in the VK_FORMAT_R16G16_SFLOAT format and add the created VulkanTexture to our allTextures array. The code for the createFontTexture() method can be found in the UtilsVulkanImGui.cpp file.
Let's look at some other helper functions that will make dealing with Vulkan objects somewhat easier:
VulkanBuffer addBuffer(VkDeviceSize size, VkBufferUsageFlags usage, VkMemoryPropertyFlags properties)
{
VulkanBuffer buffer = { .buffer = VK_NULL_HANDLE, .size = 0, .memory = VK_NULL_HANDLE };
if (!createSharedBuffer(vkDev, size, usage, properties, buffer.buffer, buffer.memory)) {
printf("Cannot allocate buffer ");
exit(EXIT_FAILURE);
} else {
buffer.size = size;
allBuffers.push_back(buffer);
}
return buffer;
}
VulkanTexture addColorTexture( int texWidth, int texHeight, VkFormat colorFormat)
{
const uint32_t w = (texWidth > 0) ? texWidth : vkDev.framebufferWidth;
const uint32_t h = (texHeight> 0) ? texHeight : vkDev.framebufferHeight;
VulkanTexture res = { .width = w, .height = h, .depth = 1, .format = colorFormat };
if (!createOffscreenImage(vkDev, res.image.image, res.image.imageMemory, w, h, colorFormat, 1, 0)) {
printf("Cannot create color texture ");
exit(EXIT_FAILURE);
}
createImageView(vkDev.device, res.image.image, colorFormat, VK_IMAGE_ASPECT_COLOR_BIT, &res.image.imageView);
createTextureSampler(vkDev.device, &res.sampler);
transitionImageLayout(vkDev, res.image.image, colorFormat, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
allTextures.push_back(res);
return res;
}
Rendering to an offscreen depth texture is used for shadow mapping and approximating ambient occlusion. The routine is almost the same as addColorTexture(), but depthFormat and image usage flags must be different. We must also explicitly specify the image layout to avoid performance warnings from validation layers. Let's take a look:
VulkanTexture addDepthTexture(int texWidth, int texHeight, VkImageLayout layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL)
{
const uint32_t w = (texWidth > 0) ? texWidth : vkDev.framebufferWidth;
const uint32_t h = (texHeight > 0) ? texHeight : vkDev.framebufferHeight;
const VkFormat depthFormat = findDepthFormat(vkDev.physicalDevice);
VulkanTexture depth = {
.width = w, .height = h, .depth = 1, .format =
depthFormat
};
if (!createImage(vkDev.device, vkDev.physicalDevice, w, h, depthFormat, VK_IMAGE_TILING_OPTIMAL, VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, depth.image.image, depth.image.imageMemory)) {
printf("Cannot create depth texture ");
exit(EXIT_FAILURE);
}
createImageView(vkDev.device, depth.image.image, depthFormat, VK_IMAGE_ASPECT_DEPTH_BIT, &depth.image.imageView);
transitionImageLayout(vkDev, depth.image.image, depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, layout);
if (!createDepthSampler( vkDev.device, &depth.sampler)) {
printf("Cannot create a depth sampler");
exit(EXIT_FAILURE);
}
allTextures.push_back(depth);
return depth;
}
struct RenderPass {
RenderPass() = default;
explicit RenderPass(VulkanRenderDevice& device, bool useDepth = true, const RenderPassCreateInfo& ci = RenderPassCreateInfo()): info(ci)
{
if (!createColorAndDepthRenderPass( vkDev, useDepth, &handle, ci)) {
printf("Failed to create render pass ");
exit(EXIT_FAILURE);
}
}
RenderPassCreateInfo info;
VkRenderPass handle = VK_NULL_HANDLE;
};
Creating the framebuffer is a frequent operation, so to make our rendering initialization code shorter, we must implement the addFramebuffer() function, which takes a render pass object and a list of attachments to create a framebuffer:
VkFramebuffer addFramebuffer( RenderPass, const std::vector<VulkanTexture>& images)
{
VkFramebuffer framebuffer;
std::vector<VkImageView> attachments;
for (const auto& i: images)
attachments.push_back(i.image.imageView);
VkFramebufferCreateInfo fbInfo = { .sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO, .pNext = nullptr, .flags = 0, .renderPass = renderPass.handle, .attachmentCount = (uint32_t)attachments.size(), .pAttachments = attachments.data(), .width = images[0].width, .height = images[0].height, .layers = 1 };
if (vkCreateFramebuffer( vkDev.device, &fbInfo, nullptr, &framebuffer) != VK_SUCCESS) {
printf("Unable to create offscreen framebuffer ");
exit(EXIT_FAILURE);
}
allFramebuffers.push_back(framebuffer);
return framebuffer;
}
RenderPass addRenderPass( const std::vector<VulkanTexture>& outputs, const RenderPassCreateInfo ci = { .clearColor_ = true, .clearDepth_ = true, .flags_ = eRenderPassBit_Offscreen | eRenderPassBit_First }, bool useDepth = true)
{
VkRenderPass renderPass;
if (outputs.empty()) {
printf("Empty list of output attachments for RenderPass ");
exit(EXIT_FAILURE);
}
if (outputs.size() == 1) {
if (!createColorOnlyRenderPass( vkDev, &renderPass, ci, outputs[0].format)) {
printf("Unable to create offscreen color-only pass ");
exit(EXIT_FAILURE);
}
} else {
if (!createColorAndDepthRenderPass( vkDev, useDepth && (outputs.size() > 1), &renderPass, ci, outputs[0].format)) {
printf("Unable to create offscreen render pass ");
exit(EXIT_FAILURE);
}
}
allRenderPasses.push_back(renderPass);
RenderPass rp;
rp.info = ci;
rp.handle = renderPass;
return rp;
}
RenderPass addDepthRenderPass( const std::vector<VulkanTexture>& outputs, const RenderPassCreateInfo ci = { .clearColor_ = false, .clearDepth_ = true, .flags_ = eRenderPassBit_Offscreen | eRenderPassBit_First })
{
VkRenderPass renderPass;
if (!createDepthOnlyRenderPass( vkDev, &renderPass, ci)) {
printf("Unable to create offscreen render pass ");
exit(EXIT_FAILURE);
}
allRenderPasses.push_back(renderPass);
RenderPass rp;
rp.info = ci;
rp.handle = renderPass;
return rp;
}
std::vector<VkFramebuffer> addFramebuffers( VkRenderPass renderPass, VkImageView depthView = VK_NULL_HANDLE)
{
RenderPass std::vector<VkFramebuffer> framebuffers;
createColorAndDepthFramebuffers(vkDev, renderPass, depthView, framebuffers);
for (auto f : framebuffers)
allFramebuffers.push_back(f);
return framebuffers;
}
RenderPass addFullScreenPass( bool useDepth = true, const RenderPassCreateInfo& ci = RenderPassCreateInfo())
{
RenderPass result(vkDev, useDepth, ci);
allRenderPasses.push_back(result.handle);
return result;
}
struct PipelineInfo {
uint32_t width = 0;
uint32_t height = 0;
VkPrimitiveTopology topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
bool useDepth = true;
bool useBlending = true;
bool dynamicScissorState = false;
};
VkPipelineLayout addPipelineLayout( VkDescriptorSetLayout dsLayout, uint32_t vtxConstSize = 0, uint32_t fragConstSize = 0)
{
VkPipelineLayout pipelineLayout;
if (!createPipelineLayoutWithConstants( vkDev.device, dsLayout, &pipelineLayout, vtxConstSize, fragConstSize)) {
printf("Cannot create pipeline layout ");
exit(EXIT_FAILURE);
}
allPipelineLayouts.push_back(pipelineLayout);
return pipelineLayout;
}
VkPipeline addPipeline(
VkRenderPass renderPass,
VkPipelineLayout pipelineLayout,
const std::vector<const char*>& shaderFiles,
const PipelineInfo& pipelineParams = PipelineInfo { .width = 0, .height = 0, .topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST, .useDepth = true, .useBlending = false, .dynamicScissorState = false })
{
VkPipeline pipeline;
if (!createGraphicsPipeline(vkDev, renderPass, pipelineLayout, shaderFiles, &pipeline, ppInfo.topology,ppInfo.useDepth, ppInfo.useBlending, ppInfo.dynamicScissorState, ppInfo.width, ppInfo.height)) {
printf("Cannot create graphics pipeline ");
exit(EXIT_FAILURE);
}
allPipelines.push_back(pipeline);
return pipeline;
}
The only instance of the VulkanResources class resides in the VulkanRenderContext structure, which will be described in the next recipe. All the resources are deleted strictly the global VkDevice object encapsulated in VulkanRenderDevice is destroyed.
The examples in this chapter heavily rely on indirect rendering, so individual mesh rendering is hidden within our scene graph. However, if you wish to update the sample code from Chapter 3, Getting Started with OpenGL and Vulkan, and use it for direct mesh geometry manipulation, the addVertexBuffer() method has been provided. The mesh geometry uploading code is similar to the createTexturedVertexBuffer() and createPBRVertexBuffer() functions we described in previous chapters:
VulkanBuffer addVertexBuffer(uint32_t indexBufferSize, const void* indexData, uint32_t vertexBufferSize, const void* vertexData)
{
VulkanBuffer result;
result.size = allocateVertexBuffer(vkDev, &result.buffer, &result.memory, vertexBufferSize, vertexData, indexBufferSize, indexData);
allBuffers.push_back(result);
return result;
}
The last important issue in terms of resource management is the descriptor set creation routines. This will be covered in the Unifying descriptor set creation routines recipe of this chapter.
We typically use VulkanResources in the constructors of different Renderer classes. The Putting it all together into a Vulkan application recipe will show you how our resource management fits the general application code.
Starting from Chapter 3, Getting Started with OpenGL and Vulkan, we introduced an ad hoc rendering loop for each demo application, which resulted in significant code duplication. Let's revisit this topic and learn how to create multiple rendering passes for Vulkan without too much boilerplate code.
Before completing this recipe, make sure to revisit the Putting it all together into a Vulkan application recipe of Chapter 3, Getting Started with OpenGL and Vulkan, as well as all the related recipes.
The goal of this recipe is to improve the rendering framework to avoid code repetition in renderers, as well as to simplify our rendering setup. In the next recipe, as a useful side effect, we will use a system capable of setting up and composing multiple rendering passes without too much hustle.
The main function for all our upcoming demos should consist of just three lines:
int main() {
MyApp app;
app.mainLoop();
return 0;
}
Let's take a look at how to organize the MyApp class for this purpose:
class MyApp: public VulkanApp {
public:
MyApp()
... field initializers list ...
... rendering sequence setup ...
The base class constructor creates a GLFW window and initializes a Vulkan rendering surface, just like we did previously throughout Chapter 3, Getting Started with OpenGL and Vulkan, to Chapter 6, Physically Based Rendering Using the glTF2 Shading Model.
void draw3D() override {
... whatever render control commands required ...
}
void drawUI() override {
... ImGUI commands …
}
void update(float deltaSeconds) override {
... update whatever needs to be updated ...
}
private:
... Vulkan buffers, scene geometry, textures etc....
... e.g., some texture: VulkanTexture envMap ...
... whatever renderers an app needs ...
MultiRenderer;
GuiRenderer imgui;
};
Having said this, let's see how the VulkanApp class wraps all the initialization and uses the previously defined VulkanResources.
The application class relies on previously developed functions and some new items that we must describe before implementing VulkanApp itself. In the previous chapters, we used VulkanRenderDevice as a simple C structure and called all the initialization routines explicitly in every sample. Following the C++ resource acquisition is initialization (RAII) paradigm, we must wrap these calls with the constructors and destructors of the helper class:
struct VulkanContextCreator {
VulkanInstance& instance;
VulkanRenderDevice& vkDev;
VulkanContextCreator(VulkanInstance& vk, VulkanRenderDevice& dev, void* window, int screenWidth, int screenHeight):instance(vk), vkDev(dev)
{
createInstance(&vk.instance);
if (!setupDebugCallbacks(vk.instance, &vk.messenger, vk.reportCallback) || glfwCreateWindowSurface(vk.instance, (GLFWwindow *)window, nullptr, &vk.surface) || !initVulkanRenderDevice3( vk, dev, screenWidth, screenHeight))
exit(EXIT_FAILURE);
}
~VulkanContextCreator() {
destroyVulkanRenderDevice(vkDev);
destroyVulkanInstance(instance);
}
};
The Vulkan instance and device alone are not enough to render anything: we must declare a basic rendering interface and combine multiple renderers in one frame.
In the previous chapter, we figured out one way to implement a generic interface for a Vulkan renderer. Let's take a look once more:
struct Renderer {
Renderer(VulkanRenderContext& c);
virtual void fillCommandBuffer( VkCommandBuffer cmdBuffer, size_t currentImage, VkFramebuffer fb = VK_NULL_HANDLE, VkRenderPass rp = VK_NULL_HANDLE) = 0;
virtual void updateBuffers(size_t currentImage) {}
};
The details of our implementation are provided in the subsequent recipe, Working with rendering passes.
struct RenderItem {
Renderer& renderer_;
bool enabled_ = true;
bool useDepth_ = true;
explicit RenderItem( Renderer& r, bool useDepth = true)
: renderer_(r)
, useDepth_(useDepth)
{}
};
struct VulkanRenderContext {
VulkanInstance vk;
VulkanRenderDevice vkDev;
VulkanContextCreator ctxCreator;
VulkanResources resources;
std::vector<RenderItem> onScreenRenderers_;
VulkanTexture depthTexture;
RenderPass screenRenderPass;
RenderPass screenRenderPass_NoDepth;
RenderPass clearRenderPass, finalRenderPass;
std::vector<VkFramebuffer> swapchainFramebuffers;
std::vector<VkFramebuffer>
swapchainFramebuffers_NoDepth;
VulkanRenderContext(void* window, uint32_t screenWidth, uint32_t screenHeight)
: ctxCreator(vk, vkDev, window, screenWidth, screenHeight)
, resources(vkDev)
, depthTexture(resources.addDepthTexture(0, 0, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL))
, screenRenderPass(resources.addFullScreenPass())
, screenRenderPass_NoDepth( resources.addFullScreenPass(false))
, finalRenderPass(resources.addFullScreenPass( true, RenderPassCreateInfo { .clearColor_ = false, .clearDepth_ = false, .flags_ = eRenderPassBit_Last }))
, clearRenderPass(resources.addFullScreenPass( true, RenderPassCreateInfo { .clearColor_ = true, .clearDepth_ = true, .flags_ = eRenderPassBit_First }))
, swapchainFramebuffers( resources.addFramebuffers( screenRenderPass.handle, depthTexture.image.imageView))
, swapchainFramebuffers_NoDepth( resources.addFramebuffers( screenRenderPass_NoDepth.handle))
{}
void updateBuffers(uint32_t imageIndex) {
for (auto& r : onScreenRenderers_)
if (r.enabled_)
r.renderer_.updateBuffers(imageIndex);
}
void beginRenderPass( VkCommandBuffer cmdBuffer, VkRenderPass pass, size_t currentImage, const VkRect2D area, VkFramebuffer fb = VK_NULL_HANDLE, uint32_t clearValueCount = 0, const VkClearValue* clearValues = nullptr)
{
const VkRenderPassBeginInfo renderPassInfo = { .sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO, .renderPass = pass, .framebuffer = (fb != VK_NULL_HANDLE) ? fb : swapchainFramebuffers[currentImage], .renderArea = area, .clearValueCount = clearValueCount, .pClearValues = clearValues };
vkCmdBeginRenderPass( cmdBuffer, &renderPassInfo, VK_SUBPASS_CONTENTS_INLINE);
}
};
Now, let's see how our frame composition works. It is similar to what we did in Chapter 5, Working with Geometry Data, where we had multiple renderers. The following code can and should only be considered as a refactoring. The added complexity is due to the offscreen rendering support that we need for the next few remaining chapters:
void VulkanRenderContext::composeFrame( VkCommandBuffer commandBuffer, uint32_t imageIndex)
{
const VkRect2D defaultScreenRect { .offset = { 0, 0 }, .extent = { .width = vkDev.framebufferWidth, .height = vkDev.framebufferHeight } };
static const VkClearValue defaultClearValues[] = { VkClearValue { .color = {1.f, 1.f, 1.f, 1.f} }, VkClearValue { .depthStencil = {1.f, 0} } };
beginRenderPass(commandBuffer, clearRenderPass.handle, imageIndex, defaultScreenRect, VK_NULL_HANDLE, 2u, defaultClearValues);
vkCmdEndRenderPass( commandBuffer );
for (auto& r : onScreenRenderers_)
if (r.enabled_) {
RenderPass rp = r.useDepth_ ? screenRenderPass : screenRenderPass_NoDepth;
VkFramebuffer fb = (r.useDepth_ ? swapchainFramebuffers: swapchainFramebuffers_NoDepth)[imageIndex];
if (r.renderer_.renderPass_.handle != VK_NULL_HANDLE)
rp = r.renderer_.renderPass_;
if (r.renderer_.framebuffer_ != VK_NULL_HANDLE)
fb = r.renderer_.framebuffer_;
r.renderer_.fillCommandBuffer( commandBuffer, imageIndex, fb, rp.handle);
}
beginRenderPass(commandBuffer, finalRenderPass.handle, imageIndex, defaultScreenRect);
vkCmdEndRenderPass(commandBuffer);
}
This concludes the definition of our helper classes for the new frame composition framework. Now, we have everything in place to define the application structure:
class VulkanApp {
protected:
struct MouseState {
glm::vec2 pos = glm::vec2(0.0f);
bool pressedLeft = false;
} mouseState_;
Resolution_;
GLFWwindow* window_ = nullptr;
VulkanRenderContext ctx_;
std::vector<RenderItem>& onScreenRenderers_;
public:
VulkanApp(int screenWidth, int screenHeight)
: window_(initVulkanApp( screenWidth, screenHeight, &resolution_))
, ctx_(window_, resolution_.width, resolution_.height)
, onScreenRenderers_(ctx_.onScreenRenderers_)
{
glfwSetWindowUserPointer(window_, this);
assignCallbacks();
}
~VulkanApp() {
glslang_finalize_process();
glfwTerminate();
}
virtual void drawUI() {}
virtual void draw3D() = 0;
virtual void update(float deltaSeconds) = 0;
For example, the CameraApp class, described later in this recipe, calls a 3D camera position update routine, while the physics simulation recipe calls physics simulation routines.
void mainLoop() {
double timeStamp = glfwGetTime();
float deltaSeconds = 0.0f;
do {
update(deltaSeconds);
const double newTimeStamp = glfwGetTime();
deltaSeconds = newTimeStamp - timeStamp;
timeStamp = newTimeStamp;
drawFrame(ctx_.vkDev, [this](uint32_t img) { this->updateBuffers(img); },
[this](uint32_t b, uint32_auto img) { ctx_.composeFrame(b, img); } );
glfwPollEvents();
vkDeviceWaitIdle(ctx_.vkDev.device);
} while (!glfwWindowShouldClose(window_));
}
The final part of the public interface of VulkanApp is related to UI event handling:
inline bool shouldHandleMouse() const
{ return !ImGui::GetIO().WantCaptureMouse; }
virtual void handleKey(int key, bool pressed) = 0;
virtual void handleMouseClick( int button, bool pressed) {
if (button == GLFW_MOUSE_BUTTON_LEFT)
mouseState_.pressedLeft = pressed;
}
virtual void handleMouseMove(float mx, float my) {
mouseState_.pos = glm::vec2(mx, my);
}
To complete the description of the new VulkanApp class, let's look at its implementation details:
private:
void assignCallbacks() {
… set mouse callbacks (not shown here) …
glfwSetKeyCallback(window_, [](GLFWwindow* window, int key, int scancode, int action, int mods) {
const bool pressed = action != GLFW_RELEASE;
if (key == GLFW_KEY_ESCAPE && pressed)
glfwSetWindowShouldClose(window, GLFW_TRUE);
The only modification we've made to the handler's code is for a custom pointer from GLFW's window_ to be extracted. The only predefined key is the Esc key. When the user presses it, we exit the application.
void* ptr = glfwGetWindowUserPointer(window);
reinterpret_cast<VulkanApp*>( ptr)->handleKey(key, pressed);
});
}
void updateBuffers(uint32_t imageIndex) {
ImGuiIO& io = ImGui::GetIO();
io.DisplaySize = ImVec2((float)ctx_.vkDev.framebufferWidth, (float)ctx_.vkDev.framebufferHeight);
ImGui::NewFrame();
drawUI();
ImGui::Render();
draw3D();
ctx_.updateBuffers(imageIndex);
}
};
The following recipes contain numerous examples of this function's implementations. The call to the previously described VulkanRenderContext::updateBuffers() concludes this function.
Our VulkanApp class is now complete, but there are still some pure virtual methods that prevent us from using it directly. A derived CameraApp class will be used as a base for all the future examples in this book:
struct CameraApp: public VulkanApp {
CameraApp(int screenWidth, int screenHeight)
: VulkanApp(screenWidth, screenHeight)
, positioner(vec3(0.0f, 5.0f, 10.0f)
, vec3(0.0f, 0.0f, -1.0f), vec3(0.0f, -1.0f, 0.0f))
, camera(positioner)
{}
virtual void update(float deltaSeconds) override {
positioner.update(deltaSeconds, mouseState_.pos, shouldHandleMouse() && mouseState_.pressedLeft);
}
glm::mat4 getDefaultProjection() const {
const float ratio = ctx_.vkDev.framebufferWidth / (float)ctx_.vkDev.framebufferHeight;
return glm::perspective( glm::pi<float>() / 4.0f, ratio, 0.1f, 1000.0f);
}
virtual void handleKey(int key, bool pressed) override {
if (key == GLFW_KEY_W)
positioner.movement_.forward_ = pressed;
… handle the rest of camera keys similarly …
}
All the keys are handled just as in the recipes from Chapter 3, Getting Started with OpenGL and Vulkan, through Chapter 6, Physically Based Rendering Using the glTF2 Shading Model.
protected:
CameraPositioner_FirstPerson positioner;
Camera;
};
The next recipe concentrates on implementing the Renderer interface based on the samples from previous chapters.
In Chapter 4, Adding User Interaction and Productivity Tools, we introduced our "layered" frame composition, which we will now refine and extend. These modifications will allow us to do offscreen rendering and significantly simplify initialization and Vulkan object management.
This recipe will describe the rendering interface that's used by the VulkanApp class. At the end of this recipe, a few concrete classes will be presented that can render quadrilaterals and the UI. Please revisit the previous two recipes to see how the Renderer interface fits in the new framework.
Check out the Putting it all together into a Vulkan application recipe of Chapter 4, Adding User Interaction and Productivity Tools, to refresh your memory on how our "layered" frame composition works.
Each of the frame rendering passes is represented by an instance of the Renderer class. The list of references to these instances is stored in VulkanRenderContext. The usage of these instances was thoroughly discussed in the previous recipe:
struct Renderer {
Renderer(VulkanRenderContext& c)
: processingWidth(c.vkDev.framebufferWidth)
, processingHeight(c.vkDev.framebufferHeight)
, ctx_(c)
{}
virtual void fillCommandBuffer( VkCommandBuffer cmdBuffer, size_t currentImage, VkFramebuffer fb = VK_NULL_HANDLE, VkRenderPass rp = VK_NULL_HANDLE) = 0;
virtual void updateBuffers(size_t currentImage) {}
inline void updateUniformBuffer( uint32_t currentImage, const uint32_t offset, const uint32_t size, const void* data)
{
uploadBufferData(ctx_.vkDev, uniforms_[currentImage].memory, offset, data, size);
}
void initPipeline( const std::vector<const char*>& shaders, const PipelineInfo& pInfo, uint32_t vtxConstSize = 0, uint32_t fragConstSize = 0)
{
pipelineLayout_ = ctx_.resources.addPipelineLayout( descriptorSetLayout_, vtxConstSize, fragConstSize);
graphicsPipeline_ = ctx_.resources.addPipeline( renderPass_.handle, pipelineLayout_, shaders, pInfo);
}
PipelineInfo initRenderPass( const PipelineInfo& pInfo, const std::vector<VulkanTexture>& outputs, RenderPass = RenderPass(), RenderPass fallbackPass = RenderPass())
{
PipelineInfo outInfo = pInfo;
if (!outputs.empty()) {
processingWidth = outputs[0].width;
processingHeight = outputs[0].height;
outInfo.width = processingWidth;
outInfo.height = processingHeight;
bool hasHandle = renderPass.handle != VK_NULL_HANDLE;
bool hasDepth = (outputs.size() == 1)) && isDepthFormat(outputs[0].format);
renderPass_ = hasHandle ? renderPass : ((hasDepth ? ctx_.resources.addDepthRenderPass(outputs) : ctx_.resources.addRenderPass(outputs));
framebuffer_ = ctx_.resources.addFramebuffer( renderPass_, outputs);
} else {
renderPass_ = hasHandle ? renderPass : fallbackPass;
}
return outInfo;
}
The last helper function we need in all our renderers is the beginRenderPass() function, which adds the appropriate commands to start a rendering pass.
void beginRenderPass( VkRenderPass rp, VkFramebuffer fb, VkCommandBuffer commandBuffer, size_t currentImage)
{
const VkClearValue clearValues[2] = { VkClearValue { .color = {1.f, 1.f, 1.f, 1.f} }, VkClearValue { .depthStencil = {1.f, 0} } };
const VkRect2D rect { .offset = { 0, 0 }, .extent = { .width = processingWidth, .height = processingHeight } };
ctx_.beginRenderPass( commandBuffer, rp, currentImage, rect, fb, (renderPass_.info.clearColor_ ? 1u : 0u) + (renderPass_.info.clearDepth_ ? 1u : 0u), renderPass_.info.clearColor_ ? &clearValues[0] : (renderPass_.info.clearDepth_ ? &clearValues[1] : nullptr));
vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, graphicsPipeline_);
vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayout_, 0, 1, &descriptorSets_[currentImage], 0, nullptr);
}
VkFramebuffer framebuffer_ = nullptr;
RenderPass_;
uint32_t processingWidth;
uint32_t processingHeight;
protected:
VulkanRenderContext& ctx_;
VkDescriptorSetLayout descriptorSetLayout_ = nullptr;
VkDescriptorPool descriptorPool_ = nullptr;
std::vector<VkDescriptorSet> descriptorSets_;
VkPipelineLayout pipelineLayout_ = nullptr;
VkPipeline graphicsPipeline_ = nullptr;
std::vector<VulkanBuffer> uniforms_;
};
With all the components in place, we are now ready to begin implementing renderers.
In the next chapter, we will use offscreen rendering. For debugging, we could output the contents of a texture to some part of the screen. The QuadRenderer class, which is derived from the base Renderer, provides a way to output textured quads. In Chapter 8, Image-Based Techniques, we will use this class to output postprocessed frames:
struct QuadRenderer: public Renderer {
QuadRenderer(VulkanRenderContext& ctx, const std::vector<VulkanTexture>& textures, const std::vector<VulkanTexture>& outputs = {}, RenderPass screenRenderPass = RenderPass())
: Renderer(ctx)
{
const PipelineInfo pInfo = initRenderPass( PipelineInfo {}, outputs, screenRenderPass, ctx.screenRenderPass_NoDepth);
uint32_t vertexBufferSize = MAX_QUADS * 6 * sizeof(VertexData);
const size_t imgCount = ctx.vkDev.swapchainImages.size();
descriptorSets_.resize(imgCount);
storages_.resize(imgCount);
DescriptorSetInfo dsInfo = { .buffers = { storageBufferAttachment(VulkanBuffer {}, 0, vertexBufferSize, VK_SHADER_STAGE_VERTEX_BIT) }, .textureArrays = { fsTextureArrayAttachment(textures) } };
For a complete explanation of the descriptor set creation process, see the Unifying descriptor set creation routines recipe.
descriptorSetLayout_ = ctx.resources.addDescriptorSetLayout(dsInfo);
descriptorPool_ = ctx.resources.addDescriptorPool(dsInfo, imgCount);
for (size_t i = 0 ; i < imgCount ; i++) {
storages_[i] = ctx.resources.addStorageBuffer( vertexBufferSize);
dsInfo.buffers[0].buffer = storages_[i];
descriptorSets_[i] = ctx.resources.addDescriptorSet( descriptorPool_, descriptorSetLayout_);
ctx.resources.updateDescriptorSet( descriptorSets_[i], dsInfo);
}
initPipeline({ "data/shaders/chapter08/VK02_QuadRenderer.vert", "data/shaders/chapter08/VK02_QuadRenderer.frag"}, pInfo);
}
void fillCommandBuffer( VkCommandBuffer cmdBuffer, size_t currentImage, VkFramebuffer fb = VK_NULL_HANDLE, VkRenderPass rp = VK_NULL_HANDLE) override
{
if (quads_.empty()) return;
bool hasRP = rp != VK_NULL_HANDLE;
bool hasFB = fb != VK_NULL_HANDLE;
beginRenderPass(hasRP ? rp : renderPass_.handle, hasFB ? fb : framebuffer_, commandBuffer, currentImage);
vkCmdDraw(commandBuffer, static_cast<uint32_t>(quads_.size()), 1, 0, 0);
vkCmdEndRenderPass(commandBuffer);
}
void updateBuffers(size_t currentImage) override {
if (quads_.empty()) return;
uploadBufferData(ctx_.vkDev, storages_[currentImage].memory, 0, quads_.data(), quads_.size() * sizeof(VertexData));
}
void quad( float x1, float y1, float x2, float y2, int texIdx) {
VertexData v1 { { x1, y1, 0 }, { 0, 0 }, texIdx }; VertexData v2 { { x2, y1, 0 }, { 1, 0 }, texIdx }; VertexData v3 { { x2, y2, 0 }, { 1, 1 }, texIdx }; VertexData v4 { { x1, y2, 0 }, { 0, 1 }, texIdx };
quads_.push_back(v1); quads_.push_back(v2); quads_.push_back(v3); quads_.push_back(v1); quads_.push_back(v3); quads_.push_back(v4);
}
void clear() { quads_.clear(); }
private:
struct VertexData {
glm::vec3 pos;
glm::vec2 tc;
int texIdx;
};
std::vector<VertexData> quads_;
std::vector<VulkanBuffer> storages_;
};
Now that we have finished the C++ part of the code, let's take a look at the GLSL part, as well as the VKArrayTextures.vert and VKArrayTextures.frag shaders we mentioned in the constructor of the QuadRenderer class:
layout(location = 0) out vec2 out_uv;
layout(location = 1) flat out uint out_texIndex;
struct DrawVert {
float x, y, z, u, v;
uint texIdx;
};
layout(binding = 0) readonly buffer SBO { DrawVert data[]; } sbo;
void main() {
uint idx = gl_VertexIndex;
DrawVert v = sbo.data[idx];
out_uv = vec2(v.u, v.v);
out_texIndex = v.texIdx;
gl_Position = vec4(vec2(v.x, v.y), 0.0, 1.0);
}
The fragment shader uses an array of textures to color the pixel:
#extension GL_EXT_nonuniform_qualifier : require
layout (binding = 1) uniform sampler2D textures[];
layout (location = 0) in vec2 in_uv;
layout (location = 1) flat in uint in_texIndex;
layout (location = 0) out vec4 outFragColor;
const uint depthTextureMask = 0xFFFF;
float linearizeDepth(float d, float zNear, float zFar) {
return zNear * zFar / (zFar + d * (zNear - zFar));
}
void main() {
uint tex = in_texIndex & depthTextureMask;
uint texType = (in_texIndex >> 16) & depthTextureMask;
vec4 value = texture( textures[nonuniformEXT(tex)], in_uv);
outFragColor = (texType == 0 ? value : vec4( vec3(linearizeDepth(value.r, 0.01, 100.0)), 1.0));
}
The next recipe will conclude our review of the new rendering framework by describing the routines for creating the descriptor set.
Similar to QuadRenderer, the new LineCanvas class is (re)implemented by following the code of VulkanCanvas from the Implementing an immediate mode drawing canvas recipe of Chapter 4, Adding User Interaction and Productivity Tools. The only thing that has changed is the simplified constructor code, which is now using the new resource management scheme and descriptor set initialization.
The GuiRenderer class also (re)implements ImGuiRenderer from the Rendering the Dear ImGui user interface with Vulkan recipe of Chapter 4, Adding User Interaction and Productivity Tools, and adds support for multiple textures. The usage of this class will be shown in most of the upcoming examples in this book.
Before we can complete our material system implementation using the Vulkan API, we must reconsider the descriptor set creation routines in all the previous recipes. The reason we didn't implement the most generic routines right away is simple: as with all the examples, we decided to follow the "natural" evolution of the code instead of provided all the solutions at the beginning. The routines presented in this recipe complete the resource management system for this book.
The source code for this recipe can be found in the shared/vkFramework/VulkanResources.h and shared/vkFramework/VulkanResources.cpp files.
The Managing Vulkan resources recipe from this chapter introduced the VulkanResources class, which contains all our allocated Vulkan objects. The descriptor sets and descriptor pools are also allocated by the methods of this class. Here is a list of requirements for descriptor set management:
Let's construct a system that addresses all these requirements:
struct DescriptorSetInfo {
std::vector<BufferAttachment> buffers;
std::vector<TextureAttachment> textures;
std::vector<TextureArrayAttachment> textureArrays;
};
struct DescriptorInfo {
VkDescriptorType type;
VkShaderStageFlags shaderStageFlags;
};
struct BufferAttachment {
DescriptorInfo dInfo;
VulkanBuffer buffer;
uint32_t offset;
uint32_t size;
};
struct TextureAttachment {
DescriptorInfo dInfo;
VulkanTexture texture;
};
struct TextureArrayAttachment {
DescriptorInfo dInfo;
std::vector<VulkanTexture> textures;
};
struct VulkanBuffer {
VkBuffer buffer;
VkDeviceSize size;
VkDeviceMemory memory;
};
For the purposes of this book, such aggregation is more than enough, but in a multi-GPU configuration, a separate form of device memory and buffer handles may be needed.
Next, let's look at VulkanTexture, a helper for aggregating VulkanImage and VkSampler so that we don't have to pass multiple objects as function parameters:
struct VulkanTexture final {
uint32_t width;
uint32_t height;
uint32_t depth;
VkFormat format;
VulkanImage image;
VkSampler sampler;
VkImageLayout desiredLayout;
};
A descriptor set needs a layout, so the following steps create VkDescriptorSetLayout from our DescriptorSetInfo structure. Notice that at this point, we are omitting the actual buffer handles from attachment descriptions:
VkDescriptorSetLayout VulkanResources::addDescriptorSetLayout( const DescriptorSetInfo& dsInfo)
{
VkDescriptorSetLayout descriptorSetLayout;
std::vector<VkDescriptorBindingFlagsEXT> descriptorBindingFlags;
uint32_t bindingIdx = 0;
std::vector<VkDescriptorSetLayoutBinding> bindings;
for (const auto& b: dsInfo.buffers) {
descriptorBindingFlags.push_back(0u);
bindings.push_back(descriptorSetLayoutBinding( bindingIdx++, b.dInfo.type, b.dInfo.shaderStageFlags));
}
for (const auto& i: dsInfo.textures) {
descriptorBindingFlags.push_back(0u);
bindings.push_back(descriptorSetLayoutBinding( bindingIdx++, i.dInfo.type, i.dInfo.shaderStageFlags));
}
for (const auto& t: dsInfo.textureArrays) {
bindings.push_back( descriptorSetLayoutBinding(bindingIdx++, VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, t.dInfo.shaderStageFlags, static_cast<uint32_t>(t.textures.size())));
}
const VkDescriptorSetLayoutBindingFlagsCreateInfoEXT setLayoutBindingFlags = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET _LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT, .bindingCount = static_cast<uint32_t>( descriptorBindingFlags.size()), .pBindingFlags = descriptorBindingFlags.data() };
const VkDescriptorSetLayoutCreateInfo layoutInfo = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR _SET_LAYOUT_CREATE_INFO, .pNext = dsInfo.textureArrays.empty() ? nullptr : &setLayoutBindingFlags, .flags = 0, .bindingCount = static_cast<uint32_t>(bindings.size()), .pBindings = bindings.size() > 0 ? bindings.data() : nullptr
};
if (vkCreateDescriptorSetLayout(vkDev.device, &layoutInfo, nullptr, &descriptorSetLayout) != VK_SUCCESS) {
printf("Failed to create descriptor set layout ");
exit(EXIT_FAILURE);
}
allDSLayouts.push_back(descriptorSetLayout);
return descriptorSetLayout;
}
To correctly allocate the correct descriptor set, we need a descriptor pool with enough handles for buffers and textures:
VkDescriptorPool VulkanResources::addDescriptorPool( const DescriptorSetInfo& dsInfo, uint32_t dSetCount)
{
uint32_t uniformBufferCount = 0;
uint32_t storageBufferCount = 0;
uint32_t samplerCount = static_cast<uint32_t>(dsInfo.textures.size());
for (const auto& ta : dsInfo.textureArrays)
samplerCount += static_cast<uint32_t>(ta.textures.size());
for (const auto& b: dsInfo.buffers) {
if (b.dInfo.type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER)
uniformBufferCount++;
if (b.dInfo.type == VK_DESCRIPTOR_TYPE_STORAGE_BUFFER)
storageBufferCount++;
}
std::vector<VkDescriptorPoolSize> poolSizes;
if (uniformBufferCount)
poolSizes.push_back(VkDescriptorPoolSize{ .type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, .descriptorCount = dSetCount * uniformBufferCount });
if (storageBufferCount)
poolSizes.push_back( VkDescriptorPoolSize{ .type = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, .descriptorCount = dSetCount * storageBufferCount });
if (samplerCount)
poolSizes.push_back( VkDescriptorPoolSize{ .type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, .descriptorCount = dSetCount * samplerCount });
const VkDescriptorPoolCreateInfo poolInfo = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, .pNext = nullptr, .flags = 0, .maxSets = static_cast<uint32_t>(dSetCount), .poolSizeCount = static_cast<uint32_t>(poolSizes.size()), .pPoolSizes = poolSizes.empty() ? nullptr : poolSizes.data() };
VkDescriptorPool descriptorPool = VK_NULL_HANDLE;
if (vkCreateDescriptorPool(vkDev.device, &poolInfo, nullptr, &descriptorPool) != VK_SUCCESS) {
printf("Cannot allocate descriptor pool ");
exit(EXIT_FAILURE);
}
allDPools.push_back(descriptorPool);
return descriptorPool;
}
VkDescriptorSet VulkanResources::addDescriptorSet( VkDescriptorPool descriptorPool, VkDescriptorSetLayout dsLayout)
{
VkDescriptorSet descriptorSet;
const VkDescriptorSetAllocateInfo allocInfo = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, .pNext = nullptr, .descriptorPool = descriptorPool, .descriptorSetCount = 1, .pSetLayouts = &dsLayout };
if (vkAllocateDescriptorSets( vkDev.device, &allocInfo, &descriptorSet) != VK_SUCCESS) {
printf("Cannot allocate descriptor set ");
exit(EXIT_FAILURE);
}
return descriptorSet;
}
The most important function for us is updateDescriptorSet(), which attaches the actual buffers and texture samplers to the descriptor set's logical slots. Let's take a look:
void VulkanResources::updateDescriptorSet( VkDescriptorSet ds, const DescriptorSetInfo& dsInfo)
{
uint32_t bindingIdx = 0;
std::vector<VkWriteDescriptorSet> descriptorWrites;
std::vector<VkDescriptorBufferInfo> bufferDescriptors(dsInfo.buffers.size());
std::vector<VkDescriptorImageInfo> imageDescriptors(dsInfo.textures.size());
std::vector<VkDescriptorImageInfo> imageArrayDescriptors;
for (size_t i = 0 ; i < dsInfo.buffers.size() ; i++)
{
BufferAttachment b = dsInfo.buffers[i];
bufferDescriptors[i] = VkDescriptorBufferInfo { .buffer = b.buffer.buffer, .offset = b.offset, .range = (b.size > 0) ? b.size : VK_WHOLE_SIZE };
descriptorWrites.push_back( bufferWriteDescriptorSet(ds, &bufferDescriptors[i], bindingIdx++, b.dInfo.type));
}
for(size_t i = 0 ; i < dsInfo.textures.size() ; i++)
{
VulkanTexture t = dsInfo.textures[i].texture;
imageDescriptors[i] = VkDescriptorImageInfo { .sampler = t.sampler, .imageView = t.image.imageView, .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL };
descriptorWrites.push_back( imageWriteDescriptorSet(ds, &imageDescriptors[i], bindingIdx++));
}
uint32_t taOffset = 0;
std::vector<uint32_t> taOffsets( dsInfo.textureArrays.size());
for (size_t ta = 0 ; ta < dsInfo.textureArrays.size() ; ta++) {
taOffsets[ta] = taOffset;
for (size_t j = 0; j<dsInfo.textureArrays[ta].textures.size(); j++) {
VulkanTexture t = dsInfo.textureArrays[ta].textures[j];
VkDescriptorImageInfo imageInfo = { .sampler = t.sampler, .imageView = t.image.imageView, .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL };
imageArrayDescriptors.push_back(imageInfo);
}
taOffset += static_cast<uint32_t>( dsInfo.textureArrays[ta].textures.size());
}
for (size_t ta = 0 ; ta < dsInfo.textureArrays.size() ; ta++) {
VkWriteDescriptorSet writeSet = { .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET, .dstSet = ds, .dstBinding = bindingIdx++, .dstArrayElement = 0, .descriptorCount = static_cast<uint32_t>( dsInfo.textureArrays[ta].textures.size()), .descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, .pImageInfo = imageArrayDescriptors.data() + taOffsets[ta] };
descriptorWrites.push_back(writeSet);
}
vkUpdateDescriptorSets(vkDev.device, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}
In all this book's examples, the descriptor sets are created early at runtime, usually in the constructor of some renderer class. All we need to do is fill the DescriptorSetInfo structure with references to our loaded texture and buffer attachments. Check out the vkFramework/VulkanResources.h file for multiple examples of how to implement various attachments using this mechanism.
Our new set of Vulkan renderers, located in the shared/vkFramework folder, uses the unified descriptor set creators we described in this recipe. Make sure you check it out.
Now, let's conclude this chapter by putting all the recipes we have just learned together, into a single demo application. The app will render the Lumberyard Bistro scene using meshes and materials from .obj files.
The Chapter7/VK03_LargeScene demo application combines the code from all the recipes of this chapter, so it will be helpful to skim through the entire chapter before proceeding.
To correctly execute the demo application, the Scene Converter tool from the Implementing a scene conversion tool recipe should be compiled and executed with all the default configuration, prior to running this demo.
Despite being able to render a fairly large scene, the main application, which can be found in the Chapter7/VK03_LargeScene folder, is surprisingly simple. All we must do here is define a MyApp class containing the scene data, textures, and all the renderer instances. The code is almost purely declarative; the only exception is when we must pass 3D camera parameters in the draw3D() function, which can also be wrapped in the VulkanBuffer interface. However, this would require some more framework code as we would have to synchronize the camera data and this new GPU buffer. Anyway, let's get started:
const char* envMapFile = "data/piazza_bologni_1k.hdr";
const char* irrMapFile = "data/piazza_bologni_1k_irradience.hdr";
struct MyApp: public CameraApp {
MyApp(): CameraApp(-80, -80)
, envMap(ctx_.resources.loadCubeMap(envMapFile))
, irrMap(ctx_.resources.loadCubeMap(irrMapFile))
, sceneData(ctx_, "data/meshes/test.meshes", "data/meshes/test.scene", "data/meshes/test.materials", envMap, irrMap)
, sceneData2(ctx_, "data/meshes/test2.meshes", "data/meshes/test2.scene", "data/meshes/test2.materials", envMap, irrMap)
, multiRenderer(ctx_, sceneData)
, multiRenderer2(ctx_, sceneData2)
, imgui(ctx_)
{
onScreenRenderers_.emplace_back(multiRenderer);
onScreenRenderers_.emplace_back(multiRenderer2);
}
void draw3D() override {
const mat4 p = getDefaultProjection();
const mat4 view =camera.getViewMatrix();
const mat4 model = glm::rotate( mat4(1.f), glm::pi<float>(), vec3(1, 0, 0));
multiRenderer.setMatrices(p, view, model);
multiRenderer2.setMatrices(p, view, model);
multiRenderer.setCameraPosition( positioner.getPosition());
multiRenderer2.setCameraPosition( positioner.getPosition());
}
private:
VulkanTexture envMap, irrMap;
VKSceneData sceneData, sceneData2;
MultiRenderer, multiRenderer2;
GuiRenderer imgui;
};
The main() function contains only three lines, all of which were explained in the Refactoring Vulkan initialization and the main loop recipe.
The main workhorse for this demo application is the VulkanApp class and two MultiRenderer instances, both of which are responsible for rendering scene objects that are loaded into VKSceneData objects. For a quick recap on the GPU data storage scheme of our application, look at the following diagram:
Figure 7.3 – Scene data scheme
The VKSceneData class loads the geometry data for all the scene objects, a list of material parameters, and an array of textures, referenced by individual materials. All the loaded data is transferred into the appropriate GPU buffers. The MultiRenderer class maintains the Shape and Transform lists in dedicated GPU buffers. Internally, the Shape List points to individual items in the Material and Transform lists, and it also holds offsets to the index and vertex data in the Mesh geometry buffer. At each frame, the VulkanApp class asks MultiRenderer to fill the command buffer with indirect draw commands to render the shapes of the scene. The parameters of the indirect draw command are taken directly from the Shape list. The running demo application should render the Lumberyard Bistro scene with materials, as shown in the following screenshot:
Figure 7.4 – Rendering the Lumberyard Bistro scene with materials
In Chapter 8, Image-Based Techniques, we will use the aforementioned MultiRenderer class to implement a few screen space effects, while in Chapter 10, Advanced Rendering Techniques and Optimizations, we will optimize the internal indirect draw commands by using frustum culling techniques. We will also implement a simple shadow mapping technique in Vulkan for this scene.
We have also implemented an OpenGL version of this app. Check out the Chapter7/GL01_LargeScene project in the source code's bundle for more information.