Chapter 7: Graphics Rendering Pipeline

In this chapter, we will learn how to implement a hierarchical scene representation and corresponding rendering pipeline. This will help us combine the rendering we completed for the geometry and materials we explored in the previous chapters. Instead of implementing a naive object-oriented scene graph where each node is represented by an object that's allocated on the heap, we will learn how to apply the data-oriented design approach to simplify the memory layout of our scene. This will make the modifications we apply to the scene graph significantly faster. This will also act as a basis for learning about data-oriented design principles and applying them in practice. The scene graph and materials representation presented here is compatible with glTF2.

This chapter will cover how to organize the overall rendering process of complex scenes with multiple materials. We will be covering the following recipes:

  • How not to create a scene graph
  • Using data-oriented design for a scene graph
  • Loading and saving a scene graph
  • Implementing transformation trees
  • Implementing a material system
  • Importing materials from Assimp
  • Implementing a scene conversion tool
  • Managing Vulkan resources
  • Refactoring Vulkan initialization and the main loop
  • Working with rendering passes
  • Unifying descriptor set creation routines
  • Putting it all together into a Vulkan application

Technical requirements

To run the recipes in this chapter, you must have a computer with a video card that supports OpenGL 4.6, with ARB_bindless_texture, and Vulkan 1.2, with nonuniform indexing for sampled image arrays. Read Chapter 1, Establishing a Build Environment, if you want to learn how to build the demo applications shown in this book.

The source code for this chapter can be found on GitHub at https://github.com/PacktPublishing/3D-Graphics-Rendering-Cookbook.

How not to do a scene graph

Numerous hobby 3D engines use a straightforward and naive class-based approach to implement a scene graph. It is always tempting to define a structure similar to the following code, but please do not do this:

struct SceneNode {

  SceneNode* parent_;

  vector<SceneNode*> children_;

  mat4 localTransform_;

  mat4 globalTransform_;

  Mesh* mesh_;

  Material* material_;

  void Render();

};

On top of this structure, you can define numerous recursive traversal methods, such as the dreaded render() operation. Let's say we have the following root object:

SceneNode* root;

Here, rendering a scene graph can be as simple as doing the following:

root->render();

The rendering routine in this case does multiple things. Most importantly, the render() method calculates the global transform for the current node. After that, depending on the rendering API being used, mesh geometry and material information is sent to the rendering pipeline. At the end, a recursive call is made to render all the children:

void SceneNode::render() {

  globalTransform_ = parent_ ?    parent_->globalTransform : mat4(1) * localTransform_;

  … API-specific rendering calls…

  for (auto& c: this->children_)

    c->Render();

}

While being a simple and "canonical" object-oriented implementation, it has multiple serious drawbacks:

  • Non-locality of data due to the use of pointers, unless a custom memory allocator is used.
  • Performance issues due to explicit recursion.
  • Potential memory leaks and crashes while using raw pointers, unless you're using smart pointers. This will slow down the performance even further due to atomic operations.
  • Difficulties with circular references and the need to employ weak pointers and similar tricks while using smart pointers to solve memory leak problems.
  • Cumbersome and error-prone recursive loading and saving of the structure.
  • Difficulties with implementing extensions (having to add more and more fields to SceneNode).

The 3D engine grows and scene graph requirements become increasingly numerous, new fields, arrays, callbacks, and pointers must be added and handled in the SceneNode structure making this approach essentially fragile and hard to maintain.

Let us step back and rethink how to keep the relative scene structure without using large monolithic classes with heavyweight dynamic containers inside.

Using data-oriented design for a scene graph

To represent complex nested visual objects such as robotic arms, planetary systems, or deeply branched animated trees, you can split the object into parts and keep track of the hierarchical relationships between them. A directed graph of parent-child relationships between different objects in a scene is called a scene graph. We are deliberately avoiding using the words "acyclic graph" here because, for convenience, you may decide to use circular references between nodes in a controlled way. Most 3D graphics tutorials aimed at hobbyists lead directly down the simple but non-optimal path we identified in the previous recipe, How not to do a scene graph. Let's go a bit deeper into the rabbit hole and learn how to apply data-oriented design to implement a faster scene graph.

In this recipe, we will learn how to get started with a decently performant scene graph design. Our focus will be on scene graphs with fixed hierarchies. In Chapter 9, Working with Scene Graphs, we will elaborate on this topic more and explore how to deal with runtime topology changes and other scene graph operations.

Getting ready

The source code for this recipe is split between the scene geometry conversion tool, Chapter7/Scene Converter/src/main.cpp, and the rendering code, which can be found in the Chapter7/GL01_LargeScene and Chapter7/VK01_SceneGraph demos.

How to do it...

It seems logical to store a linear array of individual nodes, and also replace all the "external" pointers such as Mesh* and Material*, by using suitably sized integer handles, which are just indices to some other arrays. The array of child nodes and references to parent nodes are left outside.

The local and global transforms are also stored in separate arrays and can be easily mapped to a GPU buffer without conversion, making them directly accessible from GLSL shaders. Let's look at the implementation:

  1. Here, we have a new simplified scene node declaration. Our new scene is a composite of arrays:

    struct SceneNode {

      int mesh_;

      int material_;

    };

    struct Scene {

      vector<SceneNode> nodes_;

      vector<mat4> local_;

      vector<mat4> global_;

    };

    One question remains: how can we store the hierarchy? The solution is well-known and is called the Left Child – Right Sibling tree representation. Since a scene graph is really a tree, at least in theory, where no optimization-related circular references are introduced, we may convert any tree with more than two children into a binary tree by "tilting" the branches, as shown in the following diagram:

    Figure 7.1 – Tree representations

            

    Figure 7.1 – Tree representations

    The image on the left-hand side shows a standard tree with a variable number of children for each node, while the image on the right-hand side shows a new structure that only stores a single reference to the first child and another reference to the next "sibling." Here, "being a sibling node" means "to be a child node of the same parent node." This transformation removes the need to store std::vector in each scene node. Finally, if we "tilt" the right image, we get a familiar binary tree structure where the left arrows are solid and represent a "first child" reference and the right arrows are dashed and represent the "next sibling" reference:

    Figure 7.2 – Tilted tree

    Figure 7.2 – Tilted tree

  2. Let's add indices to the SceneNode structure to represent the aforementioned storage schema. Along with the mesh and material indices for each node, we will store a reference to the parent, an index for the first child (or a negative value if there are no child nodes), and an index for the next sibling scene node:

    struct SceneNode {

      int mesh_, material_;

      int parent_;

      int firstChild_;

      int rightSibling_;

    };

    What we have now is a compact linear list of constant sized objects that are plain old data. Yes, the tree traversal and modification routines may seem unusual, but these are just linked list iterations. It would be unfair not to mention a rather serious disadvantage, though: random access to a child node is now slower on average because we must traverse each node in a list. For our purposes, this is not fatal since we will either touch all the children or none of them.

  3. Before turning to the implementation, let's perform another unconventional transformation of the new SceneNode structure. It contains the indices of the mesh and material, along with hierarchical information, but the local and global transformations are stored outside. This suggests that we may need to define the following structure to store our scene hierarchy:

    struct Hierarchy {

      int parent_;

      int firstChild_;

      int nextSibling_;

      int level_;

    };

    We have changed "left" to "first" and "right" to "next" since tree node geometry does not matter here. The level_ field stores the cached depth of the node from the top of the scene graph. The root node is at level zero; all the children have a level that's greater by one with respect to their parents.

    Also, the Mesh and Material objects for each node can be stored in separate arrays. However, if not all the nodes are equipped with a mesh or material, we can use a hash table to store node-to-mesh and node-to-material mappings. Absence of such mappings simply indicates that a node is only being used for transformations or hierarchical relation storage. The hash tables are not as linear as arrays, but they can be trivially converted to and from arrays of {key, value} pairs.

  4. Finally, we can declare a new Scene structure with logical "compartments" that we will later call components:

    struct Scene {

      vector<mat4> localTransforms_;

      vector<mat4> globalTransforms_;

      vector<Hierarchy> hierarchy_;

      // Meshes for nodes (Node -> Mesh)  unordered_map<uint32_t, uint32_t> meshes_;

      // Materials for nodes (Node -> Material)  unordered_map<uint32_t, uint32_t> materialForNode_;

  5. The following components are not strictly necessary, but they help a lot while debugging scene graph manipulation routines or while implementing an interactive scene editor, where the ability to see some human-readable node identifiers is crucial:

      // Node names: which name is assigned to the node   std::unordered_map<uint32_t, uint32_t> nameForNode_;

      // Collection of scene node names   std::vector<std::string> names_;

      // Collection of debug material names   std::vector<std::string> materialNames_;

    };

    One thing that is missing is the SceneNode structure itself, which is now represented by integer indices in the arrays of the Scene structure. It is rather amusing and unusual for an object-oriented mind to speak about SceneNode while not needing or having the scene node class itself.

The conversion routine for Assimp's aiScene into our format is implemented in the Chapter7/SceneConverter project. It is a form of top-down recursive traversal where we create our implicit SceneNode objects in the Scene structure. Let's go through the steps for traversing a scene stored in the aforementioned format:

  1. Traversal starts from some node with a given parent that's passed as a parameter. A new node identifier is returned by the addNode() routine shown here. If aiNode contains a name, we store it in the Scene::names_ array:

    void traverse(const aiScene* sourceScene,  Scene& scene,  aiNode* node, int parent, int atLevel)

    {

      int newNodeID = addNode(scene, parent, atLevel);

      if (node->mName.C_Str()) {

        uint32_t stringID = (uint32_t)scene.names_.size();

        scene.names_.push_back(      std::string( node->mName.C_Str()) );

        scene.nameForNode_[newNodeID] = stringID;

      }

  2. If this aiNode object has meshes attached to it, we must create one subnode for each of the meshes. For easier debugging, we will add a name for each new mesh subnode:

      for (size_t i = 0; i < node->mNumMeshes ; i++) {

        int newSubNodeID = addNode(      scene, newNode, atLevel + 1);

        uint32_t stringID = (uint32_t)scene.names_.size();

        scene.names_.push_back(      std::string(node->mName.C_Str()) +      "_Mesh_" + std::to_string(i));

        scene.nameForNode_[newSubNodeID] = stringID;

  3. Each of the meshes is assigned to the newly created subnode. Assimp ensures that a mesh has a material assigned to it, so we will assign that material to our node:

        int mesh = (int)node->mMeshes[i];

        scene.meshes_[newSubNodeID] = mesh;

        scene.materialForNode_[newSubNodeID] =      sourceScene->mMeshes[mesh]->mMaterialIndex;

  4. Since we only use subnodes for to attach meshes, we will set the local and global transformations to identity matrices:

        scene.globalTransform_[newSubNode] =      glm::mat4(1.0f);

        scene.localTransform_[newSubNode] =      glm::mat4(1.0f);

      }

  5. The global transformation is set to identity at the beginning of node conversion. It will be recalculated at the first frame or if the node is marked as changed. See the Implementing transformations trees recipe in this chapter for the implementation details. The local transformation is fetched from aiNode and converted into a glm::mat4 object:

      scene.globalTransform_[newNode] = glm::mat4(1.0f);

      scene.localTransform_[newNode] =    toMat4(N->mTransformation);

  6. At the end, we recursively traverse the children of this aiNode object:

      for (unsigned int n = 0 ; n < N->mNumChildren ; n++)

        traverse(sourceScene, scene, N->mChildren[n],      newNode, atLevel+1);

    }

  7. The toMat4() helper function is a per-component conversion of the aiMatrix4x4 parameter, and is transformed into a GLM matrix:

    glm::mat4 toMat4(const aiMatrix4x4& m) {

      glm::mat4 mm;

      for (int i = 0; i < 4; i++)

        for (int j = 0; j < 4; j++)

          mm[i][j] = m[i][j];

      return mm;

    }

The most complex part of the code for dealing with the Scene data structure is the addNode() routine, which allocates a new scene node and adds it to the scene hierarchy. Let's check out how to implement it:

  1. First, the addition process acquires a new node identifier, which is the current size of the hierarchy array. New identity transforms are added to the local and global transform arrays. The hierarchy for the newly added node only consists of the parent reference:

    int addNode(Scene& scene, int parent, int level)

    {

      int node = (int)scene.hierarchy_.size();

      scene.localTransform_.push_back(glm::mat4(1.0f));

      scene.globalTransform_.push_back(glm::mat4(1.0f));

      scene.hierarchy.push_back({ .parent = parent });

  2. If we have a parent, we must fix its first child reference and, potentially, the next sibling reference of some other node. If a parent node has no children, we must directly set its firstChild_ field; otherwise, we should run over the siblings of this child to find out where to add the next sibling:

      if (parent > -1) {

        int s = scene.hierarchy_[parent].firstChild_;

        if (s == -1) {

          scene.hierarchy_[parent].firstChild_ = node;

          scene.hierarchy_[node].lastSibling_ = node;

        } else {

          int dest = scene.hierarchy_[s].lastSibling_;

          if (dest <= -1) {

            // iterate nextSibling_ indices         for (dest = s;          scene.hierarchy_[dest].nextSibling_ != -1;          dest = scene.hierarchy_[dest].nextSibling_);

          }

          scene.hierarchy_[dest].nextSibling_ = node;

          scene.hierarchy_[s].lastSibling_ = node;

        }

      }

    After the for loop, we assign our new node as the next sibling of the last child. Note that this linear run over the siblings is not really necessary if we store the index of the last child node that was added. Later, in the Implementing transformations recipe, we will show you how to modify addNode() and remove the preceding loop.

  3. The level of this node is stored for correct global transformation updating. To keep the structure valid, we will store the negative indices for the newly added node:

        scene.hierarchy[node].level = level;

        scene.hierarchy_[node].nextSibling_ = -1;

        scene.hierarchy_[node].firstChild_  = -1;

        return node;

    }

Once we have the material system in place, we can use the traverse() routine in our new SceneConvert tool.

There's more...

Data-oriented design (DOD) is a vast domain, and we just used a few techniques from it. We recommend reading the online book Data-Oriented Design, by Richard Fabian, to get yourself familiar with more DOD concepts: https://www.dataorienteddesign.com/dodbook.

The Chapter7/VK01_SceneGraph demo application contains some basic scene graph editing capabilities for ImGui. These can help you get started with integrating scene graphs into your productivity tools. Check out shared/vkFramework/GuiRenderer.cpp for more details. The following recursive function, called renderSceneTree(), is responsible for rendering the scene graph tree hierarchy in the UI and selecting a node for editing:

int renderSceneTree(const Scene& scene, int node) {

  int selected = -1;

  std::string name = getNodeName(scene, node);

  std::string label = name.empty() ?    (std::string("Node") + std::to_string(node)) : name;

  int flags = (scene.hierarchy_[node].firstChild_ < 0) ?    ImGuiTreeNodeFlags_Leaf|ImGuiTreeNodeFlags_Bullet : 0;

  const bool opened = ImGui::TreeNodeEx(    &scene.hierarchy_[node], flags, "%s", label.c_str());

  ImGui::PushID(node);

  if (ImGui::IsItemClicked(0)) selected = node;

  if (opened) {

    for (int ch = scene.hierarchy_[node].firstChild_;         ch != -1; ch = scene.hierarchy_[ch].nextSibling_)

    {

      int subNode = renderSceneTree(scene, ch);

      if (subNode > -1) selected = subNode;

    }

    ImGui::TreePop();

  }

  ImGui::PopID();

  return selected;

}

The editNode() function can be used as a basis for building editing functionality for nodes, materials, and other scene graph content.

Loading and saving a scene graph

To quote Frederick Brooks, "Show me your data structures and I do not need to see your code." Hopefully, it is already more or less clear how to implement basic operations on a scene graph, but the remaining recipes in this chapter will explicitly describe all the required routines. Here, we will provide an overview of the loading and saving operations for our scene graph structure.

Getting ready

Make sure you have read the previous recipe, Using data-oriented design for a scene graph, before proceeding any further.

How to do it...

The loading procedure is a sequence of fread() calls, followed by a pair of loadMap() operations. As usual, we will be omitting any error handling code in this book's text; however, the accompanying source code bundle contains many necessary checks to see if the file was actually opened and so on. Let's get started:

  1. After opening the file, we can read the count of stored scene nodes. The data arrays are resized accordingly:

    void loadScene(const char* fileName, Scene& scene)

    {

      FILE* f = fopen(fileName, "rb");

      uint32_t sz;

      fread(&sz, sizeof(sz), 1, f);

      scene.hierarchy.resize(sz);

      scene.globalTransform.resize(sz);

      scene.localTransform.resize(sz);

  2. fread() reads the transformations and hierarchical data for all the scene nodes:

      fread(scene.localTransform.data(),     sizeof(glm::mat4), sz, f);

      fread(scene.globalTransform.data(),    sizeof(glm::mat4), sz, f);

      fread(    scene.hierarchy.data(), sizeof(Hierarchy), sz, f);

  3. Node-to-material and node-to-mesh mappings are loaded with the calls to the loadMap() helper routine:

      loadMap(f, scene.materialForNode);

      loadMap(f, scene.meshes);

  4. If there is still some data left, we must read the scene node names and material names:

      if (!feof(f)) {

        loadMap(f, scene.nameForNode_);

        loadStringList(f, scene.names_);

        loadStringList(f, scene.materialNames_);

      }

      fclose(f);

    }

Saving the scene reverses the loadScene() routine. Let's take a look:

  1. At the beginning of the file, we must write the count of scene nodes:

    void saveScene(const char* fileName,  const Scene& scene)

    {

      FILE* f = fopen(fileName, "wb");

      uint32_t sz = (uint32_t)scene.hierarchy.size();

      fwrite(&sz, sizeof(sz), 1, f);

  2. Three fwrite() calls save the local and global transformations, followed by the hierarchical information:

      fwrite(scene.localTransform.data(),    sizeof(glm::mat4), sz, f);

      fwrite(scene.globalTransform.data(),    sizeof(glm::mat4), sz, f);

      fwrite(scene.hierarchy.data(), sizeof(Hierarchy),    sz, f);

  3. Two saveMap() calls store the node-to-materials and node-to-mesh mappings:

      saveMap(f, scene.materialForNode);

      saveMap(f, scene.meshes);

    If the scene node and material names are not empty, we must also store these maps:

      if (!scene.names_.empty() &&      !scene.nameForNode_.empty()) {

        saveMap(f, scene.nameForNode_);

        saveStringList(f, scene.names_);

        saveStringList(f, scene.materialNames_);

      }

      fclose(f);

    }

Now, let's briefly describe the helper routines for loading and saving unordered maps. std::unordered_map is loaded in three steps:

  1. First, the count of {key, value} pairs is read from a file:

    void loadMap(FILE* f,  std::unordered_map<uint32_t, uint32_t>& map)

    {

      std::vector<uint32_t> ms;

      uint32_t sz;

      fread(&sz, 1, sizeof(sz), f);

  2. Then, all the key-value pairs are loaded with a single fread call:

      ms.resize(sz);

      fread(ms.data(), sizeof(int), sz, f);

  3. Finally, the array is converted into a hash table:

      for (size_t i = 0; i < (sz / 2) ; i++)

        map[ms[i * 2 + 0]] = ms[i * 2 + 1];

    }

The saving routine for std::unordered_map is created by reversing loadMap() line by line:

  1. A temporary {key, value} pair array is allocated:

    void saveMap(FILE* f,  const std::unordered_map<uint32_t, uint32_t>& map)

    {

      std::vector<uint32_t> ms;

      ms.reserve(map.size() * 2);

  2. All the values from std::unordered_map are copied to the array:

      for (const auto& m : map) {

        ms.push_back(m.first);

        ms.push_back(m.second);

      }

  3. The count of {key, value} pairs is written to the file:

      uint32_t sz = ms.size();

      fwrite(&sz, sizeof(sz), 1, f);

  4. Finally, the {key, value} pairs are written with one fwrite() call:

      fwrite(ms.data(), sizeof(int), ms.size(), f);

    }

There's more...

Topology changes for the nodes in our scene graph pose a certain, nevertheless solvable, problem. The corresponding source code is discussed in the Deleting nodes and merging scene graphs recipe of Chapter 9, Working with Scene Graphs. We just have to keep all the mesh geometries in a single GPU buffer. We will show you how to implement this later in this chapter in MultiRenderer, which is a refactoring of the MultiMeshRenderer class from Chapter5/VK01_MultiMeshDraw.

The material conversion routines will be implemented in the Implementing a material system recipe. Together with scene loading and saving, they complete the SceneConvert tool.

Implementing transformation trees

A scene graph is typically used to represent spatial relationships. For the purpose of rendering, we must calculate a global affine 3D transformation for each of the scene graph nodes. This recipe will show you how to correctly calculate global transformations from local transformations without making any redundant calculations.

Getting ready

Using the previously defined Scene structure, we will show you how to correctly recalculate global transformations. Please revisit the Using data-oriented design for a scene graph recipe before proceeding. To start this recipe, recall that we had the dangerous but tempting idea of using a recursive global transform calculator in the non-existent SceneNode::render() method:

SceneNode::Render() {

  mat4 parentTransform = parent ?    parent->globalTransform : identity();

  this->globalTransform = parentTransform * localTransform;

  ... rendering and recursion

}

It is always better to separate operations such as rendering, scene traversal, and transform calculation, while at the same time executing similar operations in large batches. This separation becomes even more important when the number of nodes becomes large.

We have already learned how to render several meshes with a single GPU draw call by using a combination of indirect rendering and programmable vertex pulling. Here, we will show you how to perform the minimum amount of global transform recalculations.

How to do it...

It is always good to avoid unnecessary calculations. In the case of global transformations of the scene nodes, we need a way to mark certain nodes whose transforms have changed in this frame. Since changed nodes may have children, we must also mark those children as changed. Let's take a look:

  1. In the Scene structure, we should declare a collection of changedAtLevel_ arrays to quickly add any changed nodes to the appropriate scene graph level:

    struct Scene {

      … somewhere in transform component …

      std::vector<int>

      changedAtThisFrame_[MAX_NODE_LEVEL];

    };

  2. The markAsChanged() routine starts with a given node and recursively descends to each and every child node, adding it to the changedAtLevel_ arrays. First, the node itself is marked as changed:

    void markAsChanged(Scene& scene, int node) {

      int level = scene.hierarchy_[node].level;

      scene.changedAtThisFrame_[level].push_back(node);

  3. We start from the first child and advance to next sibling, descending the hierarchy:

      for (int s = scene.hierarchy_[node].firstChild_ ;       s != - 1; s = scene.hierarchy_[s].nextSibling_)

        markAsChanged(scene, s);

    }

To recalculate all the global transformations for changed nodes, the following function must be implemented. No work is done if no local transformations were updated, and the scene is essentially static. Let's take a look:

  1. We start from the root layer of the list of changed scene nodes, supposing we have only one root node. This is because root node global transforms coincide with their local transforms. The changed nodes list is then cleared:

    void recalculateGlobalTransforms(Scene& scene)

    {

      if (!scene.changedAtThisFrame_[0].empty()) {

        int c = scene.changedAtThisFrame_[0][0];

        scene.globalTransform_[c] =      scene.localTransform_[c];

        scene.changedAtThisFrame_[0].clear();

      }

  2. For all the lower levels, we must ensure that we have parents so that the loops are linear and there are no conditions inside. We will start from level 1 because the root level is already being handled. The exit condition is the emptiness of the list at the current level. We will also avoid descending deeper than our list allows:

      for (int i = 1 ; i < MAX_NODE_LEVEL &&       !scene.changedAtThisFrame_[i].empty(); i++ )

      {

  3. Now, we must iterate all the changed nodes at this level. For each of the iterated nodes, we fetch the parent transform and multiply it by the local node transform:

        for (int c : scene.changedAtThisFrame_[i]) {

          int p = scene.hierarchy_[c].parent;

          scene.globalTransform_[c] =        scene.globalTransform_[p] *        scene.localTransform_[c];

        }

  4. At the end of the node iteration process, we should clear the list of changed nodes for this level:

        scene.changedAtThisFrame_[i].clear();

      }

    }

The essence of this implementation is the fact that we do not recalculate any of the global transformations multiple times. Since we start from the root layer of the scene graph tree, all the changed layers below the root acquire a valid global transformation for their parents.

Note

Depending on how frequently local transformations are updated, it may be more performant to eliminate the list of recently updated nodes and always perform a full update. Profile your real code before making a decision.

There's more...

As an advanced exercise, transfer the computation of changed node transformations to your GPU. This is relatively easy to implement, considering that we have compute shaders and buffer management in place.

Implementing a material system

Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, provided a description of the PBR shading model and presented all the required GLSL shaders for rendering a single 3D object using multiple textures. Here, we will show you how to organize scene rendering with multiple objects with different materials and properties. Our material system is compatible with the glTF2 material format and easily extensible for incorporating many existing glTF2 extensions.

Getting ready

The previous chapters dealt with rendering individual objects and applying a PBR model to lighten them. In the Using data-oriented design for a scene graph recipe, we learned the general structure for scene organization and used opaque integers as material handles. Here, we will define a structure for storing material parameters and show you how this structure can be used in GLSL shaders. The routine to convert material parameters from the ones loaded by Assimp will be described later in this chapter, in the Importing materials with Assimp recipe.

How to do it...

We need a structure to represent our PBR material, both in CPU memory to load it from a file and in a GPU buffer. Let's get started:

  1. The structure contains both the numeric values that define the lighting properties of the material and the set of texture indices. At the beginning of the definition, we use a custom macro, PACKED_STRUCT,  that hides the details of structure member alignment. This is necessary to make sure the structure layout in memory matches the corresponding structure declaration in the GLSL shader. The first two fields store the emissive and ambient color constants of our shading model:

    #ifdef __GNUC__

    #  define PACKED_STRUCT     __attribute__((packed,aligned(1)))

    #else

    #  define PACKED_STRUCT

    #endif

    struct PACKED_STRUCT MaterialDescription final

    {

      gpuvec4 emissiveColor_ = { 0.0f, 0.0f, 0.0f, 0.0f};

      gpuvec4 albedoColor_   = { 1.0f, 1.0f, 1.0f, 1.0f };

  2. The roughness_ field contains the surface's roughness. Two components, .x and .y, can be used to represent anisotropic roughness when necessary:

      gpuvec4 roughness_     = { 1.0f, 1.0f, 0.0f, 0.0f };

  3. We can describe the transparent materials by using a transparency factor, which is used to render with alpha-blended materials, or by using an alpha test threshold, which is used for the simple punch-through transparency rendering we have implemented in the demo applications for this chapter. Besides that, we must store the metallic factor for our PBR rendering:

      float transparencyFactor_ = 1.0f;

      float alphaTest_          = 0.0f;

      float metallicFactor_     = 1.0f;

  4. To customize our rendering pipeline, we may want to use some flags that differ from material to material or from object to object. In the demos for this book, we do not need such flexibility, so will render all the objects with a single shader where only the texture inputs change. However, there is a placeholder for storing these flags:

      uint32_t flags_ = sMaterialFlags_CastShadow |                    sMaterialFlags_ReceiveShadow;

  5. The second part of the structure contains indices into the list of all textures. In Vulkan, the texture is addressed by a 32-bit integer in a big texture array, while in OpenGL, the texture is an opaque 64-bit handle provided by the OpenGL implementation via the ARB_bindless_texture extension. To make sure the same GLSL material declaration can be shared between OpenGL and Vulkan, we will use 64-bit values here. For empty textures, we will use a special guard value. In both cases, this is a 32-bit integer with all its bits set to one:

      uint64_t ambientOcclusionMap_ = 0xFFFFFFFF;

      uint64_t emissiveMap_ = 0xFFFFFFFF;

      uint64_t albedoMap_ = 0xFFFFFFFF;

      uint64_t metallicRoughnessMap_ = 0xFFFFFFFF;

      uint64_t normalMap_ = 0xFFFFFFFF;

  6. The opacity map is only used during conversion and has only been included because we are using the same structure in our SceneConverter tool:

      uint64_t opacityMap_ = 0xFFFFFFFF;

    };

    See the Implementing a scene conversion tool recipe for further details on how to convert and pack material textures.

  7. Each vector value is packed to four floats. The gpuvec4 structure is tightly packed and occupies exactly 16 bytes. The PACKED_STRUCT macro instructs the GCC compiler to pack the structure tightly:

    struct PACKED_STRUCT gpuvec4 {

      float x, y, z, w;

      gpuvec4() = default;

      gpuvec4(float a, float b, float c, float d)

      : x(a), y(b), z(c), w(d) {}

      gpuvec4(const vec4& v) : x(v.x), y(v.y), z(v.z),

      w(v.w) {}

    };

Note that we have the data structures in place, let's take a look at the loading and saving code:

  1. The following function reads a list of materials from file. The loadStringList() function loads the texture file names into the files container:

    void loadMaterials(const char* fileName,  std::vector<MaterialDescription>& materials,  std::vector<std::string>& files)

    {

      FILE* f = fopen(fileName, "rb");

      if (!f) return;

      uint32_t sz;

      fread(&sz, 1, sizeof(uint32_t), f);

      materials.resize(sz);

      fread(materials.data(), materials.size(),    sizeof(MaterialDescription), f);

      loadStringList(f, files);

      fclose(f);

    }

  2. Our SceneConverter tool needs the saveMaterialList() function, which saves the converted material data in files. The code uses a helper function called saveStringList(), which appends a list of strings to an opened binary file:

    void saveMaterials(const char* fileName,  const std::vector<MaterialDescription>& materials,  const std::vector<std::string>& files)

    {

      FILE* f = fopen(fileName, "wb");

      if (!f) return;

      uint32_t sz = (uint32_t)materials.size();

      fwrite(&sz, 1, sizeof(uint32_t), f);

      fwrite(materials.data(), sz,     sizeof(MaterialDescription), f);

      saveStringList(f, files);

      fclose(f);

    }

At program start, we load the list of materials and all the texture files into GPU textures. Now, we are ready to learn how the MaterialDescription structure is used in GLSL shaders:

  1. The vertex shader is similar to the GLSL shader from Chapter 5, Working with Geometry Data. The output values for each vertex are the texture coordinates uvw, the normal vectors v_worldNormal, and the positions in world space coordinates v_worldPos:

    layout(location = 0) out vec3 uvw;

    layout(location = 1) out vec3 v_worldNormal;

    layout(location = 2) out vec4 v_worldPos;

  2. The matIdx output attribute contains the index of the material that was used in the fragment shader. The flat attribute instructs the GPU to avoid interpolating this value:

    layout(location = 3) out flat uint matIdx;

  3. The per-vertex attributes are used in all the subsequent shaders, so they are declared in separate reusable files. The VK01.h file contains the memory layout of the per-vertex attributes in the ImDrawVert structure:

    #include <data/shaders/chapter07/VK01.h>

    struct ImDrawVert {

      float x, y, z; float u, v; float nx, ny, nz;

    };

  4. The DrawData structure contains information for rendering a mesh instance with a specific material. The mesh and material indices represent offsets into GPU buffers; these will be discussed shortly. The level of detail the lod field indicates the relative offset to the vertex data. The indexOffset and vertexOffset fields contain byte offsets into the mesh index and geometry buffers. The transformIndex field stores the index of the global object-to-world-space transformation that's calculated by scene graph routines. We showed you how transformation data is packed in the previous recipe, Implementing transformation trees:

    struct DrawData {

      uint mesh;

      uint material;

      uint lod;

      uint indexOffset;

      uint vertexOffset;

      uint transformIndex;

    };

  5. After including VK01.h, we have another #include statement:

    #include <data/shaders/chapter07/VK01_VertCommon.h>

    The VK01_VertCommon.h file contains all the buffer attachments for the vertex shader. The first buffer contains two per-frame uniforms – the model-view-projection matrix and the camera position in the world space:

    layout(binding = 0) uniform  UniformBuffer

      { mat4 inMtx; vec4 cameraPos; } ubo;

  6. As usual, we must employ the programmable vertex pulling technique, so we need the indices and vertices to be in separate buffers:

    layout(binding = 1) readonly buffer SBO   { ImDrawVert data[]; } sbo;

    layout(binding = 2) readonly buffer IBO   { uint data[]; } ibo;

    layout(binding = 3) readonly buffer DrawBO   { DrawData data[]; } drawDataBuffer;

    layout(binding = 5) readonly buffer XfrmBO   { mat4 data[]; } transformBuffer;

The rest of the VK01.h file refers to yet another file, called data/shaders/chapter07/MaterialData.h, that defines a GLSL structure equivalent to MaterialData, which was described at the beginning of this recipe.

Now, let's return to the main vertex shader:

  1. First, we must fetch the DrawData typed buffer. Using the per-instance data and local gl_VertexIndex, we will calculate the offset as index data:

    void main() {

      DrawData dd = drawDataBuffer.data[gl_BaseInstance];

      uint refIdx = dd.indexOffset + gl_VertexIndex;

  2. The vertex index is calculated by adding the global vertex offset for this mesh to the vertex index fetched from the ibo buffer:

      ImDrawVert v = sbo.data[ibo.data[refIdx] +

      dd.vertexOffset];

  3. The object-to-world transformation is read directly from transformBuffer using the instance index:

      mat4 model = transformBuffer.data[gl_BaseInstance];

  4. At the end of the vertex shader, we calculate the fragment's world space position and normal vector. Since the code needs to be compatible with our OpenGL demos, we will flip the y coordinate to convert OpenGL's coordinates into Vulkan's inverted coordinate system:

      v_worldPos   = model * vec4(v.x, -v.y, v.z, 1.0);

      v_worldNormal = transpose(inverse(mat3(model))) *    vec3(v.nx, -v.ny, v.nz);

  5. For rasterization purposes, we will multiply the world space position by the aggregate camera view and projection matrix. This gives us the clip space coordinates of the fragment:

      gl_Position = ubo.inMtx * v_worldPos;

  6. The only difference from the shader shown in Chapter 5, Working with Geometry Data, is the matIdx output value's assignment. This index is used in the fragment shader to read the appropriate material parameters. The texture coordinates are passed into the fragment shader without any conversions needing to take place:

      matIdx = dd.material;

      uvw = vec3(v.u, v.v, 1.0);

    }

Now, let's take a look at the fragment shader:

  1. The fragment shader uses a single buffer that contains the material data we defined previously. Also, a single array of textures is used for all the maps:

    layout(binding = 4) readonly   buffer MatBO { MaterialData data[]; } mat_bo;

    layout(binding = 9) uniform sampler2D textures[];

  2. The main function looks up the material data using the material index that was passed from the vertex shader. For demonstration purposes, we will read the emissive color value, which is added to the output color later:

    void main() {

      MaterialData md = mat_bo.data[matIdx];

      vec4 emission = md.emissiveColor_;

  3. The default values for the ambient color and normal vector are assigned:

      vec4 albedo = vec4(1.0, 0.0, 0.0, 1.0);

      vec3 normalSample = vec3(0.0, 0.0, 1.0);

  4. The albedo color value is read from the appropriate texture by non-uniformly addressing the global texture array:

      {

        uint texIdx = uint(md.albedoMap_);

        albedo = texture(      textures[nonuniformEXT(texIdx)], uvw.xy);

      }

  5. The normal map is read in the same way:

      {

        uint texIdx = uint(md.normalMap_);

        normalSample = texture(      textures[nonuniformEXT(texIdx)], uvw.xy).xyz;

      }

  6. Just as we did with the PBR shader from Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, an alpha test is performed for objects with transparency masks:

      runAlphaTest(albedo.a, md.alphaTest_);

    To avoid dealing with any kind of scene sorting at this point, alpha transparency is simulated using dithering and punch-through transparency. You can find some useful insights at http://alex-charlton.com/posts/Dithering_on_the_GPU. The following is the final solution:

    void runAlphaTest(float alpha, float alphaThreshold) {

      if (alphaThreshold == 0.0) return;

      mat4 thresholdMatrix = mat4(    1.0 /17.0,  9.0/17.0,  3.0/17.0, 11.0/17.0,    13.0/17.0,  5.0/17.0, 15.0/17.0,  7.0/17.0,    4.0  17.0, 12.0/17.0,  2.0/17.0, 10.0/17.0,    16.0/17.0,  8.0/17.0, 14.0/17.0,  6.0/17.0   );

      int x = int(mod(gl_FragCoord.x, 4.0));

      int y = int(mod(gl_FragCoord.y, 4.0));

      alpha = clamp(    alpha - 0.5 * thresholdMatrix[x][y], 0.0, 1.0);

      if (alpha < alphaThreshold) discard;

    }

  7. The world normal is normalized to compensate for the interpolation that occurred while rasterizing the triangle:

      vec3 n = normalize(v_worldNormal);

  8. If the normal map value is bigger than the threshold that's been set, we must modify the world space normal:

      if (length(normalSample) > 0.5)

        n = perturbNormal(n,          normalize(ubo.cameraPos.xyz –            v_worldPos.xyz),normalSample, uvw.xy);

  9. The rest of the fragment shader applies a simplified lighting model:

      vec3 lightDir = normalize(vec3(-1.0, -1.0, 0.1));

      float NdotL = clamp( dot(n, lightDir), 0.3, 1.0 );

      outColor = vec4(    albedo.rgb * NdotL + emission.rgb, 1.0 );

    }

The next recipe will show you how to extract and pack the values from the Assimp library's aiMaterial structure into our MaterialData structure.

Importing materials from Assimp

In Chapter 5, Working with Geometry Data, we learned how to define a runtime data storage format for mesh geometry. This recipe will show you how to use the Assimp library to extract material properties from Assimp data structures. Combined with the next recipe, which will cover our SceneConverter tool, this concludes the process of describing our data content exporting pipeline.

Getting ready

In the previous recipe, we learned how to render multiple meshes with different materials. Now, it is time to learn how to import the material data from popular 3D asset formats.

How to do it...

Let's take a look at the convertAIMaterialToDescription() function that's used in the SceneConverter tool. It retrieves all the required parameters from the aiMaterial structure and returns a MaterialDescription object that can be used with our GLSL shaders. Let's take a look:

  1. Each texture is addressed by an integer identifier. We will store a list of texture filenames in the files parameter. The opacityMap parameter contains a list of textures that need to be combined with transparency maps:

    MaterialDescription convertAIMaterialToDescription(  const aiMaterial* M,  std::vector<std::string>& files,  std::vector<std::string>& opacityMaps)

    {

      MaterialDescription D;

      aiColor4D Color;

  2. The Assimp API provides getter functions to extract individual color parameters. We will use some of these here:

      if ( aiGetMaterialColor(M, AI_MATKEY_COLOR_AMBIENT,       &Color) == AI_SUCCESS ) {

        D.emissiveColor_ =      { Color.r, Color.g, Color.b, Color.a };

        if ( D.emissiveColor_.w > 1.0f )

          D.emissiveColor_.w = 1.0f;

      }

    The first parameter we are trying to extract is the "ambient" color, which is stored in the emissiveColor_ field of MaterialDescription. The alpha value is clamped to 1.0.

  3. In the same way, the diffuse color is stored in the albedoColor_ field of MaterialDescription with a clamped alpha channel:

      if ( aiGetMaterialColor(M, AI_MATKEY_COLOR_DIFFUSE,       &Color) == AI_SUCCESS ) {

        D.albedoColor_ =      { Color.r, Color.g, Color.b, Color.a };

        if ( D.albedoColor_.w > 1.0f )      D.albedoColor_.w = 1.0f;

      }

  4. If aiMaterial contains an emissive color value, we will add it to the emissiveColor_ property we loaded previously. The per-component color addition is necessary here because this is the only place where we will use color addition. Due to this, we did not define the addition operator for gpuvec4:

      if (aiGetMaterialColor(M, AI_MATKEY_COLOR_EMISSIVE,      &Color) == AI_SUCCESS ) {

        D.emissiveColor_.x += Color.r;    D.emissiveColor_.y += Color.g;    D.emissiveColor_.z += Color.b;    D.emissiveColor_.w += Color.a;

        if ( D.emissiveColor_.w > 1.0f )

          D.albedoColor_.w = 1.0f;

      }

  5. The following constant sets the opaqueness threshold value to 5%:

      const float opaquenessThreshold = 0.05f;

      float Opacity = 1.0f;

    In our conversion routine, we are using one simple optimization trick for transparent materials: anything with an opaqueness of 95% or more is considered opaque and avoids any blending.

  6. The material opacity is converted into transparencyFactor and then clamped against the threshold value:

      if ( aiGetMaterialFloat(M, AI_MATKEY_OPACITY,       &Opacity) == AI_SUCCESS ) {

        D.transparencyFactor_ =      glm::clamp(1.0f-Opacity, 0.0f, 1.0f);

        if ( D.transparencyFactor_ >=         1.0f – opaquenessThreshold )

          D.transparencyFactor_ = 0.0f;

      }

  7. If the material contains a transparency factor as an RGB value, we use the maximum component value to calculate our transparency factor. As we did previously, we will clamp the transparency factor against a threshold:

      if ( aiGetMaterialColor(M,       AI_MATKEY_COLOR_TRANSPARENT,       &Color) == AI_SUCCESS ) {

        const float Opacity =      std::max(std::max(Color.r, Color.g), Color.b);

        D.transparencyFactor_ =      glm::clamp( Opacity, 0.0f, 1.0f );

        if ( D.transparencyFactor_ >=         1.0f – opaquenessThreshold )

          D.transparencyFactor_ = 0.0f;

        D.alphaTest_ = 0.5f;

      }

  8. Once we've finished reading the colors and transparency factors, we must fetch scalar properties of the material with the help of the aiGetMaterialFloat() function. All the values are loaded into a temporary variable. The PBR metallic and roughness factors are loaded into the appropriate MaterialDescription fields:

      float tmp = 1.0f;

      if (aiGetMaterialFloat(M,      AI_MATKEY_GLTF_PBRMETALLICROUGHNESS       _METALLIC_FACTOR, &tmp) == AI_SUCCESS)

        D.metallicFactor_ = tmp;

      if (aiGetMaterialFloat(M,      AI_MATKEY_GLTF_PBRMETALLICROUGHNESS       _ROUGHNESS_FACTOR, &tmp) == AI_SUCCESS)

        D.roughness_ = { tmp, tmp, tmp, tmp };

  9. All the textures for our materials are stored in external files. The names of these files can be extracted by using the aiGetMaterialTexture() function:

      aiString Path;

      aiTextureMapping Mapping;

      unsigned int UVIndex = 0;

      float Blend = 1.0f;

      aiTextureOp TextureOp = aiTextureOp_Add;

      const aiTextureMapMode TextureMapMode[2] =    { aiTextureMapMode_Wrap, aiTextureMapMode_Wrap };

      unsigned int TextureFlags = 0;

    This function requires several parameters, most of which we will ignore in our converter for the sake of simplicity.

  10. The first texture is an emissive map. We will use the addUnique() function to add the texture file to our textures list:

      if (aiGetMaterialTexture( M, aiTextureType_EMISSIVE,      0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp,      TextureMapMode,&TextureFlags ) == AI_SUCCESS)

        D.emissiveMap_ = addUnique(files, Path.C_Str());

  11. The diffuse map is stored as the albedoMap_ field in our material structure:

      if (aiGetMaterialTexture( M, aiTextureType_DIFFUSE,      0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp,      TextureMapMode, &TextureFlags ) == AI_SUCCESS)

        D.albedoMap_ = addUnique(files, Path.C_Str());

  12. The normal map can be extracted from either the aiTextureType_NORMALS property or aiTextureType_HEIGHT in aiMaterial. We must check for the presence of an aiTextureType_NORMALS texture map and store the texture index in the normalMap_ field:

      if (aiGetMaterialTexture( M, aiTextureType_NORMALS,      0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp,      TextureMapMode, &TextureFlags) == AI_SUCCESS)

        D.normalMap_ = addUnique(files, Path.C_Str());

  13. If there is no classic normal map, we should check if a heightmap texture is present. This can be converted into a normal map at a later stage of the conversion process:

      if (D.normalMap_ == 0xFFFFFFFF)

        if (aiGetMaterialTexture( M, aiTextureType_HEIGHT,        0, &Path,&Mapping, &UVIndex, &Blend,        &TextureOp, TextureMapMode,

            &TextureFlags ) == AI_SUCCESS)

        D.normalMap_ = addUnique(files, Path.C_Str());

  14. The last map we will be using is the opacity map, which is stored in a separate opacityMaps array. We will pack the opacity maps into the alpha channel of our albedo textures:

      if (aiGetMaterialTexture( M, aiTextureType_OPACITY,      0, &Path,&Mapping, &UVIndex, &Blend, &TextureOp,      TextureMapMode, &TextureFlags ) == AI_SUCCESS) {

        D.opacityMap_ =      addUnique(opacityMaps, Path.C_Str());

        D.alphaTest_ = 0.5f;

      }

  15. The final part of the material conversion routine applies some heuristics for guessing the material's properties, just by looking at the material's name. Here, we are only checking for glass-like materials in our largest test scene, but some common names, such as "gold," "silver," and so on can also be used to assign metallic coefficients and albedo colors. Essentially, this is an easy trick to make our test scene look better. At the end, the MaterialDescription instance is returned for further processing:

      aiString Name;

      std::string materialName;

      if (aiGetMaterialString(M, AI_MATKEY_NAME, &Name)      == AI_SUCCESS)

        materialName = Name.C_Str();

      if (materialName.find("Glass") != std::string::npos)

        D.alphaTest_ = 0.75f;

      if (materialName.find("Bottle") !=      std::string::npos)

        D.alphaTest_ = 0.54f;

      return D;

    }

  16. The only thing we need to mention here is the addUnique() function, which populates the list of texture files. We must check if this filename is already in the collection. If the file is not there, we must add it and return its index. Otherwise, the index of a previously added texture file is returned:

    int addUnique(std::vector<std::string>& files,  const std::string& file)

    {

      if (file.empty()) return -1;

      auto i = std::find(std::begin(files),    std::end(files), file);

      if (i != files.end())

        return (int)std::distance(files.begin(), i);

      files.push_back(file);

      return (int)files.size() - 1;

    }

Before we move on, let's take a look at how to implement all the helper routines necessary for our scene converter tool, which will be described in the next recipe. The convertAndDownscaleAllTextures() function is used to generate the internal filenames for each of the textures and convert the contents of each texture into a GPU-compatible format. Let's take a look:

  1. As parameters, this routine accepts a list of material descriptions, an output directory for texture data, and the containers for all the texture files and opacity maps:

    void convertAndDownscaleAllTextures(  const std::vector<MaterialDescription>& materials,  const std::string& basePath,  std::vector<std::string>& files,  std::vector<std::string>& opacityMaps)

    {

  2. Each of the opacity maps is combined with the albedo map. To keep the correspondence between the opacity map list and the global texture indices, we will use a standard C++ hash table:

      std::unordered_map<std::string, uint32_t>    opacityMapIndices(files.size());

  3. We must iterate over all the materials and check if they have both an opacity and albedo map. If the opacity and albedo maps are present, we must associate this opacity map with the albedo map:

      for (const auto& m : materials)

        if (m.opacityMap_ != 0xFFFFFFFF &&        m.albedoMap_ != 0xFFFFFFFF)

          opacityMapIndices[files[m.albedoMap_]] =        m.opacityMap_;

  4. The following lambda takes a source texture filename and returns a modified texture filename. Internally, the texture date is converted here:

      auto converter = [&](const std::string& s) ->    std::string {

        return convertTexture(      s, basePath, opacityMapIndices, opacityMaps);

      };

  5. We use the std::transform() algorithm to convert all of the texture files:

      std::transform(std::execution::par,    std::begin(files), std::end(files),    std::begin(files), converter);

    }

    The std::execution::par parameter is a C++20 feature that allows us to parallel process the array. Since converting the texture data is a rather lengthy process, this straightforward parallelization reduces our processing time significantly.

A single texture map is converted into our runtime data format with the following routine:

  1. All our output textures will have no more than 512x512 pixels:

    std::string convertTexture(const std::string& file,  const std::string& basePath,  std::unordered_map<std::string, uint32_t>&    opacityMapIndices,  const std::vector<std::string>& opacityMaps)

    {

      const int maxNewWidth = 512;

      const int maxNewHeight = 512;

  2. A temporary dynamic array will contain a combined albedo and opacity map. To run this on Windows, Linux, and macOS, we should replace all the path separators with the "/" symbol:

      std::vector<uint8_t> tmpImage(    maxNewWidth * maxNewHeight * 4);

      const auto srcFile =    replaceAll(basePath + file, "\",  "/");

  3. The new filename is a concatenation of a fixed output directory and a source filename, with all path separators replaced by double underscores:

      const auto newFile =    std::string("data/out_textures/") +    lowercaseString(replaceAll(replaceAll(      srcFile, "..", "__"), "/", "__") +    std::string("__rescaled")) + std::string(".png");

  4. Just as we did in the previous chapters, we will use the stb_image library to load the textures. We must force the loaded image to be in RGBA format, even if there is no opacity information. This is a shortcut that we can take here to make our texture handling code significantly simpler:

      int texWidth, texHeight, texChannels;

      stbi_uc* pixels =    stbi_load(fixTextureFile(srcFile).c_str(),    &texWidth, &texHeight, &texChannels,    STBI_rgb_alpha);

      uint8_t* src = pixels;

      texChannels = STBI_rgb_alpha;

    Note

    The fixTextureFile() function fixes situations where 3D model material data references texture files with inappropriate case in filenames. For example, the .mtl file may contain map_Ka Texture01.png, while the actual filename on the file system is called texture01.png. This way, we can fix naming inconsistencies in the Bistro scene on Linux.

  5. If the texture failed to load, we must set our temporary array as input data to avoid having to exit here:

      if (!src) {

        printf("Failed to load [%s] texture ",      srcFile.c_str());

        texWidth = maxNewWidth;

        texHeight = maxNewHeight;

        src = tmpImage.data();

      }

  6. If this texture has an associated opacity map stored in the hash table, we must load that opacity map and add its contents to the albedo map. As with the source texture file, we must replace the path separators for cross-platform operations. The opacity map is loaded as a simple grayscale image:

      if (opacityMapIndices.count(file) > 0) {

        const auto opacityMapFile = replaceAll(basePath +     opacityMaps[opacityMapIndices[file]], "\", "/");

        int opacityWidth, opacityHeight;

        stbi_uc* opacityPixels =      stbi_load(opacityMapFile.c_str(),      &opacityWidth, &opacityHeight, nullptr, 1);

  7. After signaling a possible loading error, we must check the loaded image's validity:

        if (!opacityPixels) {

          printf("Failed to load opacity mask [%s] ",        opacityMapFile.c_str());

        }

        assert(opacityPixels);

        assert(texWidth == opacityWidth);

        assert(texHeight == opacityHeight);

  8. After successfully loading the opacity map with the correct dimensions, we must store the opacity values in the alpha component of this albedo texture:

        for (int y = 0; y != opacityHeight; y++)

          for (int x = 0; x != opacityWidth; x++)

            src[(y * opacityWidth + x) * texChannels + 3]          = opacityPixels[y * opacityWidth + x];

  9. The stb_image library uses explicit memory management, so we must free the loaded opacity map manually:

        stbi_image_free(opacityPixels);

      }

  10. At this point, all the loaded textures have been downscaled. We must allocate a maximum number of bytes to hold the output image. The output texture size isn't bigger than the constants we defined at the start of this function:

      const uint32_t imgSize =    texWidth * texHeight * texChannels;

      std::vector<uint8_t> mipData(imgSize);

      uint8_t* dst = mipData.data();

      const int newW = std::min(texWidth, maxNewWidth);

      const int newH = std::min(texHeight, maxNewHeight);

  11. The stb_image_resize library provides a simple function for rescaling an image without losing too much quality. Finally, let's write the output texture in PNG format using the stb_image_write library:

      stbir_resize_uint8(src, texWidth, texHeight, 0, dst,    newW, newH, 0, texChannels);

      stbi_write_png(    newFile.c_str(), newW, newH, texChannels, dst, 0);

  12. If the source texture was loaded in the first place, we must free it manually. No matter what the result of the conversion is, we must return the new texture's filename:

      if (pixels) stbi_image_free(pixels);

      return newFile;

    }

    This way, we ensure that if the conversion tool has completed without errors, the converted dataset is always valid and requires significantly fewer runtime checks.

There's more...

This relatively long recipe has shown all the necessary routines for retrieving material and texture data from external 3D assets. To learn how these functions are used in real code, let's jump to the next recipe, Implementing a scene conversion tool. The previous recipe, Implementing a material system, showed you how to use the imported materials with GLSL shaders.

Implementing a scene conversion tool

In Chapter 5, Working with Geometry Data, we implemented a geometry conversion tool capable of loading meshes in various formats supported by the Assimp library, such as .gltf or .obj, and storing them in our runtime format, which is suitable for fast loading and rendering. In this recipe, we will extend this tool into a full scene converter that will handle all our materials and textures. Let's get started and learn how to do this.

Getting ready

The source code for the scene conversion tool described in this chapter can be found in the Chapter7SceneConverter folder. The entire project is covered in this recipe. If you want to start with a simpler version of the tool that only deals with geometry data, take a look at the Implementing a geometry conversion tool recipe in Chapter 5, Working with Geometry Data.

Before we look at this recipe, make sure you're familiar with the Implementing a material system and Importing materials from Assimp recipes of this chapter.

Our geometry conversion tool takes its configuration from a .json file that, for the Lumberyard Bistro mesh used in this book, looks like this:

[{ "input_scene": "deps/src/bistro/Exterior/exterior.obj",   "output_mesh": "data/meshes/test.meshes",   "output_scene": "data/meshes/test.scene",   "output_materials": "data/meshes/test.materials",   "output_boxes": "data/meshes/test.boxes",   "scale": 0.01,   "calculate_LODs": false,   "merge_instances": true },

{ "input_scene": "deps/src/bistro/Interior/interior.obj",   "output_mesh": "data/meshes/test2.meshes",   "output_scene": "data/meshes/test2.scene",   "output_materials": "data/meshes/test2.materials",   "output_boxes": "data/meshes/test2.boxes",   "scale": 0.01,   "calculate_LODs": false,   "merge_instances": true }]

To parse this configuration file, we are going to use the RapidJSON library, which can be found on GitHub at https://github.com/Tencent/rapidjson.

How to do it...

First, we should take a look at how to implement the .json parsing step:

  1. There's a single function we can use for this that returns a container of SceneConfig structures describing where to load a mesh file from, as well as where to save the converted data:

    struct SceneConfig {

      std::string fileName;

      std::string outputMesh;

      std::string outputScene;

      std::string outputMaterials;

      std::string outputBoxes;

      float scale;

      bool calculateLODs;

      bool mergeInstances;

    };

  2. The JSON parsing code for using RapidJSON is straightforward. We will omit all error checking in this book's text, but the actual code in this book's code bundle contains some useful asserts:

    std::vector<SceneConfig> readConfigFile(    const char* cfgFileName) {

      std::ifstream ifs(cfgFileName);

      rapidjson::IStreamWrapper isw(ifs);

      rapidjson::Document document;

      const rapidjson::ParseResult =    document.ParseStream(isw);

      std::vector<SceneConfig> configList;

      for (rapidjson::SizeType i = 0; i < document.Size();       i++) {

        configList.emplace_back(SceneConfig {      .fileName =        document[i]["input_scene"].GetString(),      .outputMesh =         document[i]["output_mesh"].GetString(),      .outputScene =        document[i]["output_scene"].GetString(),      .outputMaterials =        document[i]["output_materials"].GetString(),      .outputBoxes =        document[i].HasMember("output_boxes") ?        document[i]["output_boxes"].GetString() :        std::string(),      .scale =        (float)document[i]["scale"].GetDouble(),      .calculateLODs =         document[i]["calculate_LODs"].GetBool(),      .mergeInstances =        document[i]["merge_instances"].GetBool()    });

      }

      return configList;

    }

  3. Now, let's take a look at the converter's main() function. We will read the configuration settings for all the scenes and invoke the conversion process for each one:

    int main() {

      fs::create_directory("data/out_textures");

      const auto configs =    readConfigFile("data/sceneconverter.json");

      for (const auto& cfg: configs)

        processScene(cfg);

      return 0;

    }

The actual heavy lifting is done inside processScene(). It loads a single scene file using Assimp and converts all the data into formats suitable for rendering. Let's look deeper to see how this is done:

  1. First, we will introduce a global state to simplify our implementation. Other functions, besides processScene(), will access this data; we don't want to overcomplicate the design:

    std::vector<Mesh> g_meshes;

    std::vector<BoundingBox> g_boxes;

    std::vector<uint32_t> g_indexData;

    std::vector<float> g_vertexData;

    uint32_t g_indexOffset = 0;

    uint32_t g_vertexOffset = 0;

  2. The processing functions start by clearing all the global mesh data from where it was used previously:

    void processScene(const SceneConfig& cfg) {

      g_meshes.clear();

      g_indexData.clear();

      g_vertexData.clear();

      g_indexOffset = 0;

      g_vertexOffset = 0;

  3. To load a mesh using Assimp, we must extract the base path from the filename:

      const size_t pathSeparator =    cfg.fileName.find_last_of("/\");

      const string basePath =    (pathSeparator != string::npos) ?      cfg.fileName.substr(0, pathSeparator + 1) : "";

    The actual file is in another folder. We are going to need it later, when we deal with the textures.

  4. Import a scene file using the following Assimp flags. We want to apply most of the optimizations and convert all the polygons into triangles. Normal vectors should be generated for those meshes that do not contain them. Error checking has been skipped here so that we can focus on the code's flow:

      const unsigned int flags = 0 |    aiProcess_JoinIdenticalVertices |    aiProcess_Triangulate |    aiProcess_GenSmoothNormals |    aiProcess_LimitBoneWeights |    aiProcess_SplitLargeMeshes |    aiProcess_ImproveCacheLocality |    aiProcess_RemoveRedundantMaterials |    aiProcess_FindDegenerates |    aiProcess_FindInvalidData |    aiProcess_GenUVCoords;

      const aiScene* scene =    aiImportFile(cfg.fileName.c_str(), flags);

  5. Once the mesh file has been loaded, we should convert the Assimp meshes into our representation. We will do this in the same way we did it in the Implementing a geometry conversion tool recipe of Chapter 5, Working with Geometry Data. Additionally, we will generate a bounding box for each mesh. Bounding boxes will be used in the next chapter to implement frustum culling:

      g_meshes.reserve(scene->mNumMeshes);

      for (unsigned int i = 0; i != scene->mNumMeshes;             i++) {

        Mesh = convertAIMesh(scene->mMeshes[i], cfg);

        g_meshes.push_back(mesh);

        if (!cfg.outputBoxes.empty()) {

          BoundingBox box = calculateBoundingBox(        g_vertexData.data()+mesh.vertexOffset,        mesh.vertexCount);

          g_boxes.push_back(box);

        }

      }

      saveMeshesToFile(cfg.outputMesh.c_str());

      if (!cfg.outputBoxes.empty())

        saveBoundingBoxes(      cfg.outputBoxes.c_str(), g_boxes);

  6. The next step of the conversion process is to convert all the Assimp materials that was loaded from file into our runtime material representation, which is suitable for rendering. All the texture filenames will be saved in the files container. The opacity maps will be packed into the alpha channels of the corresponding textures:

      std::vector<MaterialDescription> materials;

      std::vector<std::string>& materialNames =    ourScene.materialNames_;

      std::vector<std::string> files;

      std::vector<std::string> opacityMaps;

      for (unsigned int m = 0; m < scene->mNumMaterials;       m++) {

        aiMaterial* mm = scene->mMaterials[m];

        materialNames.push_back(      std::string(mm->GetName().C_Str()));

        MaterialDescription matDescription =      convertAIMaterialToDescription(        mm, files, opacityMaps);

        materials.push_back(matDescription);

      }

  7. The textures are converted, rescaled, and packed into the output folder. The basePath folder's name is needed to extract plain filenames:

      convertAndDownscaleAllTextures(    materials, basePath, files, opacityMaps);

      saveMaterials(    cfg.outputMaterials.c_str(), materials, files);

  8. Now, the scene is converted into the first-child-next-sibling form and saved:

      traverse(scene, ourScene, scene->mRootNode, -1, 0);

      saveScene(cfg.outputScene.c_str(), ourScene);

    }

    At this point, the data is ready for rendering. The output from running the conversion tool should look as follows:

    Loading scene from 'deps/src/bistro/Exterior/exterior.obj'...

    Converting meshes 1/22388...

    ... skipped ...

    Loading scene from 'deps/src/bistro/Interior/interior.obj'...

    Converting meshes 1/2381...

    ... skipped ...

If everything works as planned, the tool will output the converted mesh data to data/meshes and the packed textures to data/out_textures.

There's more...

Our texture conversion code goes through all the textures, downscales them to 512x512 where necessary, and saves them in RGBA .png files. In a real-world content pipeline, this conversion process may include a texture compression phase. We recommend that you implement this as an exercise using the ETC2Comp library described in Chapter 2, Using Essential Libraries. Adding texture compression code directly to the convertTexture() function in Chapter7SceneConvertersrcmain.cpp should be the easiest way to go about this.

Managing Vulkan resources

In the previous chapters, we implemented individual manual management for Vulkan resources in all our rendering classes. This recipe describes the system that manages all Vulkan-related objects and provides utility functions to create entities such as offscreen framebuffers, render passes, pipelines, textures, and storage buffers. All the functions described here will be used in the subsequent recipes.

Getting ready

The largest part of our resource management scene, which includes creating the descriptor set and update routines, is not included in this recipe. See the Unifying descriptor set creation routines recipe for additional implementation details.

How to do it...

The VulkanResources class contains a list of all the Vulkan objects. Its private part, along with a reference to VulkanRenderDevice, contains various std::vector members for storing our whole safari park of Vulkan objects. Let's take a look:

  1. First, we must store all the loaded textures:

    struct VulkanResources {

    private:

      VulkanRenderDevice& vkDev;

      std::vector<VulkanTexture> allTextures;

  2. Vulkan buffers are used for storing geometry, uniform parameters, and indirect draw commands. The Unifying descriptor set creation routines recipe contains descriptions of certain helper routines for creating different types of buffers:

      std::vector<VulkanBuffer> allBuffers;

  3. The framebuffers and renderpasses will be created and used in the next recipe, Refactoring Vulkan initialization and the main loop. Graphical pipelines are used in all the Vulkan renderers and postprocessors:

      std::vector<VkFramebuffer> allFramebuffers;

      std::vector<VkRenderPass> allRenderPasses;

      std::vector<VkPipelineLayout> allPipelineLayouts;

      std::vector<VkPipeline> allPipelines;

  4. Descriptor set layouts and pools must be created using the routines described in the Unifying descriptor set creation routines recipe:

      std::vector<VkDescriptorSetLayout> allDSLayouts;

      std::vector<VkDescriptorPool>      allDPools;

  5. The class constructor simply stores the reference in an externally passed Vulkan device object:

      explicit VulkanResources(VulkanRenderDevice& vkDev)

      : vkDev(vkDev) {}

  6. The only place where this destructor gets called implicitly is in the VulkanRenderContext class, which will be described in the following recipe. The destructor iterates over all the Vulkan objects and calls the appropriate destruction functions:

      ~VulkanResources() {

        for (auto& t: allTextures)

          destroyVulkanTexture(vkDev.device, t);

        for (auto& b: allBuffers) {

          vkDestroyBuffer(        vkDev.device, b.buffer, nullptr);

          vkFreeMemory(vkDev.device, b.memory, nullptr);

        }

  7. The framebuffers and renderpasses from all our renderers are also destroyed here:

        for (auto& fb: allFramebuffers)

          vkDestroyFramebuffer(vkDev.device, fb, nullptr);

        for (auto& rp: allRenderPasses)

          vkDestroyRenderPass(vkDev.device, rp, nullptr);

  8. The descriptor pools and pipelines are destroyed at the end of this process:

        for (auto& ds: allDSLayouts)

          vkDestroyDescriptorSetLayout(        vkDev.device, ds, nullptr);

        for (auto& pl: allPipelineLayouts)

          vkDestroyPipelineLayout(        vkDev.device, pl, nullptr);

        for (auto& p: allPipelines)

          vkDestroyPipeline(vkDev.device, p, nullptr);

        for (auto& dpool: allDPools)

          vkDestroyDescriptorPool(        vkDev.device, dpool, nullptr);

      }

  9. In the next recipe, the full screen renderpasses and framebuffers are allocated externally and passed to the VulkanResources class for deallocation at the end of the runtime. Here are two routines that register framebuffer and renderpass instances for deallocation:

      inline void registerFramebuffer(VkFramebuffer fb) {

        allFramebuffers.push_back(fb);

      }

      inline void registerRenderPass(VkRenderPass rp) {

        allRenderPasses.push_back(rp);

      }

In our previous examples, we loaded the textures in an ad hoc fashion, as well as created the image and sampler. Here, we will wrap the texture file loading code in a single method:

  1. The createTextureImage() function, from the Using texture data in Vulkan recipe of Chapter 3, Getting Started with OpenGL and Vulkan, loads the image:

    VulkanTexture loadTexture2D(const char* filename) {

      VulkanTexture tex;

      if (!createTextureImage(vkDev, filename,      tex.image.image, tex.image.imageMemory)) {

        printf("Cannot load %s 2D texture file ",      filename);

        exit(EXIT_FAILURE);

      }

  2. Let's assume that all the loaded images are in the RGBA 8-bit per-channel format. This is enforced by the scene converter tool described in this chapter. We will be using the VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL layout since loaded images are intended to be used as inputs for fragment shaders:

      VkFormat format = VK_FORMAT_R8G8B8A8_UNORM;

      transitionImageLayout(vkDev, tex.image.image,    format,    VK_IMAGE_LAYOUT_UNDEFINED,    VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);

  3. Here, the image view is created for the new texture. Here, we will use the default layer count parameter for createImageView(). A major improvement we could make to this routine would be to calculate the MIP levels for the loaded image:

      if (!createImageView(vkDev.device, tex.image.image,      format, VK_IMAGE_ASPECT_COLOR_BIT,      &tex.image.imageView)) {

        printf("Cannot create image view for 2d texture      (%s) ", filename);

        exit(EXIT_FAILURE);

      }

  4. After creating the texture sampler, we must store our newly created VulkanTexture instance in the allTextures array:

      createTextureSampler(vkDev.device, &tex.sampler);

      allTextures.push_back(tex);

      return tex;

    }

Along with loadTexture(), three other loading methods are provided for different types of textures:

  VulkanTexture loadCubeMap(    const char* fileName, uint32_t mipLevels);

  VulkanTexture loadKTX(const char* fileName);

  VulkanTexture createFontTexture(const char* fontFile);

The source code for loadCubeMap() is located in the UtilsVulkanPBRModelRenderer.cpp file. The only difference, as with all the loading routines, is that we are adding the created VulkanTexture to our allTextures container so that it will be deleted at the end of our program. The loadKTX() function is similar to the KTX file loading process that's described in the constructor of PBRModelRenderer.

After loading the texture data, we must create an image view in the VK_FORMAT_R16G16_SFLOAT format and add the created VulkanTexture to our allTextures array. The code for the createFontTexture() method can be found in the UtilsVulkanImGui.cpp file.

Let's look at some other helper functions that will make dealing with Vulkan objects somewhat easier:

  1. All the buffers in our renderers are created by calling the addBuffer() routine either directly or indirectly:

    VulkanBuffer addBuffer(VkDeviceSize size,  VkBufferUsageFlags usage,  VkMemoryPropertyFlags properties)

    {

      VulkanBuffer buffer = {    .buffer = VK_NULL_HANDLE,    .size = 0,    .memory = VK_NULL_HANDLE   };

  2. The createSharedBuffer() method is called so that we can use the buffer in compute shaders. If successful, the buffer is added to the allBuffers container:

      if (!createSharedBuffer(vkDev, size, usage,      properties, buffer.buffer, buffer.memory)) {

        printf("Cannot allocate buffer ");

        exit(EXIT_FAILURE);

      } else {

        buffer.size = size;

        allBuffers.push_back(buffer);

      }

      return buffer;

    }

  3. We haven't used offscreen rendering in the previous chapters, but pretty much every rendering and composition technique requires offscreen framebuffers. By default, a new texture is the size of the output framebuffer. The new texture contains dimensions and format information that will be passed to the framebuffer creation routine:

    VulkanTexture addColorTexture(  int texWidth, int texHeight, VkFormat colorFormat)

    {

      const uint32_t w = (texWidth > 0) ?    texWidth  : vkDev.framebufferWidth;

      const uint32_t h = (texHeight> 0) ?    texHeight : vkDev.framebufferHeight;

      VulkanTexture res = {    .width = w,  .height = h,  .depth = 1,    .format = colorFormat   };

  4. The createOffscreenImage() function sets the appropriate usage flags for the image. Should the creation process fail, we must terminate the program after issuing an error message. An image view and a texture sampler can be created in the standard way, as follows:

      if (!createOffscreenImage(vkDev,      res.image.image, res.image.imageMemory,      w, h, colorFormat, 1, 0)) {

        printf("Cannot create color texture ");

        exit(EXIT_FAILURE);

      }

      createImageView(vkDev.device, res.image.image,    colorFormat, VK_IMAGE_ASPECT_COLOR_BIT,    &res.image.imageView);

      createTextureSampler(vkDev.device, &res.sampler);

  5. To keep the code size minimal, we will set a fixed layout for our texture at creation time and we won't change the layout each frame. Just like the other textures, we will store this one in the allTextures container:

      transitionImageLayout(vkDev, res.image.image,    colorFormat, VK_IMAGE_LAYOUT_UNDEFINED,    VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);

      allTextures.push_back(res);

      return res;

    }

Rendering to an offscreen depth texture is used for shadow mapping and approximating ambient occlusion. The routine is almost the same as addColorTexture(), but depthFormat and image usage flags must be different. We must also explicitly specify the image layout to avoid performance warnings from validation layers. Let's take a look:

  1. Using findDepthFormat() from Chapter 3, Getting Started with OpenGL and Vulkan, we will set the format of our new texture:

    VulkanTexture addDepthTexture(int texWidth,  int texHeight, VkImageLayout layout =  VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL)

    {

      const uint32_t w = (texWidth  > 0) ?    texWidth : vkDev.framebufferWidth;

      const uint32_t h = (texHeight > 0) ?    texHeight : vkDev.framebufferHeight;

      const VkFormat depthFormat =    findDepthFormat(vkDev.physicalDevice);

  2. After storing the texture dimensions, we must call createImage() with the necessary flags:

      VulkanTexture depth = {

        .width = w,  .height = h,  .depth = 1,  .format =

        depthFormat

      };

      if (!createImage(vkDev.device, vkDev.physicalDevice,      w, h, depthFormat, VK_IMAGE_TILING_OPTIMAL,       VK_IMAGE_USAGE_SAMPLED_BIT |       VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT,      VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,      depth.image.image, depth.image.imageMemory)) {

        printf("Cannot create depth texture ");

        exit(EXIT_FAILURE);

      }

  3. An image view is created and its layout is set:

      createImageView(vkDev.device, depth.image.image,    depthFormat, VK_IMAGE_ASPECT_DEPTH_BIT,    &depth.image.imageView);

      transitionImageLayout(vkDev, depth.image.image,    depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, layout);

  4. The sampler for the depth textures uses different flags, so it is created with a dedicated function:

      if (!createDepthSampler(      vkDev.device, &depth.sampler)) {

        printf("Cannot create a depth sampler");

        exit(EXIT_FAILURE);

      }

      allTextures.push_back(depth);

      return depth;

    }

  5. The rest of the resource management is dedicated to framebuffers, renderpasses, and pipelines. Internally, we refer to render passes using the RenderPass structure, which holds a Vulkan handle, along with the list of parameters that were used to create this render pass:

    struct RenderPass {

      RenderPass() = default;

      explicit RenderPass(VulkanRenderDevice& device,    bool useDepth = true,    const RenderPassCreateInfo& ci =      RenderPassCreateInfo()): info(ci)

      {

        if (!createColorAndDepthRenderPass(        vkDev, useDepth, &handle, ci)) {

          printf("Failed to create render pass ");

          exit(EXIT_FAILURE);

        }

      }

      RenderPassCreateInfo info;

      VkRenderPass handle = VK_NULL_HANDLE;

    };

Creating the framebuffer is a frequent operation, so to make our rendering initialization code shorter, we must implement the addFramebuffer() function, which takes a render pass object and a list of attachments to create a framebuffer:

  1. First, we must extract individual image view objects from a container of VulkanTexture objects:

    VkFramebuffer addFramebuffer(  RenderPass,  const std::vector<VulkanTexture>& images)

    {

      VkFramebuffer framebuffer;

      std::vector<VkImageView> attachments;

      for (const auto& i: images)

        attachments.push_back(i.image.imageView);

  2. Just as we did in the Initializing the Vulkan pipeline recipe of Chapter 3, Getting Started with OpenGL and Vulkan, we will pass a list of attachments to the creation structure. It's assumed that all the images are the same size, so we will use the size of the first one here:

      VkFramebufferCreateInfo fbInfo = {    .sType =       VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,    .pNext = nullptr,    .flags = 0,    .renderPass = renderPass.handle,    .attachmentCount = (uint32_t)attachments.size(),    .pAttachments = attachments.data(),    .width = images[0].width,    .height = images[0].height,    .layers = 1   };

  3. After completing the vkCreateFramebuffer() call, we should store the newly created framebuffer in the allFramebuffers container:

      if (vkCreateFramebuffer(      vkDev.device, &fbInfo, nullptr, &framebuffer)      != VK_SUCCESS) {

        printf("Unable to create offscreen       framebuffer ");

        exit(EXIT_FAILURE);

      }

      allFramebuffers.push_back(framebuffer);

      return framebuffer;

    }

  4. Our renderers from the following samples require different kinds of rendering passes. The most generic addRenderPass() function assumes that there is at least one attachment. Render passes with empty attachment lists are not supported:

    RenderPass addRenderPass(  const std::vector<VulkanTexture>& outputs,  const RenderPassCreateInfo ci = {    .clearColor_ = true, .clearDepth_ = true,    .flags_ = eRenderPassBit_Offscreen |              eRenderPassBit_First },  bool useDepth = true)

    {

      VkRenderPass renderPass;

      if (outputs.empty()) {

        printf("Empty list of output attachments for       RenderPass ");

        exit(EXIT_FAILURE);

      }

  5. A render pass with one color attachment is a special case:

      if (outputs.size() == 1) {

        if (!createColorOnlyRenderPass(        vkDev, &renderPass, ci, outputs[0].format)) {

          printf("Unable to create offscreen color-only         pass ");

          exit(EXIT_FAILURE);

        }

  6. For more than one attachment, we should call the general render pass creation routine from the Initializing the Vulkan pipeline recipe of Chapter 3, Getting Started with OpenGL and Vulkan:

      } else {

        if (!createColorAndDepthRenderPass(        vkDev, useDepth && (outputs.size() > 1),        &renderPass, ci, outputs[0].format)) {

          printf("Unable to create offscreen render         pass ");

          exit(EXIT_FAILURE);

        }

      }

  7. Finally, our new render pass should be stored in an appropriate container:

      allRenderPasses.push_back(renderPass);

      RenderPass rp;

      rp.info = ci;

      rp.handle = renderPass;

      return rp;

    }

  8. A depth-only render pass creation, used for shadow mapping, contains less logic and simply redirects to the createDepthOnlyRenderPass() function:

    RenderPass addDepthRenderPass(  const std::vector<VulkanTexture>& outputs,  const RenderPassCreateInfo ci = {    .clearColor_ = false, .clearDepth_ = true,    .flags_ = eRenderPassBit_Offscreen |               eRenderPassBit_First   })

    {

      VkRenderPass renderPass;

      if (!createDepthOnlyRenderPass(      vkDev, &renderPass, ci)) {

        printf("Unable to create offscreen render       pass ");

        exit(EXIT_FAILURE);

      }

      allRenderPasses.push_back(renderPass);

      RenderPass rp;

      rp.info = ci;

      rp.handle = renderPass;

      return rp;

    }

  9. Two helper methods allow for shorter initialization periods for swap chains and full screen renderpasses. All the framebuffers associated with a swapchain are added to the framebuffer list:

      std::vector<VkFramebuffer> addFramebuffers(    VkRenderPass renderPass,    VkImageView depthView = VK_NULL_HANDLE)

      {

        RenderPass  std::vector<VkFramebuffer>      framebuffers;

        createColorAndDepthFramebuffers(vkDev,      renderPass, depthView, framebuffers);

        for (auto f : framebuffers)

          allFramebuffers.push_back(f);

        return framebuffers;

      }

  10. Once it's been created, the render pass is added to our local repository, in the allRenderPasses container:

      RenderPass addFullScreenPass(    bool useDepth = true,    const RenderPassCreateInfo& ci =      RenderPassCreateInfo())

      {

        RenderPass result(vkDev, useDepth, ci);

        allRenderPasses.push_back(result.handle);

        return result;

      }

  11. Along with descriptor sets, which refer to individual buffers and textures, pipelines define the rendering process. Pipeline creation parameters are passed around in a structure:

    struct PipelineInfo {

      uint32_t width  = 0;

      uint32_t height = 0;

      VkPrimitiveTopology topology =    VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;

      bool useDepth = true;

      bool useBlending = true;

      bool dynamicScissorState = false;

    };

  12. The pipeline layouts are created with a function createPipelineLayoutWithConstants(), which is similar to createPipelineLayout() from Chapter 3, Getting Started with OpenGL and Vulkan, but adds push constants to the Vulkan pipeline. The newly created pipeline layout is stored in a container:

    VkPipelineLayout addPipelineLayout(  VkDescriptorSetLayout dsLayout,  uint32_t vtxConstSize = 0,  uint32_t fragConstSize = 0)

    {

      VkPipelineLayout pipelineLayout;

      if (!createPipelineLayoutWithConstants(      vkDev.device, dsLayout, &pipelineLayout,      vtxConstSize, fragConstSize))  {

        printf("Cannot create pipeline layout ");

        exit(EXIT_FAILURE);

      }

      allPipelineLayouts.push_back(pipelineLayout);

      return pipelineLayout;

    }

  13. The addPipeline() method wraps the createGraphicsPipeline() function. Once it's been created, the pipeline is put into yet another container of Vulkan objects:

    VkPipeline addPipeline(

      VkRenderPass renderPass,

      VkPipelineLayout pipelineLayout,

      const std::vector<const char*>& shaderFiles,

      const PipelineInfo& pipelineParams = PipelineInfo {    .width = 0, .height = 0,    .topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST,    .useDepth = true, .useBlending = false,    .dynamicScissorState = false })

    {

      VkPipeline pipeline;

      if (!createGraphicsPipeline(vkDev, renderPass,       pipelineLayout, shaderFiles, &pipeline,       ppInfo.topology,ppInfo.useDepth,       ppInfo.useBlending, ppInfo.dynamicScissorState,       ppInfo.width, ppInfo.height)) {

        printf("Cannot create graphics pipeline ");

        exit(EXIT_FAILURE);

      }

      allPipelines.push_back(pipeline);

      return pipeline;

    }

The only instance of the VulkanResources class resides in the VulkanRenderContext structure, which will be described in the next recipe. All the resources are deleted strictly the global VkDevice object encapsulated in VulkanRenderDevice is destroyed.

There's more...

The examples in this chapter heavily rely on indirect rendering, so individual mesh rendering is hidden within our scene graph. However, if you wish to update the sample code from Chapter 3, Getting Started with OpenGL and Vulkan, and use it for direct mesh geometry manipulation, the addVertexBuffer() method has been provided. The mesh geometry uploading code is similar to the createTexturedVertexBuffer() and createPBRVertexBuffer() functions we described in previous chapters:

VulkanBuffer addVertexBuffer(uint32_t indexBufferSize,  const void* indexData,  uint32_t vertexBufferSize,  const void* vertexData)

{

  VulkanBuffer result;

  result.size = allocateVertexBuffer(vkDev, &result.buffer,    &result.memory, vertexBufferSize, vertexData,    indexBufferSize, indexData);

  allBuffers.push_back(result);

  return result;

}

The last important issue in terms of resource management is the descriptor set creation routines. This will be covered in the Unifying descriptor set creation routines recipe of this chapter.

We typically use VulkanResources in the constructors of different Renderer classes. The Putting it all together into a Vulkan application recipe will show you how our resource management fits the general application code.

Refactoring Vulkan initialization and the main loop

Starting from Chapter 3, Getting Started with OpenGL and Vulkan, we introduced an ad hoc rendering loop for each demo application, which resulted in significant code duplication. Let's revisit this topic and learn how to create multiple rendering passes for Vulkan without too much boilerplate code.

Getting ready

Before completing this recipe, make sure to revisit the Putting it all together into a Vulkan application recipe of Chapter 3, Getting Started with OpenGL and Vulkan, as well as all the related recipes.

How to do it...

The goal of this recipe is to improve the rendering framework to avoid code repetition in renderers, as well as to simplify our rendering setup. In the next recipe, as a useful side effect, we will use a system capable of setting up and composing multiple rendering passes without too much hustle.

The main function for all our upcoming demos should consist of just three lines:

int main() {

  MyApp app;

  app.mainLoop();

  return 0;

}

Let's take a look at how to organize the MyApp class for this purpose:

  1. The MyApp class is derived from our base VulkanApp. Its constructor initializes all the resources needed for scene and UI rendering:

    class MyApp: public VulkanApp {

    public:

      MyApp()

       ... field initializers list ...

       ... rendering sequence setup ...

    The base class constructor creates a GLFW window and initializes a Vulkan rendering surface, just like we did previously throughout Chapter 3, Getting Started with OpenGL and Vulkan, to Chapter 6, Physically Based Rendering Using the glTF2 Shading Model.

  2. There is no need to override the destructor because all our Vulkan objects are destroyed by the resource management system we discussed in the previous recipe, Managing Vulkan resources. The default destructor also takes care of GLFW windows and Vulkan device instances. The rendering method is called once per frame internally, in VulkanApp::mainLoop():

      void draw3D() override {

        ... whatever render control commands required ...

      }

  3. The drawUI() method may contain arbitrary ImGui library calls that are internally converted into a list of Vulkan commands:

      void drawUI() override {

        ... ImGUI commands …

      }

  4. The overridden VulkanApp::update() method gets called once per fixed time interval. In the Adding Bullet physics to a graphics application recipe of Chapter 9, Working with Scene Graph, we will initiate the physical world update process. Camera control may also be added here:

      void update(float deltaSeconds) override {

        ... update whatever needs to be updated ...

      }

  5. The private section contains references to VulkanTexture, VulkanBuffer, and the renderer classes we discussed earlier:

    private:

      ... Vulkan buffers, scene geometry, textures etc....

      ... e.g., some texture:  VulkanTexture envMap ...

      ... whatever renderers an app needs ...

      MultiRenderer;

      GuiRenderer imgui;

    };

Having said this, let's see how the VulkanApp class wraps all the initialization and uses the previously defined VulkanResources.

The application class relies on previously developed functions and some new items that we must describe before implementing VulkanApp itself. In the previous chapters, we used VulkanRenderDevice as a simple C structure and called all the initialization routines explicitly in every sample. Following the C++ resource acquisition is initialization (RAII) paradigm, we must wrap these calls with the constructors and destructors of the helper class:

  1. The VulkanContextCreator class holds references to VulkanInstance and VulkanRenderDevice, which are stored in VulkanApp:

    struct VulkanContextCreator {

      VulkanInstance& instance;

      VulkanRenderDevice& vkDev;

  2. The constructor of the class performs familiar initialization for Vulkan instances and logical devices. If anything fails, we must terminate the program:

      VulkanContextCreator(VulkanInstance& vk,    VulkanRenderDevice& dev, void* window,    int screenWidth, int screenHeight):instance(vk),    vkDev(dev)

      {

        createInstance(&vk.instance);

        if (!setupDebugCallbacks(vk.instance,          &vk.messenger, vk.reportCallback) ||        glfwCreateWindowSurface(vk.instance,        (GLFWwindow *)window, nullptr, &vk.surface) ||        !initVulkanRenderDevice3(          vk, dev, screenWidth, screenHeight))

          exit(EXIT_FAILURE);

      }

  3. The destructor performs trivial deinitialization of the Vulkan instance and render device:

      ~VulkanContextCreator() {

        destroyVulkanRenderDevice(vkDev);

        destroyVulkanInstance(instance);

      }

    };

The Vulkan instance and device alone are not enough to render anything: we must declare a basic rendering interface and combine multiple renderers in one frame.

In the previous chapter, we figured out one way to implement a generic interface for a Vulkan renderer. Let's take a look once more:

  1. Here, we will only present the interface, which consists of a function to fill command buffers and a function to update all the current auxiliary buffers containing uniforms or geometry data. This is because the interface is what matters for implementing the application's main loop:

    struct Renderer {

      Renderer(VulkanRenderContext& c);

      virtual void fillCommandBuffer(    VkCommandBuffer cmdBuffer,    size_t currentImage,    VkFramebuffer fb = VK_NULL_HANDLE,    VkRenderPass rp = VK_NULL_HANDLE) = 0;

      virtual void updateBuffers(size_t currentImage) {}

    };

    The details of our implementation are provided in the subsequent recipe, Working with rendering passes.

  2. Since individual Renderer class creation is rather expensive, we must define a wrapper structure that takes a reference to the Renderer class. Thanks to C++11's move semantics, an std::vector of RenderItem instances can be filled with emplace_back(), without it triggering copy constructors or reinitialization:

    struct RenderItem {

      Renderer& renderer_;

      bool enabled_ = true;

      bool useDepth_ = true;

      explicit RenderItem(    Renderer& r, bool useDepth = true)

      : renderer_(r)

      , useDepth_(useDepth)

      {}

    };

  3. The VulkanRenderContext class holds all basic Vulkan objects (instance and device), along with a list of on-screen renderers. This class will be used later in VulkanApp to compose a frame. VulkanContextCreator helps initialize both the instance and logical Vulkan device. The resource management system described in the previous recipe, Managing Vulkan resources, is also initialized here:

    struct VulkanRenderContext {

      VulkanInstance vk;

      VulkanRenderDevice vkDev;

      VulkanContextCreator ctxCreator;

      VulkanResources resources;

  4. In essence, this class contains a list of on-screen renderers, declared as a dynamic array. Along with composite subsystems, a list of renderpass and framebuffer handles are declared for use in Renderer instances. All framebuffers share a single depth buffer. The render passes for on-screen rendering are also declared here:

      std::vector<RenderItem> onScreenRenderers_;

      VulkanTexture depthTexture;

      RenderPass screenRenderPass;

      RenderPass screenRenderPass_NoDepth;

  5. Two special render passes for clearing and finalizing the frame are used, just as in the demo application from Chapter 5, Working with Geometry Data, the VulkanClear and VulkanFinish classes. The framebuffers for depth-buffered and 2D rendering are also here, along with any render passes:

      RenderPass clearRenderPass, finalRenderPass;

      std::vector<VkFramebuffer> swapchainFramebuffers;

      std::vector<VkFramebuffer>

      swapchainFramebuffers_NoDepth;

  6. The constructor of the class is empty; only the initializers' list sets up all the fields:

      VulkanRenderContext(void* window,    uint32_t screenWidth, uint32_t screenHeight)

      : ctxCreator(vk, vkDev, window, screenWidth,      screenHeight)

      , resources(vkDev)

      , depthTexture(resources.addDepthTexture(0, 0,      VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL))

      , screenRenderPass(resources.addFullScreenPass())

      , screenRenderPass_NoDepth(      resources.addFullScreenPass(false))

  7. The finalization and screen clearing render passes are initialized with a special set of creation parameters:

      , finalRenderPass(resources.addFullScreenPass(      true, RenderPassCreateInfo {        .clearColor_ = false, .clearDepth_ = false,        .flags_ = eRenderPassBit_Last  }))

      , clearRenderPass(resources.addFullScreenPass(      true, RenderPassCreateInfo {        .clearColor_ =  true, .clearDepth_ =  true,        .flags_ = eRenderPassBit_First }))

      , swapchainFramebuffers(      resources.addFramebuffers(        screenRenderPass.handle,        depthTexture.image.imageView))

      , swapchainFramebuffers_NoDepth(      resources.addFramebuffers(        screenRenderPass_NoDepth.handle))

      {}

  8. The updateBuffers() method iterates over all the enabled renderers and updates their internal buffers:

      void updateBuffers(uint32_t imageIndex) {

        for (auto& r : onScreenRenderers_)

          if (r.enabled_)

            r.renderer_.updateBuffers(imageIndex);

      }

  9. All the renderers in our framework use custom rendering passes. Starting a new rendering pass can be implemented with the following routine:

      void beginRenderPass(    VkCommandBuffer cmdBuffer, VkRenderPass pass,    size_t currentImage, const VkRect2D area,    VkFramebuffer fb = VK_NULL_HANDLE,    uint32_t clearValueCount = 0,    const VkClearValue* clearValues = nullptr)

      {

  10. As we saw in Chapter 3, Getting Started with OpenGL and Vulkan, the vkCmdBeginRenderPass() API call takes a structure as its parameter. If an external framebuffer is unspecified, we use our local full screen framebuffer. Optional clearing values are also passed as parameters:

        const VkRenderPassBeginInfo renderPassInfo = {      .sType =         VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,      .renderPass = pass,      .framebuffer = (fb != VK_NULL_HANDLE) ?        fb : swapchainFramebuffers[currentImage],      .renderArea = area,      .clearValueCount = clearValueCount,      .pClearValues = clearValues     };

        vkCmdBeginRenderPass(      cmdBuffer, &renderPassInfo,      VK_SUBPASS_CONTENTS_INLINE);

      }

    };

Now, let's see how our frame composition works. It is similar to what we did in Chapter 5, Working with Geometry Data, where we had multiple renderers. The following code can and should only be considered as a refactoring. The added complexity is due to the offscreen rendering support that we need for the next few remaining chapters:

  1. To specify the output region for our renderers, we must declare a rectangle variable:

    void VulkanRenderContext::composeFrame(  VkCommandBuffer commandBuffer, uint32_t imageIndex)

    {

      const VkRect2D defaultScreenRect {    .offset = { 0, 0 },    .extent = { .width  = vkDev.framebufferWidth,                .height = vkDev.framebufferHeight }  };

  2. Clearing the screen requires values for both the color buffer and the depth buffer. If any custom user-specified clearing value is required, this is the place in our framework to add modifications:

      static const VkClearValue defaultClearValues[] =  {    VkClearValue { .color = {1.f, 1.f, 1.f, 1.f} },    VkClearValue { .depthStencil = {1.f, 0} }  };

  3. The special screen clearing render pass is executed first:

      beginRenderPass(commandBuffer,    clearRenderPass.handle,    imageIndex, defaultScreenRect, VK_NULL_HANDLE,    2u, defaultClearValues);

      vkCmdEndRenderPass( commandBuffer );

  4. When the screen is ready, we iterate over the list of renderers and fill the command buffer sequentially. We skip inactive renderers while iterating. This is mostly a debugging feature for manually controlling the output. An appropriate full screen rendering pass is selected for each renderer instance:

      for (auto& r : onScreenRenderers_)

        if (r.enabled_) {

          RenderPass rp = r.useDepth_ ?        screenRenderPass : screenRenderPass_NoDepth;

  5. The framebuffer is also selected according to the useDepth flag in a renderer:

        VkFramebuffer fb =      (r.useDepth_ ? swapchainFramebuffers:         swapchainFramebuffers_NoDepth)[imageIndex];

  6. If this renderer outputs to some offscreen buffer with a custom rendering pass, we replace both the rp and fb pointers accordingly:

          if (r.renderer_.renderPass_.handle !=          VK_NULL_HANDLE)

            rp = r.renderer_.renderPass_;

          if (r.renderer_.framebuffer_ != VK_NULL_HANDLE)

            fb = r.renderer_.framebuffer_;

  7. Finally, we ask the renderer to fill the current command buffer. At the end, the framebuffer is converted into a presentation-optimal format using a special render pass:

          r.renderer_.fillCommandBuffer(        commandBuffer, imageIndex, fb, rp.handle);

        }

      beginRenderPass(commandBuffer,    finalRenderPass.handle, imageIndex,    defaultScreenRect);

      vkCmdEndRenderPass(commandBuffer);

    }

This concludes the definition of our helper classes for the new frame composition framework. Now, we have everything in place to define the application structure:

  1. The protected section of the class contains the mouse state for GUI handling, the screen resolution for correct aspect ratio calculation, and the GLFW window pointer. We also have our one and only VulkanRendererContext instance. To save some typing, we will refer to ctx_.onScreenRenderers_ by defining a local reference field:

    class VulkanApp {

    protected:

      struct MouseState {

        glm::vec2 pos = glm::vec2(0.0f);

        bool pressedLeft = false;

      } mouseState_;

      Resolution_;

      GLFWwindow* window_ = nullptr;

      VulkanRenderContext ctx_;

      std::vector<RenderItem>& onScreenRenderers_;

  2. The public section of the class provides initialization, deinitialization, and miscellaneous event handlers:

    public:

      VulkanApp(int screenWidth, int screenHeight)

      : window_(initVulkanApp(      screenWidth, screenHeight, &resolution_))

      , ctx_(window_, resolution_.width,      resolution_.height)

      , onScreenRenderers_(ctx_.onScreenRenderers_)

      {

        glfwSetWindowUserPointer(window_, this);

        assignCallbacks();

      }

  3. The destructor contains explicit GLSL compiler library deinitialization features, as well as the GLFW termination call:

      ~VulkanApp() {

        glslang_finalize_process();

        glfwTerminate();

      }

  4. As we mentioned at the beginning of this recipe, the user provides two overridden methods for UI and 3D rendering. The update() routine performs whatever actions necessary to calculate the new application state:

      virtual void drawUI() {}

      virtual void draw3D() = 0;

      virtual void update(float deltaSeconds) = 0;

    For example, the CameraApp class, described later in this recipe, calls a 3D camera position update routine, while the physics simulation recipe calls physics simulation routines.

  5. The mainLoop() method is called from the main() function of our application. One thing to note is that the method implementation in the source code bundle for this book include a frames per second counter. It has been omitted here to keep the source code shorter:

      void mainLoop() {

        double timeStamp = glfwGetTime();

        float deltaSeconds = 0.0f;

        do {

          update(deltaSeconds);

  6. Usual time counting is performed to calculate deltaSeconds for the next frame. Note that here, we are processing the frames as fast as possible, but internally, the overridden update() function may quantize time into fixed intervals. This will be used in our physics example in Chapter 9:

          const double newTimeStamp = glfwGetTime();

          deltaSeconds = newTimeStamp - timeStamp;

          timeStamp = newTimeStamp;

  7. The drawFrame() method from Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, takes our new updateBuffers() and VulkanRenderContext::composeFrame() functions to perform frame composition:

          drawFrame(ctx_.vkDev,        [this](uint32_t img)        { this->updateBuffers(img); },

            [this](uint32_t b, uint32_auto img)        { ctx_.composeFrame(b, img); }      );

  8. After polling for system events, we wait for all the graphics operations to complete:

          glfwPollEvents();

          vkDeviceWaitIdle(ctx_.vkDev.device);

        } while (!glfwWindowShouldClose(window_));

      }

The final part of the public interface of VulkanApp is related to UI event handling:

  1. The shouldHandleMouse() function asks if ImGui has already consumed the incoming event; we can handle mouse movements and clicks ourselves. This is used in most of our demos to control the camera in our main view, but only if the user does not interact with UI widgets:

      inline bool shouldHandleMouse() const

      { return !ImGui::GetIO().WantCaptureMouse; }

  2. The handleKey() method processes incoming key presses. One useful override is done in the CameraApp class:

      virtual void handleKey(int key, bool pressed) = 0;

  3. handleMouseClick() and handleMouseMove() just save the parameters of incoming mouse events:

      virtual void handleMouseClick(    int button, bool pressed) {

        if (button == GLFW_MOUSE_BUTTON_LEFT)

          mouseState_.pressedLeft = pressed;

      }

      virtual void handleMouseMove(float mx, float my) {

        mouseState_.pos = glm::vec2(mx, my);

      }

To complete the description of the new VulkanApp class, let's look at its implementation details:

  1. The assignCallbacks() method uses glfwSetCursorPosCallback(), glfwSetMouseButtonCallback(), and glfwSetKeyCallback() to forward mouse and keyboard events from GLFW to the handleMouseMove(), handleMouseClick(), and handleKey() methods of the VulkanApp class, respectively. The event handlers repeat the code from the previous recipes, so only the key press handler is shown here:

    private:

      void assignCallbacks() {

        … set mouse callbacks (not shown here) …

        glfwSetKeyCallback(window_,      [](GLFWwindow* window, int key, int scancode,      int action, int mods) {

          const bool pressed = action != GLFW_RELEASE;

          if (key == GLFW_KEY_ESCAPE && pressed)

            glfwSetWindowShouldClose(window, GLFW_TRUE);

    The only modification we've made to the handler's code is for a custom pointer from GLFW's window_ to be extracted. The only predefined key is the Esc key. When the user presses it, we exit the application.

  2. The initVulkanApp() function, which is called in the constructor of VulkanApp, associates the this pointer with GLFW's window_ object. Here, we are extracting the this pointer and, after casting it to VulkanApp*, calling the handleKey() method to process keypresses:

          void* ptr = glfwGetWindowUserPointer(window);   

          reinterpret_cast<VulkanApp*>(        ptr)->handleKey(key, pressed);

        });

      }

  3. The updateBuffers() method updates the ImGui display dimensions and resets any internal draw lists. The user-provided drawUI() function is called to render the app-specific UI. Then, draw3D() updates internal scene descriptions and whatever else is necessary to render the frame:

      void updateBuffers(uint32_t imageIndex) {

        ImGuiIO& io = ImGui::GetIO();

        io.DisplaySize =      ImVec2((float)ctx_.vkDev.framebufferWidth,      (float)ctx_.vkDev.framebufferHeight);

        ImGui::NewFrame();

        drawUI();

        ImGui::Render();

        draw3D();

        ctx_.updateBuffers(imageIndex);

      }

    };

    The following recipes contain numerous examples of this function's implementations. The call to the previously described VulkanRenderContext::updateBuffers() concludes this function.

Our VulkanApp class is now complete, but there are still some pure virtual methods that prevent us from using it directly. A derived CameraApp class will be used as a base for all the future examples in this book:

  1. The constructor performs the usual VulkanApp initialization, along with setting up the 3D camera:

    struct CameraApp: public VulkanApp {

      CameraApp(int screenWidth, int screenHeight)

      : VulkanApp(screenWidth, screenHeight)

      , positioner(vec3(0.0f, 5.0f, 10.0f)

      , vec3(0.0f, 0.0f, -1.0f), vec3(0.0f, -1.0f, 0.0f))

      , camera(positioner)

      {}

  2. The overridden update() method sends mouse event parameters to the 3D camera positioner:

      virtual void update(float deltaSeconds) override {

        positioner.update(deltaSeconds, mouseState_.pos,      shouldHandleMouse() && mouseState_.pressedLeft);

      }

  3. The default camera projection calculator uses the screen aspect ratio:

      glm::mat4 getDefaultProjection() const {

        const float ratio = ctx_.vkDev.framebufferWidth /      (float)ctx_.vkDev.framebufferHeight;

        return glm::perspective(      glm::pi<float>() / 4.0f, ratio, 0.1f, 1000.0f);

      }

  4. The handleKey() method redirects key press events to the Boolean fields of the camera positioner:

      virtual void handleKey(int key, bool pressed)    override {

        if (key == GLFW_KEY_W)

          positioner.movement_.forward_ = pressed;

        … handle the rest of camera keys similarly …

      }

    All the keys are handled just as in the recipes from Chapter 3, Getting Started with OpenGL and Vulkan, through Chapter 6, Physically Based Rendering Using the glTF2 Shading Model.

  5. The protected section of the class defines the camera positioner and the camera itself. Subclasses may use the current camera position for lighting calculations by passing it to uniform buffers:

    protected:

      CameraPositioner_FirstPerson positioner;

      Camera;

    };

The next recipe concentrates on implementing the Renderer interface based on the samples from previous chapters.

Working with rendering passes

In Chapter 4, Adding User Interaction and Productivity Tools, we introduced our "layered" frame composition, which we will now refine and extend. These modifications will allow us to do offscreen rendering and significantly simplify initialization and Vulkan object management.

This recipe will describe the rendering interface that's used by the VulkanApp class. At the end of this recipe, a few concrete classes will be presented that can render quadrilaterals and the UI. Please revisit the previous two recipes to see how the Renderer interface fits in the new framework.

Getting ready

Check out the Putting it all together into a Vulkan application recipe of Chapter 4, Adding User Interaction and Productivity Tools, to refresh your memory on how our "layered" frame composition works.

How to do it...

Each of the frame rendering passes is represented by an instance of the Renderer class. The list of references to these instances is stored in VulkanRenderContext. The usage of these instances was thoroughly discussed in the previous recipe:

  1. The Renderer class contains an empty public constructor that stores a reference to VulkanRenderContext and the default size of the output framebuffer:

    struct Renderer {

      Renderer(VulkanRenderContext& c)

      : processingWidth(c.vkDev.framebufferWidth)

      , processingHeight(c.vkDev.framebufferHeight)

      , ctx_(c)

      {}

  2. To produce anything on the screen, we need to record a command buffer for our graphics queue. A pure virtual fillCommandBuffer() method is overridden in subclasses to record rendering commands. Each frame can be rendered to a different framebuffer, so we will pass the image index as a parameter. A frame can be rendered to an onscreen framebuffer. In this case, we pass null handles as the output framebuffer and render pass:

      virtual void fillCommandBuffer(    VkCommandBuffer cmdBuffer,    size_t currentImage,    VkFramebuffer fb = VK_NULL_HANDLE,    VkRenderPass rp = VK_NULL_HANDLE) = 0;

  3. At each frame, we may need to update the contents of some of the buffers. By default, the respective method is empty:

      virtual void updateBuffers(size_t currentImage) {}

  4. One frequent operation we must perform is updating the uniform buffer that corresponds to the current frame:

      inline void updateUniformBuffer(    uint32_t currentImage,    const uint32_t offset,    const uint32_t size, const void* data)

      {

        uploadBufferData(ctx_.vkDev,      uniforms_[currentImage].memory, offset, data,      size);

      }

  5. Each of our renderers uses a dedicated Vulkan graphics pipeline. This pipeline is determined by the list of used shaders, the sizes of push constant buffers for the vertex and fragment stages, and a PipelineInfo structure with additional parameters. The initPipeline() function creates a pipeline layout and then immediately uses this layout to create the Vulkan pipeline itself. Just like with any of the Vulkan objects, the pipeline handle is stored in the ctx_.resources object, so we do not have to worry about its destruction:

      void initPipeline(    const std::vector<const char*>& shaders,    const PipelineInfo& pInfo,    uint32_t vtxConstSize = 0,    uint32_t fragConstSize = 0)

      {

        pipelineLayout_ =      ctx_.resources.addPipelineLayout(        descriptorSetLayout_, vtxConstSize,        fragConstSize);

        graphicsPipeline_ = ctx_.resources.addPipeline(      renderPass_.handle, pipelineLayout_, shaders,      pInfo);

      }

  6. Each renderer defines a dedicated render pass that is compatible with the set of input textures. The initRenderPass() function contains the logic that's used in most of the renderer classes. The input pipeline parameters can be changed if offscreen rendering is performed (non-empty list of output textures). If we pass in a valid renderPass object, then it is directly assigned to the internal renderPass_ field:

      PipelineInfo initRenderPass(    const PipelineInfo& pInfo,    const std::vector<VulkanTexture>& outputs,    RenderPass = RenderPass(),    RenderPass fallbackPass = RenderPass())

      {

  7. If the output list is empty, which means we are rendering to the screen, and the renderPass parameter is not valid, then we take fallbackPass as the rendering pass. Usually, it is taken from one of the class fields of VulkanRenderContextscreenRenderPass or screenRenderPass_NoDepth – depending on whether we need depth buffering or not. We may need to modify the input pipeline description, so we will declare a new PipelineInfo variable. If we are rendering to an offscreen buffer, we must store the buffer dimensions in the rendering area. The output pipeline information structure also contains the actual rendering area's size:

          PipelineInfo outInfo = pInfo;

          if (!outputs.empty()) {

            processingWidth = outputs[0].width;

            processingHeight = outputs[0].height;

            outInfo.width = processingWidth;

            outInfo.height = processingHeight;

  8. If no external renderpass is provided, we allocate a new one that's compatible with the output framebuffer. If we have only one depth attachment, then we must use a special rendering pass. The isDepthFormat() function is a one-liner that compares the VkFormat parameter with one of the predefined Vulkan depth buffer formats; see the UtilsVulkan.h file for details. To render to the screen framebuffer, we will use one of the renderpasses from our parameters:

          bool hasHandle =        renderPass.handle != VK_NULL_HANDLE;

          bool hasDepth = (outputs.size() == 1)) &&        isDepthFormat(outputs[0].format);

          renderPass_  = hasHandle ? renderPass :        ((hasDepth ?          ctx_.resources.addDepthRenderPass(outputs) :            ctx_.resources.addRenderPass(outputs));

          framebuffer_ = ctx_.resources.addFramebuffer(        renderPass_, outputs);

        } else {

          renderPass_ =        hasHandle ? renderPass : fallbackPass;

        }

        return outInfo;

      }

    The last helper function we need in all our renderers is the beginRenderPass() function, which adds the appropriate commands to start a rendering pass.

  9. Just as we did in VulkanRenderContext::beginRenderPass(), we will declare some buffer clearing values and the output area:

      void beginRenderPass(    VkRenderPass rp, VkFramebuffer fb,    VkCommandBuffer commandBuffer,    size_t currentImage)

      {

        const VkClearValue clearValues[2] = {      VkClearValue { .color = {1.f, 1.f, 1.f, 1.f} },      VkClearValue { .depthStencil = {1.f, 0} }    };

        const VkRect2D rect {      .offset = { 0, 0 },      .extent = { .width  = processingWidth,                  .height = processingHeight }    };

  10. To avoid calling the Vulkan API directly, we will pass a complete set of parameters to the VulkanRenderContext::beginRenderPass() function we implemented in the previous recipe. Some arithmetic is required to calculate the number of clear values. If we don't need to clear the color, depth, or both buffers, we will change the offset in the clearValues array:

        ctx_.beginRenderPass(      commandBuffer, rp, currentImage, rect, fb,      (renderPass_.info.clearColor_ ? 1u : 0u) +      (renderPass_.info.clearDepth_ ? 1u : 0u),      renderPass_.info.clearColor_ ? &clearValues[0] :        (renderPass_.info.clearDepth_ ?          &clearValues[1] : nullptr));

  11. After starting a renderpass, we must bind our local graphics pipeline and descriptor set for this frame:

        vkCmdBindPipeline(commandBuffer,      VK_PIPELINE_BIND_POINT_GRAPHICS,      graphicsPipeline_);

        vkCmdBindDescriptorSets(commandBuffer,      VK_PIPELINE_BIND_POINT_GRAPHICS,      pipelineLayout_, 0, 1,      &descriptorSets_[currentImage], 0, nullptr);

      }

  12. The public part of the Renderer class contains cached instances of the framebuffer and renderpass, as well as the output framebuffer dimensions:

      VkFramebuffer framebuffer_ = nullptr;

      RenderPass_;

      uint32_t processingWidth;

      uint32_t processingHeight;

  13. The protected part of the renderer is very similar to our RendererBase from Chapter 4, Adding User Interaction and Productivity Tools, but here, we will use the VulkanRendererContext reference to cleanly manage Vulkan objects. Each renderer contains a list of descriptor sets, along with a pool and a layout for all the sets. The pipeline layout and the pipeline itself are also present in every renderer. An array of uniform buffers, one for each of the frames in a swapchain, is the last field of our Renderer:

    protected:

      VulkanRenderContext& ctx_;

      VkDescriptorSetLayout descriptorSetLayout_ =    nullptr;

      VkDescriptorPool descriptorPool_ = nullptr;

      std::vector<VkDescriptorSet> descriptorSets_;

      VkPipelineLayout pipelineLayout_ = nullptr;

      VkPipeline graphicsPipeline_ = nullptr;

      std::vector<VulkanBuffer> uniforms_;

    };

With all the components in place, we are now ready to begin implementing renderers.

In the next chapter, we will use offscreen rendering. For debugging, we could output the contents of a texture to some part of the screen. The QuadRenderer class, which is derived from the base Renderer, provides a way to output textured quads. In Chapter 8, Image-Based Techniques, we will use this class to output postprocessed frames:

  1. The constructor of this class takes a list of textures, which can be mapped to quadrangles later, and an array of output textures. If the list of output textures is empty, this class renders directly to the screen. A render pass that's compatible with the output textures can be passed from the outside context as an optional parameter:

    struct QuadRenderer: public Renderer {

      QuadRenderer(VulkanRenderContext& ctx,    const std::vector<VulkanTexture>& textures,    const std::vector<VulkanTexture>& outputs = {},    RenderPass screenRenderPass = RenderPass())

      : Renderer(ctx)

    {

  2. The initialization begins by creating an appropriate framebuffer and a render pass. The QuadRenderer class does not use the depth buffer, so a depthless framebuffer is passed as a parameter:

        const PipelineInfo pInfo = initRenderPass(      PipelineInfo {}, outputs, screenRenderPass,      ctx.screenRenderPass_NoDepth);

  3. The vertex buffer must contain six vertices for each item on the screen since we are using two triangles to represent a single quadrangle. MAX_QUADS is just an integer constant containing the largest number of quadrangles in a buffer. The number of geometry buffers and descriptor sets equals the number of swapchain images:

        uint32_t vertexBufferSize =      MAX_QUADS * 6 * sizeof(VertexData);

        const size_t imgCount =       ctx.vkDev.swapchainImages.size();

        descriptorSets_.resize(imgCount);

        storages_.resize(imgCount);

  4. Each of the descriptor sets for this renderer contains a reference to all the textures and uses a single geometry buffer. Here, we will only fill in a helper structure that is then passed to the construction routine:

        DescriptorSetInfo dsInfo = {      .buffers = {        storageBufferAttachment(VulkanBuffer {}, 0,        vertexBufferSize, VK_SHADER_STAGE_VERTEX_BIT)      },      .textureArrays = {        fsTextureArrayAttachment(textures)      }    };

    For a complete explanation of the descriptor set creation process, see the Unifying descriptor set creation routines recipe.

  5. Once we have a structure that describes the layout of the descriptor set, we will call the VulkanResources object to create Vulkan's descriptor set layout and a descriptor pool:

      descriptorSetLayout_ =    ctx.resources.addDescriptorSetLayout(dsInfo);

      descriptorPool_ =     ctx.resources.addDescriptorPool(dsInfo, imgCount);

  6. For each of the swapchain images, we allocate a GPU geometry buffer and put it in the first slot of the descriptor set:

        for (size_t i = 0 ; i < imgCount ; i++) {

          storages_[i] = ctx.resources.addStorageBuffer(        vertexBufferSize);

          dsInfo.buffers[0].buffer = storages_[i];

          descriptorSets_[i] =        ctx.resources.addDescriptorSet(          descriptorPool_, descriptorSetLayout_);

          ctx.resources.updateDescriptorSet(        descriptorSets_[i], dsInfo);

        }

  7. At the end of the constructor, we should create a pipeline using the following shaders:

        initPipeline({      "data/shaders/chapter08/VK02_QuadRenderer.vert",      "data/shaders/chapter08/VK02_QuadRenderer.frag"},      pInfo);

      }

  8. Filling in the command buffer consists of starting a render pass and issuing a single vkCmdDraw() command to render the quadrangles:

      void fillCommandBuffer(    VkCommandBuffer cmdBuffer,    size_t currentImage,    VkFramebuffer fb = VK_NULL_HANDLE,    VkRenderPass rp = VK_NULL_HANDLE) override

      {

  9. If the quads list is empty, no commands need to be issued:

        if (quads_.empty()) return;

        bool hasRP = rp != VK_NULL_HANDLE;

        bool hasFB = fb != VK_NULL_HANDLE;

        beginRenderPass(hasRP ? rp : renderPass_.handle,      hasFB ? fb : framebuffer_, commandBuffer,      currentImage);

        vkCmdDraw(commandBuffer,      static_cast<uint32_t>(quads_.size()), 1, 0, 0);

        vkCmdEndRenderPass(commandBuffer);

      }

  10. If the quad geometry or amount changes, the quad geometry buffer implicitly reuploads to the GPU using the overridden uploadBuffers() function:

      void updateBuffers(size_t currentImage) override {

        if (quads_.empty()) return;

        uploadBufferData(ctx_.vkDev,      storages_[currentImage].memory, 0,      quads_.data(),      quads_.size() * sizeof(VertexData));

      }

  11. The main function for us is, of course, the quad routine, which adds a new quadrangle to the buffer. A quadrangle is defined by four corner points whose coordinates are calculated from parameters. Each of the vertices is marked with a texture index that is passed to the fragment shader. Since Vulkan does not support quadrangles as a rendering primitive, we will split the quadrangle into two adjacent triangles. Each triangle is specified by three vertices:

      void quad(    float x1, float y1, float x2, float y2, int texIdx)  {

        VertexData v1 { { x1, y1, 0 }, { 0, 0 }, texIdx };    VertexData v2 { { x2, y1, 0 }, { 1, 0 }, texIdx };    VertexData v3 { { x2, y2, 0 }, { 1, 1 }, texIdx };    VertexData v4 { { x1, y2, 0 }, { 0, 1 }, texIdx };

        quads_.push_back(v1);    quads_.push_back(v2);    quads_.push_back(v3);    quads_.push_back(v1);    quads_.push_back(v3);    quads_.push_back(v4);

      }

  12. The following method allows us to clear the entire list of quadrangles:

      void clear() { quads_.clear(); }

  13. In the private: part of the class, we store a structured buffer of vertex data. Note that no index buffer is being used, so there are additional possibilities for optimizations. At the end, we store a list of buffers for storing geometry data on our GPU – one buffer per swapchain image:

    private:

      struct VertexData {

        glm::vec3 pos;

        glm::vec2 tc;

        int texIdx;

      };

      std::vector<VertexData> quads_;

      std::vector<VulkanBuffer> storages_;

    };

Now that we have finished the C++ part of the code, let's take a look at the GLSL part, as well as the VKArrayTextures.vert and VKArrayTextures.frag shaders we mentioned in the constructor of the QuadRenderer class:

  1. As output, the vertex shader produces interpolated texture coordinates and a constant texture index. We will use a vertex layout similar to the one we used for ImGui widget rendering. The SBO buffer, which we will be using as input, contains packed vertex data:

    layout(location = 0) out vec2 out_uv;

    layout(location = 1) flat out uint out_texIndex;

    struct DrawVert {

      float x, y, z, u, v;

      uint texIdx;

    };

    layout(binding = 0)  readonly buffer SBO { DrawVert data[]; } sbo;

  2. In the main() function, we fetch the vertex data into a local variable. Then, the vertex data is unpacked into output variables. The mandatory gl_Position variable is filled from vertex data without any transformations:

    void main() {

      uint idx = gl_VertexIndex;

      DrawVert v = sbo.data[idx];

      out_uv  = vec2(v.u, v.v);

      out_texIndex = v.texIdx;

      gl_Position = vec4(vec2(v.x, v.y), 0.0, 1.0);

    }

The fragment shader uses an array of textures to color the pixel:

  1. First, we should enable the GL_EXT_nonuniform_qualifier extension to address the array using values that have been read from a buffer:

    #extension GL_EXT_nonuniform_qualifier : require

    layout (binding = 1) uniform sampler2D textures[];

  2. The inputs of the pixel shader are the output of the previous vertex shader. The only output is the pixel's color:

    layout (location = 0) in vec2 in_uv;

    layout (location = 1) flat in uint in_texIndex;

    layout (location = 0) out vec4 outFragColor;

  3. The following constant allows us to mark some textures as containing "depth" values:

    const uint depthTextureMask = 0xFFFF;

  4. The following function converts a non-linear depth buffer value into an intensity value:

    float linearizeDepth(float d, float zNear, float zFar) {

      return zNear * zFar / (zFar + d * (zNear - zFar));

    }

  5. The main() function renders the quadrangle's fragments. First, we extract the texture index. Then, we look at the higher bits of the texture index and determine the texture type. The texture value at the given texture coordinates is fetched from an appropriate texture. Depending on the type of the texture, the value is either output directly or linearized and converted into grayscale:

    void main() {

      uint tex = in_texIndex & depthTextureMask;

      uint texType =    (in_texIndex >> 16) & depthTextureMask;

      vec4 value = texture(    textures[nonuniformEXT(tex)], in_uv);

      outFragColor = (texType == 0 ? value : vec4(    vec3(linearizeDepth(value.r, 0.01, 100.0)), 1.0));

    }

The next recipe will conclude our review of the new rendering framework by describing the routines for creating the descriptor set.

There's more...

Similar to QuadRenderer, the new LineCanvas class is (re)implemented by following the code of VulkanCanvas from the Implementing an immediate mode drawing canvas recipe of Chapter 4, Adding User Interaction and Productivity Tools. The only thing that has changed is the simplified constructor code, which is now using the new resource management scheme and descriptor set initialization.

The GuiRenderer class also (re)implements ImGuiRenderer from the Rendering the Dear ImGui user interface with Vulkan recipe of Chapter 4, Adding User Interaction and Productivity Tools, and adds support for multiple textures. The usage of this class will be shown in most of the upcoming examples in this book.

Unifying descriptor set creation routines

Before we can complete our material system implementation using the Vulkan API, we must reconsider the descriptor set creation routines in all the previous recipes. The reason we didn't implement the most generic routines right away is simple: as with all the examples, we decided to follow the "natural" evolution of the code instead of provided all the solutions at the beginning. The routines presented in this recipe complete the resource management system for this book.

Getting ready

The source code for this recipe can be found in the shared/vkFramework/VulkanResources.h and shared/vkFramework/VulkanResources.cpp files.

How to do it...

The Managing Vulkan resources recipe from this chapter introduced the VulkanResources class, which contains all our allocated Vulkan objects. The descriptor sets and descriptor pools are also allocated by the methods of this class. Here is a list of requirements for descriptor set management:

  • There are three types of objects to support: buffers, textures, and arrays of textures (or indexed textures). For the purposes of this book, we will not use indexed buffers, though they might be useful in general.
  • We should be able to specify the layout of a descriptor set.
  • Once the descriptor set has been created, we should be able to perform "update" operations to fill the descriptor set with concrete resource handles.

Let's construct a system that addresses all these requirements:

  1. The first and second requirements immediately tell us to create a structure that we can use as input for our descriptor set creation routine:

    struct DescriptorSetInfo {

      std::vector<BufferAttachment>       buffers;

      std::vector<TextureAttachment>      textures;

      std::vector<TextureArrayAttachment> textureArrays;

    };

  2. Individual attachments are described in a similar fashion. The attachment contains a DescriptorInfo structure that tells us about the usage type and exact shader stages where the attachment is used:

    struct DescriptorInfo {

      VkDescriptorType type;

      VkShaderStageFlags shaderStageFlags;

    };

  3. A buffer attachment contains a reference to the VulkanBuffer structure, which provides a buffer size and an offset to the data. The offset and size fields may seem redundant because the size is already present in the VulkanBuffer structure. However, we have frequently used parts of a single buffer as two logical attachments. For example, this is the case with the index and vertex data buffers:

    struct BufferAttachment {

      DescriptorInfo dInfo;

      VulkanBuffer   buffer;

      uint32_t       offset;

      uint32_t       size;

    };

  4. The texture attachment is similar to BufferAttachment and contains a VulkanTexture object:

    struct TextureAttachment {

      DescriptorInfo dInfo;

      VulkanTexture  texture;

    };

  5. The only thing that's different in an attachment of texture arrays is the std::vector field, which contains a textures list:

    struct TextureArrayAttachment {

      DescriptorInfo dInfo;

      std::vector<VulkanTexture> textures;

    };

  6. The VulkanBuffer object contains the VkDeviceMemory and VkBuffer handles, along with information about the buffer size:

    struct VulkanBuffer {

      VkBuffer       buffer;

      VkDeviceSize   size;

      VkDeviceMemory memory;

    };

    For the purposes of this book, such aggregation is more than enough, but in a multi-GPU configuration, a separate form of device memory and buffer handles may be needed.

Next, let's look at VulkanTexture, a helper for aggregating VulkanImage and VkSampler so that we don't have to pass multiple objects as function parameters:

  1. At the start of VulkanTexture, we will store the texture dimensions and pixel format. This is because, later in the code, we will deduce rendering pass parameters for offscreen buffers from this data:

    struct VulkanTexture final {

      uint32_t width;

      uint32_t height;

      uint32_t depth;

      VkFormat format;

  2. VulkanImage is an aggregate structure we used in Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, to pass texture image data around. As a reminder, it contains the VkImage and VkImageView handles:

      VulkanImage image;

      VkSampler sampler;

  3. The last field of VulkanTexture keeps track of the layout of this texture at creation time. This is important so that we correctly use the texture as an offscreen buffer or as a texture source for shaders:

      VkImageLayout desiredLayout;

    };

A descriptor set needs a layout, so the following steps create VkDescriptorSetLayout from our DescriptorSetInfo structure. Notice that at this point, we are omitting the actual buffer handles from attachment descriptions:

  1. As we described in the Using descriptor indexing and texture arrays in Vulkan recipe of Chapter 6, Physically Based Rendering Using the glTF2 Shading Model, we need to specify some extra flags for our indexed texture arrays:

    VkDescriptorSetLayout   VulkanResources::addDescriptorSetLayout(    const DescriptorSetInfo& dsInfo)

    {

      VkDescriptorSetLayout descriptorSetLayout;

      std::vector<VkDescriptorBindingFlagsEXT>    descriptorBindingFlags;

  2. For each type of resource, we collect the appropriate bindings. The loops over our buffers and textures push attachments to the bindings array:

      uint32_t bindingIdx = 0;

      std::vector<VkDescriptorSetLayoutBinding> bindings;

      for (const auto& b: dsInfo.buffers) {

        descriptorBindingFlags.push_back(0u);

        bindings.push_back(descriptorSetLayoutBinding(      bindingIdx++, b.dInfo.type,      b.dInfo.shaderStageFlags));

      }

      for (const auto& i: dsInfo.textures) {

        descriptorBindingFlags.push_back(0u);

        bindings.push_back(descriptorSetLayoutBinding(      bindingIdx++, i.dInfo.type,       i.dInfo.shaderStageFlags));

      }

  3. For the texture array attachments, we must also store the non-zero binding flag:

      for (const auto& t: dsInfo.textureArrays) {

        bindings.push_back(      descriptorSetLayoutBinding(bindingIdx++,      VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,      t.dInfo.shaderStageFlags,      static_cast<uint32_t>(t.textures.size())));

      }

  4. Here, a VkDevice object is created with the shaderSampledImageArrayDynamicIndexing option enabled, so we must pass the descriptor binding flags to a descriptor layout information structure. As a side note, the sType field probably uses the longest constant name in this entire book as its value:

      const VkDescriptorSetLayoutBindingFlagsCreateInfoEXT    setLayoutBindingFlags = {    .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET              _LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT,    .bindingCount = static_cast<uint32_t>(      descriptorBindingFlags.size()),    .pBindingFlags = descriptorBindingFlags.data()  };

  5. The usual creation structure for the layout is defined with references to the layout binding flags and binding descriptions:

      const VkDescriptorSetLayoutCreateInfo layoutInfo = {    .sType = VK_STRUCTURE_TYPE_DESCRIPTOR              _SET_LAYOUT_CREATE_INFO,    .pNext = dsInfo.textureArrays.empty() ?      nullptr : &setLayoutBindingFlags,    .flags = 0,    .bindingCount =       static_cast<uint32_t>(bindings.size()),    .pBindings = bindings.size() > 0 ?      bindings.data() : nullptr

      };

  6. As with all the errors, we exit if anything goes wrong:

      if (vkCreateDescriptorSetLayout(vkDev.device,      &layoutInfo, nullptr,      &descriptorSetLayout) != VK_SUCCESS) {

        printf("Failed to create descriptor set       layout ");

        exit(EXIT_FAILURE);

      }

  7. As with the rest of the Vulkan objects, the descriptor set layout is stored in an internal array to be destroyed at the end:

      allDSLayouts.push_back(descriptorSetLayout);

      return descriptorSetLayout;

    }

To correctly allocate the correct descriptor set, we need a descriptor pool with enough handles for buffers and textures:

  1. The following routine creates such a pool. We must count each type of buffer and all the samplers:

    VkDescriptorPool VulkanResources::addDescriptorPool(  const DescriptorSetInfo& dsInfo, uint32_t dSetCount)

    {

      uint32_t uniformBufferCount = 0;

      uint32_t storageBufferCount = 0;

      uint32_t samplerCount =    static_cast<uint32_t>(dsInfo.textures.size());

  2. Each texture array generates a fixed number of textures:

      for (const auto& ta : dsInfo.textureArrays)

        samplerCount +=      static_cast<uint32_t>(ta.textures.size());

  3. The buffers we will use should be of the storage or uniform type:

      for (const auto& b: dsInfo.buffers) {

        if (b.dInfo.type ==        VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER)

          uniformBufferCount++;

        if (b.dInfo.type ==        VK_DESCRIPTOR_TYPE_STORAGE_BUFFER)

          storageBufferCount++;

      }

  4. In our GLSL shaders, we use uniforms, storage buffers, and texture samplers. The poolSizes array contains three (or less, if a buffer type is absent) items for each type of buffer:

      std::vector<VkDescriptorPoolSize> poolSizes;

  5. For each buffer type, we add an item to poolSizes:

      if (uniformBufferCount)

        poolSizes.push_back(VkDescriptorPoolSize{      .type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,      .descriptorCount = dSetCount *         uniformBufferCount });

  6. The storage buffer differs by a single constant in the type field:

      if (storageBufferCount)

        poolSizes.push_back( VkDescriptorPoolSize{      .type = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,      .descriptorCount = dSetCount *        storageBufferCount });

  7. Texture samplers also generate an item in the poolSizes array:

      if (samplerCount)

        poolSizes.push_back( VkDescriptorPoolSize{      .type =         VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,      .descriptorCount = dSetCount * samplerCount });

  8. Having counted everything, we must declare the descriptor pool creation structure:

      const VkDescriptorPoolCreateInfo poolInfo = {    .sType =       VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO,    .pNext = nullptr,    .flags = 0,    .maxSets = static_cast<uint32_t>(dSetCount),    .poolSizeCount =      static_cast<uint32_t>(poolSizes.size()),    .pPoolSizes = poolSizes.empty() ?      nullptr : poolSizes.data()  };

  9. If Vulkan's vkCreateDescriptorPool() function fails, we should exit. The descriptor pool must be destroyed at the end, so we will keep the handle in an array:

      VkDescriptorPool descriptorPool = VK_NULL_HANDLE;

      if (vkCreateDescriptorPool(vkDev.device,      &poolInfo, nullptr, &descriptorPool) !=      VK_SUCCESS) {

        printf("Cannot allocate descriptor pool ");

        exit(EXIT_FAILURE);

      }

      allDPools.push_back(descriptorPool);

      return descriptorPool;

    }

  10. The addDescriptorSet() function is nothing more than a wrapper on top of vkAllocateDescriptorSets():

    VkDescriptorSet   VulkanResources::addDescriptorSet(    VkDescriptorPool descriptorPool,    VkDescriptorSetLayout dsLayout)

    {

      VkDescriptorSet descriptorSet;

      const VkDescriptorSetAllocateInfo allocInfo = {    .sType =      VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO,    .pNext = nullptr,    .descriptorPool = descriptorPool,    .descriptorSetCount = 1,    .pSetLayouts = &dsLayout   };

  11. Error checking terminates the program in case of a failure:

      if (vkAllocateDescriptorSets(        vkDev.device, &allocInfo, &descriptorSet) !=      VK_SUCCESS) {

        printf("Cannot allocate descriptor set ");

        exit(EXIT_FAILURE);

      }

      return descriptorSet;

    }

The most important function for us is updateDescriptorSet(), which attaches the actual buffers and texture samplers to the descriptor set's logical slots. Let's take a look:

  1. This function prepares a list of the descriptor write operations for all the buffers and textures. Note that we do not provide a method to update some specific descriptor set item, but the entire set. Individual buffer and image descriptors are stored in separate arrays:

    void VulkanResources::updateDescriptorSet(  VkDescriptorSet ds, const DescriptorSetInfo& dsInfo)

    {

      uint32_t bindingIdx = 0;

      std::vector<VkWriteDescriptorSet> descriptorWrites;

      std::vector<VkDescriptorBufferInfo>    bufferDescriptors(dsInfo.buffers.size());

      std::vector<VkDescriptorImageInfo>    imageDescriptors(dsInfo.textures.size());

      std::vector<VkDescriptorImageInfo>    imageArrayDescriptors;

  2. The first array is used to convert buffer descriptions into VkDescriptorBufferInfo structures:

      for (size_t i = 0 ; i < dsInfo.buffers.size() ; i++)

      {

        BufferAttachment b = dsInfo.buffers[i];

        bufferDescriptors[i] = VkDescriptorBufferInfo {      .buffer = b.buffer.buffer,      .offset = b.offset,      .range  = (b.size > 0) ? b.size : VK_WHOLE_SIZE     };

        descriptorWrites.push_back(      bufferWriteDescriptorSet(ds,        &bufferDescriptors[i],        bindingIdx++, b.dInfo.type));

      }

  3. The second one is used to convert all individual textures into VkDescriptorImageInfo structures:

      for(size_t i = 0 ; i < dsInfo.textures.size() ; i++)

      {

        VulkanTexture t = dsInfo.textures[i].texture;

        imageDescriptors[i] = VkDescriptorImageInfo {      .sampler = t.sampler,      .imageView = t.image.imageView,      .imageLayout =        VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL     };

        descriptorWrites.push_back(      imageWriteDescriptorSet(ds,        &imageDescriptors[i], bindingIdx++));

      }

  4. Finally, the trickiest pair of loops is required to process a list of texture arrays. The first loop collects all the individual image descriptions in a single array and stores the offsets of these images for each individual texture array. To keep track of the offset in the global list of textures, we will use the taOffset variable:

      uint32_t taOffset = 0;

      std::vector<uint32_t> taOffsets(    dsInfo.textureArrays.size());

      for (size_t ta = 0 ; ta <       dsInfo.textureArrays.size() ; ta++) {

        taOffsets[ta] = taOffset;

  5. Inside each texture array, we must convert each texture handle, just as we did with single texture attachments:

        for (size_t j = 0;         j<dsInfo.textureArrays[ta].textures.size();         j++) {

          VulkanTexture t =         dsInfo.textureArrays[ta].textures[j];

          VkDescriptorImageInfo imageInfo = {        .sampler = t.sampler,        .imageView = t.image.imageView,        .imageLayout =           VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL       };

          imageArrayDescriptors.push_back(imageInfo);

        }

  6. The offset of the global texture array is updated with the size of the current texture array:

        taOffset += static_cast<uint32_t>(      dsInfo.textureArrays[ta].textures.size());

      }

  7. The second loop over the textureArrays field fills Vulkan's write descriptor set operation structure with the appropriate pointers inside the imageArrayDescriptors array:

      for (size_t ta = 0 ; ta <       dsInfo.textureArrays.size() ; ta++) {

        VkWriteDescriptorSet writeSet = {      .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,      .dstSet = ds,      .dstBinding = bindingIdx++,      .dstArrayElement = 0,      .descriptorCount =        static_cast<uint32_t>(          dsInfo.textureArrays[ta].textures.size()),      .descriptorType =        VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,      .pImageInfo = imageArrayDescriptors.data() +        taOffsets[ta]    };

        descriptorWrites.push_back(writeSet);

      }

  8. Once we have all the descriptor write operations filled and packed, we can pass them to the vkUpdateDescriptorSets() routine:

      vkUpdateDescriptorSets(vkDev.device,    static_cast<uint32_t>(descriptorWrites.size()),    descriptorWrites.data(), 0, nullptr);

    }

In all this book's examples, the descriptor sets are created early at runtime, usually in the constructor of some renderer class. All we need to do is fill the DescriptorSetInfo structure with references to our loaded texture and buffer attachments. Check out the vkFramework/VulkanResources.h file for multiple examples of how to implement various attachments using this mechanism.

There's more...

Our new set of Vulkan renderers, located in the shared/vkFramework folder, uses the unified descriptor set creators we described in this recipe. Make sure you check it out.

Putting it all together into a Vulkan application

Now, let's conclude this chapter by putting all the recipes we have just learned together, into a single demo application. The app will render the Lumberyard Bistro scene using meshes and materials from .obj files.

Getting ready

The Chapter7/VK03_LargeScene demo application combines the code from all the recipes of this chapter, so it will be helpful to skim through the entire chapter before proceeding.

To correctly execute the demo application, the Scene Converter tool from the Implementing a scene conversion tool recipe should be compiled and executed with all the default configuration, prior to running this demo.

How to do it...

Despite being able to render a fairly large scene, the main application, which can be found in the Chapter7/VK03_LargeScene folder, is surprisingly simple. All we must do here is define a MyApp class containing the scene data, textures, and all the renderer instances. The code is almost purely declarative; the only exception is when we must pass 3D camera parameters in the draw3D() function, which can also be wrapped in the VulkanBuffer interface. However, this would require some more framework code as we would have to synchronize the camera data and this new GPU buffer. Anyway, let's get started:

  1. First, we have two constants with environment map texture filenames:

    const char* envMapFile = "data/piazza_bologni_1k.hdr";

    const char* irrMapFile =  "data/piazza_bologni_1k_irradience.hdr";

  2. The initialization part of the MyApp class sets up scene rendering and frame composition, which, in this example, consists of rendering two scene parts – the exterior and interior objects of the Bistro dataset. The application is instructed to create a window that takes up 80% of our screen space:

    struct MyApp: public CameraApp {

      MyApp(): CameraApp(-80, -80)

  3. Two environment maps are loaded for PBR lighting:

      , envMap(ctx_.resources.loadCubeMap(envMapFile))

      , irrMap(ctx_.resources.loadCubeMap(irrMapFile))

  4. Two parts of the scene are loaded from the files that were generated by the Scene Converter tool:

      , sceneData(ctx_, "data/meshes/test.meshes",      "data/meshes/test.scene",      "data/meshes/test.materials", envMap, irrMap)

      , sceneData2(ctx_, "data/meshes/test2.meshes",      "data/meshes/test2.scene",      "data/meshes/test2.materials", envMap, irrMap)

  5. Each part of the scene is rendered by a dedicated MultiRenderer instance. A mandatory ImGui renderer instance is initialized at the end:

      , multiRenderer(ctx_, sceneData)

      , multiRenderer2(ctx_, sceneData2)

      , imgui(ctx_)

  6. Here, the frame composition process is rendering two scene parts directly to the screen's framebuffer:

      {

        onScreenRenderers_.emplace_back(multiRenderer);

        onScreenRenderers_.emplace_back(multiRenderer2);

      }

  7. The draw3D() function passes the current camera parameters to both scene renderers:

      void draw3D() override {

        const mat4 p = getDefaultProjection();

        const mat4 view =camera.getViewMatrix();

        const mat4 model = glm::rotate(      mat4(1.f), glm::pi<float>(), vec3(1, 0, 0));

        multiRenderer.setMatrices(p, view, model);

        multiRenderer2.setMatrices(p, view, model);

        multiRenderer.setCameraPosition(      positioner.getPosition());

        multiRenderer2.setCameraPosition(      positioner.getPosition());

      }

  8. The private members of the class contain two environment maps for the PBR model and two VKSceneData instances. These contain the interior and exterior geometry of the test scene. Besides that, we must declare two MultiRenderer instances to display two parts of the Lumberyard Bistro scene. Though we will not be directly using the ImGui renderer in this example, we are doing this because its constructor contains the ImGui context initialization routine, which is required in VulkanApp:

    private:

      VulkanTexture envMap, irrMap;

      VKSceneData sceneData, sceneData2;

      MultiRenderer, multiRenderer2;

      GuiRenderer imgui;

    };

The main() function contains only three lines, all of which were explained in the Refactoring Vulkan initialization and the main loop recipe.

How it works…

The main workhorse for this demo application is the VulkanApp class and two MultiRenderer instances, both of which are responsible for rendering scene objects that are loaded into VKSceneData objects. For a quick recap on the GPU data storage scheme of our application, look at the following diagram:

Figure 7.3 – Scene data scheme

Figure 7.3 – Scene data scheme

The VKSceneData class loads the geometry data for all the scene objects, a list of material parameters, and an array of textures, referenced by individual materials. All the loaded data is transferred into the appropriate GPU buffers. The MultiRenderer class maintains the Shape and Transform lists in dedicated GPU buffers. Internally, the Shape List points to individual items in the Material and Transform lists, and it also holds offsets to the index and vertex data in the Mesh geometry buffer. At each frame, the VulkanApp class asks MultiRenderer to fill the command buffer with indirect draw commands to render the shapes of the scene. The parameters of the indirect draw command are taken directly from the Shape list. The running demo application should render the Lumberyard Bistro scene with materials, as shown in the following screenshot:

Figure 7.4 – Rendering the Lumberyard Bistro scene with materials

Figure 7.4 – Rendering the Lumberyard Bistro scene with materials

In Chapter 8, Image-Based Techniques, we will use the aforementioned MultiRenderer class to implement a few screen space effects, while in Chapter 10, Advanced Rendering Techniques and Optimizations, we will optimize the internal indirect draw commands by using frustum culling techniques. We will also implement a simple shadow mapping technique in Vulkan for this scene.

There's more...

We have also implemented an OpenGL version of this app. Check out the Chapter7/GL01_LargeScene project in the source code's bundle for more information.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset