Chapter 13. Multipass Rendering


What You’ll Learn in This Chapter

• How to use renderpass objects to accelerate multipass rendering

• How to fold clears and barriers into renderpass objects

• How to control how and when data gets saved to memory


Many graphics applications make multiple passes over each frame or are otherwise able to subdivide rendering into multiple logical phases. Vulkan brings this into the core of its operation, exposing the concept of multipass rendering within a single object. This object was briefly introduced in Chapter 7, “Graphics Pipelines,” but we skimmed over many of the details, instead going only into enough depth to enable basic single-pass rendering to be achieved. In this chapter, we dig deeper into the topic to explain how multipass rendering algorithms can be implemented in a few renderpass objects or even a single one.

When we introduced the renderpass object back in Chapter 7, “Graphics Pipelines,” we covered it in only enough detail to explain how a framebuffer can be attached to a command buffer at the beginning of a renderpass and how the renderpass could be configured to allow drawing into a single set of color attachments. A renderpass object, however, can contain many subpasses, each performing some of the operations required to render the final scene. Dependency information can be introduced, allowing a Vulkan implementation to build a directed acyclic graph (DAG) and determine where data flows, who produces it and who consumes it, what needs to be ready by when, and so on.

Input Attachments

Recall the VkRenderPassCreateInfo structure, the definition of which is

typedef struct VkRenderPassCreateInfo {
    VkStructureType                   sType;
    const void*                       pNext;
    VkRenderPassCreateFlags           flags;
    uint32_t                          attachmentCount;
    const VkAttachmentDescription*    pAttachments;
    uint32_t                          subpassCount;
    const VkSubpassDescription*       pSubpasses;
    uint32_t                          dependencyCount;
    const VkSubpassDependency*        pDependencies;
} VkRenderPassCreateInfo;

Within this structure, we have pointers to arrays of attachments, subpasses, and dependency information. Each subpass is defined by a VkSubpassDescription structure, the definition of which is

typedef struct VkSubpassDescription {
    VkSubpassDescriptionFlags        flags;
    VkPipelineBindPoint              pipelineBindPoint;
    uint32_t                         inputAttachmentCount;
    const VkAttachmentReference *    pInputAttachments;
    uint32_t                         colorAttachmentCount;
    const VkAttachmentReference *    pColorAttachments;
    const VkAttachmentReference *    pResolveAttachments;
    const VkAttachmentReference *    pDepthStencilAttachment;
    uint32_t                         preserveAttachmentCount;
    const uint32_t*                  pPreserveAttachments;
} VkSubpassDescription;

In the example renderpass we set up in Chapter 7, we used a single subpass with no dependencies and a single set of outputs. However, each subpass can have one or more input attachments, which are attachments from which you can read in your fragment shaders. The primary difference between an input attachment and a normal texture bound into a descriptor set is that when you read from an input attachment, you read from the current fragment.

Each subpass may write to one or more output attachments. These are either the color attachments or the depth-stencil attachment (of which there is only one). By inspecting the subpasses, which output attachments they write to and input attachments they read from, Vulkan can build up a graph of where data flows within a renderpass.

In order to demonstrate this, we will construct a simple three-pass renderpass object that performs deferred shading. In the first pass, we render only to a depth attachment in order to produce what is known as a depth prepass.

In a second pass, we render all the geometry with a special shader that produces a g-buffer, which is a color attachment (or set of color attachments) that stores a normal diffuse color, specular power, and other parameters needed for shading. In this second pass, we test against the depth buffer we just generated, so we do not write out the large amounts of data for geometry that will not be visible in the final output. Even when the geometry is visible, we do not run complex lighting shaders; we simply write out data.

In our third pass, we perform all of our shading calculations. We read from the depth buffer in order to reconstruct the view-space position, which allows us to create our eye and view vectors. We also read from our normal, specular, and diffuse buffers, which supply parameters for our lighting computation. Note that in the third pass, we don’t actually need the real geometry, and we instead render a single triangle that, after clipping, covers the entire viewport.

Figure 13.1 shows this schematically. As you can see from the figure, the first subpass has no inputs and only a depth attachment. The second subpass uses the same depth attachment for testing but also has no inputs, producing only outputs. The third and final pass uses the depth buffer produced by the first pass and the g-buffer attachments produced by the second pass as input attachments. It can do this because the lighting calculations at each pixel require only the data computed by previous shader invocations at the same location.

Image

Figure 13.1: Data Flow for a Simple Deferred Renderer

Listing 13.1 shows the code required to construct a renderpass that represents these three subpasses and their attachments.

Listing 13.1: Deferred Shading Renderpass Setup

enum
{
    kAttachment_BACK          = 0,
    kAttachment_DEPTH         = 1,
    kAttachment_GBUFFER       = 2
};
enum
{
    kSubpass_DEPTH            = 0,
    kSubpass_GBUFFER          = 1,
    kSubpass_LIGHTING         = 2
};

static const VkAttachmentDescription attachments[] =
{
    // Back buffer
    {
        0,                                 // flags
        VK_FORMAT_R8G8B8A8_UNORM,          // format
        VK_SAMPLE_COUNT_1_BIT,             // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // loadOp
        VK_ATTACHMENT_STORE_OP_STORE,      // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,  // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,         // initialLayout
        VK_IMAGE_LAYOUT_PRESENT_SRC_KHR    // finalLayout
    },
    // Depth buffer
    {
        0,                                 // flags
        VK_FORMAT_D32_SFLOAT,              // format
        VK_SAMPLE_COUNT_1_BIT,             // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // loadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,  // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,  // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,         // initialLayout
        VK_IMAGE_LAYOUT_UNDEFINED          // finalLayout
    },
    // G-buffer 1
    {
        0,                                 // flags
        VK_FORMAT_R32G32B32A32_UINT,       // format
        VK_SAMPLE_COUNT_1_BIT,             // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // loadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,  // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,   // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,  // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,         // initialLayout
        VK_IMAGE_LAYOUT_UNDEFINED          // finalLayout
    }
};
// Depth prepass depth buffer reference (read/write)
static const VkAttachmentReference depthAttachmentReference =
{
    kAttachment_DEPTH,                                  // attachment
    VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL    // layout
};

// G-buffer attachment references (render)
static const VkAttachmentReference gBufferOutputs[] =
{
    {
        kAttachment_GBUFFER,                            // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

// Lighting input attachment references
static const VkAttachmentReference gBufferReadRef[] =
{
    // Read from g-buffer.
    {
        kAttachment_GBUFFER,                            // attachment
        VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL        // layout
    },
    // Read depth as texture.
    {
        kAttachment_DEPTH,                              // attachment
        VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL // layout
    }
};

// Final pass-back buffer render reference
static const VkAttachmentReference backBufferRenderRef[] =
{
    {
        kAttachment_BACK,                               // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

static const VkSubpassDescription subpasses[] =
{
    // Subpass 1 - depth prepass
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        0,                                  // colorAttachmentCount
        nullptr,                            // pColorAttachments
        nullptr,                            // pResolveAttachments
        &depthAttachmentReference,          // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 2 - g-buffer generation
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        vkcore::utils::arraysize(gBufferOutputs), // colorAttachmentCount
        gBufferOutputs,                     // pColorAttachments
        nullptr,                            // pResolveAttachments
        &depthAttachmentReference,          // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 3 - lighting
    {
        0,                                 // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,   // pipelineBindPoint
        vkcore::utils::arraysize(gBufferReadRef),  // inputAttachmentCount
        gBufferReadRef,                    // pInputAttachments
        vkcore::utils::arraysize(backBufferRenderRef),// colorAttachmentCount
        backBufferRenderRef,               // pColorAttachments
        nullptr,                           // pResolveAttachments
        nullptr,                           // pDepthStencilAttachment
        0,                                 // preserveAttachmentCount
        nullptr                            // pPreserveAttachments
    },
};

static const VkSubpassDependency dependencies[] =
{
    // G-buffer pass depends on depth prepass.
    {
        kSubpass_DEPTH,                                 // srcSubpass
        kSubpass_GBUFFER,                               // dstSubpass
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,  // srcStageMask
        VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,          // dstStageMask
        VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,           // srcAccessMask
        VK_ACCESS_SHADER_READ_BIT,                      // dstAccessMask
        VK_DEPENDENCY_BY_REGION_BIT                     // dependencyFlags
    },
    // Lighting pass depends on g-buffer.
    {
        kSubpass_GBUFFER,                               // srcSubpass
        kSubpass_LIGHTING,                              // dstSubpass
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,  // srcStageMask
        VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,          // dstStageMask
        VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,           // srcAccessMask
        VK_ACCESS_SHADER_READ_BIT,                      // dstAccessMask
        VK_DEPENDENCY_BY_REGION_BIT                     // dependencyFlags
    },
};

static const VkRenderPassCreateInfo renderPassCreateInfo =
{
    VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO, nullptr,
    0,                                                  // flags
    vkcore::utils::arraysize(attachments),              // attachmentCount
    attachments,                                        // pAttachments
    vkcore::utils::arraysize(subpasses),                // subpassCount
    subpasses,                                          // pSubpasses
    vkcore::utils::arraysize(dependencies),             // dependencyCount
    dependencies                                        // pDependencies
};

result = vkCreateRenderPass(device,
                            &renderPassCreateInfo,
                            nullptr,
                            &m_renderPass);

As you can see, Listing 13.1 is quite long. However, the code complexity is relatively low; most of the listing is simply definitions of static data structures describing the renderpass.

The attachments[] array contains a list of all of the attachments used in the renderpass. This is referenced by index by the arrays of VkAttachmentReference structures, depthAttachmentReference, gBufferOutputs, gBufferReadRef, and backBufferRenderRef. These reference the depth buffer, the g-buffer as an output, the g-buffer as an input, and the back buffer, respectively.

The subpasses[] array is a description of the subpasses contained in the renderpass. Each is desribed by an instance of the VkSubpassDescription structure, and you can see there is one for each of the depth prepass, g-buffer generation, and lighting passes.

Note that for the lighting pass, we include the g-buffer read references and the depth buffer as input attachments to the pass. This is so that the lighting computations performed in the shader can read the g-buffer content and the pixels’ depth value.

Finally, we see the dependencies between the passes described in the dependencies[] array. There are two entries in the array, the first describing the dependency of the g-buffer pass on the depth prepass, and the second describing the dependency of the lighting pass on the g-buffer pass. Note that there is no reason to have a dependency between the lighting pass and the depth prepass even though one technically exists, because there is already an implicit depencency through the g-buffer generation pass.

The subpasses inside a renderpass are logically executed in the order in which they are declared in the array of subpasses referenced by the VkRenderPassCreateInfo structure used to create the renderpass object. When vkCmdBeginRenderPass() is called, the first subpass in the array is automatically begun. In simple renderpasses with a single subpass, this is sufficient to execute the entire renderpass. However, once we have multiple subpasses, we need to be able to tell Vulkan when to move from subpass to subpass.

To do this, we call vkCmdNextSubpass(), the prototype of which is

void vkCmdNextSubpass (
    VkCommandBuffer                         commandBuffer,
    VkSubpassContents                       contents);

The command buffer in which to place the command is specified in commandBuffer. The contents parameter specifies where the commands for the subpass will come from. For now, we’ll set this to VK_SUBPASS_CONTENTS_INLINE, which indicates that you will continue to add commands to the same command buffer. It’s also possible to call other command buffers, in which case we’d set this to VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS. We’ll cover this scenario later in this chapter.

When vkCmdNextSubpass() is called, the current command buffer moves to the next subpass in the current renderpass. Correspondingly, you can call vkCmdNextSubpass() only between calls to vkCmdBeginRenderPass() and vkCmdEndRenderPass(), and only until you have exhausted the subpasses within the renderpass.

With renderpasses containing multiple subpasses, we still must call vkCmdEndRenderPass() to terminate the current renderpass and finalize rendering.

Attachment Contents

Each color and depth-stencil attachment associated with a renderpass has a load operation and a store operation that determine how its contents are loaded from and stored to memory as the renderpass is begun and ended.

Attachment Initialization

When the renderpass is begun, the operations that should be performed on each of the attachments are specified in the loadOp field of the VkAttachmentDescription structure describing the attachment. There are two possible values for this field.

VK_ATTACHMENT_LOAD_OP_DONT_CARE means that you don’t care about the initial contents of the attachment. This means that Vulkan can do whatever it needs to do to get the attachment ready to render into (including doing nothing), without worrying about the actual values in the attachment. For example, if it has a super-fast clear that clears only to purple, then purple it shall be.

Setting loadOp for the attachment to VK_ATTACHMENT_LOAD_OP_CLEAR means that the attachment will be cleared to a value you specify at vkCmdBeginRenderPass() time. While logically, this operation happens at the very start of the renderpass, in practice, implementations may delay the actual clear operation to the beginning of the first pass that uses the attachment. This is the preferred method of clearing color attachments.

It’s also possible to explicitly clear one or more color or depth-stencil attachments inside a renderpass by calling vkCmdClearAttachments(), the prototype of which is

void vkCmdClearAttachments (
    VkCommandBuffer                        commandBuffer,
    uint32_t                              attachmentCount,
    const VkClearAttachment*               pAttachments,
    uint32_t                              rectCount,
    const VkClearRect*                     pRects);

The command buffer that will execute the command is specified in commandBuffer. vkCmdClearAttachments() will clear the contents of several attachments. The number of attachments to clear is specified in attachmentCount, and pAttachments should be a pointer to an array of attachmentCount VkClearAttachment structures, each defining one of the attachments to clear. The definition of VkClearAttachment is

typedef struct VkClearAttachment {
    VkImageAspectFlags    aspectMask;
    uint32_t              colorAttachment;
    VkClearValue          clearValue;
} VkClearAttachment;

The aspectMask field of VkClearAttachment specifies the aspect or aspects of the attachment to be cleared. If aspectMask contains VK_IMAGE_ASPECT_DEPTH_BIT, VK_IMAGE_ASPECT_STENCIL_BIT, or both, then the clear operation will be applied to the depth-stencil attachment for the current subpass. Each subpass can have at most one depth-stencil attachment. If aspectMask contains VK_IMAGE_ASPECT_COLOR_BIT, then the clear operation will be applied to the color attachment at index colorAttachment in the current subpass. It is not possible to clear a color attachment and a depth-stencil attachment with a single VkClearAttachment structure, so aspectMask should not contain VK_IMAGE_ASPECT_COLOR_BIT along with VK_IMAGE_ASPECT_DEPTH_BIT or VK_IMAGE_ASPECT_STENCIL_BIT.

The values to clear the attachment with are specified in the clearValue field, which is an instance of the VkClearValue union. This was introduced in Chapter 8, “Drawing,” and its definition is

typedef union VkClearValue {
    VkClearColorValue            color;
    VkClearDepthStencilValue     depthStencil;
} VkClearValue;

If the referenced attachment is a color attachment, then the values from the color field of the VkClearAttachment structure will be used. Otherwise, the values contained in the depthStencil field of the structure will be used.

In addition to clearing multiple attachments, a single call to vkCmdClearAttachments() can clear rectangular regions of each attachment. This provides additional functionality over setting the loadOp for the attachment to VK_ATTACHMENT_LOAD_OP_CLEAR. If an attachment is cleared with VK_ATTACHMENT_LOAD_OP_CLEAR (which is what you want in the majority of cases), the whole attachment is cleared, and there is no opportunity to clear only part of the attachment. However, when you use vkCmdClearAttachments(), multiple smaller regions can be cleared.

The number of regions to clear is specified in the rectCount parameter to vkCmdClearAttachments(), and the pRects parameter is a pointer to an array of rectCount VkClearRect structures, each defining one of the rectangles to clear. The definition of VkClearRect is

typedef struct VkClearRect {
    VkRect2D    rect;
    uint32_t    baseArrayLayer;
    uint32_t    layerCount;
} VkClearRect;

The VkClearRect structure defines more than a rectangle. The rect field contains the actual rectangle to clear. If the attachment is an array image, then some or all of its layers can be cleared by specifying the range of layers in baseArrayLayer and layerCount, which contain, respectively, the index of the first layer to clear and the number of layers to clear.

In addition to containing more information and providing more functionality than the attachments load operation, vkCmdClearAttachments() also potentially provides more convenience than vkCmdClearColorImage() or vkCmdClearDepthStencilImage(). Both of these commands require that the image be in VK_IMAGE_LAYOUT_GENERAL or VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL layout and therefore may require a pipeline barrier around them to ensure this. Also, these commands cannot be called inside a renderpass. On the other hand, vkCmdClearAttachments() takes advantage of the fact that the attachments are already bound for rendering and are considered to be part of the renderpass content, almost like a special kind of draw. Therefore, no barrier or special handling is needed beyond ensuring that the command is executed inside a renderpass.

That said, it is still recommended that you use the VK_ATTACHMENT_LOAD_OP_CLEAR load operation when you need the entirety of an attachment to be cleared as part of a renderpass, and the VK_ATTACHMENT_LOAD_OP_DONT_CARE operation when you will guarantee that you will overwrite every pixel in the attachment by the time the renderpass has completed.

Render Areas

When a renderpass instance is executed, it is possible to tell Vulkan that you’re going to update only a small area of the attachments. This area is known as the render area and can be specified when vkCmdBeginRenderPass() is called. We introduced this briefly in Chapter 8, “Drawing,” and the renderArea member of the VkRenderPassBeginInfo structure is passed to vkCmdBeginRenderPass().

If you are rendering to the entire framebuffer, set the renderArea field to cover the entire area of the framebuffer. However, if you want to update only a small part of the framebuffer, you can set the renderArea rectangle accordingly. Any part of the renderpass’s attachments that are not contained inside this render area are not affected by any of the operations in the renderpass including the renderpass’s load and store operations for those attachments.

When you are using a render area smaller than the entire attachment, it is the application’s responsibility to ensure that it doesn’t render outside this area. Some implementations may ignore the render area entirely and trust your application to stick within it; some may round the render area up to some multiple of an internal rectangular region; and some may strictly adhere to the area you’ve specified. The only way to get well-defined behavior is to make sure you render only inside this area, using a scissor test if needed.

Rendering to a smaller render area than the entire attachment may also come at some performance cost unless the area matches the granularity of the supported render areas for the implementation. Consider the framebuffer to be tiles in a grid that are potentially rendered one at a time. Completely covering and redefining the content of a single tile should be fast, but updating only part of a tile may cause Vulkan to do extra work to keep the untouched parts of the tile well defined.

To acheive maximum performance, you should ensure that the render areas you use match the render area granularity supported by the implementation. This can be queried using vkGetRenderAreaGranularity(), the prototype of which is

void vkGetRenderAreaGranularity (
    VkDevice                            device,
    VkRenderPass                        renderPass,
    VkExtent2D*                         pGranularity);

For a renderpass specified in renderPass, vkGetRenderAreaGranularity() returns, in the variable pointed to by pGranularity, the dimensions of a tile used for rendering. The device that owns the renderpass should be passed in device.

To ensure that the render area you pass to vkCmdBeginRenderPass() performs optimally, you should ensure two things: First, that the x and y components of its origin are integer multiples of the width and height of the render-area granularity; and second, that the width and height of the render area are either integer multiples of that granularity or extend to the edge of the framebuffer. Obviously, a render area that completely covers the attachments trivially meets these requirements.

Preserving Attachment Content

In order to preserve the contents of the attachment, we need to set the attachment’s store operation (contained in the storeOp field of the VkAttachmentDescription structure used to create the renderpass) to VK_ATTACHMENT_STORE_OP_STORE. This causes Vulkan to ensure that after the renderpass has completed, the contents of the image used as the attachment accurately reflect what was rendered during the renderpass.

The only other choice for this field is VK_ATTACHMENT_STORE_OP_DONT_CARE, which tells Vulkan that you don’t need the content of the attachment after the renderpass has completed. This is used, for example, when an attachment is used to store intermediate data that will be consumed by some later subpass in the same renderpass. In this case, the content doesn’t need to live longer than the renderpass itself.

In some cases, you need to produce content in one subpass, execute an unrelated subpass, and then consume the content created more than one subpass ago. In this case, you should tell Vulkan that it cannot discard the content of an attachment over the course of rendering another subpass. In practice, a Vulkan implementation should be able to tell by inspecting the input and output attachments for the subpasses in a renderpass which ones produce and which ones consume data and will do the right thing. However, to be fully correct, every live attachment should appear as an input, an output, or a preserve attachment in each subpass. Furthermore, by including an attachment in the preserve attachment array for a subpass, you are telling Vulkan that you are about to use the attachment content in an upcoming subpass. This may enable it to keep some of the data in cache or some other high-speed memory.

The list of attachments to preserve across a subpass is specified using the pPreserveAttachments field of the VkSubpassDescription structure describing each subpass. This is a pointer to an array of uint32_t indices into the renderpass’s attachment list, and the number of integers in this array is contained in the preserveAttachmentCount field of VkSubpassDescription.

To demonstrate this, we extend our example further to render transparent objects. We render a depth buffer, then render a g-buffer containing per-pixel information, and finally we render a shading pass that calculates lighting information. Because we have only 1 pixel’s worth of information, this deferred shading approach cannot render transparent or translucent objects. Therefore, these objects must be rendered separately. The traditional approach is to simply render all the opaque geometry in one pass (or passes) and then composite the translucent geometry on top at the end. This introduces a serial dependency, causing rendering of the translucent geometry to wait for opaque geometry to complete rendering.

This serial dependency is shown in the DAG illustrated in Figure 13.2.

Image

Figure 13.2: Serial Dependency of Translucent on Opaque Geometry

Rather than introduce a serial dependency, which would preclude Vulkan from rendering any of the translucent geometry in parallel with opaque geometry, we take another approach: Render the translucent geometry to another color attachment (using the same depth prepass information for depth rejection) and the opaque geometry into a second temporary attachment. After both the opaque and transparent geometry have been rendered, we perform a composition pass that blends the translucent geometry on top of the opaque geometry. This pass can also perform other per-pixel operations, such as color grading, vignetting, film-grain application, and so on. The new DAG for this is shown in Figure 13.3.

Image

Figure 13.3: Parallel Rendering of Translucent and Opaque Geometry

As you can see from the updated DAG in Figure 13.3, first the depth information is rendered and then the g-buffer generation pass executes, followed by the lighting pass. The translucency buffer generation pass has no dependency on the g-buffer or the result of the lighting pass, so it is able to run in parallel, depending only on the depth information from the depth prepass. The new composite pass now depends on the result of the lighting pass and the translucent pass.

As the subpasses in the renderpass are expressed serially, regardless of the serial ordering of the opaque g-buffer pass and the translucency pass in the renderpass, we need to preserve the content of the first pass’s outputs until the shading pass is able to execute. Because there is less data to store, we render the translucent objects first and preserve the translucency buffer across the g-buffer generation pass. The g-buffer and translucency buffer are then used as input attachments to the shading pass.

The code to set all this up is shown in Listing 13.2.

Listing 13.2: Translucency and Deferred Shading Setup

enum
{
    kAttachment_BACK         = 0,
    kAttachment_DEPTH        = 1,
    kAttachment_GBUFFER      = 2,
    kAttachment_TRANSLUCENCY = 3,
    kAttachment_OPAQUE       = 4
};

enum
{
    kSubpass_DEPTH           = 0,
    kSubpass_GBUFFER         = 1,
    kSubpass_LIGHTING        = 2,
    kSubpass_TRANSLUCENTS    = 3,
    kSubpass_COMPOSITE       = 4
};

static const VkAttachmentDescription attachments[] =
{
    // Back buffer
    {
        0,                                  // flags
        VK_FORMAT_R8G8B8A8_UNORM,           // format
        VK_SAMPLE_COUNT_1_BIT,              // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // loadOp
        VK_ATTACHMENT_STORE_OP_STORE,       // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,          // initialLayout
        VK_IMAGE_LAYOUT_PRESENT_SRC_KHR     // finalLayout
    },
    // Depth buffer
    {
        0,                                  // flags
        VK_FORMAT_D32_SFLOAT,               // format
        VK_SAMPLE_COUNT_1_BIT,              // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // loadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,          // initialLayout
        VK_IMAGE_LAYOUT_UNDEFINED           // finalLayout
    },
    // G-buffer 1
    {
        0,                                  // flags
        VK_FORMAT_R32G32B32A32_UINT,        // format
        VK_SAMPLE_COUNT_1_BIT,              // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // loadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,          // initialLayout
        VK_IMAGE_LAYOUT_UNDEFINED           // finalLayout
    },
    // Translucency buffer
    {
        0,                                  // flags
        VK_FORMAT_R8G8B8A8_UNORM,           // format
        VK_SAMPLE_COUNT_1_BIT,              // samples
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // loadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // storeOp
        VK_ATTACHMENT_LOAD_OP_DONT_CARE,    // stencilLoadOp
        VK_ATTACHMENT_STORE_OP_DONT_CARE,   // stencilStoreOp
        VK_IMAGE_LAYOUT_UNDEFINED,          // initialLayout
        VK_IMAGE_LAYOUT_UNDEFINED           // finalLayout
    }
};

// Depth prepass depth buffer reference (read/write)
static const VkAttachmentReference depthAttachmentReference =
{
    kAttachment_DEPTH,                                  // attachment
    VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL    // layout
};
// G-buffer attachment references (render)
static const VkAttachmentReference gBufferOutputs[] =
{
    {
        kAttachment_GBUFFER,                            // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

// Lighting input attachment references
static const VkAttachmentReference gBufferReadRef[] =
{
    // Read from g-buffer.
    {
        kAttachment_GBUFFER,                            // attachment
        VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL        // layout

    },
    // Read depth as texture.
    {
        kAttachment_DEPTH,                              // attachment
        VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL // layout
    }
};

// Lighting pass - write to opaque buffer.
static const VkAttachmentReference opaqueWrite[] =
{
    // Write to opaque buffer.
    {
        kAttachment_OPAQUE,                             // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

// Translucency rendering pass - translucency buffer write
static const VkAttachmentReference translucentWrite[] =
{
    // Write to translucency buffer.
    {
        kAttachment_TRANSLUCENCY,                       // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

static const VkAttachmentReference compositeInputs[] =
{
    // Read from translucency buffer.
    {
        kAttachment_TRANSLUCENCY,                       // attachment
        VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL        // layout
    },
    // Read from opaque buffer.
    {
        kAttachment_OPAQUE,                             // attachment
        VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL        // layout
    }
};

// Final pass - back buffer render reference
static const VkAttachmentReference backBufferRenderRef[] =
{
    {
        kAttachment_BACK,                               // attachment
        VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL        // layout
    }
};

static const VkSubpassDescription subpasses[] =
{
    // Subpass 1 - depth prepass
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        0,                                  // colorAttachmentCount
        nullptr,                            // pColorAttachments
        nullptr,                            // pResolveAttachments
        &depthAttachmentReference,          // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 2 - g-buffer generation
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        vkcore::utils::arraysize(gBufferOutputs),  // colorAttachmentCount
        gBufferOutputs,                     // pColorAttachments
        nullptr,                            // pResolveAttachments
        &depthAttachmentReference,          // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 3 - lighting
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        vkcore::utils::arraysize(gBufferReadRef), // inputAttachmentCount
        gBufferReadRef,                     // pInputAttachments
        vkcore::utils::arraysize(opaqueWrite), // colorAttachmentCount
        opaqueWrite,                        // pColorAttachments
        nullptr,                            // pResolveAttachments
        nullptr,                            // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 4 - translucent objects
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        vkcore::utils::arraysize(translucentWrite), // colorAttachmentCount
        translucentWrite,                   // pColorAttachments
        nullptr,                            // pResolveAttachments
        nullptr,                            // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    },
    // Subpass 5 - composite
    {
        0,                                  // flags
        VK_PIPELINE_BIND_POINT_GRAPHICS,    // pipelineBindPoint
        0,                                  // inputAttachmentCount
        nullptr,                            // pInputAttachments
        vkcore::utils::arraysize(backBufferRenderRef), // colorAttachmentCount
        backBufferRenderRef,                // pColorAttachments
        nullptr,                            // pResolveAttachments
        nullptr,                            // pDepthStencilAttachment
        0,                                  // preserveAttachmentCount
        nullptr                             // pPreserveAttachments
    }
};
static const VkSubpassDependency dependencies[] =
{
    // G-buffer pass depends on depth prepass.
    {
        kSubpass_DEPTH,                                 // srcSubpass
        kSubpass_GBUFFER,                               // dstSubpass
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,  // srcStageMask
        VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,          // dstStageMask
        VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,           // srcAccessMask
        VK_ACCESS_SHADER_READ_BIT,                      // dstAccessMask
        VK_DEPENDENCY_BY_REGION_BIT                     // dependencyFlags
    },
    // Lighting pass depends on g-buffer.
    {
        kSubpass_GBUFFER,                               // srcSubpass
        kSubpass_LIGHTING,                              // dstSubpass
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,  // srcStageMask
        VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,          // dstStageMask
        VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,           // srcAccessMask
        VK_ACCESS_SHADER_READ_BIT,                      // dstAccessMask
        VK_DEPENDENCY_BY_REGION_BIT                     // dependencyFlags
    },
    // Composite pass depends on translucent pass.
    {
       kSubpass_TRANSLUCENTS,                           // srcSubpass
       kSubpass_COMPOSITE,                              // dstSubpass
       VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,   // srcStageMask
       VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,           // dstStageMask
       VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,            // srcAccessMask
       VK_ACCESS_SHADER_READ_BIT,                       // dstAccessMask
       VK_DEPENDENCY_BY_REGION_BIT                      // dependencyFlags
    },
    // Composite pass also depends on lighting.
    {
       kSubpass_LIGHTING,                               // srcSubpass
       kSubpass_COMPOSITE,                              // dstSubpass
       VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,   // srcStageMask
       VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,           // dstStageMask
       VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,            // srcAccessMask
       VK_ACCESS_SHADER_READ_BIT,                       // dstAccessMask
       VK_DEPENDENCY_BY_REGION_BIT                      // dependencyFlags
    }
};

static const VkRenderPassCreateInfo renderPassCreateInfo =
{
    VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO, nullptr,
    0,                                                  // flags
    vkcore::utils::arraysize(attachments),              // attachmentCount
    attachments,                                        // pAttachments
    vkcore::utils::arraysize(subpasses),                // subpassCount
    subpasses,                                          // pSubpasses
    vkcore::utils::arraysize(dependencies),             // dependencyCount
    dependencies                                        // pDependencies
};

result = vkCreateRenderPass(device,
                            &renderPassCreateInfo,
                            nullptr,
                            &m_renderPass);

Again, the code in Listing 13.2 is extremely long but is mostly a set of constant data structures. Here, we’ve added the translucent pass and the composite pass to the list of passes in subpasses[]. The final pass is now the composite pass, so it is the one that references the back buffer in its pColorAttachments array. The result of the lighting pass is now written to the temporary opaque buffer, indexed by kAttachment_OPAQUE.

Although this appears to consume a significant amount of memory, we can note several redeeming points about this configuration:

• There are likely to be at least two or three back buffers, while you can use the same buffer every frame for intermediate results. The additional overhead of one extra buffer is not that large.

• The lighting pass consumes the g-buffer, after which it is no longer needed. You can either write the opaque result back into the g-buffer or mark the attachment as transient and hope that the Vulkan implementation does this for you.

• If you are rendering high dynamic range, then you may want your rendering results to be in a higher-precision format than the back buffer anyway and then perform tone mapping or other processing during the composition pass. In this case, you’ll need the intermediate buffers.

Secondary Command Buffers

Secondary command buffers are command buffers that can be called from primary command buffers. Although not directly related to multipass rendering, they are used primarily to allow the commands contributing to a render consisting of many subpasses to be built up in multiple command buffers. As you know, a renderpass must begin and end in the same command buffer. That is, the call to vkCmdEndRenderPass() must appear in the same command buffer as its corresponding vkCmdBeginRenderPass().

Given this requirement, it’s very difficult to render a large amount of the scene in a single renderpass and still build command buffers in parallel. In the ideal case (from an implementation point of view), the entire scene will be rendered in a single large renderpass, with potentially many subpasses. Without secondary command buffers, this would require most, if not all, of a scene to be rendered using a single long command buffer, precluding parallelized command generation.

To create a secondary command buffer, create a command pool and then from it allocate one or more command buffers. In the VkCommandBufferAllocateInfo structure passed to vkAllocateCommandBuffers(), set the level field to VK_COMMAND_BUFFER_LEVEL_SECONDARY. We then record commands into the command buffer as usual, but with certain restrictions as to which commands can be executed. A table listing which commands may and may not be recorded in secondary command buffers is shown in the Appendix, “Vulkan Functions.”

When the secondary command buffer is ready to execute from another primary command buffer, call vkCmdExecuteCommands(), the prototype of which is

void vkCmdExecuteCommands (
    VkCommandBuffer                    commandBuffer,
    uint32_t                           commandBufferCount,
    const VkCommandBuffer*             pCommandBuffers);

The command buffer from which to call the secondary command buffers is passed in commandBuffer. A single call to vkCmdExecuteCommands() can execute many secondary-level command buffers. The number of command buffers to execute is passed in commandBufferCount, and pCommandBuffers should point to an array of this many VkCommandBuffer handles to the command buffers to execute.

Vulkan command buffers contain a certain amount of state. In particular, the currently bound pipeline, the various dynamic states, and the currently bound descriptor sets are effectively properties of each command buffer. When multiple command buffers are executed back to back, even when sent to the same call to vkQueueSubmit(), no state is inherited from one to the next. That is, the initial state of each command buffer is undefined, even if the previously executed command buffer left everything as it needs to be for the next.

When you are executing a large, complex command buffer, it’s probably fine to begin with an undefined state because the first thing you’ll do in the command buffer is set up everything required for the first few drawing commands. That cost is likely to be small relative to the cost of the whole command buffer, even if it’s partially redundant with respect to previously executed command buffers.

When a primary command buffer calls a secondary command buffer, and especially when a primary command buffer calls many short secondary command buffers back to back, it can be costly to reset the complete state of the pipeline in each and every secondary command buffer. To compensate for this, some state can be inherited from primary to secondary command buffers. This is done using the VkCommandBufferInheritanceInfo structure, which is passed to vkBeginCommandBuffer(). The definition of this structure is

typedef struct VkCommandBufferInheritanceInfo {
    VkStructureType                  sType;
    const void*                      pNext;
    VkRenderPass                     renderPass;
    uint32_t                         subpass;
    VkFramebuffer                    framebuffer;
    VkBool32                         occlusionQueryEnable;
    VkQueryControlFlags              queryFlags;
    VkQueryPipelineStatisticFlags    pipelineStatistics;
} VkCommandBufferInheritanceInfo;

The VkCommandBufferInheritanceInfo provides a mechanism for your application to tell Vulkan that you know what the state will be when the secondary command buffer is executed. This allows the properties of the command buffer to begin in a well-defined state.

The sType field of VkCommandBufferInheritanceInfo should be set to VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_INFO, and pNext should be set to nullptr.

The renderpass and subpass fields specify the renderpass and the subpass of the render that the command buffer will be called inside, respectively. If the framebuffer to which the renderpass will be rendering is known, then it can be specified in the framebuffer field. This can sometimes result in better performance when the command buffer is executed. However, if you don’t know which framebuffer will be used, then you should set this field to VK_NULL_HANDLE.

The occlusionQueryEnable field should be set to VK_TRUE if the secondary command buffer will be executed while the primary command buffer is executing an occlusion query. This tells Vulkan to keep any counters associated with occlusion queries consistent during execution of the secondary command buffer. If this flag is VK_FALSE, then the secondary command buffer should not be executed while occlusion queries are active in the calling command buffer. While the behavior is technically undefined, the most likely outcome is that the results of the occlusion queries are garbage.

You can execute occlusion queries inside secondary command buffers regardless of the value of occlusionQueryEnable. You will need to begin and end the query inside the same secondary command buffer if you don’t inherit the state from a calling primary.

If occlusion query inheritance is enabled, then the queryFlags field contains additional flags that control the behavior of occlusion queries. The only flag defined for use here is VK_QUERY_CONTROL_PRECISE_BIT, which, if set, indicates that precise occlusion-query results are needed.

The pipelineStatistics field includes flags that tell Vulkan which pipeline statistics are being gathered by calling the primary command buffer. Again, you can gather pipeline statistics during the execution of a secondary command buffer, but if you want the operation of the pipeline invoked by a secondary command buffer to contribute counters accumulated by the primary command buffer, you need to set these flags accurately. The available bits are members of the VkQueryPipelineStatisticFlagBits enumeration.

Summary

This chapter delved deeper into the renderpass, a fundamental feature of Vulkan that enables efficient multipass rendering. You saw how to construct a nontrivial renderpass that contains many subpasses and how to build the contents of those subpasses as separate command buffers that can be called from your main command buffer. We discussed some potential optimizations that Vulkan implementations could make to improve the performance of rendering when they are given all the information about what will come in the frame. You also saw how many of the functions performed by barriers and clears can be folded into the renderpass, sometimes making them close to free. The renderpass is a powerful feature and, if possible, you should endeavor to make use of it in your applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.184.114