The depth image surface plays an important role in 3D graphics application. It brings the perception of depth in a rendered scene using depth testing. In depth testing, each fragment's depth is stored in a special buffer called a depth image. Unlike the color image that stores the color information, the depth image stores depth information of the primitive's corresponding fragment from the camera view. The depth image's dimension is usually the same as the color image. Not a hard-and-fast rule, but in general, the depth image stores the depth information as 16-, 24-, or 32-bit float values.
The creation of a depth image is different from the color image. You must have noticed that we did not use the vkCreateImage()
API to obtain color image objects while retrieving swapchain images. These images were directly returned from the fpGetSwapchainImagesKHR()
extension API. In this section, we will go through a step-by-step process to create the depth image.
Image data is stored in a contiguous type of memory and is mapped to the 2D image memory where it is stored in a linear fashion. In the linear arrangement, texels are laid out in contiguous row-by-row memory locations, as shown in the following diagram:
A pitch generally represents the width of an image, which could be more than the padding bytes that are generally added in order to meet alignment requirements. The position offset of a given texel can be calculated using its row and column position along with the given pitch, as shown in the preceding image.
This linear layout is neat as long as texels are accessed in places along the row where no neighboring texel information is required. However, in general, many applications require image information to be fetched along multiple rows. When the image is large in dimension, the pitch length increases and stretches across multiple rows in this linear layout. In a multicache-level system, this leads to a situation where the performance can drop due to slower address translation caused by translation lookaside buffer (TLB) and cache misses.
On most GPUs, this slower translation of the address is fixed by storing the texels in a swizzle format. This way of storing image texels is called Optimal tiling, where the image texels are stored in a tiled fashion representing multiple columns and rows in a continuous memory chunk. For example, in the following diagram, there are four tiles represented with different colors, where each tile has 2 x 2 rows (pitch) and columns:
Clearly, in the linear fashion, blocks of the same color are set apart by the other blocks that comes in between; however, in the optimal layout, blocks of the same color are held together, providing a much more efficient way to access the neighboring texels without incurring performance loss. Note that this illustration of optimal tiling just mimics how the principle works; under the hood, there exist highly complex swizzling algorithms that help achieve optimal tiling.
In Vulkan, tiling is defined by VkImageTiling
, and it represents linear tiling (VK_IMAGE_TILING_LINEAR
) and optimal tiling (VK_IMAGE_TILING_OPTIMAL
). The following is the syntax for this:
typedef enum VkImageTiling { VK_IMAGE_TILING_OPTIMAL = 0, VK_IMAGE_TILING_LINEAR = 1, } VkImageTiling;
Let's take a look at the tiling types and their respective definitions:
Tiling type |
Description |
|
These are opaquely tiled and provide optimal access to the underlying memory by laying out the texels in an implementation-dependent arrangement. |
|
As understood by the name, texels here are arranged in a row-major order in a linear fashion. The coherency may cause some padding in each row. |
Initialize the depth format with 16-byte float values and query the format properties supported by the physical device specified by deviceObj->gpu
. The retrieved properties are used to choose the optimal tiling/swizzling (VK_IMAGE_TILING_OPTIMAL
) layout for the image in the memory.
The depth-related member variables are packed in a user-defined structure called Depth
in the Renderer
class. Here's the code that illustrates this:
struct{ VkFormat format; VkImage image; VkDeviceMemory mem; VkImageView view; } Depth;
The various fields of this structure are defined in the following table:
Parameters |
Description |
|
This refers to the depth image format, namely |
|
This refers to the |
|
This is the allocated memory associated with the depth image object. |
|
This is the |
The depth format, tiling information, and other parameters--such as image size and image type--are used to create the VkImageCreateInfo
control structure. Since we are creating a depth buffer, we need to specify the usage as VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
in the usage field of the same structure. Use it to create the VkImage
image object with the vkCreateImage()
API. For more information on VkImageCreateInfo
and vkCreateImage()
, refer to the Creating images subsection of the Understanding image resources section in this chapter:
VkResult result; VkImageCreateInfo imageInfo = {}; // If the depth format is undefined, // use fall back as 16-byte value if (Depth.format == VK_FORMAT_UNDEFINED) { Depth.format = VK_FORMAT_D16_UNORM; } const VkFormat depthFormat = Depth.format; VkFormatProperties props; vkGetPhysicalDeviceFormatProperties(*deviceObj->gpu, depthFormat, &props); if (props.optimalTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT) { imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL; } else if (props.linearTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT) { imageInfo.tiling = VK_IMAGE_TILING_LINEAR; } else { std::cout << "Unsupported Depth Format, try other Depth formats. "; exit(-1); } imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO; imageInfo.pNext = NULL; imageInfo.imageType = VK_IMAGE_TYPE_2D; imageInfo.format = depthFormat; imageInfo.extent.width = width; imageInfo.extent.height = height; imageInfo.extent.depth = 1; imageInfo.mipLevels = 1; imageInfo.arrayLayers = 1; imageInfo.samples = NUM_SAMPLES; imageInfo.queueFamilyIndexCount= 0; imageInfo.pQueueFamilyIndices = NULL; imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE; imageInfo.usage = VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT; imageInfo.flags = 0; // User create image info and create the image objects result = vkCreateImage(deviceObj->device, &imageInfo, NULL, &Depth.image); assert(result == VK_SUCCESS);
Query the buffer's image memory requirements using the vkGetImageMemoryRequirements()
API. This will retrieve the total size required for allocating the depth image object's physical memory backing. For more information on API usage, refer to the Gathering memory allocation requirements subsection in this chapter:
// Get the image memory requirements
VkMemoryRequirements memRqrmnt;
vkGetImageMemoryRequirements
(deviceObj->device, Depth.image, &memRqrmnt);
Use the memoryTypeBits
field from the queried memory requirements, memRqrmnt
, and determine the type of memory suitable for allocating the memory of the depth image using VulkanDevice::memoryTypeFromProperties()
:
VkMemoryAllocateInfo memAlloc = {}; memAlloc.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO; memAlloc.pNext = NULL; memAlloc.allocationSize = 0; memAlloc.memoryTypeIndex = 0; memAlloc.allocationSize = memRqrmnt.size; bool pass; // Determine the type of memory required // with memory properties pass = deviceObj->memoryTypeFromProperties(memRqrmnt. memoryTypeBits, 0, &memAlloc.memoryTypeIndex); assert(pass);
The VulkanDevice::memoryTypeFromProperties()
function takes three parameters as inputs. The first one (typeBits
) represents the type of the memory, the second parameter (requirementsMask
) specifies the user requirement for the particular memory type, and the last one (typeIndex
) returns the memory index handles.
This function iterates and checks whether the requested memory type is present. Next, it checks whether the found memory satisfies the user requirements. If successful, it returns Boolean true
and the index of the memory type; upon failure, it returns Boolean false
:
bool VulkanDevice::memoryTypeFromProperties(uint32_t typeBits, VkFlags requirementsMask, uint32_t *typeIndex) { // Search memtypes to find first index with those properties for (uint32_t i = 0; i < 32; i++) { if ((typeBits & 1) == 1) { // Type is available, does it match user properties? if ((memoryProperties.memoryTypes[i].propertyFlags & requirementsMask) == requirementsMask) { *typeIndex = i; return true; } } typeBits >>= 1; } // No memory types matched, return failure return false; }
The memory requirement guides the application to allocate a specified amount of memory for the depth image. Once the memory is allocated successfully using vkAllocateMemory()
, it needs to be bound to the depth image (Depth.image
), making the image the owner of the allocated memory:
// Allocate the physical backing for the depth image result = vkAllocateMemory(deviceObj->device, &memAlloc, NULL, &Depth.mem); assert(result == VK_SUCCESS); // Bind the allocated memory to the depth image result = vkBindImageMemory(deviceObj->device, Depth.image, Depth.mem, 0); assert(result == VK_SUCCESS);
GPU hardware that is capable of supporting optimal layouts requires transitioning from the optimal layout to the linear layout and vice versa. Optimal layouts are not directly accessible by the consumer components for read and write purposes. The opaque nature of an optimal layout requires a layout transition, which is the process of converting one type (old type) of layout into another type (new type).
GPU hardware that supports the optimal layout allows you to store the data either in a linear or optimal layout through layout transitioning. The layout transition process can be applied using memory barriers. The memory barriers inspect the specified old and new image layouts and execute the layout transition. It may not be necessary that every layout transition triggers an actual layout conversion operation on the GPU. For instance, when an image object is created for the first time, it may have the initial layout undefined; in such a case, the GPU may only need to access memory in the optimal pattern. For more information on memory barriers, continue with the next section.
A memory barrier is an instruction that helps synchronize data reads and writes. It guarantees that the operation specified before and after the memory barrier will be synchronized. When this instruction is inserted, it ensures that the memory operation issued before this instruction is completed prior to executing the memory instruction issued after the barrier instruction.
There are three types of memory barrier:
VkMemoryBarrier
structure's instance.VkBufferMemoryBarrier
structure's instance.VkImageMemoryBarrier
instance and is applicable to the different memory access types via a specific image sub-resource range of the specified image object.The allocated image memory needs to be laid out according to its usage. The image layout helps the memory contents become accessible in an implementation-specific way, given the nature of its usage. There is a general layout available for the image that can be used for anything, but this may not be the appropriate one (VK_IMAGE_LAYOUT_GENERAL
). In Vulkan, image layouts are represented using VkImageLayout
. The following are the fields defined for this enumeration:
VkImageLayout fields |
Description |
|
The image content in this layout and its subrange are pretty much in an undefined state and are assumed to be in this state right after they are created. |
|
This layout permits all operations on the image or its subrange, which is otherwise specified through the usage flags ( |
|
The image in this layout can only be used with the framebuffer color attachment. It can be accessed via framebuffer color reads and can be written using draw commands. |
|
The image in this layout can only be used with the framebuffer depth/stencil attachment. It can be accessed via framebuffer color reads and can be written using draw commands. |
|
This layout uses the image as a read-only shader resource. So it can only be accessed by shader reads done via a sampled image descriptor, combined image sampler descriptor, or read-only storage image descriptor ( |
|
An image (or a subrange of it) in this layout can only be used as the source operand of the commands |
|
An image (or a subrange of it) in this layout can only be used as the destination operand of the commands |
The layouts in the images applied through special memory barriers are called VkImageMemoryBarrier
. The memory barriers are inserted with the help of the vkCmdPipelineBarrier()
API. The syntax of this API is as follows:
void vkCmdPipelineBarrier( VkCommandBuffer commandBuffer, VkPipelineStageFlags srcStageMask, VkPipelineStageFlags dstStageMask, VkDependencyFlags dependencyFlags, uint32_t memoryBarrierCount, const VkMemoryBarrier* pMemoryBarriers, uint32_t bufferMemoryBarrierCount, const VkBufferMemoryBarrier* pBufferMemoryBarriers, uint32_t imageMemoryBarrierCount, const VkImageMemoryBarrier* pImageMemoryBarriers);
Let's see the specification of all the fields:
Parameters |
Description |
|
This is the command buffer in which the memory barrier is specified. |
|
This is the bitwise mask field specifying the pipeline stages that must complete their execution before the barrier is implemented. |
|
This is the bitwise mask field specifying the pipeline stages that should not start the execution until the barrier is completed. |
|
This refers to the |
|
This refers to the number of memory barriers. |
|
This is the |
|
This refers to the number of buffer memory barriers. |
|
This refers to the |
|
This refers to the number of image type memory barriers. |
|
This refers to the |
The following code makes use of an image barrier and sets the appropriate image layout information in the VkImageMemoryBarrier
control structure (imgMemoryBarrier
). This control structure is passed to the vkCmdPipelineBarrier()
API, which sets the execution and applies the memory barriers. The created depth image (Depth.image
) is set as a framebuffer depth/stencil attachment layout by specifying the VkImageMemoryBarrier's newLayout
field as VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
.
Using the created command pool, allocate the cmdDepthImage
command buffer. This command buffer will be used to record the image layout transition, as mentioned here:
/****** void VulkanRenderer::createDepthImage()******/ // Use command buffer to create the depth image. This includes - // Command buffer allocation, recording with begin/end // scope and submission. CommandBufferMgr::allocCommandBuffer(&deviceObj->device, cmdPool, &cmdDepthImage); CommandBufferMgr::beginCommandBuffer(cmdDepthImage); { // Set the image layout to depth stencil optimal setImageLayout(Depth.image, VK_IMAGE_ASPECT_DEPTH_BIT, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL, (VkAccessFlagBits)0, cmdDepthImage); } CommandBufferMgr::endCommandBuffer(cmdDepthImage); CommandBufferMgr::submitCommandBuffer(deviceObj->queue, &cmdDepthImage);
The image layout is set using the setImageLayout()
function. This is a helper function that records memory barriers using the vkCmdPipelineBarrier()
command.
This command is recorded in the cmdDepthImage
command buffer and guarantees that it will meet the requirement of proper image layouts before it allows the dependent resources to access it.
The setImageLayout()
helper function transits the existing old image layout format to the specified new layout type. In the present example, the old image layout is specified as VK_IMAGE_LAYOUT_UNDEFINED
because the image object is created for the first time and has no predefined layout applied. Since we are implementing the image layout for depth/stencil testing, the new intended image layout must be mentioned with the VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
usage type:
void VulkanRenderer::setImageLayout(VkImage image, VkImageAspectFlags aspectMask, VkImageLayout oldImageLayout, VkImageLayout newImageLayout, VkAccessFlagBits srcAccessMask, const VkCommandBuffer& cmd){ // Dependency on cmd assert(cmd != VK_NULL_HANDLE); // The deviceObj->queue must be initialized assert(deviceObj->queue != VK_NULL_HANDLE); VkImageMemoryBarrier imgMemoryBarrier = {}; imgMemoryBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER; imgMemoryBarrier.pNext = NULL; imgMemoryBarrier.srcAccessMask = srcAccessMask; imgMemoryBarrier.dstAccessMask = 0; imgMemoryBarrier.oldLayout = oldImageLayout; imgMemoryBarrier.newLayout = newImageLayout; imgMemoryBarrier.image = image; imgMemoryBarrier.subresourceRange.aspectMask = aspectMask; imgMemoryBarrier.subresourceRange.baseMipLevel = 0; imgMemoryBarrier.subresourceRange.levelCount = 1; imgMemoryBarrier.subresourceRange.layerCount = 1; if (oldImageLayout == VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) { imgMemoryBarrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; } switch (newImageLayout) { // Ensure that anything that was copying from this image // has completed. An image in this layout can only be // used as the destination operand of the commands case VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL: case VK_IMAGE_LAYOUT_PRESENT_SRC_KHR: imgMemoryBarrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; break; // Ensure any Copy or CPU writes to image are flushed. An image // in this layout can only be used as a read-only shader resource case VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL: imgMemoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT; imgMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT; break; // An image in this layout can only be used as a // framebuffer color attachment case VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL: imgMemoryBarrier.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT; break; // An image in this layout can only be used as a // framebuffer depth/stencil attachment case VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL: imgMemoryBarrier.dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT; break; } VkPipelineStageFlags srcStages= VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT; VkPipelineStageFlags destStages = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT; vkCmdPipelineBarrier(cmd, srcStages, destStages, 0, 0, NULL, 0, NULL, 1, &imgMemoryBarrier); }
Finally, we'll let the application use the depth image by means of an image view. We know very well that images cannot be used directly in a Vulkan application. They are used in the form of image views. The following code implements the creation of the image view using the vkCreateImageView()
API. For more information on the API, refer to the Creating the image view subsection under the Understanding image resources section in this chapter:
/****** void VulkanRenderer::createDepthImage()******/ VkImageViewCreateInfo imgViewInfo = {}; imgViewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; imgViewInfo.pNext = NULL; imgViewInfo.image = VK_NULL_HANDLE; imgViewInfo.format = depthFormat; imgViewInfo.components = { VK_COMPONENT_SWIZZLE_IDENTITY }; imgViewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT; imgViewInfo.subresourceRange.baseMipLevel = 0; imgViewInfo.subresourceRange.levelCount = 1; imgViewInfo.subresourceRange.baseArrayLayer = 0; imgViewInfo.subresourceRange.layerCount = 1; imgViewInfo.viewType = VK_IMAGE_VIEW_TYPE_2D; imgViewInfo.flags = 0; if ( depthFormat == VK_FORMAT_D16_UNORM_S8_UINT || depthFormat == VK_FORMAT_D24_UNORM_S8_UINT || depthFormat == VK_FORMAT_D32_SFLOAT_S8_UINT) { imgViewInfo.subresourceRange.aspectMask |= VK_IMAGE_ASPECT_STENCIL_BIT; } // Create the image view and allow the application to // use the images. imgViewInfo.image = Depth.image; result = vkCreateImageView(deviceObj->device, &imgViewInfo, NULL, &Depth.view); assert(result == VK_SUCCESS);
13.58.120.57