Vulkan isn’t newbie friendly.
But something that I get asked a lot are questions that aren’t answered anywhere or is hard to find (i.e. typically in the form of “why are things done this way?”) unless you are familiar with previous APIs (i.e. D3D11 & Metal) and existing HW limitations.
Note that for brevity and clarity I’ll shorten the following:
RGBA8_UNORM
isVK_FORMAT_R8G8B8A8_UNORM
isRGBA8_SRGB
VK_FORMAT_R8G8B8A8_SRGB
- PSO (Pipeline State Object) is a
VkPipeline
USAGE_STORAGE_BIT
isVK_IMAGE_USAGE_STORAGE_BIT
Why Vulkan hardcodes the viewport in the PSO?
VkPipeline
(from now on PSOs) asks for viewport & scissor values. That means that targetting a 1440×900 viewport and a 1920×1080 require two different PSOs just because the resolution is different. As you can guess, this is insane.
The solution is to do what everyone does: use dynamic states for these two settings. Add VK_DYNAMIC_STATE_VIEWPORT
& VK_DYNAMIC_STATE_SCISSOR
and now the same PSO can be used for both resolutions. Just set them via vkCmdSetViewport
and vkCmdSetScissor
.
Now you may be wondered WHY? Well, the answer is simple: Because one mobile vendor ferociously requested the vp/scissor values to be hardcoded into the PSO because it was faster in their HW. They claimed it would save a few multiplications (thus saving perf and battery).
That’s it, that’s the reason. It was to appease one vendor.
Pretty pointless if you ask me, given that AFAIK nobody, I mean nobody hardcodes the viewport into the PSO; other than perhaps a tech demo.
Everybody complained about this, and everyone agreed it was unreasonable to demand viewport settings to be hardcoded, so dynamic state was created for this and shipped with Vulkan 1.0. And everyone secretly knew that everyone would ignore the hardcoded path.
Why does PSO ask for a VkRenderPass? Doesn’t this tie a PSO to a specific resolution???
This is the ugliest part of Vulkan to explain that has ties down to its obsession for using subpasses for TBDR (aka mobile) GPUs; instead of much friendlier approaches like Metal giving read/write access to gl_FragColor (something that sadly was only just added in end of 2021 via VK_ARM_rasterization_order_attachment_access
).
If you use VK_KHR_dynamic_rendering
you can skip this question. If not, keep reading.
Indeed, a PSO needs a VkRenderPass
and in turn it needs a VkImage.
Therefore, each PSO is still tied to a specific resolution.
But here’s the thing: You don’t really need the VkImage
!
Vulkan Specs have a hard-to-read “compatibility” concept:
VUID-VkRenderPassBeginInfo-renderPass-00904
renderPass
must be compatible with therenderPass
member of the VkFramebufferCreateInfo structure specified when creatingframebuffer
The rough TL;DR is that two VkRenderPass are “compatible” if they have the exact same pixel format, same MSAA settings and same number of attachments in the same order with the same settings. Their resolutions don’t matter! The load/store actions don’t matter either.
Ironically the easiest way to understand what makes two VkRenderPass
compatible is to look what Metal asks for in MTLRenderPipelineColorAttachmentDescriptor, MTLRenderPassDepthAttachmentDescriptor, and MTLRenderPassStencilAttachmentDescriptor.
In OgreNext, we use VulkanRenderSystem::getVkRenderPass
to build a dummy VkRenderPass
that is compatible (based on our own abstraction called HlmsPassPso
) providing pixel format and MSAA settings without ever providing a VkImage.
And we use a cache via VulkanCache::getRenderPass
to ensure all PSOs share the same VkRenderPass
pointer.
Wait, but that means I need to track pass format and msaa? That’s a lot of work!
Yes, you need to track that. If you design your engine properly around PSOs, it isn’t that much work. Note that you need to do this not just for Vulkan, but also for D3D12 and Metal.
If you already have an existing engine that isn’t designed around PSOs, you can use a caching system instead to bind current state to an abstracted pass entry.
This is what modern drivers do internally for D3D11 and OpenGL.
Why can’t I reinterpret RGBA8_UNORM into RGBA8_SRGB if I use USAGE_STORAGE_BIT?
You can. But there was an API oversight.
See… on a lot of HW, USAGE_STORAGE_BIT
is not supported with
due to HW limitations.RGBA8_UNORM_SRGB
Thus a very simple & common solution is to create an RGBA8_UNORM
texture with USAGE_STORAGE_BIT
, write to it in a compute shader (doing sRGB conversion by hand in the shader) and then reinterpret the texture as RGBA8_SRGB
for sampling like a regular texture.
But validation layers will complain if you do this.
You need the VK_KHR_maintenance2
extension to tell Vulkan drivers you intend to do this.
Simply add VkImageViewUsageCreateInfo
to VkImageViewCreateInfo::pNext
VkImageViewCreateInfo imageViewCi;
VkImageViewUsageCreateInfo flagRestriction;
memset( flagRestriction, 0, sizeof( flagRestriction ) );
flagRestriction.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO;
imageViewCi.pNext = &flagRestriction;
flagRestriction.usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT;
if( isTexture() )
flagRestriction.usage |= VK_IMAGE_USAGE_SAMPLED_BIT;
if( isRenderToTexture() )
{
flagRestriction.usage |= PixelFormatGpuUtils::isDepth( pixelFormat )
? VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
: VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
}
This signals that you want to reinterpret the regular non-sRGB to SRGB without USAGE_STORAGE_BIT
because you don’t care about it.
The texture must have been created with VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
.
See OgreNext code for reference.
Why can’t RGBA8_SRGB use USAGE_STORAGE_BIT?
This is a HW limitation. Internally texture formats are rearranged in different patterns. Not just morton order to make it cache friendly, but also at the bit level.
Whereas we think of an RGBA8_UNORM
texture to be 8 bits of red, followed by 8 bits of green i.e.: RRRRRRRRGGGGGGGGBBBBBBBBAAAAAAAA
but internally it may actually be interleaved in weird ways RRGGBBRRRRBBBG...
with morton order applied on top of it.
They may even choose to take a 2×2 pixel block and put all R bits first (i.e. 64 bits of red), then all G next and so on. This arrangement is better if you intend to use textureGather for example.
How bits and pixels are interleaved is opaque and up to each HW. This arrangement is supposedly done so to simplify bilinear filtering. GPUs might use lossless compression if they can.
The exact details are unknown to me. However the point is, that the GPU does not do what you think it is doing.
sRGB complicates things because data must be converted from sRGB to linear on demand during sampling (and before filtering). And it seems that giving write access to sRGB (which is what the USAGE_STORAGE_BIT
flag is for) makes things too complicated.
So the easy path for HW vendors is to just forbid this combination and force the developer to make the sRGB conversion before writing the data as raw RGBA8_UNORM
.
Also note that adding the flag USAGE_STORAGE_BIT
to RGBA8_UNORM
can disable some of these optimizations I mention. So don’t add it if you don’t have to.
Why resource binding is done through VkPipelineLayout & VkDescriptorSetLayout and all that madness?
Faith Ekstrand answers this question much better than I ever could.
The short version is that HW supported by Vulkan can be divided in 4 binding models:
- Direct Access
- Descriptor buffers
- Descriptor heaps
- Fixed HW bindings
And descriptor set is the only “reasonable” abstraction very clever people found to cover all those 4 models.
Particularly if you can ditch certain HW, you can go full bindless and make your life easier. But if you want the broadest support, you need to go in deep.
See Faith Ekstrand’s post for in-depth explanation.