Pyrogenesis trunk
Recommended usage patterns

Vulkan gives great flexibility in memory allocation.

This chapter shows the most common patterns.

See also slides from talk: Sawicki, Adam. Advanced Graphics Techniques Tutorial: Memory management in Vulkan and DX12. Game Developers Conference, 2018

GPU-only resource

When: Any resources that you frequently write and read on GPU, e.g. images used as color attachments (aka "render targets"), depth-stencil attachments, images/buffers used as storage image/buffer (aka "Unordered Access View (UAV)").

What to do: Let the library select the optimal memory type, which will likely have VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT.

imgCreateInfo.imageType = VK_IMAGE_TYPE_2D;
imgCreateInfo.extent.width = 3840;
imgCreateInfo.extent.height = 2160;
imgCreateInfo.extent.depth = 1;
imgCreateInfo.mipLevels = 1;
imgCreateInfo.arrayLayers = 1;
VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_AUTO;
allocCreateInfo.priority = 1.0f;
VkImage img;
vmaCreateImage(allocator, &imgCreateInfo, &allocCreateInfo, &img, &alloc, nullptr);
VMA_CALL_PRE VkResult VMA_CALL_POST vmaCreateImage(VmaAllocator VMA_NOT_NULL allocator, const VkImageCreateInfo *VMA_NOT_NULL pImageCreateInfo, const VmaAllocationCreateInfo *VMA_NOT_NULL pAllocationCreateInfo, VkImage VMA_NULLABLE_NON_DISPATCHABLE *VMA_NOT_NULL pImage, VmaAllocation VMA_NULLABLE *VMA_NOT_NULL pAllocation, VmaAllocationInfo *VMA_NULLABLE pAllocationInfo)
Function similar to vmaCreateBuffer().
@ VMA_MEMORY_USAGE_AUTO
Selects best memory type automatically.
Definition: vk_mem_alloc.h:492
@ VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT
Set this flag if the allocation should have its own memory block.
Definition: vk_mem_alloc.h:528
uint32_t depth
Definition: vulkan.h:1779
uint32_t height
Definition: vulkan.h:1778
uint32_t width
Definition: vulkan.h:1777
Definition: vulkan.h:2685
VkImageLayout initialLayout
Definition: vulkan.h:2700
uint32_t mipLevels
Definition: vulkan.h:2692
uint32_t arrayLayers
Definition: vulkan.h:2693
VkSampleCountFlagBits samples
Definition: vulkan.h:2694
VkExtent3D extent
Definition: vulkan.h:2691
VkFormat format
Definition: vulkan.h:2690
VkImageType imageType
Definition: vulkan.h:2689
VkImageTiling tiling
Definition: vulkan.h:2695
VkImageUsageFlags usage
Definition: vulkan.h:2696
Parameters of new VmaAllocation.
Definition: vk_mem_alloc.h:1222
float priority
A floating-point value between 0 and 1, indicating the priority of the allocation relative to other m...
Definition: vk_mem_alloc.h:1268
VmaMemoryUsage usage
Intended usage of memory.
Definition: vk_mem_alloc.h:1230
VmaAllocationCreateFlags flags
Use VmaAllocationCreateFlagBits enum.
Definition: vk_mem_alloc.h:1224
Represents single memory allocation.
@ VK_IMAGE_LAYOUT_UNDEFINED
Definition: vulkan.h:813
@ VK_IMAGE_TILING_OPTIMAL
Definition: vulkan.h:828
@ VK_IMAGE_USAGE_SAMPLED_BIT
Definition: vulkan.h:841
@ VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
Definition: vulkan.h:843
@ VK_SAMPLE_COUNT_1_BIT
Definition: vulkan.h:1357
@ VK_IMAGE_TYPE_2D
Definition: vulkan.h:834
@ VK_FORMAT_R8G8B8A8_UNORM
Definition: vulkan.h:544
@ VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO
Definition: vulkan.h:1112

Also consider: Consider creating them as dedicated allocations using VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT, especially if they are large or if you plan to destroy and recreate them with different sizes e.g. when display resolution changes. Prefer to create such resources first and all other GPU resources (like textures and vertex buffers) later. When VK_EXT_memory_priority extension is enabled, it is also worth setting high priority to such allocation to decrease chances to be evicted to system memory by the operating system.

Staging copy for upload

When: A "staging" buffer than you want to map and fill from CPU code, then use as a source od transfer to some GPU resource.

What to do: Use flag VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT. Let the library select the optimal memory type, which will always have VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT.

bufCreateInfo.size = 65536;
VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer buf;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);
...
memcpy(allocInfo.pMappedData, myData, myDataSize);
VMA_CALL_PRE VkResult VMA_CALL_POST vmaCreateBuffer(VmaAllocator VMA_NOT_NULL allocator, const VkBufferCreateInfo *VMA_NOT_NULL pBufferCreateInfo, const VmaAllocationCreateInfo *VMA_NOT_NULL pAllocationCreateInfo, VkBuffer VMA_NULLABLE_NON_DISPATCHABLE *VMA_NOT_NULL pBuffer, VmaAllocation VMA_NULLABLE *VMA_NOT_NULL pAllocation, VmaAllocationInfo *VMA_NULLABLE pAllocationInfo)
Creates a new VkBuffer, allocates and binds memory for it.
@ VMA_ALLOCATION_CREATE_MAPPED_BIT
Set this flag to use a memory that will be persistently mapped and retrieve pointer to it.
Definition: vk_mem_alloc.h:549
@ VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT
Requests possibility to map the allocation (using vmaMapMemory() or VMA_ALLOCATION_CREATE_MAPPED_BIT)...
Definition: vk_mem_alloc.h:598
Definition: vulkan.h:2611
VkDeviceSize size
Definition: vulkan.h:2615
VkBufferUsageFlags usage
Definition: vulkan.h:2616
Parameters of VmaAllocation objects, that can be retrieved using function vmaGetAllocationInfo().
Definition: vk_mem_alloc.h:1337
void *VMA_NULLABLE pMappedData
Pointer to the beginning of this allocation as mapped data.
Definition: vk_mem_alloc.h:1379
@ VK_BUFFER_USAGE_TRANSFER_SRC_BIT
Definition: vulkan.h:392
@ VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO
Definition: vulkan.h:1110

Also consider: You can map the allocation using vmaMapMemory() or you can create it as persistenly mapped using VMA_ALLOCATION_CREATE_MAPPED_BIT, as in the example above.

Readback

When: Buffers for data written by or transferred from the GPU that you want to read back on the CPU, e.g. results of some computations.

What to do: Use flag VMA_ALLOCATION_CREATE_HOST_ACCESS_RANDOM_BIT. Let the library select the optimal memory type, which will always have VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT and VK_MEMORY_PROPERTY_HOST_CACHED_BIT.

bufCreateInfo.size = 65536;
VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer buf;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);
...
const float* downloadedData = (const float*)allocInfo.pMappedData;
@ VMA_ALLOCATION_CREATE_HOST_ACCESS_RANDOM_BIT
Requests possibility to map the allocation (using vmaMapMemory() or VMA_ALLOCATION_CREATE_MAPPED_BIT)...
Definition: vk_mem_alloc.h:610
@ VK_BUFFER_USAGE_TRANSFER_DST_BIT
Definition: vulkan.h:393

Advanced data uploading

For resources that you frequently write on CPU via mapped pointer and freqnently read on GPU e.g. as a uniform buffer (also called "dynamic"), multiple options are possible:

  1. Easiest solution is to have one copy of the resource in HOST_VISIBLE memory, even if it means system RAM (not DEVICE_LOCAL) on systems with a discrete graphics card, and make the device reach out to that resource directly.
    • Reads performed by the device will then go through PCI Express bus. The performace of this access may be limited, but it may be fine depending on the size of this resource (whether it is small enough to quickly end up in GPU cache) and the sparsity of access.
  2. On systems with unified memory (e.g. AMD APU or Intel integrated graphics, mobile chips), a memory type may be available that is both HOST_VISIBLE (available for mapping) and DEVICE_LOCAL (fast to access from the GPU). Then, it is likely the best choice for such type of resource.
  3. Systems with a discrete graphics card and separate video memory may or may not expose a memory type that is both HOST_VISIBLE and DEVICE_LOCAL, also known as Base Address Register (BAR). If they do, it represents a piece of VRAM (or entire VRAM, if ReBAR is enabled in the motherboard BIOS) that is available to CPU for mapping.
    • Writes performed by the host to that memory go through PCI Express bus. The performance of these writes may be limited, but it may be fine, especially on PCIe 4.0, as long as rules of using uncached and write-combined memory are followed - only sequential writes and no reads.
  4. Finally, you may need or prefer to create a separate copy of the resource in DEVICE_LOCAL memory, a separate "staging" copy in HOST_VISIBLE memory and perform an explicit transfer command between them.

Thankfully, VMA offers an aid to create and use such resources in the the way optimal for the current Vulkan device. To help the library make the best choice, use flag VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT together with VMA_ALLOCATION_CREATE_HOST_ACCESS_ALLOW_TRANSFER_INSTEAD_BIT. It will then prefer a memory type that is both DEVICE_LOCAL and HOST_VISIBLE (integrated memory or BAR), but if no such memory type is available or allocation from it fails (PC graphics cards have only 256 MB of BAR by default, unless ReBAR is supported and enabled in BIOS), it will fall back to DEVICE_LOCAL memory for fast GPU access. It is then up to you to detect that the allocation ended up in a memory type that is not HOST_VISIBLE, so you need to create another "staging" allocation and perform explicit transfers.

bufCreateInfo.size = 65536;
VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer buf;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);
VkMemoryPropertyFlags memPropFlags;
vmaGetAllocationMemoryProperties(allocator, alloc, &memPropFlags);
{
// Allocation ended up in a mappable memory and is already mapped - write to it directly.
// [Executed in runtime]:
memcpy(allocInfo.pMappedData, myData, myDataSize);
}
else
{
// Allocation ended up in a non-mappable memory - need to transfer.
stagingBufCreateInfo.size = 65536;
stagingBufCreateInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;
VmaAllocationCreateInfo stagingAllocCreateInfo = {};
stagingAllocCreateInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer stagingBuf;
VmaAllocation stagingAlloc;
VmaAllocationInfo stagingAllocInfo;
vmaCreateBuffer(allocator, &stagingBufCreateInfo, &stagingAllocCreateInfo,
&stagingBuf, &stagingAlloc, stagingAllocInfo);
// [Executed in runtime]:
memcpy(stagingAllocInfo.pMappedData, myData, myDataSize);
//vkCmdPipelineBarrier: VK_ACCESS_HOST_WRITE_BIT --> VK_ACCESS_TRANSFER_READ_BIT
VkBufferCopy bufCopy = {
0, // srcOffset
0, // dstOffset,
myDataSize); // size
vkCmdCopyBuffer(cmdBuf, stagingBuf, buf, 1, &bufCopy);
}
VMA_CALL_PRE void VMA_CALL_POST vmaGetAllocationMemoryProperties(VmaAllocator VMA_NOT_NULL allocator, VmaAllocation VMA_NOT_NULL allocation, VkMemoryPropertyFlags *VMA_NOT_NULL pFlags)
Given an allocation, returns Property Flags of its memory type.
@ VMA_ALLOCATION_CREATE_HOST_ACCESS_ALLOW_TRANSFER_INSTEAD_BIT
Together with VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT or VMA_ALLOCATION_CREATE_HOST_AC...
Definition: vk_mem_alloc.h:622
Definition: vulkan.h:2722
VkFlags VkMemoryPropertyFlags
Definition: vulkan.h:2441
@ VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
Definition: vulkan.h:918
@ VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT
Definition: vulkan.h:396
#define vkCmdCopyBuffer
Definition: vulkan.h:4489

Other use cases

Here are some other, less obvious use cases and their recommended settings: