| ================= |
| DirectX Container |
| ================= |
| |
| .. contents:: |
| :local: |
| |
| .. toctree:: |
| :hidden: |
| |
| Overview |
| ======== |
| |
| The DirectX Container (DXContainer) file format is the binary file format for |
| compiled shaders targeting the DirectX runtime. The file format is also called |
| the DXIL Container or DXBC file format. Because the file format can be used to |
| include either DXIL or DXBC compiled shaders, the nomenclature in LLVM is simply |
| DirectX Container. |
| |
| DirectX Container files are read by the compiler and associated tools as well as |
| the DirectX runtime, profiling tools and other users. This document serves as a |
| companion to the implementation in LLVM to more completely document the file |
| format for its many users. |
| |
| Basic Structure |
| =============== |
| |
| A DXContainer file begins with a header, and is then followed by a sequence of |
| "parts", which are analogous to object file sections. Each part contains a part |
| header, and some number of bytes of data after the header in a defined format. |
| |
| DX Container data structures are encoded little-endian in the binary file. |
| |
| The LLVM versions of all data structures described and/or referenced in this |
| file are defined in |
| `llvm/include/llvm/BinaryFormat/DXContainer.h |
| <https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainer.h>`_. |
| Some pseudo code is provided in blocks below to ease understanding of this |
| document, but reading it with the header available will provide the most |
| clarity. |
| |
| File Header |
| ----------- |
| |
| .. code-block:: c |
| |
| struct Header { |
| uint8_t Magic[4]; |
| uint8_t Digest[16]; |
| uint16_t MajorVersion; |
| uint16_t MinorVersion; |
| uint32_t FileSize; |
| uint32_t PartCount; |
| }; |
| |
| The DXContainer header matches the pseudo-definition above. It begins with a |
| four character code (magic number) with the value ``DXBC`` to denote the file |
| format. |
| |
| The ``Digest`` is a 128bit hash digest computed with a proprietary algorithm and |
| encoded in the binary by the bytecode validator. |
| |
| The ``MajorVersion`` and ``MinorVersion`` encode the file format version |
| ``1.0``. |
| |
| The remaining fields encode 32-bit unsigned integers for the file size and |
| number of parts. |
| |
| Following the part header is an array of ``PartCount`` 32-bit unsigned integers |
| specifying the offsets of each part header. |
| |
| Part Data |
| --------- |
| |
| .. code-block:: c |
| |
| struct PartHeader { |
| uint8_t Name[4]; |
| uint32_t Size; |
| } |
| |
| Each part begins with a part header. A part header includes the 4-character part |
| name, and a 32-bit unsigned integer specifying the size of the part data. The |
| part header is followed by ``Size`` bytes of data comprising the part. The |
| format does not explicitly require 32-bit alignment of parts, although LLVM does |
| implement this restriction in the writer code (because it's a good idea). The |
| LLVM object reader code does not assume inputs are correctly aligned to avoid |
| undefined behavior caused by misaligned inputs generated by other compilers. |
| |
| Part Formats |
| ============ |
| |
| The part name indicates the format of the part data. There are 24 part headers |
| used by DXC and FXC. Not all compiled shaders contain all parts. In the list |
| below parts generated only by DXC are marked with †, and parts generated only by |
| FXC are marked with \*. |
| |
| #. `DXIL`_† - Stores the DXIL bytecode. |
| #. `HASH`_† - Stores the shader MD5 hash. |
| #. ILDB† - Stores the DXIL bytecode with LLVM Debug Information embedded in the module. |
| #. ILDN† - Stores shader debug name for external debug information. |
| #. `ISG1`_ - Stores the input signature for Shader Model 5.1+. |
| #. ISGN\* - Stores the input signature for Shader Model 4 and earlier. |
| #. `OSG1`_ - Stores the output signature for Shader Model 5.1+. |
| #. OSG5\* - Stores the output signature for Shader Model 5. |
| #. OSGN\* - Stores the output signature for Shader Model 4 and earlier. |
| #. PCSG\* - Stores the patch constant signature for Shader Model 5.1 and earlier. |
| #. PDBI† - Stores PDB information. |
| #. PRIV - Stores arbitrary private data (Not encoded by either FXC or DXC). |
| #. `PSG1`_ - Stores the patch constant signature for Shader Model 6+. |
| #. `PSV0`_ - Stores Pipeline State Validation data. |
| #. RDAT† - Stores Runtime Data. |
| #. RDEF\* - Stores resource definitions. |
| #. `RTS0`_ - Stores compiled root signature. |
| #. `SFI0`_ - Stores shader feature flags. |
| #. SHDR\* - Stores compiled DXBC bytecode. |
| #. SHEX\* - Stores compiled DXBC bytecode. |
| #. DXBC\* - Stores compiled DXBC bytecode. |
| #. SRCI† - Stores shader source information. |
| #. STAT† - Stores shader statistics. |
| #. VERS† - Stores shader compiler version information. |
| |
| DXIL Part |
| --------- |
| .. _DXIL: |
| |
| The DXIL part is comprised of three data structures: the ``ProgramHeader``, the |
| ``BitcodeHeader`` and the bitcode serialized LLVM 3.7 IR Module. |
| |
| The ``ProgramHeader`` contains the shader model version and pipeline stage |
| enumeration value. This identifies the target profile of the contained shader |
| bitcode. |
| |
| The ``BitcodeHeader`` contains the DXIL version information and refers to the |
| start of the bitcode data. |
| |
| HASH Part |
| --------- |
| .. _HASH: |
| |
| The HASH part contains a 32-bit unsigned integer with the shader hash flags, and |
| a 128-bit MD5 hash digest. The flags field can either have the value ``0`` to |
| indicate no flags, or ``1`` to indicate that the file hash was computed |
| including the source code that produced the binary. |
| |
| Program Signature (SG1) Parts |
| ----------------------------- |
| .. _ISG1: |
| .. _OSG1: |
| .. _PSG1: |
| |
| .. code-block:: c |
| |
| struct ProgramSignatureHeader { |
| uint32_t ParamCount; |
| uint32_t FirstParamOffset; |
| } |
| |
| The program signature parts (ISG1, OSG1, & PSG1) all use the same data |
| structures to encode inputs, outputs and patch information. The |
| ``ProgramSignatureHeader`` includes two 32-bit unsigned integers to specify the |
| number of signature parameters and the offset of the first parameter. |
| |
| Beginning at ``FirstParamOffset`` bytes from the start of the |
| ``ProgramSignatureHeader``, ``ParamCount`` ``ProgramSignatureElement`` |
| structures are written. Following the ``ProgramSignatureElements`` is a string |
| table of null terminated strings padded to 32-byte alignment. This string table |
| matches the DWARF string table format as implemented by LLVM. |
| |
| Each ``ProgramSignatureElement`` encodes a ``NameOffset`` value which specifies |
| the offset into the string table. A value of ``0`` denotes no name. The offsets |
| encoded here are from the beginning of the ``ProgramSignatureHeader`` not the |
| beginning of the string table. |
| |
| The ``ProgramSignatureElement`` contains several enumeration fields which are |
| defined in `llvm/include/llvm/BinaryFormat/DXContainerConstants.def <https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainerConstants.def>`_. |
| These fields encode the D3D system value, the type of data and its precision |
| requirements. |
| |
| PSV0 Part |
| --------- |
| .. _PSV0: |
| |
| The Pipeline State Validation data encodes versioned runtime information |
| structures. These structures use a scheme where in lieu of encoding a version |
| number, they encode the size of the structure and each new version of the |
| structure is additive. This allows readers to infer the version of the structure |
| by comparing the encoded size with the size of known structures. If the encoded |
| size is larger than any known structure, the largest known structure can validly |
| parse the data represented in the known structure. |
| |
| In LLVM we represent the versions of the associated data structures with |
| versioned namespaces under the ``llvm::dxbc::PSV`` namespace (e.g. ``v0``, |
| ``v1``). Each structure in the ``v0`` namespace is the base version, the |
| structures in the ``v1`` namespace inherit from the ``v0`` namespace, and the |
| ``v2`` structures inherit from the ``v1`` structures, and so on. |
| |
| The high-level structure of the PSV data is: |
| |
| #. ``RuntimeInfo`` structure |
| #. Resource bindings |
| #. Signature elements |
| #. Mask Vectors (Output, Input, InputPatch, PatchOutput) |
| |
| Immediately following the part header for the PSV0 part is a 32-bit unsigned |
| integer specifying the size of the ``RuntimeInfo`` structure that follows. |
| |
| Immediately following the ``RuntimeInfo`` structure is a 32-bit unsigned integer |
| specifying the number of resource bindings. If the number of resources is |
| greater than zero, another unsigned 32-bit integer follows to specify the size |
| of the ``ResourceBindInfo`` structure. This is followed by the specified number |
| of structures of the specified size (which infers the version of the structure). |
| |
| For version 0 of the data this ends the part data. |
| |
| PSV0 Signature Elements |
| ~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The signature elements are conceptually a single concept but the data is encoded |
| in three different blocks. The first block is a string table, the second block |
| is an index table, and the third block is the elements themselves, which in turn |
| are separeated by input, output and patch constant or primitive elements. |
| |
| Signature elements capture much of the same data captured in the :ref:`SG1 |
| <ISG1>` parts. The use of an index table allows de-duplciation of data for a more |
| compact final representation. |
| |
| The string table begins with a 32-bit unsigned integer specifying the table |
| size. This string table uses the DXContainer format as implemented in LLVM. This |
| format prefixes the string table with a null byte so that offset ``0`` is a null |
| string, and pads to 32-byte alignment. |
| |
| The index table begins with a 32-bit unsigned integer specifying the size of the |
| table, and is followed by that many 32-bit unsigned integers representing the |
| table. The index table may or may not deduplicate repeated sequences (both DXC |
| and Clang do). The indices signify the indices in the flattened aggregate |
| representation which the signature element describes. A single semantic may have |
| more than one entry in this table to denote the different attributes of its |
| members. |
| |
| For example given the following code: |
| |
| .. code-block:: c |
| |
| struct VSOut_1 |
| { |
| float4 f3 : VOUT2; |
| float3 f4 : VOUT3; |
| }; |
| |
| |
| struct VSOut |
| { |
| float4 f1 : VOUT0; |
| float2 f2[4] : VOUT1; |
| VSOut_1 s; |
| int4 f5 : VOUT4; |
| }; |
| |
| void main(out VSOut o1 : A) { |
| } |
| |
| The semantic ``A`` gets expanded into 5 output signature elements. Those |
| elements are: |
| |
| .. note:: |
| |
| In the example below, it is a coincidence that the rows match the indices, in |
| more complicated examples with multiple semantics this is not the case. |
| |
| #. Index 0 starts at row 0, contains 4 columns, and is float32. This represents |
| ``f1`` in the source. |
| #. Index 1, 2, 3, and 4 starts at row 1, contains two columns and is float32. |
| This represents ``f2`` in the source, and it spreads across rows 1 - 4. |
| #. Index 5 starts at row 5, contains 4 columns, and is float32. This represents |
| ``f3`` in the source. |
| #. Index 6 starts at row 6, contains 3 columns, and is float32. This represents |
| ``f4``. |
| #. Index 7 starts at row 7, contains 4 columns, and is signed 32-bit integer. |
| This represents ``f5`` in the source. |
| |
| The LLVM ``obj2yaml`` tool can parse this data out of the PSV and present it in |
| human readable YAML. For the example above it produces the output: |
| |
| .. code-block:: YAML |
| |
| SigOutputElements: |
| - Name: A |
| Indices: [ 0 ] |
| StartRow: 0 |
| Cols: 4 |
| StartCol: 0 |
| Allocated: true |
| Kind: Arbitrary |
| ComponentType: Float32 |
| Interpolation: Linear |
| DynamicMask: 0x0 |
| Stream: 0 |
| - Name: A |
| Indices: [ 1, 2, 3, 4 ] |
| StartRow: 1 |
| Cols: 2 |
| StartCol: 0 |
| Allocated: true |
| Kind: Arbitrary |
| ComponentType: Float32 |
| Interpolation: Linear |
| DynamicMask: 0x0 |
| Stream: 0 |
| - Name: A |
| Indices: [ 5 ] |
| StartRow: 5 |
| Cols: 4 |
| StartCol: 0 |
| Allocated: true |
| Kind: Arbitrary |
| ComponentType: Float32 |
| Interpolation: Linear |
| DynamicMask: 0x0 |
| Stream: 0 |
| - Name: A |
| Indices: [ 6 ] |
| StartRow: 6 |
| Cols: 3 |
| StartCol: 0 |
| Allocated: true |
| Kind: Arbitrary |
| ComponentType: Float32 |
| Interpolation: Linear |
| DynamicMask: 0x0 |
| Stream: 0 |
| - Name: A |
| Indices: [ 7 ] |
| StartRow: 7 |
| Cols: 4 |
| StartCol: 0 |
| Allocated: true |
| Kind: Arbitrary |
| ComponentType: SInt32 |
| Interpolation: Constant |
| DynamicMask: 0x0 |
| Stream: 0 |
| |
| The number of signature elements of each type is encoded in the |
| ``llvm::dxbc::PSV::v1::RuntimeInfo`` structure. If any of the element count |
| values are non-zero, the size of the ``ProgramSignatureElement`` structure is |
| encoded next to allow versioning of that structure. Today there is only one |
| version. Following the size field is the specified number of signature elements |
| in the order input, output, then patch constant or primitive. |
| |
| Following the signature elements is a sequence of mask vectors encoded as a |
| series of 32-bit integers. Each 32-bit integer in the mask encodes values for 8 |
| input/output/patch or primitive elements. The mask vector is filled from least |
| significant bit to most significant bit with each added element shifting the |
| previous elements left. A reader needs to consult the total number of vectors |
| encoded in the ``RuntimeInfo`` structure to know how to read the mask vector. |
| |
| If the shader has ``UsesViewID`` enabled in the ``RuntimeInfo`` an output mask |
| vector will be included. The output mask vector is four arrays of 32-bit |
| unsigned integers. Each of the four arrays corresponds to an output stream. |
| Geometry shaders have a maximum of four output streams, all other shader stages |
| only support one output stream. Each bit in the mask vector identifies one |
| column of an output from the output signature depends on the ViewID. |
| |
| If the shader has ``UsesViewID`` enabled, it is a hull shader, and it has patch |
| constant or primitive vector elements, a patch constant or primitive vector mask |
| will be included. It is identical in structure to the output mask vector. Each |
| bit in the mask vector identifies one column of a patch constant output which |
| depends on the ViewID. |
| |
| The next series of mask vectors are similar in structure to the output mask |
| vector, but they contain an extra dimension. |
| |
| The output/input map is encoded next if the shader has inputs and outputs. The |
| output/input mask encodes which outputs are impacted by each column of each |
| input. The size for each mask vector is the size of the output max vector * the |
| number of inputs * 4 (for each component). Each bit in the mask vector |
| identifies one column of an output and a column of an input. A value of 1 means |
| the output is impacted by the input. |
| |
| If the shader is a hull shader, and it has inputs and patch outputs, an input to |
| patch map will be included next. This is identical in structure to the |
| output/input map. The dimensions are defined by the size of the patch constant |
| or primitive vector mask * the number of inputs * 4 (for each component). Each |
| bit in the mask vector identifies one column of a patch constant output and a |
| column of an input. A value of 1 means the output is impacted by the input. |
| |
| If the shader is a domain shader, and it has outputs and patch outputs, an |
| output patch map will be included next. This is identical in structure to the |
| output/input map. The dimensions are defined by the size of the patch constant |
| or primitive vector mask * the number of outputs * 4 (for each component). Each |
| bit in the mask vector identifies one column of a patch constant input and a |
| column of an output. A value of 1 means the output is impacted by the primitive |
| input. |
| |
| Root Signature (RTS0) Part |
| -------------------------- |
| .. _RTS0: |
| |
| The Root Signature data defines the shader's resource interface with Direct3D |
| 12, specifying what resources the shader needs to access and how they're |
| organized and bound to the pipeline. |
| |
| The RTS0 part comprises three data structures: ``RootSignatureHeader``, |
| ``RootParameters`` and ``StaticSamplers``. The details of each will be described |
| in the following sections. All ``RootParameters`` will be serialized following |
| the order they were defined in the metadata representation. |
| |
| The table below summarizes the data being serialized as well as it's size. The |
| details of it part will be discussed in further details on the next sections |
| of this document. |
| |
| ======================== =========================================== ============================= |
| Part Name Size In Bytes Maximum number of Instances |
| ======================== =========================================== ============================= |
| Root Signature Header 24 1 |
| Root Parameter Headers 12 Many |
| Root Parameter ================================ === Many |
| Root Constants 12 |
| Root Descriptor Version 1.0 8 |
| Root Descriptor Version 1.1 12 |
| Descriptors Tables Version 1.0 20 |
| Descriptors Tables Version 1.1 24 |
| ================================ === |
| |
| Static Samplers 52 Many |
| ======================== =========================================== ============================= |
| |
| |
| Root Signature Header |
| ~~~~~~~~~~~~~~~~~~~~~ |
| |
| The root signature header is 24 bytes long, consisting of six 32 bit values |
| representing the version, number and offset of parameters, number and offset |
| of static samplers, and a flags field for global behaviours: |
| |
| .. code-block:: c |
| |
| struct RootSignatureHeader { |
| uint32_t Version; |
| uint32_t NumParameters; |
| uint32_t ParametersOffset; |
| uint32_t NumStaticSamplers; |
| uint32_t StaticSamplerOffset; |
| uint32_t Flags; |
| } |
| |
| |
| Root Parameters |
| ~~~~~~~~~~~~~~~ |
| |
| Root parameters define how resources are bound to the shader pipeline, each |
| type having different size and fields. |
| |
| The slot of root parameters is preceded by a variable size section containing |
| the header information for such parameters. Such structure is 12 bytes long, |
| composed of three 32 bit values, representing the parameter type, a flag |
| encoding the pipeline stages where the data is visible, and an offset |
| calculated from the start of RTS0 section. |
| |
| .. code-block:: c |
| |
| struct RootParameterHeader { |
| uint32_t ParameterType; |
| uint32_t ShaderVisibility; |
| uint32_t ParameterOffset; |
| }; |
| |
| After the header information has been serialized, the actual data for each of the |
| root parameters is layout in a single continous blob. The parameters can be fetch |
| from such using the offset information, present in the header. |
| |
| The following sections will describe each of the root parameters types and their |
| encodings. |
| |
| Root Constants |
| '''''''''''''' |
| |
| The root constants are inline 32-bit values that show up in the shader |
| as a constant buffer. It is a 12 bytes long structure, two 32 bit values |
| encoding the register and space the constant is assigned to, and |
| the last 32 bits encode the number of constants being defined in the buffer. |
| |
| .. code-block:: c |
| |
| struct RootConstants { |
| uint32_t Register; |
| uint32_t Space; |
| uint32_t NumOfConstants; |
| }; |
| |
| Root Descriptor |
| ''''''''''''''' |
| |
| Root descriptors provide direct GPU memory addresses to resources. |
| |
| In version 1.0, the root descriptor is 8 bytes. It encodes the register and |
| space as 2 32-bit values. |
| |
| In version 1.1, the root descriptor is 12 bytes. It matches the 1.0 descriptor |
| but adds a 32-bit access flag. |
| |
| .. code-block:: c |
| |
| struct RootDescriptor_V1_0 { |
| uint32_t ShaderRegister; |
| uint32_t RegisterSpace; |
| }; |
| |
| struct RootDescriptor_V1_1 { |
| uint32_t ShaderRegister; |
| uint32_t RegisterSpace; |
| uint32_t Flags; |
| }; |
| |
| Root Descriptor Table |
| ''''''''''''''''''''' |
| |
| Descriptor tables let shaders access multiple resources through a single pointer |
| to a descriptor heap. |
| |
| The tables are made of a collection of descriptor ranges. In Version 1.0, the |
| descriptor range is 20 bytes, containing five 32 bit values. It encodes a range |
| of registers, including the register type, range length, register numbers and |
| space within range and the offset locating each range inside the table. |
| |
| In version 1.1, the descriptor range is 24 bytes. It matches the 1.0 descriptor |
| but adds a 32-bit access flag. |
| |
| .. code-block:: c |
| |
| struct DescriptorRange_V1_0 { |
| uint32_t RangeType; |
| uint32_t NumDescriptors; |
| uint32_t BaseShaderRegister; |
| uint32_t RegisterSpace; |
| uint32_t OffsetInDescriptorsFromTableStart; |
| }; |
| |
| struct DescriptorRange_V1_1 { |
| dxbc::DescriptorRangeType RangeType; |
| uint32_t NumDescriptors; |
| uint32_t BaseShaderRegister; |
| uint32_t RegisterSpace; |
| uint32_t OffsetInDescriptorsFromTableStart; |
| uint32_t Flags; |
| }; |
| |
| Static Samplers |
| ~~~~~~~~~~~~~~~ |
| |
| Static samplers are predefined filtering settings built into the root signature, |
| avoiding descriptor heap lookups. |
| |
| This section also has a variable size, since it can contain multiple static |
| samplers definitions. However, the definition is a fixed sized struct, |
| containing 13 32-byte fields of various enum, float, and integer values. |
| |
| .. code-block:: c |
| |
| struct StaticSamplerDesc { |
| FilterMode Filter; |
| TextureAddressMode AddressU; |
| TextureAddressMode AddressV; |
| TextureAddressMode AddressW; |
| float MipLODBias; |
| uint32_t MaxAnisotropy; |
| ComparisonFunc ComparisonFunc; |
| StaticBorderColor BorderColor; |
| float MinLOD; |
| float MaxLOD; |
| uint32_t ShaderRegister; |
| uint32_t RegisterSpace; |
| ShaderVisibility ShaderVisibility; |
| }; |
| |
| SFI0 Part |
| --------- |
| .. _SFI0: |
| |
| The SFI0 part encodes a 64-bit unsigned integer bitmask of the feature flags. |
| This denotes which optional features the shader requires. The flag values are |
| defined in `llvm/include/llvm/BinaryFormat/DXContainerConstants.def <https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/DXContainerConstants.def>`_. |