| ========================= |
| AMDGPU Instruction Syntax |
| ========================= |
| |
| .. contents:: |
| :local: |
| |
| .. _amdgpu_syn_instructions: |
| |
| Instructions |
| ============ |
| |
| Syntax |
| ~~~~~~ |
| |
| An instruction has the following syntax: |
| |
| ``<``\ *opcode mnemonic*\ ``> <``\ *operand0*\ ``>, <``\ *operand1*\ ``>,... <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...`` |
| |
| :doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated while |
| :doc:`modifiers<AMDGPUModifierSyntax>` are space-separated. |
| |
| The order of *operands* and *modifiers* is fixed. |
| Most *modifiers* are optional and may be omitted. |
| |
| .. _amdgpu_syn_instruction_mnemo: |
| |
| Opcode Mnemonic |
| ~~~~~~~~~~~~~~~ |
| |
| Opcode mnemonic describes opcode semantics and may include one or more suffices in this order: |
| |
| * :ref:`Packing suffix<amdgpu_syn_instruction_pk>`. |
| * :ref:`Destination operand type suffix<amdgpu_syn_instruction_type>`. |
| * :ref:`Source operand type suffix<amdgpu_syn_instruction_type>`. |
| * :ref:`Encoding suffix<amdgpu_syn_instruction_enc>`. |
| |
| .. _amdgpu_syn_instruction_pk: |
| |
| Packing Suffix |
| ~~~~~~~~~~~~~~ |
| |
| Most instructions which operate on packed data have a *_pk* suffix. |
| Unless otherwise :ref:`noted<amdgpu_syn_instruction_operand_tags>`, |
| these instructions operate on and produce packed data composed of |
| two values. The type of values is indicated by |
| :ref:`type suffices<amdgpu_syn_instruction_type>`. |
| |
| For example, the following instruction sums up two pairs of f16 values |
| and produces a pair of f16 values: |
| |
| .. parsed-literal:: |
| |
| v_pk_add_f16 v1, v2, v3 // Each operand has f16x2 type |
| |
| .. _amdgpu_syn_instruction_type: |
| |
| Type and Size Suffices |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Instructions which operate with data have an implied type of *data* operands. |
| This data type is specified as a suffix of instruction mnemonic. |
| |
| There are instructions which have 2 type suffices: |
| the first is the data type of the destination operand, |
| the second is the data type of source *data* operand(s). |
| |
| Note that data type specified by an instruction does not apply |
| to other kinds of operands such as *addresses*, *offsets* and so on. |
| |
| The following table enumerates the most frequently used type suffices. |
| |
| ============================================ ======================= ============================ |
| Type Suffices Packed instruction? Data Type |
| ============================================ ======================= ============================ |
| _b512, _b256, _b128, _b64, _b32, _b16, _b8 No Bits. |
| _u64, _u32, _u16, _u8 No Unsigned integer. |
| _i64, _i32, _i16, _i8 No Signed integer. |
| _f64, _f32, _f16 No Floating-point. |
| _b16, _u16, _i16, _f16 Yes Packed (b16x2, u16x2, etc). |
| ============================================ ======================= ============================ |
| |
| Instructions which have no type suffices are assumed to operate with typeless data. |
| The size of data is specified by size suffices: |
| |
| ================= =================== ===================================== |
| Size Suffix Implied data type Required register size in dwords |
| ================= =================== ===================================== |
| \- b32 1 |
| x2 b64 2 |
| x3 b96 3 |
| x4 b128 4 |
| x8 b256 8 |
| x16 b512 16 |
| x b32 1 |
| xy b64 2 |
| xyz b96 3 |
| xyzw b128 4 |
| d16_x b16 1 |
| d16_xy b16x2 2 for GFX8.0, 1 for GFX8.1 and GFX9+ |
| d16_xyz b16x3 3 for GFX8.0, 2 for GFX8.1 and GFX9+ |
| d16_xyzw b16x4 4 for GFX8.0, 2 for GFX8.1 and GFX9+ |
| ================= =================== ===================================== |
| |
| .. WARNING:: |
| There are exceptions from rules described above. |
| Operands which have type different from type specified by the opcode are |
| :ref:`tagged<amdgpu_syn_instruction_operand_tags>` in the description. |
| |
| Examples of instructions with different types of source and destination operands: |
| |
| .. parsed-literal:: |
| |
| s_bcnt0_i32_b64 |
| v_cvt_f32_u32 |
| |
| Examples of instructions with one data type: |
| |
| .. parsed-literal:: |
| |
| v_max3_f32 |
| v_max3_i16 |
| |
| Examples of instructions which operate with packed data: |
| |
| .. parsed-literal:: |
| |
| v_pk_add_u16 |
| v_pk_add_i16 |
| v_pk_add_f16 |
| |
| Examples of typeless instructions which operate on b128 data: |
| |
| .. parsed-literal:: |
| |
| buffer_store_dwordx4 |
| flat_load_dwordx4 |
| |
| .. _amdgpu_syn_instruction_enc: |
| |
| Encoding Suffices |
| ~~~~~~~~~~~~~~~~~ |
| |
| Most *VOP1*, *VOP2* and *VOPC* instructions have several variants: |
| they may also be encoded in *VOP3*, *DPP* and *SDWA* formats. |
| |
| The assembler will automatically use optimal encoding based on instruction operands. |
| To force specific encoding, one can add a suffix to the opcode of the instruction: |
| |
| =================================================== ================= |
| Encoding Encoding Suffix |
| =================================================== ================= |
| *VOP1*, *VOP2* and *VOPC* (32-bit) encoding _e32 |
| *VOP3* (64-bit) encoding _e64 |
| *DPP* encoding _dpp |
| *SDWA* encoding _sdwa |
| =================================================== ================= |
| |
| These suffices are used in this reference to indicate the assumed encoding. |
| When no suffix is specified, native instruction encoding is implied. |
| |
| Operands |
| ======== |
| |
| Syntax |
| ~~~~~~ |
| |
| Syntax of generic operands is described :doc:`in this document<AMDGPUOperandSyntax>`. |
| |
| For detailed information about operands follow *operand links* in GPU-specific documents: |
| |
| * :doc:`GFX7<AMDGPU/AMDGPUAsmGFX7>` |
| * :doc:`GFX8<AMDGPU/AMDGPUAsmGFX8>` |
| * :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` |
| * :doc:`GFX10<AMDGPU/AMDGPUAsmGFX10>` |
| |
| Modifiers |
| ========= |
| |
| Syntax |
| ~~~~~~ |
| |
| Syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`. |
| |
| Information about modifiers supported for individual instructions may be found in GPU-specific documents: |
| |
| * :doc:`GFX7<AMDGPU/AMDGPUAsmGFX7>` |
| * :doc:`GFX8<AMDGPU/AMDGPUAsmGFX8>` |
| * :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` |
| * :doc:`GFX10<AMDGPU/AMDGPUAsmGFX10>` |
| |