| # SPIR-V Dialect to LLVM Dialect conversion manual |
| |
| This manual describes the conversion from [SPIR-V Dialect](Dialects/SPIR-V.md) |
| to [LLVM Dialect](Dialects/LLVM.md). It assumes familiarity with both, and |
| describes the design choices behind the modelling of SPIR-V concepts in LLVM |
| Dialect. The conversion is an ongoing work, and is expected to grow as more |
| features are implemented. |
| |
| Conversion can be performed by invoking an appropriate conversion pass: |
| |
| ```shell |
| mlir-opt -convert-spirv-to-llvm <filename.mlir> |
| ``` |
| |
| This pass performs type and operation conversions for SPIR-V operations as |
| described in this document. |
| |
| [TOC] |
| |
| ## Type Conversion |
| |
| This section describes how SPIR-V Dialect types are mapped to LLVM Dialect. |
| |
| ### Scalar types |
| |
| SPIR-V Dialect | LLVM Dialect |
| :------------: | :-----------------: |
| `i<bitwidth>` | `!llvm.i<bitwidth>` |
| `si<bitwidth>` | `!llvm.i<bitwidth>` |
| `ui<bitwidth>` | `!llvm.i<bitwidth>` |
| `f16` | `f16` |
| `f32` | `f32` |
| `f64` | `f64` |
| |
| ### Vector types |
| |
| SPIR-V Dialect | LLVM Dialect |
| :-------------------------------: | :-------------------------------: |
| `vector<<count> x <scalar-type>>` | `vector<<count> x <scalar-type>>` |
| |
| ### Pointer types |
| |
| A SPIR-V pointer also takes a Storage Class. At the moment, conversion does |
| **not** take it into account. |
| |
| SPIR-V Dialect | LLVM Dialect |
| :-------------------------------------------: | :-------------------------: |
| `!spv.ptr< <element-type>, <storage-class> >` | `!llvm.ptr<<element-type>>` |
| |
| ### Array types |
| |
| SPIR-V distinguishes between array type and run-time array type, the length of |
| which is not known at compile time. In LLVM, it is possible to index beyond the |
| end of the array. Therefore, runtime array can be implemented as a zero length |
| array type. |
| |
| Moreover, SPIR-V supports the notion of array stride. Currently only natural |
| strides (based on [`VulkanLayoutUtils`][VulkanLayoutUtils]) are supported. They |
| are also mapped to LLVM array. |
| |
| SPIR-V Dialect | LLVM Dialect |
| :------------------------------------: | :-------------------------------------: |
| `!spv.array<<count> x <element-type>>` | `!llvm.array<<count> x <element-type>>` |
| `!spv.rtarray< <element-type> >` | `!llvm.array<0 x <element-type>>` |
| |
| ### Struct types |
| |
| Members of SPIR-V struct types may have decorations and offset information. |
| Currently, there is **no** support of member decorations conversion for structs. |
| For more information see section on [Decorations](#Decorations-conversion). |
| |
| Usually we expect that each struct member has a natural size and alignment. |
| However, there are cases (*e.g.* in graphics) where one would place struct |
| members explicitly at particular offsets. This case is **not** supported at the |
| moment. Hence, we adhere to the following mapping: |
| |
| * Structs with no offset are modelled as LLVM packed structures. |
| |
| * Structs with natural offset (*i.e.* offset that equals to cumulative size of |
| the previous struct elements or is a natural alignment) are mapped to |
| naturally padded structs. |
| |
| * Structs with unnatural offset (*i.e.* offset that is not equal to cumulative |
| size of the previous struct elements) are **not** supported. In this case, |
| offsets can be emulated with padding fields (*e.g.* integers). However, such |
| a design would require index recalculation in the conversion of ops that |
| involve memory addressing. |
| |
| Examples of SPIR-V struct conversion are: ```mlir !spv.struct<i8, i32> => |
| !llvm.struct<packed (i8, i32)> !spv.struct<i8 [0], i32 [4]> => !llvm.struct<(i8, |
| i32)> |
| |
| // error !spv.struct<i8 [0], i32 [8]> ``` |
| |
| ### Not implemented types |
| |
| The rest of the types not mentioned explicitly above are not supported by the |
| conversion. This includes `ImageType` and `MatrixType`. |
| |
| ## Operation Conversion |
| |
| This section describes how SPIR-V Dialect operations are converted to LLVM |
| Dialect. It lists already working conversion patterns, as well as those that are |
| an ongoing work. |
| |
| There are also multiple ops for which there is no clear mapping in LLVM. |
| Conversion for those have to be discussed within the community on the |
| case-by-case basis. |
| |
| ### Arithmetic ops |
| |
| SPIR-V arithmetic ops mostly have a direct equivalent in LLVM Dialect. Such |
| exceptions as `spv.SMod` and `spv.FMod` are rare. |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :---------------: | :-------------: |
| `spv.FAdd` | `llvm.fadd` |
| `spv.FDiv` | `llvm.fdiv` |
| `spv.FNegate` | `llvm.fneg` |
| `spv.FMul` | `llvm.fmul` |
| `spv.FRem` | `llvm.frem` |
| `spv.FSub` | `llvm.fsub` |
| `spv.IAdd` | `llvm.add` |
| `spv.IMul` | `llvm.mul` |
| `spv.ISub` | `llvm.sub` |
| `spv.SDiv` | `llvm.sdiv` |
| `spv.SRem` | `llvm.srem` |
| `spv.UDiv` | `llvm.udiv` |
| `spv.UMod` | `llvm.urem` |
| |
| ### Bitwise ops |
| |
| SPIR-V has a range of bit ops that are mapped to LLVM dialect ops, intrinsics or |
| may have a specific conversion pattern. |
| |
| #### Direct conversion |
| |
| As with arithmetic ops, most of bitwise ops have a semantically equivalent op in |
| LLVM: |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :---------------: | :-------------: |
| `spv.BitwiseAnd` | `llvm.and` |
| `spv.BitwiseOr` | `llvm.or` |
| `spv.BitwiseXor` | `llvm.xor` |
| |
| Also, some of bitwise ops can be modelled with LLVM intrinsics: |
| |
| SPIR-V Dialect op | LLVM Dialect intrinsic |
| :---------------: | :--------------------: |
| `spv.BitCount` | `llvm.intr.ctpop` |
| `spv.BitReverse` | `llvm.intr.bitreverse` |
| |
| #### `spv.Not` |
| |
| `spv.Not` is modelled with a `xor` operation with a mask with all bits set. |
| |
| ```mlir |
| %mask = llvm.mlir.constant(-1 : i32) : i32 |
| %0 = spv.Not %op : i32 => %0 = llvm.xor %op, %mask : i32 |
| ``` |
| |
| #### Bitfield ops |
| |
| SPIR-V dialect has three bitfield ops: `spv.BitFieldInsert`, |
| `spv.BitFieldSExtract` and `spv.BitFieldUExtract`. This section will first |
| outline the general design of conversion patterns for this ops, and then |
| describe each of them. |
| |
| All of these ops take `base`, `offset` and `count` (`insert` for |
| `spv.BitFieldInsert`) as arguments. There are two important things to note: |
| |
| * `offset` and `count` are always scalar. This means that we can have the |
| following case: |
| |
| ```mlir |
| %0 = spv.BitFieldSExtract %base, %offset, %count : vector<2xi32>, i8, i8 |
| ``` |
| |
| To be able to proceed with conversion algorithms described below, all |
| operands have to be of the same type and bitwidth. This requires |
| broadcasting of `offset` and `count` to vectors, for example for the case |
| above it gives: |
| |
| ```mlir |
| // Broadcasting offset |
| %offset0 = llvm.mlir.undef : vector<2xi8> |
| %zero = llvm.mlir.constant(0 : i32) : i32 |
| %offset1 = llvm.insertelement %offset, %offset0[%zero : i32] : vector<2xi8> |
| %one = llvm.mlir.constant(1 : i32) : i32 |
| %vec_offset = llvm.insertelement %offset, %offset1[%one : i32] : vector<2xi8> |
| |
| // Broadcasting count |
| // ... |
| ``` |
| |
| * `offset` and `count` may have different bitwidths from `base`. In this case, |
| both of these operands have to be zero extended (since they are treated as |
| unsigned by the specification) or truncated. For the above example it would |
| be: |
| |
| ```mlir |
| // Zero extending offset after broadcasting |
| %res_offset = llvm.zext %vec_offset: vector<2xi8> to vector<2xi32> |
| ``` |
| |
| Also, note that if the bitwidth of `offset` or `count` is greater than the |
| bitwidth of `base`, truncation is still permitted. This is because the ops |
| have a defined behaviour with `offset` and `count` being less than the size |
| of `base`. It creates a natural upper bound on what values `offset` and |
| `count` can take, which is 64. This can be expressed in less than 8 bits. |
| |
| Now, having these two cases in mind, we can proceed with conversion for the ops |
| and their operands. |
| |
| ##### `spv.BitFieldInsert` |
| |
| This operation is implemented as a series of LLVM Dialect operations. First step |
| would be to create a mask with bits set outside [`offset`, `offset` + `count` - |
| 1]. Then, unchanged bits are extracted from `base` that are outside of |
| [`offset`, `offset` + `count` - 1]. The result is `or`ed with shifted `insert`. |
| |
| ```mlir |
| // Create mask |
| // %minus_one = llvm.mlir.constant(-1 : i32) : i32 |
| // %t0 = llvm.shl %minus_one, %count : i32 |
| // %t1 = llvm.xor %t0, %minus_one : i32 |
| // %t2 = llvm.shl %t1, %offset : i32 |
| // %mask = llvm.xor %t2, %minus_one : i32 |
| |
| // Extract unchanged bits from the Base |
| // %new_base = llvm.and %base, %mask : i32 |
| |
| // Insert new bits |
| // %sh_insert = llvm.shl %insert, %offset : i32 |
| // %res = llvm.or %new_base, %sh_insert : i32 |
| %res = spv.BitFieldInsert %base, %insert, %offset, %count : i32, i32, i32 |
| ``` |
| |
| ##### `spv.BitFieldSExtract` |
| |
| To implement `spv.BitFieldSExtract`, `base` is shifted left by [sizeof(`base`) - |
| (`count` + `offset`)], so that the bit at `offset` + `count` - 1 is the most |
| significant bit. After, the result is shifted right, filling the bits with the |
| sign bit. |
| |
| ```mlir |
| // Calculate the amount to shift left. |
| // %size = llvm.mlir.constant(32 : i32) : i32 |
| // %t0 = llvm.add %count, %offset : i32 |
| // %t1 = llvm.sub %size, %t0 : i32 |
| |
| // Shift left and then right to extract the bits |
| // %sh_left = llvm.shl %base, %t1 : i32 |
| // %t2 = llvm.add %offset, %t1 : i32 |
| // %res = llvm.ashr %sh_left, %t2 : i32 |
| %res = spv.BitFieldSExtract %base, %offset, %count : i32, i32, i32 |
| ``` |
| |
| ##### `spv.BitFieldUExtract` |
| |
| For this op a similar pattern as for `spv.BitFieldInsert` is used. First, a mask |
| with bits set at [0, `count` - 1] is created. Then `base` is shifted by `offset` |
| and the mask is applied. |
| |
| ```mlir |
| // Create a mask |
| // %minus_one = llvm.mlir.constant(-1 : i32) : i32 |
| // %t0 = llvm.shl %minus_one, %count : i32 |
| // mask = llvm.xor %t0, %minus_one : i32 |
| |
| // Shift Base and apply mask |
| // %sh_base = llvm.lshr %base, %offset : i32 |
| // %res = llvm.and %sh_base, %mask : i32 |
| %res = spv.BitFieldUExtract %base, %offset, %count : i32, i32, i32 |
| ``` |
| |
| ### Cast ops |
| |
| #### Direct conversions |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :---------------: | :-------------: |
| `spv.ConvertFToS` | `llvm.fptosi` |
| `spv.ConvertFToU` | `llvm.fptoui` |
| `spv.ConvertSToF` | `llvm.sitofp` |
| `spv.ConvertUToF` | `llvm.uitofp` |
| |
| #### spv.Bitcast |
| |
| This operation has a direct counterpart in LLVM: `llvm.bitcast`. It is treated |
| separately since it also supports pointer to pointer bit pattern-preserving type |
| conversion, apart from regular scalar or vector of numerical type. |
| |
| #### Special cases |
| |
| Special cases include `spv.FConvert`, `spv.SConvert` and `spv.UConvert`. These |
| operations are either a truncate or extend. Let's denote the operand component |
| width as A, and result component width as R. Then, the following mappings are |
| used: |
| |
| ##### `spv.FConvert` |
| |
| Case | LLVM Dialect op |
| :---: | :-------------: |
| A < R | `llvm.fpext` |
| A > R | `llvm.fptrunc` |
| |
| ##### `spv.SConvert` |
| |
| Case | LLVM Dialect op |
| :---: | :-------------: |
| A < R | `llvm.sext` |
| A > R | `llvm.trunc` |
| |
| ##### `spv.UConvert` |
| |
| Case | LLVM Dialect op |
| :---: | :-------------: |
| A < R | `llvm.zext` |
| A > R | `llvm.trunc` |
| |
| The case when A = R is not possible, based on SPIR-V Dialect specification: |
| |
| > The component width cannot equal the component width in Result Type. |
| |
| ### Comparison ops |
| |
| SPIR-V comparison ops are mapped to LLVM `icmp` and `fcmp` operations. |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :--------------------------: | :---------------: |
| `spv.IEqual` | `llvm.icmp "eq"` |
| `spv.INotEqual` | `llvm.icmp "ne"` |
| `spv.FOrdEqual` | `llvm.fcmp "oeq"` |
| `spv.FOrdGreaterThan` | `llvm.fcmp "ogt"` |
| `spv.FOrdGreaterThanEqual` | `llvm.fcmp "oge"` |
| `spv.FOrdLessThan` | `llvm.fcmp "olt"` |
| `spv.FOrdLessThanEqual` | `llvm.fcmp "ole"` |
| `spv.FOrdNotEqual` | `llvm.fcmp "one"` |
| `spv.FUnordEqual` | `llvm.fcmp "ueq"` |
| `spv.FUnordGreaterThan` | `llvm.fcmp "ugt"` |
| `spv.FUnordGreaterThanEqual` | `llvm.fcmp "uge"` |
| `spv.FUnordLessThan` | `llvm.fcmp "ult"` |
| `spv.FUnordLessThanEqual` | `llvm.fcmp "ule"` |
| `spv.FUnordNotEqual` | `llvm.fcmp "une"` |
| `spv.SGreaterThan` | `llvm.icmp "sgt"` |
| `spv.SGreaterThanEqual` | `llvm.icmp "sge"` |
| `spv.SLessThan` | `llvm.icmp "slt"` |
| `spv.SLessThanEqual` | `llvm.icmp "sle"` |
| `spv.UGreaterThan` | `llvm.icmp "ugt"` |
| `spv.UGreaterThanEqual` | `llvm.icmp "uge"` |
| `spv.ULessThan` | `llvm.icmp "ult"` |
| `spv.ULessThanEqual` | `llvm.icmp "ule"` |
| |
| ### Composite ops |
| |
| Currently, conversion supports rewrite patterns for `spv.CompositeExtract` and |
| `spv.CompositeInsert`. We distinguish two cases for these operations: when the |
| composite object is a vector, and when the composite object is of a non-vector |
| type (*i.e.* struct, array or runtime array). |
| |
| Composite type | SPIR-V Dialect op | LLVM Dialect op |
| :------------: | :--------------------: | :-------------------: |
| vector | `spv.CompositeExtract` | `llvm.extractelement` |
| vector | `spv.CompositeInsert` | `llvm.insertelement` |
| non-vector | `spv.CompositeExtract` | `llvm.extractvalue` |
| non-vector | `spv.CompositeInsert` | `llvm.insertvalue` |
| |
| ### `spv.EntryPoint` and `spv.ExecutionMode` |
| |
| First of all, it is important to note that there is no direct representation of |
| entry points in LLVM. At the moment, we use the following approach: |
| |
| * `spv.EntryPoint` is simply removed. |
| |
| * In contrast, `spv.ExecutionMode` may contain important information about the |
| entry point. For example, `LocalSize` provides information about the |
| work-group size that can be reused. |
| |
| In order to preserve this information, `spv.ExecutionMode` is converted to a |
| struct global variable that stores the execution mode id and any variables |
| associated with it. In C, the struct has the structure shown below. |
| |
| ```c |
| // No values are associated // There are values that are associated |
| // with this entry point. // with this entry point. |
| struct { struct { |
| int32_t executionMode; int32_t executionMode; |
| }; int32_t values[]; |
| }; |
| ``` |
| |
| ```mlir |
| // spv.ExecutionMode @empty "ContractionOff" |
| llvm.mlir.global external constant @{{.*}}() : !llvm.struct<(i32)> { |
| %0 = llvm.mlir.undef : !llvm.struct<(i32)> |
| %1 = llvm.mlir.constant(31 : i32) : i32 |
| %ret = llvm.insertvalue %1, %0[0 : i32] : !llvm.struct<(i32)> |
| llvm.return %ret : !llvm.struct<(i32)> |
| } |
| ``` |
| |
| ### Logical ops |
| |
| Logical ops follow a similar pattern as bitwise ops, with the difference that |
| they operate on `i1` or vector of `i1` values. The following mapping is used to |
| emulate SPIR-V ops behaviour: |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :-------------------: | :--------------: |
| `spv.LogicalAnd` | `llvm.and` |
| `spv.LogicalOr` | `llvm.or` |
| `spv.LogicalEqual` | `llvm.icmp "eq"` |
| `spv.LogicalNotEqual` | `llvm.icmp "ne"` |
| |
| `spv.LogicalNot` has the same conversion pattern as bitwise `spv.Not`. It is |
| modelled with `xor` operation with a mask with all bits set. |
| |
| ```mlir |
| %mask = llvm.mlir.constant(-1 : i1) : i1 |
| %0 = spv.LogicalNot %op : i1 => %0 = llvm.xor %op, %mask : i1 |
| ``` |
| |
| ### Memory ops |
| |
| This section describes the conversion patterns for SPIR-V dialect operations |
| that concern memory. |
| |
| #### `spv.AccessChain` |
| |
| `spv.AccessChain` is mapped to `llvm.getelementptr` op. In order to create a |
| valid LLVM op, we also add a 0 index to the `spv.AccessChain`'s indices list in |
| order to go through the pointer. |
| |
| ```mlir |
| // Access the 1st element of the array |
| %i = spv.Constant 1: i32 |
| %var = spv.Variable : !spv.ptr<!spv.struct<f32, !spv.array<4xf32>>, Function> |
| %el = spv.AccessChain %var[%i, %i] : !spv.ptr<!spv.struct<f32, !spv.array<4xf32>>, Function>, i32, i32 |
| |
| // Corresponding LLVM dialect code |
| %i = ... |
| %var = ... |
| %0 = llvm.mlir.constant(0 : i32) : i32 |
| %el = llvm.getelementptr %var[%0, %i, %i] : (!llvm.ptr<struct<packed (f32, array<4 x f32>)>>, i32, i32, i32) |
| ``` |
| |
| #### `spv.Load` and `spv.Store` |
| |
| These ops are converted to their LLVM counterparts: `llvm.load` and |
| `llvm.store`. If the op has a memory access attribute, then there are the |
| following cases, based on the value of the attribute: |
| |
| * **Aligned**: alignment is passed on to LLVM op builder, for example: `mlir |
| // llvm.store %ptr, %val {alignment = 4 : i64} : !llvm.ptr<f32> spv.Store |
| "Function" %ptr, %val ["Aligned", 4] : f32` |
| * **None**: same case as if there is no memory access attribute. |
| |
| * **Nontemporal**: set `nontemporal` flag, for example: `mlir // %res = |
| llvm.load %ptr {nontemporal} : !llvm.ptr<f32> %res = spv.Load "Function" |
| %ptr ["Nontemporal"] : f32` |
| |
| * **Volatile**: mark the op as `volatile`, for example: `mlir // %res = |
| llvm.load volatile %ptr : !llvm.ptr<f32> %res = spv.Load "Function" %ptr |
| ["Volatile"] : f32` Otherwise the conversion fails as other cases |
| (`MakePointerAvailable`, `MakePointerVisible`, `NonPrivatePointer`) are not |
| supported yet. |
| |
| #### `spv.GlobalVariable` and `spv.mlir.addressof` |
| |
| `spv.GlobalVariable` is modelled with `llvm.mlir.global` op. However, there is a |
| difference that has to be pointed out. |
| |
| In SPIR-V dialect, the global variable returns a pointer, whereas in LLVM |
| dialect the global holds an actual value. This difference is handled by |
| `spv.mlir.addressof` and `llvm.mlir.addressof` ops that both return a pointer |
| and are used to reference the global. |
| |
| ```mlir |
| // Original SPIR-V module |
| spv.module Logical GLSL450 { |
| spv.GlobalVariable @struct : !spv.ptr<!spv.struct<f32, !spv.array<10xf32>>, Private> |
| spv.func @func() -> () "None" { |
| %0 = spv.mlir.addressof @struct : !spv.ptr<!spv.struct<f32, !spv.array<10xf32>>, Private> |
| spv.Return |
| } |
| } |
| |
| // Converted result |
| module { |
| llvm.mlir.global private @struct() : !llvm.struct<packed (f32, [10 x f32])> |
| llvm.func @func() { |
| %0 = llvm.mlir.addressof @struct : !llvm.ptr<struct<packed (f32, [10 x f32])>> |
| llvm.return |
| } |
| } |
| ``` |
| |
| The SPIR-V to LLVM conversion does not involve modelling of workgroups. Hence, |
| we say that only current invocation is in conversion's scope. This means that |
| global variables with pointers of `Input`, `Output`, and `Private` storage |
| classes are supported. Also, `StorageBuffer` storage class is allowed for |
| executing [`mlir-spirv-cpu-runner`](#mlir-spirv-cpu-runner). |
| |
| Moreover, `bind` that specifies the descriptor set and the binding number and |
| `built_in` that specifies SPIR-V `BuiltIn` decoration have no conversion into |
| LLVM dialect. |
| |
| Currently `llvm.mlir.global`s are created with `private` linkage for `Private` |
| storage class and `External` for other storage classes, based on SPIR-V spec: |
| |
| > By default, functions and global variables are private to a module and cannot |
| > be accessed by other modules. However, a module may be written to export or |
| > import functions and global (module scope) variables. |
| |
| If the global variable's pointer has `Input` storage class, then a `constant` |
| flag is added to LLVM op: |
| |
| ```mlir |
| spv.GlobalVariable @var : !spv.ptr<f32, Input> => llvm.mlir.global external constant @var() : f32 |
| ``` |
| |
| #### `spv.Variable` |
| |
| Per SPIR-V dialect spec, `spv.Variable` allocates an object in memory, resulting |
| in a pointer to it, which can be used with `spv.Load` and `spv.Store`. It is |
| also a function-level variable. |
| |
| `spv.Variable` is modelled as `llvm.alloca` op. If initialized, an additional |
| store instruction is used. Note that there is no initialization for arrays and |
| structs since constants of these types are not supported in LLVM dialect (TODO). |
| Also, at the moment initialization is only possible via `spv.Constant`. |
| |
| ```mlir |
| // Conversion of VariableOp without initialization |
| %size = llvm.mlir.constant(1 : i32) : i32 |
| %res = spv.Variable : !spv.ptr<vector<3xf32>, Function> => %res = llvm.alloca %size x vector<3xf32> : (i32) -> !llvm.ptr<vec<3 x f32>> |
| |
| // Conversion of VariableOp with initialization |
| %c = llvm.mlir.constant(0 : i64) : i64 |
| %c = spv.Constant 0 : i64 %size = llvm.mlir.constant(1 : i32) : i32 |
| %res = spv.Variable init(%c) : !spv.ptr<i64, Function> => %res = llvm.alloca %[[SIZE]] x i64 : (i32) -> !llvm.ptr<i64> |
| llvm.store %c, %res : !llvm.ptr<i64> |
| ``` |
| |
| Note that simple conversion to `alloca` may not be sufficient if the code has |
| some scoping. For example, if converting ops executed in a loop into `alloca`s, |
| a stack overflow may occur. For this case, `stacksave`/`stackrestore` pair can |
| be used (TODO). |
| |
| ### Miscellaneous ops with direct conversions |
| |
| There are multiple SPIR-V ops that do not fit in a particular group but can be |
| converted directly to LLVM dialect. Their conversion is addressed in this |
| section. |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :---------------: | :---------------: |
| `spv.Select` | `llvm.select` |
| `spv.Undef` | `llvm.mlir.undef` |
| |
| ### Shift ops |
| |
| Shift operates on two operands: `shift` and `base`. |
| |
| In SPIR-V dialect, `shift` and `base` may have different bit width. On the |
| contrary, in LLVM Dialect both `base` and `shift` have to be of the same |
| bitwidth. This leads to the following conversions: |
| |
| * if `base` has the same bitwidth as `shift`, the conversion is |
| straightforward. |
| |
| * if `base` has a greater bit width than `shift`, shift is sign or zero |
| extended first. Then the extended value is passed to the shift. |
| |
| * otherwise, the conversion is considered to be illegal. |
| |
| ```mlir |
| // Shift without extension |
| %res0 = spv.ShiftRightArithmetic %0, %2 : i32, i32 => %res0 = llvm.ashr %0, %2 : i32 |
| |
| // Shift with extension |
| %ext = llvm.sext %1 : i16 to i32 |
| %res1 = spv.ShiftRightArithmetic %0, %1 : i32, i16 => %res1 = llvm.ashr %0, %ext: i32 |
| ``` |
| |
| ### `spv.Constant` |
| |
| At the moment `spv.Constant` conversion supports scalar and vector constants |
| **only**. |
| |
| #### Mapping |
| |
| `spv.Constant` is mapped to `llvm.mlir.constant`. This is a straightforward |
| conversion pattern with a special case when the argument is signed or unsigned. |
| |
| #### Special case |
| |
| SPIR-V constant can be a signed or unsigned integer. Since LLVM Dialect does not |
| have signedness semantics, this case should be handled separately. |
| |
| The conversion casts constant value attribute to a signless integer or a vector |
| of signless integers. This is correct because in SPIR-V, like in LLVM, how to |
| interpret an integer number is also dictated by the opcode. However, in reality |
| hardware implementation might show unexpected behavior. Therefore, it is better |
| to handle it case-by-case, given that the purpose of the conversion is not to |
| cover all possible corner cases. |
| |
| ```mlir |
| // %0 = llvm.mlir.constant(0 : i8) : i8 |
| %0 = spv.Constant 0 : i8 |
| |
| // %1 = llvm.mlir.constant(dense<[2, 3, 4]> : vector<3xi32>) : vector<3xi32> |
| %1 = spv.Constant dense<[2, 3, 4]> : vector<3xui32> |
| ``` |
| |
| ### Not implemented ops |
| |
| There is no support of the following ops: |
| |
| * All atomic ops |
| * All group ops |
| * All matrix ops |
| * All OCL ops |
| |
| As well as: |
| |
| * spv.CompositeConstruct |
| * spv.ControlBarrier |
| * spv.CopyMemory |
| * spv.FMod |
| * spv.GLSL.Acos |
| * spv.GLSL.Asin |
| * spv.GLSL.Atan |
| * spv.GLSL.Cosh |
| * spv.GLSL.FSign |
| * spv.GLSL.SAbs |
| * spv.GLSL.Sinh |
| * spv.GLSL.SSign |
| * spv.MemoryBarrier |
| * spv.mlir.referenceof |
| * spv.SMod |
| * spv.SpecConstant |
| * spv.Unreachable |
| * spv.VectorExtractDynamic |
| |
| ## Control flow conversion |
| |
| ### Branch ops |
| |
| `spv.Branch` and `spv.BranchConditional` are mapped to `llvm.br` and |
| `llvm.cond_br`. Branch weights for `spv.BranchConditional` are mapped to |
| corresponding `branch_weights` attribute of `llvm.cond_br`. When translated to |
| proper LLVM, `branch_weights` are converted into LLVM metadata associated with |
| the conditional branch. |
| |
| ### `spv.FunctionCall` |
| |
| `spv.FunctionCall` maps to `llvm.call`. For example: |
| |
| ```mlir |
| %0 = spv.FunctionCall @foo() : () -> i32 => %0 = llvm.call @foo() : () -> f32 |
| spv.FunctionCall @bar(%0) : (i32) -> () => llvm.call @bar(%0) : (f32) -> () |
| ``` |
| |
| ### `spv.mlir.selection` and `spv.mlir.loop` |
| |
| Control flow within `spv.mlir.selection` and `spv.mlir.loop` is lowered directly |
| to LLVM via branch ops. The conversion can only be applied to selection or loop |
| with all blocks being reachable. Moreover, selection and loop control attributes |
| (such as `Flatten` or `Unroll`) are not supported at the moment. |
| |
| ```mlir |
| // Conversion of selection |
| %cond = spv.Constant true %cond = llvm.mlir.constant(true) : i1 |
| spv.mlir.selection { |
| spv.BranchConditional %cond, ^true, ^false llvm.cond_br %cond, ^true, ^false |
| |
| ^true: ^true: |
| // True block code // True block code |
| spv.Branch ^merge => llvm.br ^merge |
| |
| ^false: ^false: |
| // False block code // False block code |
| spv.Branch ^merge llvm.br ^merge |
| |
| ^merge: ^merge: |
| spv.mlir.merge llvm.br ^continue |
| } |
| // Remaining code ^continue: |
| // Remaining code |
| ``` |
| |
| ```mlir |
| // Conversion of loop |
| %cond = spv.Constant true %cond = llvm.mlir.constant(true) : i1 |
| spv.mlir.loop { |
| spv.Branch ^header llvm.br ^header |
| |
| ^header: ^header: |
| // Header code // Header code |
| spv.BranchConditional %cond, ^body, ^merge => llvm.cond_br %cond, ^body, ^merge |
| |
| ^body: ^body: |
| // Body code // Body code |
| spv.Branch ^continue llvm.br ^continue |
| |
| ^continue: ^continue: |
| // Continue code // Continue code |
| spv.Branch ^header llvm.br ^header |
| |
| ^merge: ^merge: |
| spv.mlir.merge llvm.br ^remaining |
| } |
| // Remaining code ^remaining: |
| // Remaining code |
| ``` |
| |
| ## Decorations conversion |
| |
| **Note: these conversions have not been implemented yet** |
| |
| ## GLSL extended instruction set |
| |
| This section describes how SPIR-V ops from GLSL extended instructions set are |
| mapped to LLVM Dialect. |
| |
| ### Direct conversions |
| |
| SPIR-V Dialect op | LLVM Dialect op |
| :---------------: | :----------------: |
| `spv.GLSL.Ceil` | `llvm.intr.ceil` |
| `spv.GLSL.Cos` | `llvm.intr.cos` |
| `spv.GLSL.Exp` | `llvm.intr.exp` |
| `spv.GLSL.FAbs` | `llvm.intr.fabs` |
| `spv.GLSL.Floor` | `llvm.intr.floor` |
| `spv.GLSL.FMax` | `llvm.intr.maxnum` |
| `spv.GLSL.FMin` | `llvm.intr.minnum` |
| `spv.GLSL.Log` | `llvm.intr.log` |
| `spv.GLSL.Sin` | `llvm.intr.sin` |
| `spv.GLSL.Sqrt` | `llvm.intr.sqrt` |
| `spv.GLSL.SMax` | `llvm.intr.smax` |
| `spv.GLSL.SMin` | `llvm.intr.smin` |
| |
| ### Special cases |
| |
| `spv.InverseSqrt` is mapped to: |
| |
| ```mlir |
| %one = llvm.mlir.constant(1.0 : f32) : f32 |
| %res = spv.InverseSqrt %arg : f32 => %sqrt = "llvm.intr.sqrt"(%arg) : (f32) -> f32 |
| %res = fdiv %one, %sqrt : f32 |
| ``` |
| |
| `spv.Tan` is mapped to: |
| |
| ```mlir |
| %sin = "llvm.intr.sin"(%arg) : (f32) -> f32 |
| %res = spv.Tan %arg : f32 => %cos = "llvm.intr.cos"(%arg) : (f32) -> f32 |
| %res = fdiv %sin, %cos : f32 |
| ``` |
| |
| `spv.Tanh` is modelled using the equality `tanh(x) = {exp(2x) - 1}/{exp(2x) + |
| 1}`: |
| |
| ```mlir |
| %two = llvm.mlir.constant(2.0: f32) : f32 |
| %2xArg = llvm.fmul %two, %arg : f32 |
| %exp = "llvm.intr.exp"(%2xArg) : (f32) -> f32 |
| %res = spv.Tanh %arg : f32 => %one = llvm.mlir.constant(1.0 : f32) : f32 |
| %num = llvm.fsub %exp, %one : f32 |
| %den = llvm.fadd %exp, %one : f32 |
| %res = llvm.fdiv %num, %den : f32 |
| ``` |
| |
| ## Function conversion and related ops |
| |
| This section describes the conversion of function-related operations from SPIR-V |
| to LLVM dialect. |
| |
| ### `spv.func` |
| |
| This op declares or defines a SPIR-V function and it is converted to |
| `llvm.func`. This conversion handles signature conversion, and function control |
| attributes remapping to LLVM dialect function |
| [`passthrough` attribute](Dialects/LLVM.md/#attribute-pass-through). |
| |
| The following mapping is used to map |
| [SPIR-V function control][SPIRVFunctionAttributes] to |
| [LLVM function attributes][LLVMFunctionAttributes]: |
| |
| SPIR-V Function Control Attributes | LLVM Function Attributes |
| :--------------------------------: | :---------------------------: |
| None | No function attributes passed |
| Inline | `alwaysinline` |
| DontInline | `noinline` |
| Pure | `readonly` |
| Const | `readnone` |
| |
| ### `spv.Return` and `spv.ReturnValue` |
| |
| In LLVM IR, functions may return either 1 or 0 value. Hence, we map both ops to |
| `llvm.return` with or without a return value. |
| |
| ## Module ops |
| |
| Module in SPIR-V has one region that contains one block. It is defined via |
| `spv.module` op that also takes a range of attributes: |
| |
| * Addressing model |
| * Memory model |
| * Version-Capability-Extension attribute |
| |
| `spv.module` is converted into `ModuleOp`. This plays a role of enclosing scope |
| to LLVM ops. At the moment, SPIR-V module attributes are ignored. |
| |
| ## `mlir-spirv-cpu-runner` |
| |
| `mlir-spirv-cpu-runner` allows to execute `gpu` dialect kernel on the CPU via |
| SPIR-V to LLVM dialect conversion. Currently, only single-threaded kernel is |
| supported. |
| |
| To build the runner, add the following option to `cmake`: `bash |
| -DMLIR_ENABLE_SPIRV_CPU_RUNNER=1` |
| |
| ### Pipeline |
| |
| The `gpu` module with the kernel and the host code undergo the following |
| transformations: |
| |
| * Convert the `gpu` module into SPIR-V dialect, lower ABI attributes and |
| update version, capability and extension. |
| |
| * Emulate the kernel call by converting the launching operation into a normal |
| function call. The data from the host side to the device is passed via |
| copying to global variables. These are created in both the host and the |
| kernel code and later linked when nested modules are folded. |
| |
| * Convert SPIR-V dialect kernel to LLVM dialect via the new conversion path. |
| |
| After these passes, the IR transforms into a nested LLVM module - a main module |
| representing the host code and a kernel module. These modules are linked and |
| executed using `ExecutionEngine`. |
| |
| ### Walk-through |
| |
| This section gives a detailed overview of the IR changes while running |
| `mlir-spirv-cpu-runner`. First, consider that we have the following IR. (For |
| simplicity some type annotations and function implementations have been |
| omitted). |
| |
| ```mlir |
| gpu.module @foo { |
| gpu.func @bar(%arg: memref<8xi32>) { |
| // Kernel code. |
| gpu.return |
| } |
| } |
| |
| func @main() { |
| // Fill the buffer with some data |
| %buffer = memref.alloc : memref<8xi32> |
| %data = ... |
| call fillBuffer(%buffer, %data) |
| |
| "gpu.launch_func"(/*grid dimensions*/, %buffer) { |
| kernel = @foo::bar |
| } |
| } |
| ``` |
| |
| Lowering `gpu` dialect to SPIR-V dialect results in |
| |
| ```mlir |
| spv.module @__spv__foo /*VCE triple and other metadata here*/ { |
| spv.GlobalVariable @__spv__foo_arg bind(0,0) : ... |
| spv.func @bar() { |
| // Kernel code. |
| } |
| spv.EntryPoint @bar, ... |
| } |
| |
| func @main() { |
| // Fill the buffer with some data. |
| %buffer = memref.alloc : memref<8xi32> |
| %data = ... |
| call fillBuffer(%buffer, %data) |
| |
| "gpu.launch_func"(/*grid dimensions*/, %buffer) { |
| kernel = @foo::bar |
| } |
| } |
| ``` |
| |
| Then, the lowering from standard dialect to LLVM dialect is applied to the host |
| code. |
| |
| ```mlir |
| spv.module @__spv__foo /*VCE triple and other metadata here*/ { |
| spv.GlobalVariable @__spv__foo_arg bind(0,0) : ... |
| spv.func @bar() { |
| // Kernel code. |
| } |
| spv.EntryPoint @bar, ... |
| } |
| |
| // Kernel function declaration. |
| llvm.func @__spv__foo_bar() : ... |
| |
| llvm.func @main() { |
| // Fill the buffer with some data. |
| llvm.call fillBuffer(%buffer, %data) |
| |
| // Copy data to the global variable, call kernel, and copy the data back. |
| %addr = llvm.mlir.addressof @__spv__foo_arg_descriptor_set0_binding0 : ... |
| "llvm.intr.memcpy"(%addr, %buffer) : ... |
| llvm.call @__spv__foo_bar() |
| "llvm.intr.memcpy"(%buffer, %addr) : ... |
| |
| llvm.return |
| } |
| ``` |
| |
| Finally, SPIR-V module is converted to LLVM and the symbol names are resolved |
| for the linkage. |
| |
| ```mlir |
| module @__spv__foo { |
| llvm.mlir.global @__spv__foo_arg_descriptor_set0_binding0 : ... |
| llvm.func @__spv__foo_bar() { |
| // Kernel code. |
| } |
| } |
| |
| // Kernel function declaration. |
| llvm.func @__spv__foo_bar() : ... |
| |
| llvm.func @main() { |
| // Fill the buffer with some data. |
| llvm.call fillBuffer(%buffer, %data) |
| |
| // Copy data to the global variable, call kernel, and copy the data back. |
| %addr = llvm.mlir.addressof @__spv__foo_arg_descriptor_set0_binding0 : ... |
| "llvm.intr.memcpy"(%addr, %buffer) : ... |
| llvm.call @__spv__foo_bar() |
| "llvm.intr.memcpy"(%buffer, %addr) : ... |
| |
| llvm.return |
| } |
| ``` |
| |
| [LLVMFunctionAttributes]: https://llvm.org/docs/LangRef.html#function-attributes |
| [SPIRVFunctionAttributes]: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_function_control_a_function_control |
| [VulkanLayoutUtils]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/SPIRV/LayoutUtils.h |