1# SPIR-V Dialect to LLVM Dialect conversion manual 2 3This manual describes the conversion from [SPIR-V Dialect](Dialects/SPIR-V.md) 4to [LLVM Dialect](Dialects/LLVM.md). It assumes familiarity with both, and 5describes the design choices behind the modelling of SPIR-V concepts in LLVM 6Dialect. The conversion is an ongoing work, and is expected to grow as more 7features are implemented. 8 9Conversion can be performed by invoking an appropriate conversion pass: 10 11```shell 12mlir-opt -convert-spirv-to-llvm <filename.mlir> 13``` 14 15This pass performs type and operation conversions for SPIR-V operations as 16described in this document. 17 18[TOC] 19 20## Type Conversion 21 22This section describes how SPIR-V Dialect types are mapped to LLVM Dialect. 23 24### Scalar types 25 26SPIR-V Dialect | LLVM Dialect 27:------------: | :-----------------: 28`i<bitwidth>` | `!llvm.i<bitwidth>` 29`si<bitwidth>` | `!llvm.i<bitwidth>` 30`ui<bitwidth>` | `!llvm.i<bitwidth>` 31`f16` | `f16` 32`f32` | `f32` 33`f64` | `f64` 34 35### Vector types 36 37SPIR-V Dialect | LLVM Dialect 38:-------------------------------: | :-------------------------------: 39`vector<<count> x <scalar-type>>` | `vector<<count> x <scalar-type>>` 40 41### Pointer types 42 43A SPIR-V pointer also takes a Storage Class. At the moment, conversion does 44**not** take it into account. 45 46SPIR-V Dialect | LLVM Dialect 47:-------------------------------------------: | :-------------------------: 48`!spirv.ptr< <element-type>, <storage-class> >` | `!llvm.ptr` 49 50### Array types 51 52SPIR-V distinguishes between array type and run-time array type, the length of 53which is not known at compile time. In LLVM, it is possible to index beyond the 54end of the array. Therefore, runtime array can be implemented as a zero length 55array type. 56 57Moreover, SPIR-V supports the notion of array stride. Currently only natural 58strides (based on [`VulkanLayoutUtils`][VulkanLayoutUtils]) are supported. They 59are also mapped to LLVM array. 60 61SPIR-V Dialect | LLVM Dialect 62:------------------------------------: | :-------------------------------------: 63`!spirv.array<<count> x <element-type>>` | `!llvm.array<<count> x <element-type>>` 64`!spirv.rtarray< <element-type> >` | `!llvm.array<0 x <element-type>>` 65 66### Struct types 67 68Members of SPIR-V struct types may have decorations and offset information. 69Currently, there is **no** support of member decorations conversion for structs. 70For more information see section on [Decorations](#Decorations-conversion). 71 72Usually we expect that each struct member has a natural size and alignment. 73However, there are cases (*e.g.* in graphics) where one would place struct 74members explicitly at particular offsets. This case is **not** supported at the 75moment. Hence, we adhere to the following mapping: 76 77* Structs with no offset are modelled as LLVM packed structures. 78 79* Structs with natural offset (*i.e.* offset that equals to cumulative size of 80 the previous struct elements or is a natural alignment) are mapped to 81 naturally padded structs. 82 83* Structs with unnatural offset (*i.e.* offset that is not equal to cumulative 84 size of the previous struct elements) are **not** supported. In this case, 85 offsets can be emulated with padding fields (*e.g.* integers). However, such 86 a design would require index recalculation in the conversion of ops that 87 involve memory addressing. 88 89Examples of SPIR-V struct conversion are: ```mlir !spirv.struct<i8, i32> => 90!llvm.struct<packed (i8, i32)> !spirv.struct<i8 [0], i32 [4]> => !llvm.struct<(i8, 91i32)> 92 93// error !spirv.struct<i8 [0], i32 [8]> ``` 94 95### Not implemented types 96 97The rest of the types not mentioned explicitly above are not supported by the 98conversion. This includes `ImageType` and `MatrixType`. 99 100## Operation Conversion 101 102This section describes how SPIR-V Dialect operations are converted to LLVM 103Dialect. It lists already working conversion patterns, as well as those that are 104an ongoing work. 105 106There are also multiple ops for which there is no clear mapping in LLVM. 107Conversion for those have to be discussed within the community on the 108case-by-case basis. 109 110### Arithmetic ops 111 112SPIR-V arithmetic ops mostly have a direct equivalent in LLVM Dialect. Such 113exceptions as `spirv.SMod` and `spirv.FMod` are rare. 114 115SPIR-V Dialect op | LLVM Dialect op 116:---------------: | :-------------: 117`spirv.FAdd` | `llvm.fadd` 118`spirv.FDiv` | `llvm.fdiv` 119`spirv.FNegate` | `llvm.fneg` 120`spirv.FMul` | `llvm.fmul` 121`spirv.FRem` | `llvm.frem` 122`spirv.FSub` | `llvm.fsub` 123`spirv.IAdd` | `llvm.add` 124`spirv.IMul` | `llvm.mul` 125`spirv.ISub` | `llvm.sub` 126`spirv.SDiv` | `llvm.sdiv` 127`spirv.SRem` | `llvm.srem` 128`spirv.UDiv` | `llvm.udiv` 129`spirv.UMod` | `llvm.urem` 130 131### Bitwise ops 132 133SPIR-V has a range of bit ops that are mapped to LLVM dialect ops, intrinsics or 134may have a specific conversion pattern. 135 136#### Direct conversion 137 138As with arithmetic ops, most of bitwise ops have a semantically equivalent op in 139LLVM: 140 141SPIR-V Dialect op | LLVM Dialect op 142:---------------: | :-------------: 143`spirv.BitwiseAnd` | `llvm.and` 144`spirv.BitwiseOr` | `llvm.or` 145`spirv.BitwiseXor` | `llvm.xor` 146 147Also, some of bitwise ops can be modelled with LLVM intrinsics: 148 149SPIR-V Dialect op | LLVM Dialect intrinsic 150:---------------: | :--------------------: 151`spirv.BitCount` | `llvm.intr.ctpop` 152`spirv.BitReverse` | `llvm.intr.bitreverse` 153 154#### `spirv.Not` 155 156`spirv.Not` is modelled with a `xor` operation with a mask with all bits set. 157 158```mlir 159 %mask = llvm.mlir.constant(-1 : i32) : i32 160%0 = spirv.Not %op : i32 => %0 = llvm.xor %op, %mask : i32 161``` 162 163#### Bitfield ops 164 165SPIR-V dialect has three bitfield ops: `spirv.BitFieldInsert`, 166`spirv.BitFieldSExtract` and `spirv.BitFieldUExtract`. This section will first 167outline the general design of conversion patterns for this ops, and then 168describe each of them. 169 170All of these ops take `base`, `offset` and `count` (`insert` for 171`spirv.BitFieldInsert`) as arguments. There are two important things to note: 172 173* `offset` and `count` are always scalar. This means that we can have the 174 following case: 175 176 ```mlir 177 %0 = spirv.BitFieldSExtract %base, %offset, %count : vector<2xi32>, i8, i8 178 ``` 179 180 To be able to proceed with conversion algorithms described below, all 181 operands have to be of the same type and bitwidth. This requires 182 broadcasting of `offset` and `count` to vectors, for example for the case 183 above it gives: 184 185 ```mlir 186 // Broadcasting offset 187 %offset0 = llvm.mlir.undef : vector<2xi8> 188 %zero = llvm.mlir.constant(0 : i32) : i32 189 %offset1 = llvm.insertelement %offset, %offset0[%zero : i32] : vector<2xi8> 190 %one = llvm.mlir.constant(1 : i32) : i32 191 %vec_offset = llvm.insertelement %offset, %offset1[%one : i32] : vector<2xi8> 192 193 // Broadcasting count 194 // ... 195 ``` 196 197* `offset` and `count` may have different bitwidths from `base`. In this case, 198 both of these operands have to be zero extended (since they are treated as 199 unsigned by the specification) or truncated. For the above example it would 200 be: 201 202 ```mlir 203 // Zero extending offset after broadcasting 204 %res_offset = llvm.zext %vec_offset: vector<2xi8> to vector<2xi32> 205 ``` 206 207 Also, note that if the bitwidth of `offset` or `count` is greater than the 208 bitwidth of `base`, truncation is still permitted. This is because the ops 209 have a defined behaviour with `offset` and `count` being less than the size 210 of `base`. It creates a natural upper bound on what values `offset` and 211 `count` can take, which is 64. This can be expressed in less than 8 bits. 212 213Now, having these two cases in mind, we can proceed with conversion for the ops 214and their operands. 215 216##### `spirv.BitFieldInsert` 217 218This operation is implemented as a series of LLVM Dialect operations. First step 219would be to create a mask with bits set outside [`offset`, `offset` + `count` - 2201]. Then, unchanged bits are extracted from `base` that are outside of 221[`offset`, `offset` + `count` - 1]. The result is `or`ed with shifted `insert`. 222 223```mlir 224// Create mask 225// %minus_one = llvm.mlir.constant(-1 : i32) : i32 226// %t0 = llvm.shl %minus_one, %count : i32 227// %t1 = llvm.xor %t0, %minus_one : i32 228// %t2 = llvm.shl %t1, %offset : i32 229// %mask = llvm.xor %t2, %minus_one : i32 230 231// Extract unchanged bits from the Base 232// %new_base = llvm.and %base, %mask : i32 233 234// Insert new bits 235// %sh_insert = llvm.shl %insert, %offset : i32 236// %res = llvm.or %new_base, %sh_insert : i32 237%res = spirv.BitFieldInsert %base, %insert, %offset, %count : i32, i32, i32 238``` 239 240##### `spirv.BitFieldSExtract` 241 242To implement `spirv.BitFieldSExtract`, `base` is shifted left by [sizeof(`base`) - 243(`count` + `offset`)], so that the bit at `offset` + `count` - 1 is the most 244significant bit. After, the result is shifted right, filling the bits with the 245sign bit. 246 247```mlir 248// Calculate the amount to shift left. 249// %size = llvm.mlir.constant(32 : i32) : i32 250// %t0 = llvm.add %count, %offset : i32 251// %t1 = llvm.sub %size, %t0 : i32 252 253// Shift left and then right to extract the bits 254// %sh_left = llvm.shl %base, %t1 : i32 255// %t2 = llvm.add %offset, %t1 : i32 256// %res = llvm.ashr %sh_left, %t2 : i32 257%res = spirv.BitFieldSExtract %base, %offset, %count : i32, i32, i32 258``` 259 260##### `spirv.BitFieldUExtract` 261 262For this op a similar pattern as for `spirv.BitFieldInsert` is used. First, a mask 263with bits set at [0, `count` - 1] is created. Then `base` is shifted by `offset` 264and the mask is applied. 265 266```mlir 267// Create a mask 268// %minus_one = llvm.mlir.constant(-1 : i32) : i32 269// %t0 = llvm.shl %minus_one, %count : i32 270// mask = llvm.xor %t0, %minus_one : i32 271 272// Shift Base and apply mask 273// %sh_base = llvm.lshr %base, %offset : i32 274// %res = llvm.and %sh_base, %mask : i32 275%res = spirv.BitFieldUExtract %base, %offset, %count : i32, i32, i32 276``` 277 278### Cast ops 279 280#### Direct conversions 281 282SPIR-V Dialect op | LLVM Dialect op 283:---------------: | :-------------: 284`spirv.ConvertFToS` | `llvm.fptosi` 285`spirv.ConvertFToU` | `llvm.fptoui` 286`spirv.ConvertSToF` | `llvm.sitofp` 287`spirv.ConvertUToF` | `llvm.uitofp` 288 289#### spirv.Bitcast 290 291This operation has a direct counterpart in LLVM: `llvm.bitcast`. It is treated 292separately since it also supports pointer to pointer bit pattern-preserving type 293conversion, apart from regular scalar or vector of numerical type. 294 295#### Special cases 296 297Special cases include `spirv.FConvert`, `spirv.SConvert` and `spirv.UConvert`. These 298operations are either a truncate or extend. Let's denote the operand component 299width as A, and result component width as R. Then, the following mappings are 300used: 301 302##### `spirv.FConvert` 303 304Case | LLVM Dialect op 305:---: | :-------------: 306A < R | `llvm.fpext` 307A > R | `llvm.fptrunc` 308 309##### `spirv.SConvert` 310 311Case | LLVM Dialect op 312:---: | :-------------: 313A < R | `llvm.sext` 314A > R | `llvm.trunc` 315 316##### `spirv.UConvert` 317 318Case | LLVM Dialect op 319:---: | :-------------: 320A < R | `llvm.zext` 321A > R | `llvm.trunc` 322 323The case when A = R is not possible, based on SPIR-V Dialect specification: 324 325> The component width cannot equal the component width in Result Type. 326 327### Comparison ops 328 329SPIR-V comparison ops are mapped to LLVM `icmp` and `fcmp` operations. 330 331SPIR-V Dialect op | LLVM Dialect op 332:--------------------------: | :---------------: 333`spirv.IEqual` | `llvm.icmp "eq"` 334`spirv.INotEqual` | `llvm.icmp "ne"` 335`spirv.FOrdEqual` | `llvm.fcmp "oeq"` 336`spirv.FOrdGreaterThan` | `llvm.fcmp "ogt"` 337`spirv.FOrdGreaterThanEqual` | `llvm.fcmp "oge"` 338`spirv.FOrdLessThan` | `llvm.fcmp "olt"` 339`spirv.FOrdLessThanEqual` | `llvm.fcmp "ole"` 340`spirv.FOrdNotEqual` | `llvm.fcmp "one"` 341`spirv.FUnordEqual` | `llvm.fcmp "ueq"` 342`spirv.FUnordGreaterThan` | `llvm.fcmp "ugt"` 343`spirv.FUnordGreaterThanEqual` | `llvm.fcmp "uge"` 344`spirv.FUnordLessThan` | `llvm.fcmp "ult"` 345`spirv.FUnordLessThanEqual` | `llvm.fcmp "ule"` 346`spirv.FUnordNotEqual` | `llvm.fcmp "une"` 347`spirv.SGreaterThan` | `llvm.icmp "sgt"` 348`spirv.SGreaterThanEqual` | `llvm.icmp "sge"` 349`spirv.SLessThan` | `llvm.icmp "slt"` 350`spirv.SLessThanEqual` | `llvm.icmp "sle"` 351`spirv.UGreaterThan` | `llvm.icmp "ugt"` 352`spirv.UGreaterThanEqual` | `llvm.icmp "uge"` 353`spirv.ULessThan` | `llvm.icmp "ult"` 354`spirv.ULessThanEqual` | `llvm.icmp "ule"` 355 356### Composite ops 357 358Currently, conversion supports rewrite patterns for `spirv.CompositeExtract` and 359`spirv.CompositeInsert`. We distinguish two cases for these operations: when the 360composite object is a vector, and when the composite object is of a non-vector 361type (*i.e.* struct, array or runtime array). 362 363Composite type | SPIR-V Dialect op | LLVM Dialect op 364:------------: | :--------------------: | :-------------------: 365vector | `spirv.CompositeExtract` | `llvm.extractelement` 366vector | `spirv.CompositeInsert` | `llvm.insertelement` 367non-vector | `spirv.CompositeExtract` | `llvm.extractvalue` 368non-vector | `spirv.CompositeInsert` | `llvm.insertvalue` 369 370### `spirv.EntryPoint` and `spirv.ExecutionMode` 371 372First of all, it is important to note that there is no direct representation of 373entry points in LLVM. At the moment, we use the following approach: 374 375* `spirv.EntryPoint` is simply removed. 376 377* In contrast, `spirv.ExecutionMode` may contain important information about the 378 entry point. For example, `LocalSize` provides information about the 379 work-group size that can be reused. 380 381 In order to preserve this information, `spirv.ExecutionMode` is converted to a 382 struct global variable that stores the execution mode id and any variables 383 associated with it. In C, the struct has the structure shown below. 384 385 ```c 386 // No values are associated // There are values that are associated 387 // with this entry point. // with this entry point. 388 struct { struct { 389 int32_t executionMode; int32_t executionMode; 390 }; int32_t values[]; 391 }; 392 ``` 393 394 ```mlir 395 // spirv.ExecutionMode @empty "ContractionOff" 396 llvm.mlir.global external constant @{{.*}}() : !llvm.struct<(i32)> { 397 %0 = llvm.mlir.undef : !llvm.struct<(i32)> 398 %1 = llvm.mlir.constant(31 : i32) : i32 399 %ret = llvm.insertvalue %1, %0[0] : !llvm.struct<(i32)> 400 llvm.return %ret : !llvm.struct<(i32)> 401 } 402 ``` 403 404### Logical ops 405 406Logical ops follow a similar pattern as bitwise ops, with the difference that 407they operate on `i1` or vector of `i1` values. The following mapping is used to 408emulate SPIR-V ops behaviour: 409 410SPIR-V Dialect op | LLVM Dialect op 411:-------------------: | :--------------: 412`spirv.LogicalAnd` | `llvm.and` 413`spirv.LogicalOr` | `llvm.or` 414`spirv.LogicalEqual` | `llvm.icmp "eq"` 415`spirv.LogicalNotEqual` | `llvm.icmp "ne"` 416 417`spirv.LogicalNot` has the same conversion pattern as bitwise `spirv.Not`. It is 418modelled with `xor` operation with a mask with all bits set. 419 420```mlir 421 %mask = llvm.mlir.constant(-1 : i1) : i1 422%0 = spirv.LogicalNot %op : i1 => %0 = llvm.xor %op, %mask : i1 423``` 424 425### Memory ops 426 427This section describes the conversion patterns for SPIR-V dialect operations 428that concern memory. 429 430#### `spirv.AccessChain` 431 432`spirv.AccessChain` is mapped to `llvm.getelementptr` op. In order to create a 433valid LLVM op, we also add a 0 index to the `spirv.AccessChain`'s indices list in 434order to go through the pointer. 435 436```mlir 437// Access the 1st element of the array 438%i = spirv.Constant 1: i32 439%var = spirv.Variable : !spirv.ptr<!spirv.struct<f32, !spirv.array<4xf32>>, Function> 440%el = spirv.AccessChain %var[%i, %i] : !spirv.ptr<!spirv.struct<f32, !spirv.array<4xf32>>, Function>, i32, i32 441 442// Corresponding LLVM dialect code 443%i = ... 444%var = ... 445%0 = llvm.mlir.constant(0 : i32) : i32 446%el = llvm.getelementptr %var[%0, %i, %i] : (!llvm.ptr, i32, i32, i32), !llvm.struct<packed (f32, array<4 x f32>)> 447``` 448 449#### `spirv.Load` and `spirv.Store` 450 451These ops are converted to their LLVM counterparts: `llvm.load` and 452`llvm.store`. If the op has a memory access attribute, then there are the 453following cases, based on the value of the attribute: 454 455* **Aligned**: alignment is passed on to LLVM op builder, for example: `mlir 456 // llvm.store %ptr, %val {alignment = 4 : i64} : !llvm.ptr spirv.Store 457 "Function" %ptr, %val ["Aligned", 4] : f32` 458* **None**: same case as if there is no memory access attribute. 459 460* **Nontemporal**: set `nontemporal` flag, for example: `mlir // %res = 461 llvm.load %ptr {nontemporal} : !llvm.ptr %res = spirv.Load "Function" 462 %ptr ["Nontemporal"] : f32` 463 464* **Volatile**: mark the op as `volatile`, for example: `mlir // %res = 465 llvm.load volatile %ptr : !llvm.ptr f32> %res = spirv.Load "Function" %ptr 466 ["Volatile"] : f32` Otherwise the conversion fails as other cases 467 (`MakePointerAvailable`, `MakePointerVisible`, `NonPrivatePointer`) are not 468 supported yet. 469 470#### `spirv.GlobalVariable` and `spirv.mlir.addressof` 471 472`spirv.GlobalVariable` is modelled with `llvm.mlir.global` op. However, there is a 473difference that has to be pointed out. 474 475In SPIR-V dialect, the global variable returns a pointer, whereas in LLVM 476dialect the global holds an actual value. This difference is handled by 477`spirv.mlir.addressof` and `llvm.mlir.addressof` ops that both return a pointer 478and are used to reference the global. 479 480```mlir 481// Original SPIR-V module 482spirv.module Logical GLSL450 { 483 spirv.GlobalVariable @struct : !spirv.ptr<!spirv.struct<f32, !spirv.array<10xf32>>, Private> 484 spirv.func @func() -> () "None" { 485 %0 = spirv.mlir.addressof @struct : !spirv.ptr<!spirv.struct<f32, !spirv.array<10xf32>>, Private> 486 spirv.Return 487 } 488} 489 490// Converted result 491module { 492 llvm.mlir.global private @struct() : !llvm.struct<packed (f32, [10 x f32])> 493 llvm.func @func() { 494 %0 = llvm.mlir.addressof @struct : !llvm.ptr 495 llvm.return 496 } 497} 498``` 499 500The SPIR-V to LLVM conversion does not involve modelling of workgroups. Hence, 501we say that only current invocation is in conversion's scope. This means that 502global variables with pointers of `Input`, `Output`, and `Private` storage 503classes are supported. Also, `StorageBuffer` storage class is allowed for 504executing [SPIR-V CPU Runner tests](#spir-v-cpu-runner-tests). 505 506Moreover, `bind` that specifies the descriptor set and the binding number and 507`built_in` that specifies SPIR-V `BuiltIn` decoration have no conversion into 508LLVM dialect. 509 510Currently `llvm.mlir.global`s are created with `private` linkage for `Private` 511storage class and `External` for other storage classes, based on SPIR-V spec: 512 513> By default, functions and global variables are private to a module and cannot 514> be accessed by other modules. However, a module may be written to export or 515> import functions and global (module scope) variables. 516 517If the global variable's pointer has `Input` storage class, then a `constant` 518flag is added to LLVM op: 519 520```mlir 521spirv.GlobalVariable @var : !spirv.ptr<f32, Input> => llvm.mlir.global external constant @var() : f32 522``` 523 524#### `spirv.Variable` 525 526Per SPIR-V dialect spec, `spirv.Variable` allocates an object in memory, resulting 527in a pointer to it, which can be used with `spirv.Load` and `spirv.Store`. It is 528also a function-level variable. 529 530`spirv.Variable` is modelled as `llvm.alloca` op. If initialized, an additional 531store instruction is used. Note that there is no initialization for arrays and 532structs since constants of these types are not supported in LLVM dialect (TODO). 533Also, at the moment initialization is only possible via `spirv.Constant`. 534 535```mlir 536// Conversion of VariableOp without initialization 537 %size = llvm.mlir.constant(1 : i32) : i32 538%res = spirv.Variable : !spirv.ptr<vector<3xf32>, Function> => %res = llvm.alloca %size x vector<3xf32> : (i32) -> !llvm.ptr 539 540// Conversion of VariableOp with initialization 541 %c = llvm.mlir.constant(0 : i64) : i64 542%c = spirv.Constant 0 : i64 %size = llvm.mlir.constant(1 : i32) : i32 543%res = spirv.Variable init(%c) : !spirv.ptr<i64, Function> => %res = llvm.alloca %[[SIZE]] x i64 : (i32) -> !llvm.ptr 544 llvm.store %c, %res : i64, !llvm.ptr 545``` 546 547Note that simple conversion to `alloca` may not be sufficient if the code has 548some scoping. For example, if converting ops executed in a loop into `alloca`s, 549a stack overflow may occur. For this case, `stacksave`/`stackrestore` pair can 550be used (TODO). 551 552### Miscellaneous ops with direct conversions 553 554There are multiple SPIR-V ops that do not fit in a particular group but can be 555converted directly to LLVM dialect. Their conversion is addressed in this 556section. 557 558SPIR-V Dialect op | LLVM Dialect op 559:---------------: | :---------------: 560`spirv.Select` | `llvm.select` 561`spirv.Undef` | `llvm.mlir.undef` 562 563### Shift ops 564 565Shift operates on two operands: `shift` and `base`. 566 567In SPIR-V dialect, `shift` and `base` may have different bit width. On the 568contrary, in LLVM Dialect both `base` and `shift` have to be of the same 569bitwidth. This leads to the following conversions: 570 571* if `base` has the same bitwidth as `shift`, the conversion is 572 straightforward. 573 574* if `base` has a greater bit width than `shift`, shift is sign or zero 575 extended first. Then the extended value is passed to the shift. 576 577* otherwise, the conversion is considered to be illegal. 578 579```mlir 580// Shift without extension 581%res0 = spirv.ShiftRightArithmetic %0, %2 : i32, i32 => %res0 = llvm.ashr %0, %2 : i32 582 583// Shift with extension 584 %ext = llvm.sext %1 : i16 to i32 585%res1 = spirv.ShiftRightArithmetic %0, %1 : i32, i16 => %res1 = llvm.ashr %0, %ext: i32 586``` 587 588### `spirv.Constant` 589 590At the moment `spirv.Constant` conversion supports scalar and vector constants 591**only**. 592 593#### Mapping 594 595`spirv.Constant` is mapped to `llvm.mlir.constant`. This is a straightforward 596conversion pattern with a special case when the argument is signed or unsigned. 597 598#### Special case 599 600SPIR-V constant can be a signed or unsigned integer. Since LLVM Dialect does not 601have signedness semantics, this case should be handled separately. 602 603The conversion casts constant value attribute to a signless integer or a vector 604of signless integers. This is correct because in SPIR-V, like in LLVM, how to 605interpret an integer number is also dictated by the opcode. However, in reality 606hardware implementation might show unexpected behavior. Therefore, it is better 607to handle it case-by-case, given that the purpose of the conversion is not to 608cover all possible corner cases. 609 610```mlir 611// %0 = llvm.mlir.constant(0 : i8) : i8 612%0 = spirv.Constant 0 : i8 613 614// %1 = llvm.mlir.constant(dense<[2, 3, 4]> : vector<3xi32>) : vector<3xi32> 615%1 = spirv.Constant dense<[2, 3, 4]> : vector<3xui32> 616``` 617 618### Not implemented ops 619 620There is no support of the following ops: 621 622* All atomic ops 623* All group ops 624* All matrix ops 625* All CL ops 626 627As well as: 628 629* spirv.CompositeConstruct 630* spirv.ControlBarrier 631* spirv.CopyMemory 632* spirv.FMod 633* spirv.GL.Acos 634* spirv.GL.Asin 635* spirv.GL.Atan 636* spirv.GL.Cosh 637* spirv.GL.FSign 638* spirv.GL.SAbs 639* spirv.GL.Sinh 640* spirv.GL.SSign 641* spirv.MemoryBarrier 642* spirv.mlir.referenceof 643* spirv.SMod 644* spirv.SpecConstant 645* spirv.Unreachable 646* spirv.VectorExtractDynamic 647 648## Control flow conversion 649 650### Branch ops 651 652`spirv.Branch` and `spirv.BranchConditional` are mapped to `llvm.br` and 653`llvm.cond_br`. Branch weights for `spirv.BranchConditional` are mapped to 654corresponding `branch_weights` attribute of `llvm.cond_br`. When translated to 655proper LLVM, `branch_weights` are converted into LLVM metadata associated with 656the conditional branch. 657 658### `spirv.FunctionCall` 659 660`spirv.FunctionCall` maps to `llvm.call`. For example: 661 662```mlir 663%0 = spirv.FunctionCall @foo() : () -> i32 => %0 = llvm.call @foo() : () -> f32 664spirv.FunctionCall @bar(%0) : (i32) -> () => llvm.call @bar(%0) : (f32) -> () 665``` 666 667### `spirv.mlir.selection` and `spirv.mlir.loop` 668 669Control flow within `spirv.mlir.selection` and `spirv.mlir.loop` is lowered directly 670to LLVM via branch ops. The conversion can only be applied to selection or loop 671with all blocks being reachable. Moreover, selection and loop control attributes 672(such as `Flatten` or `Unroll`) are not supported at the moment. 673 674```mlir 675// Conversion of selection 676%cond = spirv.Constant true %cond = llvm.mlir.constant(true) : i1 677spirv.mlir.selection { 678 spirv.BranchConditional %cond, ^true, ^false llvm.cond_br %cond, ^true, ^false 679 680^true: ^true: 681 // True block code // True block code 682 spirv.Branch ^merge => llvm.br ^merge 683 684^false: ^false: 685 // False block code // False block code 686 spirv.Branch ^merge llvm.br ^merge 687 688^merge: ^merge: 689 spirv.mlir.merge llvm.br ^continue 690} 691// Remaining code ^continue: 692 // Remaining code 693``` 694 695```mlir 696// Conversion of loop 697%cond = spirv.Constant true %cond = llvm.mlir.constant(true) : i1 698spirv.mlir.loop { 699 spirv.Branch ^header llvm.br ^header 700 701^header: ^header: 702 // Header code // Header code 703 spirv.BranchConditional %cond, ^body, ^merge => llvm.cond_br %cond, ^body, ^merge 704 705^body: ^body: 706 // Body code // Body code 707 spirv.Branch ^continue llvm.br ^continue 708 709^continue: ^continue: 710 // Continue code // Continue code 711 spirv.Branch ^header llvm.br ^header 712 713^merge: ^merge: 714 spirv.mlir.merge llvm.br ^remaining 715} 716// Remaining code ^remaining: 717 // Remaining code 718``` 719 720## Decorations conversion 721 722**Note: these conversions have not been implemented yet** 723 724## GLSL extended instruction set 725 726This section describes how SPIR-V ops from GLSL extended instructions set are 727mapped to LLVM Dialect. 728 729### Direct conversions 730 731SPIR-V Dialect op | LLVM Dialect op 732:---------------: | :----------------: 733`spirv.GL.Ceil` | `llvm.intr.ceil` 734`spirv.GL.Cos` | `llvm.intr.cos` 735`spirv.GL.Exp` | `llvm.intr.exp` 736`spirv.GL.FAbs` | `llvm.intr.fabs` 737`spirv.GL.Floor` | `llvm.intr.floor` 738`spirv.GL.FMax` | `llvm.intr.maxnum` 739`spirv.GL.FMin` | `llvm.intr.minnum` 740`spirv.GL.Log` | `llvm.intr.log` 741`spirv.GL.Sin` | `llvm.intr.sin` 742`spirv.GL.Sqrt` | `llvm.intr.sqrt` 743`spirv.GL.SMax` | `llvm.intr.smax` 744`spirv.GL.SMin` | `llvm.intr.smin` 745 746### Special cases 747 748`spirv.InverseSqrt` is mapped to: 749 750```mlir 751 %one = llvm.mlir.constant(1.0 : f32) : f32 752%res = spirv.InverseSqrt %arg : f32 => %sqrt = "llvm.intr.sqrt"(%arg) : (f32) -> f32 753 %res = fdiv %one, %sqrt : f32 754``` 755 756`spirv.Tan` is mapped to: 757 758```mlir 759 %sin = "llvm.intr.sin"(%arg) : (f32) -> f32 760%res = spirv.Tan %arg : f32 => %cos = "llvm.intr.cos"(%arg) : (f32) -> f32 761 %res = fdiv %sin, %cos : f32 762``` 763 764`spirv.Tanh` is modelled using the equality `tanh(x) = {exp(2x) - 1}/{exp(2x) + 7651}`: 766 767```mlir 768 %two = llvm.mlir.constant(2.0: f32) : f32 769 %2xArg = llvm.fmul %two, %arg : f32 770 %exp = "llvm.intr.exp"(%2xArg) : (f32) -> f32 771%res = spirv.Tanh %arg : f32 => %one = llvm.mlir.constant(1.0 : f32) : f32 772 %num = llvm.fsub %exp, %one : f32 773 %den = llvm.fadd %exp, %one : f32 774 %res = llvm.fdiv %num, %den : f32 775``` 776 777## Function conversion and related ops 778 779This section describes the conversion of function-related operations from SPIR-V 780to LLVM dialect. 781 782### `spirv.func` 783 784This op declares or defines a SPIR-V function and it is converted to 785`llvm.func`. This conversion handles signature conversion, and function control 786attributes remapping to LLVM dialect function 787[`passthrough` attribute](Dialects/LLVM.md/#attribute-pass-through). 788 789The following mapping is used to map 790[SPIR-V function control][SPIRVFunctionAttributes] to 791[LLVM function attributes][LLVMFunctionAttributes]: 792 793SPIR-V Function Control Attributes | LLVM Function Attributes 794:--------------------------------: | :---------------------------: 795None | No function attributes passed 796Inline | `alwaysinline` 797DontInline | `noinline` 798Pure | `readonly` 799Const | `readnone` 800 801### `spirv.Return` and `spirv.ReturnValue` 802 803In LLVM IR, functions may return either 1 or 0 value. Hence, we map both ops to 804`llvm.return` with or without a return value. 805 806## Module ops 807 808Module in SPIR-V has one region that contains one block. It is defined via 809`spirv.module` op that also takes a range of attributes: 810 811* Addressing model 812* Memory model 813* Version-Capability-Extension attribute 814 815`spirv.module` is converted into `ModuleOp`. This plays a role of enclosing scope 816to LLVM ops. At the moment, SPIR-V module attributes are ignored. 817 818## SPIR-V CPU Runner Tests 819 820The `mlir-runner` has support for executing a `gpu` dialect kernel on the 821CPU via SPIR-V to LLVM dialect conversion. This is referred to as the "SPIR-V 822CPU Runner". The `--link-nested-modules` flag needs to be passed for this. 823Currently, only single-threaded kernels are supported. 824 825To build the required runtime libaries, add the following option to `cmake`: 826`-DMLIR_ENABLE_SPIRV_CPU_RUNNER=1` 827 828### Pipeline 829 830The `gpu` module with the kernel and the host code undergo the following 831transformations: 832 833* Convert the `gpu` module into SPIR-V dialect, lower ABI attributes and 834 update version, capability and extension. 835 836* Emulate the kernel call by converting the launching operation into a normal 837 function call. The data from the host side to the device is passed via 838 copying to global variables. These are created in both the host and the 839 kernel code and later linked when nested modules are folded. 840 841* Convert SPIR-V dialect kernel to LLVM dialect via the new conversion path. 842 843After these passes, the IR transforms into a nested LLVM module - a main module 844representing the host code and a kernel module. These modules are linked and 845executed using `ExecutionEngine`. 846 847### Walk-through 848 849This section gives a detailed overview of the IR changes while running 850SPIR-V CPU Runner tests. First, consider that we have the following IR. (For 851simplicity some type annotations and function implementations have been 852omitted). 853 854```mlir 855gpu.module @foo { 856 gpu.func @bar(%arg: memref<8xi32>) { 857 // Kernel code. 858 gpu.return 859 } 860} 861 862func.func @main() { 863 // Fill the buffer with some data 864 %buffer = memref.alloc : memref<8xi32> 865 %data = ... 866 call fillBuffer(%buffer, %data) 867 868 "gpu.launch_func"(/*grid dimensions*/, %buffer) { 869 kernel = @foo::bar 870 } 871} 872``` 873 874Lowering `gpu` dialect to SPIR-V dialect results in 875 876```mlir 877spirv.module @__spv__foo /*VCE triple and other metadata here*/ { 878 spirv.GlobalVariable @__spv__foo_arg bind(0,0) : ... 879 spirv.func @bar() { 880 // Kernel code. 881 } 882 spirv.EntryPoint @bar, ... 883} 884 885func.func @main() { 886 // Fill the buffer with some data. 887 %buffer = memref.alloc : memref<8xi32> 888 %data = ... 889 call fillBuffer(%buffer, %data) 890 891 "gpu.launch_func"(/*grid dimensions*/, %buffer) { 892 kernel = @foo::bar 893 } 894} 895``` 896 897Then, the lowering from standard dialect to LLVM dialect is applied to the host 898code. 899 900```mlir 901spirv.module @__spv__foo /*VCE triple and other metadata here*/ { 902 spirv.GlobalVariable @__spv__foo_arg bind(0,0) : ... 903 spirv.func @bar() { 904 // Kernel code. 905 } 906 spirv.EntryPoint @bar, ... 907} 908 909// Kernel function declaration. 910llvm.func @__spv__foo_bar() : ... 911 912llvm.func @main() { 913 // Fill the buffer with some data. 914 llvm.call fillBuffer(%buffer, %data) 915 916 // Copy data to the global variable, call kernel, and copy the data back. 917 %addr = llvm.mlir.addressof @__spv__foo_arg_descriptor_set0_binding0 : ... 918 "llvm.intr.memcpy"(%addr, %buffer) : ... 919 llvm.call @__spv__foo_bar() 920 "llvm.intr.memcpy"(%buffer, %addr) : ... 921 922 llvm.return 923} 924``` 925 926Finally, SPIR-V module is converted to LLVM and the symbol names are resolved 927for the linkage. 928 929```mlir 930module @__spv__foo { 931 llvm.mlir.global @__spv__foo_arg_descriptor_set0_binding0 : ... 932 llvm.func @__spv__foo_bar() { 933 // Kernel code. 934 } 935} 936 937// Kernel function declaration. 938llvm.func @__spv__foo_bar() : ... 939 940llvm.func @main() { 941 // Fill the buffer with some data. 942 llvm.call fillBuffer(%buffer, %data) 943 944 // Copy data to the global variable, call kernel, and copy the data back. 945 %addr = llvm.mlir.addressof @__spv__foo_arg_descriptor_set0_binding0 : ... 946 "llvm.intr.memcpy"(%addr, %buffer) : ... 947 llvm.call @__spv__foo_bar() 948 "llvm.intr.memcpy"(%buffer, %addr) : ... 949 950 llvm.return 951} 952``` 953 954[LLVMFunctionAttributes]: https://llvm.org/docs/LangRef.html#function-attributes 955[SPIRVFunctionAttributes]: https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_function_control_a_function_control 956[VulkanLayoutUtils]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/SPIRV/LayoutUtils.h 957