1# 'llvm' Dialect 2 3This dialect maps [LLVM IR](https://llvm.org/docs/LangRef.html) into MLIR by 4defining the corresponding operations and types. LLVM IR metadata is usually 5represented as MLIR attributes, which offer additional structure verification. 6 7We use "LLVM IR" to designate the 8[intermediate representation of LLVM](https://llvm.org/docs/LangRef.html) and 9"LLVM _dialect_" or "LLVM IR _dialect_" to refer to this MLIR dialect. 10 11Unless explicitly stated otherwise, the semantics of the LLVM dialect operations 12must correspond to the semantics of LLVM IR instructions and any divergence is 13considered a bug. The dialect also contains auxiliary operations that smoothen 14the differences in the IR structure, e.g., MLIR does not have `phi` operations 15and LLVM IR does not have a `constant` operation. These auxiliary operations are 16systematically prefixed with `mlir`, e.g. `llvm.mlir.constant` where `llvm.` is 17the dialect namespace prefix. 18 19[TOC] 20 21## Dependency on LLVM IR 22 23LLVM dialect is not expected to depend on any object that requires an 24`LLVMContext`, such as an LLVM IR instruction or type. Instead, MLIR provides 25thread-safe alternatives compatible with the rest of the infrastructure. The 26dialect is allowed to depend on the LLVM IR objects that don't require a 27context, such as data layout and triple description. 28 29## Module Structure 30 31IR modules use the built-in MLIR `ModuleOp` and support all its features. In 32particular, modules can be named, nested and are subject to symbol visibility. 33Modules can contain any operations, including LLVM functions and globals. 34 35### Data Layout and Triple 36 37An IR module may have an optional data layout and triple information attached 38using MLIR attributes `llvm.data_layout` and `llvm.triple`, respectively. Both 39are string attributes with the 40[same syntax](https://llvm.org/docs/LangRef.html#data-layout) as in LLVM IR and 41are verified to be correct. They can be defined as follows. 42 43```mlir 44module attributes {llvm.data_layout = "e", 45 llvm.target_triple = "aarch64-linux-android"} { 46 // module contents 47} 48``` 49 50### Functions 51 52LLVM functions are represented by a special operation, `llvm.func`, that has 53syntax similar to that of the built-in function operation but supports 54LLVM-related features such as linkage and variadic argument lists. See detailed 55description in the operation list [below](#llvmfunc-llvmllvmfuncop). 56 57### PHI Nodes and Block Arguments 58 59MLIR uses block arguments instead of PHI nodes to communicate values between 60blocks. Therefore, the LLVM dialect has no operation directly equivalent to 61`phi` in LLVM IR. Instead, all terminators can pass values as successor operands 62as these values will be forwarded as block arguments when the control flow is 63transferred. 64 65For example: 66 67```mlir 68^bb1: 69 %0 = llvm.addi %arg0, %cst : i32 70 llvm.br ^bb2[%0: i32] 71 72// If the control flow comes from ^bb1, %arg1 == %0. 73^bb2(%arg1: i32) 74 // ... 75``` 76 77is equivalent to LLVM IR 78 79```llvm 80%0: 81 %1 = add i32 %arg0, %cst 82 br %3 83 84%3: 85 %arg1 = phi [%1, %0], //... 86``` 87 88Since there is no need to use the block identifier to differentiate the source 89of different values, the LLVM dialect supports terminators that transfer the 90control flow to the same block with different arguments. For example: 91 92```mlir 93^bb1: 94 llvm.cond_br %cond, ^bb2[%0: i32], ^bb2[%1: i32] 95 96^bb2(%arg0: i32): 97 // ... 98``` 99 100### Context-Level Values 101 102Some value kinds in LLVM IR, such as constants and undefs, are uniqued in 103context and used directly in relevant operations. MLIR does not support such 104values for thread-safety and concept parsimony reasons. Instead, regular values 105are produced by dedicated operations that have the corresponding semantics: 106[`llvm.mlir.constant`](#llvmmlirconstant-llvmconstantop), 107[`llvm.mlir.undef`](#llvmmlirundef-llvmundefop), 108[`llvm.mlir.poison`](#llvmmlirpoison-llvmpoisonop), 109[`llvm.mlir.zero`](#llvmmlirzero-llvmzeroop). Note how these operations are 110prefixed with `mlir.` to indicate that they don't belong to LLVM IR but are only 111necessary to model it in MLIR. The values produced by these operations are 112usable just like any other value. 113 114Examples: 115 116```mlir 117// Create an undefined value of structure type with a 32-bit integer followed 118// by a float. 119%0 = llvm.mlir.undef : !llvm.struct<(i32, f32)> 120 121// Null pointer. 122%1 = llvm.mlir.zero : !llvm.ptr 123 124// Create an zero initialized value of structure type with a 32-bit integer 125// followed by a float. 126%2 = llvm.mlir.zero : !llvm.struct<(i32, f32)> 127 128// Constant 42 as i32. 129%3 = llvm.mlir.constant(42 : i32) : i32 130 131// Splat dense vector constant. 132%3 = llvm.mlir.constant(dense<1.0> : vector<4xf32>) : vector<4xf32> 133``` 134 135Note that constants list the type twice. This is an artifact of the LLVM dialect 136not using built-in types, which are used for typed MLIR attributes. The syntax 137will be reevaluated after considering composite constants. 138 139### Globals 140 141Global variables are also defined using a special operation, 142[`llvm.mlir.global`](#llvmmlirglobal-llvmglobalop), located at the module 143level. Globals are MLIR symbols and are identified by their name. 144 145Since functions need to be isolated-from-above, i.e. values defined outside the 146function cannot be directly used inside the function, an additional operation, 147[`llvm.mlir.addressof`](#llvmmliraddressof-llvmaddressofop), is provided to 148locally define a value containing the _address_ of a global. The actual value 149can then be loaded from that pointer, or a new value can be stored into it if 150the global is not declared constant. This is similar to LLVM IR where globals 151are accessed through name and have a pointer type. 152 153### Linkage 154 155Module-level named objects in the LLVM dialect, namely functions and globals, 156have an optional _linkage_ attribute derived from LLVM IR 157[linkage types](https://llvm.org/docs/LangRef.html#linkage-types). Linkage is 158specified by the same keyword as in LLVM IR and is located between the operation 159name (`llvm.func` or `llvm.global`) and the symbol name. If no linkage keyword 160is present, `external` linkage is assumed by default. Linkage is _distinct_ from 161MLIR symbol visibility. 162 163### Attribute Pass-Through 164 165**WARNING:** this feature MUST NOT be used for any real workload. It is 166exclusively intended for quick prototyping. After that, attributes must be 167introduced as proper first-class concepts in the dialect. 168 169The LLVM dialect provides a mechanism to forward function-level attributes to 170LLVM IR using the `passthrough` attribute. This is an array attribute containing 171either string attributes or array attributes. In the former case, the value of 172the string is interpreted as the name of LLVM IR function attribute. In the 173latter case, the array is expected to contain exactly two string attributes, the 174first corresponding to the name of LLVM IR function attribute, and the second 175corresponding to its value. Note that even integer LLVM IR function attributes 176have their value represented in the string form. 177 178Example: 179 180```mlir 181llvm.func @func() attributes { 182 passthrough = ["readonly", // value-less attribute 183 ["alignstack", "4"], // integer attribute with value 184 ["other", "attr"]] // attribute unknown to LLVM 185} { 186 llvm.return 187} 188``` 189 190If the attribute is not known to LLVM IR, it will be attached as a string 191attribute. 192 193## Types 194 195LLVM dialect uses built-in types whenever possible and defines a set of 196complementary types, which correspond to the LLVM IR types that cannot be 197directly represented with built-in types. Similarly to other MLIR context-owned 198objects, the creation and manipulation of LLVM dialect types is thread-safe. 199 200MLIR does not support module-scoped named type declarations, e.g. `%s = type 201{i32, i32}` in LLVM IR. Instead, types must be fully specified at each use, 202except for recursive types where only the first reference to a named type needs 203to be fully specified. MLIR [type aliases](../LangRef.md/#type-aliases) can be 204used to achieve more compact syntax. 205 206The general syntax of LLVM dialect types is `!llvm.`, followed by a type kind 207identifier (e.g., `ptr` for pointer or `struct` for structure) and by an 208optional list of type parameters in angle brackets. The dialect follows MLIR 209style for types with nested angle brackets and keyword specifiers rather than 210using different bracket styles to differentiate types. Types inside the angle 211brackets may omit the `!llvm.` prefix for brevity: the parser first attempts to 212find a type (starting with `!` or a built-in type) and falls back to accepting a 213keyword. For example, `!llvm.struct<(!llvm.ptr, f32)>` and 214`!llvm.struct<(ptr, f32)>` are equivalent, with the latter being the canonical 215form, and denote a struct containing a pointer and a float. 216 217### Built-in Type Compatibility 218 219LLVM dialect accepts a subset of built-in types that are referred to as _LLVM 220dialect-compatible types_. The following types are compatible: 221 222- Signless integers - `iN` (`IntegerType`). 223- Floating point types - `bfloat`, `half`, `float`, `double` , `f80`, `f128` 224 (`FloatType`). 225- 1D vectors of signless integers or floating point types - `vector<NxT>` 226 (`VectorType`). 227 228Note that only a subset of types that can be represented by a given class is 229compatible. For example, signed and unsigned integers are not compatible. LLVM 230provides a function, `bool LLVM::isCompatibleType(Type)`, that can be used as a 231compatibility check. 232 233Each LLVM IR type corresponds to *exactly one* MLIR type, either built-in or 234LLVM dialect type. For example, because `i32` is LLVM-compatible, there is no 235`!llvm.i32` type. However, `!llvm.struct<(T, ...)>` is defined in the LLVM 236dialect as there is no corresponding built-in type. 237 238### Additional Simple Types 239 240The following non-parametric types derived from the LLVM IR are available in the 241LLVM dialect: 242 243- `!llvm.ppc_fp128` (`LLVMPPCFP128Type`) - 128-bit floating-point value (two 244 64 bits). 245- `!llvm.token` (`LLVMTokenType`) - a non-inspectable value associated with an 246 operation. 247- `!llvm.metadata` (`LLVMMetadataType`) - LLVM IR metadata, to be used only if 248 the metadata cannot be represented as structured MLIR attributes. 249- `!llvm.void` (`LLVMVoidType`) - does not represent any value; can only 250 appear in function results. 251 252These types represent a single value (or an absence thereof in case of `void`) 253and correspond to their LLVM IR counterparts. 254 255### Additional Parametric Types 256 257These types are parameterized by the types they contain, e.g., the pointee or 258the element type, which can be either compatible built-in or LLVM dialect types. 259 260#### Pointer Types 261 262Pointer types specify an address in memory. 263 264Pointers are [opaque](https://llvm.org/docs/OpaquePointers.html), i.e., do not 265indicate the type of the data pointed to, and are intended to simplify LLVM IR 266by encoding behavior relevant to the pointee type into operations rather than 267into types. Pointers can optionally be parametrized with an address space. The 268address space is an integer, but this choice may be reconsidered if MLIR 269implements named address spaces. The syntax of pointer types is as follows: 270 271``` 272 llvm-ptr-type ::= `!llvm.ptr` (`<` integer-literal `>`)? 273``` 274 275where the optional group containing the integer literal corresponds to the 276address space. All cases are represented by `LLVMPointerType` internally. 277 278#### Array Types 279 280Array types represent sequences of elements in memory. Array elements can be 281addressed with a value unknown at compile time, and can be nested. Only 1D 282arrays are allowed though. 283 284Array types are parameterized by the fixed size and the element type. 285Syntactically, their representation is the following: 286 287``` 288 llvm-array-type ::= `!llvm.array<` integer-literal `x` type `>` 289``` 290 291and they are internally represented as `LLVMArrayType`. 292 293#### Function Types 294 295Function types represent the type of a function, i.e. its signature. 296 297Function types are parameterized by the result type, the list of argument types 298and by an optional "variadic" flag. Unlike built-in `FunctionType`, LLVM dialect 299functions (`LLVMFunctionType`) always have single result, which may be 300`!llvm.void` if the function does not return anything. The syntax is as follows: 301 302``` 303 llvm-func-type ::= `!llvm.func<` type `(` type-list (`,` `...`)? `)` `>` 304``` 305 306For example, 307 308```mlir 309!llvm.func<void ()> // a function with no arguments; 310!llvm.func<i32 (f32, i32)> // a function with two arguments and a result; 311!llvm.func<void (i32, ...)> // a variadic function with at least one argument. 312``` 313 314In the LLVM dialect, functions are not first-class objects and one cannot have a 315value of function type. Instead, one can take the address of a function and 316operate on pointers to functions. 317 318### Vector Types 319 320Vector types represent sequences of elements, typically when multiple data 321elements are processed by a single instruction (SIMD). Vectors are thought of as 322stored in registers and therefore vector elements can only be addressed through 323constant indices. 324 325Vector types are parameterized by the size, which may be either _fixed_ or a 326multiple of some fixed size in case of _scalable_ vectors, and the element type. 327Vectors cannot be nested and only 1D vectors are supported. Scalable vectors are 328still considered 1D. 329 330LLVM dialect uses built-in vector types for _fixed_-size vectors of built-in 331types, and provides additional types for fixed-sized vectors of LLVM dialect 332types (`LLVMFixedVectorType`) and scalable vectors of any types 333(`LLVMScalableVectorType`). These two additional types share the following 334syntax: 335 336``` 337 llvm-vec-type ::= `!llvm.vec<` (`?` `x`)? integer-literal `x` type `>` 338``` 339 340Note that the sets of element types supported by built-in and LLVM dialect 341vector types are mutually exclusive, e.g., the built-in vector type does not 342accept `!llvm.ptr` and the LLVM dialect fixed-width vector type does not 343accept `i32`. 344 345The following functions are provided to operate on any kind of the vector types 346compatible with the LLVM dialect: 347 348- `bool LLVM::isCompatibleVectorType(Type)` - checks whether a type is a 349 vector type compatible with the LLVM dialect; 350- `Type LLVM::getVectorElementType(Type)` - returns the element type of any 351 vector type compatible with the LLVM dialect; 352- `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number 353 of elements in any vector type compatible with the LLVM dialect; 354- `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type 355 with the given element type and size; the resulting type is either a 356 built-in or an LLVM dialect vector type depending on which one supports the 357 given element type. 358 359#### Examples of Compatible Vector Types 360 361```mlir 362vector<42 x i32> // Vector of 42 32-bit integers. 363!llvm.vec<42 x ptr> // Vector of 42 pointers. 364!llvm.vec<? x 4 x i32> // Scalable vector of 32-bit integers with 365 // size divisible by 4. 366!llvm.array<2 x vector<2 x i32>> // Array of 2 vectors of 2 32-bit integers. 367!llvm.array<2 x vec<2 x ptr>> // Array of 2 vectors of 2 pointers. 368``` 369 370### Structure Types 371 372The structure type is used to represent a collection of data members together in 373memory. The elements of a structure may be any type that has a size. 374 375Structure types are represented in a single dedicated class 376mlir::LLVM::LLVMStructType. Internally, the struct type stores a (potentially 377empty) name, a (potentially empty) list of contained types and a bitmask 378indicating whether the struct is named, opaque, packed or uninitialized. 379Structure types that don't have a name are referred to as _literal_ structs. 380Such structures are uniquely identified by their contents. _Identified_ structs 381on the other hand are uniquely identified by the name. 382 383#### Identified Structure Types 384 385Identified structure types are uniqued using their name in a given context. 386Attempting to construct an identified structure with the same name a structure 387that already exists in the context *will result in the existing structure being 388returned*. **MLIR does not auto-rename identified structs in case of name 389conflicts** because there is no naming scope equivalent to a module in LLVM IR 390since MLIR modules can be arbitrarily nested. 391 392Programmatically, identified structures can be constructed in an _uninitialized_ 393state. In this case, they are given a name but the body must be set up by a 394later call, using MLIR's type mutation mechanism. Such uninitialized types can 395be used in type construction, but must be eventually initialized for IR to be 396valid. This mechanism allows for constructing _recursive_ or mutually referring 397structure types: an uninitialized type can be used in its own initialization. 398 399Once the type is initialized, its body cannot be changed anymore. Any further 400attempts to modify the body will fail and return failure to the caller _unless 401the type is initialized with the exact same body_. Type initialization is 402thread-safe; however, if a concurrent thread initializes the type before the 403current thread, the initialization may return failure. 404 405The syntax for identified structure types is as follows. 406 407``` 408llvm-ident-struct-type ::= `!llvm.struct<` string-literal, `opaque` `>` 409 | `!llvm.struct<` string-literal, `packed`? 410 `(` type-or-ref-list `)` `>` 411type-or-ref-list ::= <maybe empty comma-separated list of type-or-ref> 412type-or-ref ::= <any compatible type with optional !llvm.> 413 | `!llvm.`? `struct<` string-literal `>` 414``` 415 416#### Literal Structure Types 417 418Literal structures are uniqued according to the list of elements they contain, 419and can optionally be packed. The syntax for such structs is as follows. 420 421``` 422llvm-literal-struct-type ::= `!llvm.struct<` `packed`? `(` type-list `)` `>` 423type-list ::= <maybe empty comma-separated list of types with optional !llvm.> 424``` 425 426Literal structs cannot be recursive, but can contain other structs. Therefore, 427they must be constructed in a single step with the entire list of contained 428elements provided. 429 430#### Examples of Structure Types 431 432```mlir 433!llvm.struct<> // NOT allowed 434!llvm.struct<()> // empty, literal 435!llvm.struct<(i32)> // literal 436!llvm.struct<(struct<(i32)>)> // struct containing a struct 437!llvm.struct<packed (i8, i32)> // packed struct 438!llvm.struct<"a"> // recursive reference, only allowed within 439 // another struct, NOT allowed at top level 440!llvm.struct<"a", ()> // empty, named (necessary to differentiate from 441 // recursive reference) 442!llvm.struct<"a", opaque> // opaque, named 443!llvm.struct<"a", (i32, ptr)> // named 444!llvm.struct<"a", packed (i8, i32)> // named, packed 445``` 446 447### Unsupported Types 448 449LLVM IR `label` type does not have a counterpart in the LLVM dialect since, in 450MLIR, blocks are not values and don't need a type. 451 452## Operations 453 454All operations in the LLVM IR dialect have a custom form in MLIR. The mnemonic 455of an operation is that used in LLVM IR prefixed with "`llvm.`". 456 457[include "Dialects/LLVMOps.md"] 458 459## Operations for LLVM IR Intrinsics 460 461MLIR operation system is open making it unnecessary to introduce a hard bound 462between "core" operations and "intrinsics". General LLVM IR intrinsics are 463modeled as first-class operations in the LLVM dialect. Target-specific LLVM IR 464intrinsics, e.g., NVVM or ROCDL, are modeled as separate dialects. 465 466[include "Dialects/LLVMIntrinsicOps.md"] 467 468### Debug Info 469 470Debug information within the LLVM dialect is represented using locations in 471combination with a set of attributes that mirror the DINode structure defined by 472the debug info metadata within LLVM IR. Debug scoping information is attached 473to LLVM IR dialect operations using a fused location (`FusedLoc`) whose metadata 474holds the DIScopeAttr representing the debug scope. Similarly, the subprogram 475of LLVM IR dialect `FuncOp` operations is attached using a fused location whose 476metadata is a DISubprogramAttr. 477