1# MLIR Language Reference 2 3MLIR (Multi-Level IR) is a compiler intermediate representation with 4similarities to traditional three-address SSA representations (like 5[LLVM IR](http://llvm.org/docs/LangRef.html) or 6[SIL](https://github.com/apple/swift/blob/main/docs/SIL.rst)), but which 7introduces notions from polyhedral loop optimization as first-class concepts. 8This hybrid design is optimized to represent, analyze, and transform high level 9dataflow graphs as well as target-specific code generated for high performance 10data parallel systems. Beyond its representational capabilities, its single 11continuous design provides a framework to lower from dataflow graphs to 12high-performance target-specific code. 13 14This document defines and describes the key concepts in MLIR, and is intended to 15be a dry reference document - the 16[rationale documentation](Rationale/Rationale.md), 17[glossary](../getting_started/Glossary.md), and other content are hosted 18elsewhere. 19 20MLIR is designed to be used in three different forms: a human-readable textual 21form suitable for debugging, an in-memory form suitable for programmatic 22transformations and analysis, and a compact serialized form suitable for storage 23and transport. The different forms all describe the same semantic content. This 24document describes the human-readable textual form. 25 26[TOC] 27 28## High-Level Structure 29 30MLIR is fundamentally based on a graph-like data structure of nodes, called 31*Operations*, and edges, called *Values*. Each Value is the result of exactly 32one Operation or Block Argument, and has a *Value Type* defined by the 33[type system](#type-system). [Operations](#operations) are contained in 34[Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations 35are also ordered within their containing block and Blocks are ordered in their 36containing region, although this order may or may not be semantically meaningful 37in a given [kind of region](Interfaces.md/#regionkindinterfaces)). Operations 38may also contain regions, enabling hierarchical structures to be represented. 39 40Operations can represent many different concepts, from higher-level concepts 41like function definitions, function calls, buffer allocations, view or slices of 42buffers, and process creation, to lower-level concepts like target-independent 43arithmetic, target-specific instructions, configuration registers, and logic 44gates. These different concepts are represented by different operations in MLIR 45and the set of operations usable in MLIR can be arbitrarily extended. 46 47MLIR also provides an extensible framework for transformations on operations, 48using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary 49set of passes on an arbitrary set of operations results in a significant scaling 50challenge, since each transformation must potentially take into account the 51semantics of any operation. MLIR addresses this complexity by allowing operation 52semantics to be described abstractly using [Traits](Traits) and 53[Interfaces](Interfaces.md), enabling transformations to operate on operations 54more generically. Traits often describe verification constraints on valid IR, 55enabling complex invariants to be captured and checked. (see 56[Op vs Operation](Tutorials/Toy/Ch-2.md/#op-vs-operation-using-mlir-operations)) 57 58One obvious application of MLIR is to represent an 59[SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR, 60like the LLVM core IR, with appropriate choice of operation types to define 61Modules, Functions, Branches, Memory Allocation, and verification constraints to 62ensure the SSA Dominance property. MLIR includes a collection of dialects which 63defines just such structures. However, MLIR is intended to be general enough to 64represent other compiler-like data structures, such as Abstract Syntax Trees in 65a language frontend, generated instructions in a target-specific backend, or 66circuits in a High-Level Synthesis tool. 67 68Here's an example of an MLIR module: 69 70```mlir 71// Compute A*B using an implementation of multiply kernel and print the 72// result using a TensorFlow op. The dimensions of A and B are partially 73// known. The shapes are assumed to match. 74func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) { 75 // Compute the inner dimension of %A using the dim operation. 76 %n = memref.dim %A, 1 : tensor<100x?xf32> 77 78 // Allocate addressable "buffers" and copy tensors %A and %B into them. 79 %A_m = memref.alloc(%n) : memref<100x?xf32> 80 bufferization.materialize_in_destination %A in writable %A_m 81 : (tensor<100x?xf32>, memref<100x?xf32>) -> () 82 83 %B_m = memref.alloc(%n) : memref<?x50xf32> 84 bufferization.materialize_in_destination %B in writable %B_m 85 : (tensor<?x50xf32>, memref<?x50xf32>) -> () 86 87 // Call function @multiply passing memrefs as arguments, 88 // and getting returned the result of the multiplication. 89 %C_m = call @multiply(%A_m, %B_m) 90 : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>) 91 92 memref.dealloc %A_m : memref<100x?xf32> 93 memref.dealloc %B_m : memref<?x50xf32> 94 95 // Load the buffer data into a higher level "tensor" value. 96 %C = memref.tensor_load %C_m : memref<100x50xf32> 97 memref.dealloc %C_m : memref<100x50xf32> 98 99 // Call TensorFlow built-in function to print the result tensor. 100 "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>) 101 102 return %C : tensor<100x50xf32> 103} 104 105// A function that multiplies two memrefs and returns the result. 106func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>) 107 -> (memref<100x50xf32>) { 108 // Compute the inner dimension of %A. 109 %n = memref.dim %A, 1 : memref<100x?xf32> 110 111 // Allocate memory for the multiplication result. 112 %C = memref.alloc() : memref<100x50xf32> 113 114 // Multiplication loop nest. 115 affine.for %i = 0 to 100 { 116 affine.for %j = 0 to 50 { 117 memref.store 0 to %C[%i, %j] : memref<100x50xf32> 118 affine.for %k = 0 to %n { 119 %a_v = memref.load %A[%i, %k] : memref<100x?xf32> 120 %b_v = memref.load %B[%k, %j] : memref<?x50xf32> 121 %prod = arith.mulf %a_v, %b_v : f32 122 %c_v = memref.load %C[%i, %j] : memref<100x50xf32> 123 %sum = arith.addf %c_v, %prod : f32 124 memref.store %sum, %C[%i, %j] : memref<100x50xf32> 125 } 126 } 127 } 128 return %C : memref<100x50xf32> 129} 130``` 131 132## Notation 133 134MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip 135through a textual form. This is important for development of the compiler - e.g. 136for understanding the state of code as it is being transformed and writing test 137cases. 138 139This document describes the grammar using 140[Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form). 141 142This is the EBNF grammar used in this document, presented in yellow boxes. 143 144``` 145alternation ::= expr0 | expr1 | expr2 // Either expr0 or expr1 or expr2. 146sequence ::= expr0 expr1 expr2 // Sequence of expr0 expr1 expr2. 147repetition0 ::= expr* // 0 or more occurrences. 148repetition1 ::= expr+ // 1 or more occurrences. 149optionality ::= expr? // 0 or 1 occurrence. 150grouping ::= (expr) // Everything inside parens is grouped together. 151literal ::= `abcd` // Matches the literal `abcd`. 152``` 153 154Code examples are presented in blue boxes. 155 156``` 157// This is an example use of the grammar above: 158// This matches things like: ba, bana, boma, banana, banoma, bomana... 159example ::= `b` (`an` | `om`)* `a` 160``` 161 162### Common syntax 163 164The following core grammar productions are used in this document: 165 166``` 167// TODO: Clarify the split between lexing (tokens) and parsing (grammar). 168digit ::= [0-9] 169hex_digit ::= [0-9a-fA-F] 170letter ::= [a-zA-Z] 171id-punct ::= [$._-] 172 173integer-literal ::= decimal-literal | hexadecimal-literal 174decimal-literal ::= digit+ 175hexadecimal-literal ::= `0x` hex_digit+ 176float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)? 177string-literal ::= `"` [^"\n\f\v\r]* `"` TODO: define escaping rules 178``` 179 180Not listed here, but MLIR does support comments. They use standard BCPL syntax, 181starting with a `//` and going until the end of the line. 182 183 184### Top level Productions 185 186``` 187// Top level production 188toplevel := (operation | attribute-alias-def | type-alias-def)* 189``` 190 191The production `toplevel` is the top level production that is parsed by any parsing 192consuming the MLIR syntax. [Operations](#operations), 193[Attribute aliases](#attribute-value-aliases), and [Type aliases](#type-aliases) 194can be declared on the toplevel. 195 196### Identifiers and keywords 197 198Syntax: 199 200``` 201// Identifiers 202bare-id ::= (letter|[_]) (letter|digit|[_$.])* 203bare-id-list ::= bare-id (`,` bare-id)* 204value-id ::= `%` suffix-id 205alias-name :: = bare-id 206suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*)) 207 208symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)? 209value-id-list ::= value-id (`,` value-id)* 210 211// Uses of value, e.g. in an operand list to an operation. 212value-use ::= value-id (`#` decimal-literal)? 213value-use-list ::= value-use (`,` value-use)* 214``` 215 216Identifiers name entities such as values, types and functions, and are chosen by 217the writer of MLIR code. Identifiers may be descriptive (e.g. `%batch_size`, 218`@matmul`), or may be non-descriptive when they are auto-generated (e.g. `%23`, 219`@func42`). Identifier names for values may be used in an MLIR text file but are 220not persisted as part of the IR - the printer will give them anonymous names 221like `%42`. 222 223MLIR guarantees identifiers never collide with keywords by prefixing identifiers 224with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts 225(e.g. affine expressions), identifiers are not prefixed, for brevity. New 226keywords may be added to future versions of MLIR without danger of collision 227with existing identifiers. 228 229Value identifiers are only [in scope](#value-scoping) for the (nested) region in 230which they are defined and cannot be accessed or referenced outside of that 231region. Argument identifiers in mapping functions are in scope for the mapping 232body. Particular operations may further limit which identifiers are in scope in 233their regions. For instance, the scope of values in a region with 234[SSA control flow semantics](#control-flow-and-ssacfg-regions) is constrained 235according to the standard definition of 236[SSA dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)). 237Another example is the [IsolatedFromAbove trait](Traits/#isolatedfromabove), 238which restricts directly accessing values defined in containing regions. 239 240Function identifiers and mapping identifiers are associated with 241[Symbols](SymbolsAndSymbolTables.md) and have scoping rules dependent on symbol 242attributes. 243 244## Dialects 245 246Dialects are the mechanism by which to engage with and extend the MLIR 247ecosystem. They allow for defining new [operations](#operations), as well as 248[attributes](#attributes) and [types](#type-system). Each dialect is given a 249unique `namespace` that is prefixed to each defined attribute/operation/type. 250For example, the [Affine dialect](Dialects/Affine.md) defines the namespace: 251`affine`. 252 253MLIR allows for multiple dialects, even those outside of the main tree, to 254co-exist together within one module. Dialects are produced and consumed by 255certain passes. MLIR provides a [framework](DialectConversion.md) to convert 256between, and within, different dialects. 257 258A few of the dialects supported by MLIR: 259 260* [Affine dialect](Dialects/Affine.md) 261* [Func dialect](Dialects/Func.md) 262* [GPU dialect](Dialects/GPU.md) 263* [LLVM dialect](Dialects/LLVM.md) 264* [SPIR-V dialect](Dialects/SPIR-V.md) 265* [Vector dialect](Dialects/Vector.md) 266 267### Target specific operations 268 269Dialects provide a modular way in which targets can expose target-specific 270operations directly through to MLIR. As an example, some targets go through 271LLVM. LLVM has a rich set of intrinsics for certain target-independent 272operations (e.g. addition with overflow check) as well as providing access to 273target-specific operations for the targets it supports (e.g. vector permutation 274operations). LLVM intrinsics in MLIR are represented via operations that start 275with an "llvm." name. 276 277Example: 278 279```mlir 280// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 281%x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1) 282``` 283 284These operations only work when targeting LLVM as a backend (e.g. for CPUs and 285GPUs), and are required to align with the LLVM definition of these intrinsics. 286 287## Operations 288 289Syntax: 290 291``` 292operation ::= op-result-list? (generic-operation | custom-operation) 293 trailing-location? 294generic-operation ::= string-literal `(` value-use-list? `)` successor-list? 295 dictionary-properties? region-list? dictionary-attribute? 296 `:` function-type 297custom-operation ::= bare-id custom-operation-format 298op-result-list ::= op-result (`,` op-result)* `=` 299op-result ::= value-id (`:` integer-literal)? 300successor-list ::= `[` successor (`,` successor)* `]` 301successor ::= caret-id (`:` block-arg-list)? 302dictionary-properties ::= `<` dictionary-attribute `>` 303region-list ::= `(` region (`,` region)* `)` 304dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` 305trailing-location ::= `loc` `(` location `)` 306``` 307 308MLIR introduces a uniform concept called *operations* to enable describing many 309different levels of abstractions and computations. Operations in MLIR are fully 310extensible (there is no fixed list of operations) and have application-specific 311semantics. For example, MLIR supports 312[target-independent operations](Dialects/MemRef.md), 313[affine operations](Dialects/Affine.md), and 314[target-specific machine operations](#target-specific-operations). 315 316The internal representation of an operation is simple: an operation is 317identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`, 318`ppc.eieio`, etc), can return zero or more results, take zero or more operands, 319has storage for [properties](#properties), has a dictionary of 320[attributes](#attributes), has zero or more successors, and zero or more 321enclosed [regions](#regions). The generic printing form includes all these 322elements literally, with a function type to indicate the types of the 323results and operands. 324 325Example: 326 327```mlir 328// An operation that produces two results. 329// The results of %result can be accessed via the <name> `#` <opNo> syntax. 330%result:2 = "foo_div"() : () -> (f32, i32) 331 332// Pretty form that defines a unique name for each result. 333%foo, %bar = "foo_div"() : () -> (f32, i32) 334 335// Invoke a TensorFlow function called tf.scramble with two inputs 336// and an attribute "fruit" stored in properties. 337%2 = "tf.scramble"(%result#0, %bar) <{fruit = "banana"}> : (f32, i32) -> f32 338 339// Invoke an operation with some discardable attributes 340%foo, %bar = "foo_div"() {some_attr = "value", other_attr = 42 : i64} : () -> (f32, i32) 341``` 342 343In addition to the basic syntax above, dialects may register known operations. 344This allows those dialects to support *custom assembly form* for parsing and 345printing operations. In the operation sets listed below, we show both forms. 346 347### Builtin Operations 348 349The [builtin dialect](Dialects/Builtin.md) defines a select few operations that 350are widely applicable by MLIR dialects, such as a universal conversion cast 351operation that simplifies inter/intra dialect conversion. This dialect also 352defines a top-level `module` operation, that represents a useful IR container. 353 354## Blocks 355 356Syntax: 357 358``` 359block ::= block-label operation+ 360block-label ::= block-id block-arg-list? `:` 361block-id ::= caret-id 362caret-id ::= `^` suffix-id 363value-id-and-type ::= value-id `:` type 364 365// Non-empty list of names and types. 366value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)* 367 368block-arg-list ::= `(` value-id-and-type-list? `)` 369``` 370 371A *Block* is a list of operations. In 372[SSACFG regions](#control-flow-and-ssacfg-regions), each block represents a 373compiler [basic block](https://en.wikipedia.org/wiki/Basic_block) where 374instructions inside the block are executed in order and terminator operations 375implement control flow branches between basic blocks. 376 377The last operation in a block must be a 378[terminator operation](#control-flow-and-ssacfg-regions). A region with a single 379block may opt out of this requirement by attaching the `NoTerminator` on the 380enclosing op. The top-level `ModuleOp` is an example of such an operation which 381defines this trait and whose block body does not have a terminator. 382 383Blocks in MLIR take a list of block arguments, notated in a function-like way. 384Block arguments are bound to values specified by the semantics of individual 385operations. Block arguments of the entry block of a region are also arguments to 386the region and the values bound to these arguments are determined by the 387semantics of the containing operation. Block arguments of other blocks are 388determined by the semantics of terminator operations, e.g. Branches, which have 389the block as a successor. In regions with 390[control flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure 391to implicitly represent the passage of control-flow dependent values without the 392complex nuances of PHI nodes in traditional SSA representations. Note that 393values which are not control-flow dependent can be referenced directly and do 394not need to be passed through block arguments. 395 396Here is a simple example function showing branches, returns, and block 397arguments: 398 399```mlir 400func.func @simple(i64, i1) -> i64 { 401^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a 402 cf.cond_br %cond, ^bb1, ^bb2 403 404^bb1: 405 cf.br ^bb3(%a: i64) // Branch passes %a as the argument 406 407^bb2: 408 %b = arith.addi %a, %a : i64 409 cf.br ^bb3(%b: i64) // Branch passes %b as the argument 410 411// ^bb3 receives an argument, named %c, from predecessors 412// and passes it on to bb4 along with %a. %a is referenced 413// directly from its defining operation and is not passed through 414// an argument of ^bb3. 415^bb3(%c: i64): 416 cf.br ^bb4(%c, %a : i64, i64) 417 418^bb4(%d : i64, %e : i64): 419 %0 = arith.addi %d, %e : i64 420 return %0 : i64 // Return is also a terminator. 421} 422``` 423 424**Context:** The "block argument" representation eliminates a number of special 425cases from the IR compared to traditional "PHI nodes are operations" SSA IRs 426(like LLVM). For example, the 427[parallel copy semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf) 428of SSA is immediately apparent, and function arguments are no longer a special 429case: they become arguments to the entry block 430[[more rationale](Rationale/Rationale.md/#block-arguments-vs-phi-nodes)]. Blocks 431are also a fundamental concept that cannot be represented by operations because 432values defined in an operation cannot be accessed outside the operation. 433 434## Regions 435 436### Definition 437 438A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a 439region is not imposed by the IR. Instead, the containing operation defines the 440semantics of the regions it contains. MLIR currently defines two kinds of 441regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe 442control flow between blocks, and [Graph regions](#graph-regions), which do not 443require control flow between block. The kinds of regions within an operation are 444described using the [RegionKindInterface](Interfaces.md/#regionkindinterfaces). 445 446Regions do not have a name or an address, only the blocks contained in a region 447do. Regions must be contained within operations and have no type or attributes. 448The first block in the region is a special block called the 'entry block'. The 449arguments to the entry block are also the arguments of the region itself. The 450entry block cannot be listed as a successor of any other block. The syntax for a 451region is as follows: 452 453``` 454region ::= `{` entry-block? block* `}` 455entry-block ::= operation+ 456``` 457 458A function body is an example of a region: it consists of a CFG of blocks and 459has additional semantic restrictions that other types of regions may not have. 460For example, in a function body, block terminators must either branch to a 461different block, or return from a function where the types of the `return` 462arguments must match the result types of the function signature. Similarly, the 463function arguments must match the types and count of the region arguments. In 464general, operations with regions can define these correspondences arbitrarily. 465 466An *entry block* is a block with no label and no arguments that may occur at 467the beginning of a region. It enables a common pattern of using a region to 468open a new scope. 469 470 471### Value Scoping 472 473Regions provide hierarchical encapsulation of programs: it is impossible to 474reference, i.e. branch to, a block which is not in the same region as the source 475of the reference, i.e. a terminator operation. Similarly, regions provides a 476natural scoping for value visibility: values defined in a region don't escape to 477the enclosing region, if any. By default, operations inside a region can 478reference values defined outside of the region whenever it would have been legal 479for operands of the enclosing operation to reference those values, but this can 480be restricted using traits, such as 481[OpTrait::IsolatedFromAbove](Traits/#isolatedfromabove), or a custom 482verifier. 483 484Example: 485 486```mlir 487 "any_op"(%a) ({ // if %a is in-scope in the containing region... 488 // then %a is in-scope here too. 489 %new_value = "another_op"(%a) : (i64) -> (i64) 490 }) : (i64) -> (i64) 491``` 492 493MLIR defines a generalized 'hierarchical dominance' concept that operates across 494hierarchy and defines whether a value is 'in scope' and can be used by a 495particular operation. Whether a value can be used by another operation in the 496same region is defined by the kind of region. A value defined in a region can be 497used by an operation which has a parent in the same region, if and only if the 498parent could use the value. A value defined by an argument to a region can 499always be used by any operation deeply contained in the region. A value defined 500in a region can never be used outside of the region. 501 502### Control Flow and SSACFG Regions 503 504In MLIR, control flow semantics of a region is indicated by 505[RegionKind::SSACFG](Interfaces.md/#regionkindinterfaces). Informally, these 506regions support semantics where operations in a region 'execute sequentially'. 507Before an operation executes, its operands have well-defined values. After an 508operation executes, the operands have the same values and results also have 509well-defined values. After an operation executes, the next operation in the 510block executes until the operation is the terminator operation at the end of a 511block, in which case some other operation will execute. The determination of the 512next instruction to execute is the 'passing of control flow'. 513 514In general, when control flow is passed to an operation, MLIR does not restrict 515when control flow enters or exits the regions contained in that operation. 516However, when control flow enters a region, it always begins in the first block 517of the region, called the *entry* block. Terminator operations ending each block 518represent control flow by explicitly specifying the successor blocks of the 519block. Control flow can only pass to one of the specified successor blocks as in 520a `branch` operation, or back to the containing operation as in a `return` 521operation. Terminator operations without successors can only pass control back 522to the containing operation. Within these restrictions, the particular semantics 523of terminator operations is determined by the specific dialect operations 524involved. Blocks (other than the entry block) that are not listed as a successor 525of a terminator operation are defined to be unreachable and can be removed 526without affecting the semantics of the containing operation. 527 528Although control flow always enters a region through the entry block, control 529flow may exit a region through any block with an appropriate terminator. The 530standard dialect leverages this capability to define operations with 531Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different 532blocks in the region and exiting through any block with a `return` operation. 533This behavior is similar to that of a function body in most programming 534languages. In addition, control flow may also not reach the end of a block or 535region, for example if a function call does not return. 536 537Example: 538 539```mlir 540func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region 541^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a 542 cf.cond_br %cond, ^bb1, ^bb2 543 544^bb1: 545 // This def for %value does not dominate ^bb2 546 %value = "op.convert"(%a) : (i64) -> i64 547 cf.br ^bb3(%a: i64) // Branch passes %a as the argument 548 549^bb2: 550 accelerator.launch() { // An SSACFG region 551 ^bb0: 552 // Region of code nested under "accelerator.launch", it can reference %a but 553 // not %value. 554 %new_value = "accelerator.do_something"(%a) : (i64) -> () 555 } 556 // %new_value cannot be referenced outside of the region 557 558^bb3: 559 ... 560} 561``` 562 563#### Operations with Multiple Regions 564 565An operation containing multiple regions also completely determines the 566semantics of those regions. In particular, when control flow is passed to an 567operation, it may transfer control flow to any contained region. When control 568flow exits a region and is returned to the containing operation, the containing 569operation may pass control flow to any region in the same operation. An 570operation may also pass control flow to multiple contained regions concurrently. 571An operation may also pass control flow into regions that were specified in 572other operations, in particular those that defined the values or symbols the 573given operation uses as in a call operation. This passage of control is 574generally independent of passage of control flow through the basic blocks of the 575containing region. 576 577#### Closure 578 579Regions allow defining an operation that creates a closure, for example by 580“boxing” the body of the region into a value they produce. It remains up to the 581operation to define its semantics. Note that if an operation triggers 582asynchronous execution of the region, it is under the responsibility of the 583operation caller to wait for the region to be executed guaranteeing that any 584directly used values remain live. 585 586### Graph Regions 587 588In MLIR, graph-like semantics in a region is indicated by 589[RegionKind::Graph](Interfaces.md/#regionkindinterfaces). Graph regions are 590appropriate for concurrent semantics without control flow, or for modeling 591generic directed graph data structures. Graph regions are appropriate for 592representing cyclic relationships between coupled values where there is no 593fundamental order to the relationships. For instance, operations in a graph 594region may represent independent threads of control with values representing 595streams of data. As usual in MLIR, the particular semantics of a region is 596completely determined by its containing operation. Graph regions may only 597contain a single basic block (the entry block). 598 599**Rationale:** Currently graph regions are arbitrarily limited to a single basic 600block, although there is no particular semantic reason for this limitation. This 601limitation has been added to make it easier to stabilize the pass infrastructure 602and commonly used passes for processing graph regions to properly handle 603feedback loops. Multi-block regions may be allowed in the future if use cases 604that require it arise. 605 606In graph regions, MLIR operations naturally represent nodes, while each MLIR 607value represents a multi-edge connecting a single source node and multiple 608destination nodes. All values defined in the region as results of operations are 609in scope within the region and can be accessed by any other operation in the 610region. In graph regions, the order of operations within a block and the order 611of blocks in a region is not semantically meaningful and non-terminator 612operations may be freely reordered, for instance, by canonicalization. Other 613kinds of graphs, such as graphs with multiple source nodes and multiple 614destination nodes, can also be represented by representing graph edges as MLIR 615operations. 616 617Note that cycles can occur within a single block in a graph region, or between 618basic blocks. 619 620```mlir 621"test.graph_region"() ({ // A Graph region 622 %1 = "op1"(%1, %3) : (i32, i32) -> (i32) // OK: %1, %3 allowed here 623 %2 = "test.ssacfg_region"() ({ 624 %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region 625 }) : () -> (i32) 626 %3 = "op2"(%1, %4) : (i32, i32) -> (i32) // OK: %4 allowed here 627 %4 = "op3"(%1) : (i32) -> (i32) 628}) : () -> () 629``` 630 631### Arguments and Results 632 633The arguments of the first block of a region are treated as arguments of the 634region. The source of these arguments is defined by the semantics of the parent 635operation. They may correspond to some of the values the operation itself uses. 636 637Regions produce a (possibly empty) list of values. The operation semantics 638defines the relation between the region results and the operation results. 639 640## Type System 641 642Each value in MLIR has a type defined by the type system. MLIR has an open type 643system (i.e. there is no fixed list of types), and types may have 644application-specific semantics. MLIR dialects may define any number of types 645with no restrictions on the abstractions they represent. 646 647``` 648type ::= type-alias | dialect-type | builtin-type 649 650type-list-no-parens ::= type (`,` type)* 651type-list-parens ::= `(` `)` 652 | `(` type-list-no-parens `)` 653 654// This is a common way to refer to a value with a specified type. 655ssa-use-and-type ::= ssa-use `:` type 656ssa-use ::= value-use 657 658// Non-empty list of names and types. 659ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)* 660 661function-type ::= (type | type-list-parens) `->` (type | type-list-parens) 662``` 663 664### Type Aliases 665 666``` 667type-alias-def ::= `!` alias-name `=` type 668type-alias ::= `!` alias-name 669``` 670 671MLIR supports defining named aliases for types. A type alias is an identifier 672that can be used in the place of the type that it defines. These aliases *must* 673be defined before their uses. Alias names may not contain a '.', since those 674names are reserved for [dialect types](#dialect-types). 675 676Example: 677 678```mlir 679!avx_m128 = vector<4 x f32> 680 681// Using the original type. 682"foo"(%x) : vector<4 x f32> -> () 683 684// Using the type alias. 685"foo"(%x) : !avx_m128 -> () 686``` 687 688### Dialect Types 689 690Similarly to operations, dialects may define custom extensions to the type 691system. 692 693``` 694dialect-namespace ::= bare-id 695 696dialect-type ::= `!` (opaque-dialect-type | pretty-dialect-type) 697opaque-dialect-type ::= dialect-namespace dialect-type-body 698pretty-dialect-type ::= dialect-namespace `.` pretty-dialect-type-lead-ident 699 dialect-type-body? 700pretty-dialect-type-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*` 701 702dialect-type-body ::= `<` dialect-type-contents+ `>` 703dialect-type-contents ::= dialect-type-body 704 | `(` dialect-type-contents+ `)` 705 | `[` dialect-type-contents+ `]` 706 | `{` dialect-type-contents+ `}` 707 | [^\[<({\]>)}\0]+ 708``` 709 710Dialect types are generally specified in an opaque form, where the contents 711of the type are defined within a body wrapped with the dialect namespace 712and `<>`. Consider the following examples: 713 714```mlir 715// A tensorflow string type. 716!tf<string> 717 718// A type with complex components. 719!foo<something<abcd>> 720 721// An even more complex type. 722!foo<"a123^^^" + bar> 723``` 724 725Dialect types that are simple enough may use a prettier format, which unwraps 726part of the syntax into an equivalent, but lighter weight form: 727 728```mlir 729// A tensorflow string type. 730!tf.string 731 732// A type with complex components. 733!foo.something<abcd> 734``` 735 736See [here](DefiningDialects/AttributesAndTypes.md) to learn how to define dialect types. 737 738### Builtin Types 739 740The [builtin dialect](Dialects/Builtin.md) defines a set of types that are 741directly usable by any other dialect in MLIR. These types cover a range from 742primitive integer and floating-point types, function types, and more. 743 744## Properties 745 746Properties are extra data members stored directly on an Operation class. They 747provide a way to store [inherent attributes](#attributes) and other arbitrary 748data. The semantics of the data is specific to a given operation, and may be 749exposed through [Interfaces](Interfaces.md) accessors and other methods. 750Properties can always be serialized to Attribute in order to be printed 751generically. 752 753## Attributes 754 755Syntax: 756 757``` 758attribute-entry ::= (bare-id | string-literal) `=` attribute-value 759attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute 760``` 761 762Attributes are the mechanism for specifying constant data on operations in 763places where a variable is never allowed - e.g. the comparison predicate of a 764[`cmpi` operation](Dialects/ArithOps.md/#arithcmpi-arithcmpiop). Each operation has an 765attribute dictionary, which associates a set of attribute names to attribute 766values. MLIR's builtin dialect provides a rich set of 767[builtin attribute values](#builtin-attribute-values) out of the box (such as 768arrays, dictionaries, strings, etc.). Additionally, dialects can define their 769own [dialect attribute values](#dialect-attribute-values). 770 771For dialects which haven't adopted properties yet, the top-level attribute 772dictionary attached to an operation has special semantics. The attribute 773entries are considered to be of two different kinds based on whether their 774dictionary key has a dialect prefix: 775 776- *inherent attributes* are inherent to the definition of an operation's 777 semantics. The operation itself is expected to verify the consistency of 778 these attributes. An example is the `predicate` attribute of the 779 `arith.cmpi` op. These attributes must have names that do not start with a 780 dialect prefix. 781 782- *discardable attributes* have semantics defined externally to the operation 783 itself, but must be compatible with the operations's semantics. These 784 attributes must have names that start with a dialect prefix. The dialect 785 indicated by the dialect prefix is expected to verify these attributes. An 786 example is the `gpu.container_module` attribute. 787 788Note that attribute values are allowed to themselves be dictionary attributes, 789but only the top-level dictionary attribute attached to the operation is subject 790to the classification above. 791 792When properties are adopted, only discardable attributes are stored in the 793top-level dictionary, while inherent attributes are stored in the properties 794storage. 795 796### Attribute Value Aliases 797 798``` 799attribute-alias-def ::= `#` alias-name `=` attribute-value 800attribute-alias ::= `#` alias-name 801``` 802 803MLIR supports defining named aliases for attribute values. An attribute alias is 804an identifier that can be used in the place of the attribute that it defines. 805These aliases *must* be defined before their uses. Alias names may not contain a 806'.', since those names are reserved for 807[dialect attributes](#dialect-attribute-values). 808 809Example: 810 811```mlir 812#map = affine_map<(d0) -> (d0 + 10)> 813 814// Using the original attribute. 815%b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a) 816 817// Using the attribute alias. 818%b = affine.apply #map(%a) 819``` 820 821### Dialect Attribute Values 822 823Similarly to operations, dialects may define custom attribute values. 824 825``` 826dialect-namespace ::= bare-id 827 828dialect-attribute ::= `#` (opaque-dialect-attribute | pretty-dialect-attribute) 829opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body 830pretty-dialect-attribute ::= dialect-namespace `.` pretty-dialect-attribute-lead-ident 831 dialect-attribute-body? 832pretty-dialect-attribute-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*` 833 834dialect-attribute-body ::= `<` dialect-attribute-contents+ `>` 835dialect-attribute-contents ::= dialect-attribute-body 836 | `(` dialect-attribute-contents+ `)` 837 | `[` dialect-attribute-contents+ `]` 838 | `{` dialect-attribute-contents+ `}` 839 | [^\[<({\]>)}\0]+ 840``` 841 842Dialect attributes are generally specified in an opaque form, where the contents 843of the attribute are defined within a body wrapped with the dialect namespace 844and `<>`. Consider the following examples: 845 846```mlir 847// A string attribute. 848#foo<string<"">> 849 850// A complex attribute. 851#foo<"a123^^^" + bar> 852``` 853 854Dialect attributes that are simple enough may use a prettier format, which unwraps 855part of the syntax into an equivalent, but lighter weight form: 856 857```mlir 858// A string attribute. 859#foo.string<""> 860``` 861 862See [here](DefiningDialects/AttributesAndTypes.md) on how to define dialect attribute values. 863 864### Builtin Attribute Values 865 866The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values 867that are directly usable by any other dialect in MLIR. These types cover a range 868from primitive integer and floating-point values, attribute dictionaries, dense 869multi-dimensional arrays, and more. 870 871### IR Versioning 872 873A dialect can opt-in to handle versioning through the 874`BytecodeDialectInterface`. Few hooks are exposed to the dialect to allow 875managing a version encoded into the bytecode file. The version is loaded lazily 876and allows to retrieve the version information while parsing the input IR, and 877gives an opportunity to each dialect for which a version is present to perform 878IR upgrades post-parsing through the `upgradeFromVersion` method. Custom 879Attribute and Type encodings can also be upgraded according to the dialect 880version using readAttribute and readType methods. 881 882There is no restriction on what kind of information a dialect is allowed to 883encode to model its versioning. Currently, versioning is supported only for 884bytecode formats. 885