xref: /llvm-project/mlir/docs/LangRef.md (revision 73fa6685c43ef61f5f5babb14f734097af6dc702)
1# MLIR Language Reference
2
3MLIR (Multi-Level IR) is a compiler intermediate representation with
4similarities to traditional three-address SSA representations (like
5[LLVM IR](http://llvm.org/docs/LangRef.html) or
6[SIL](https://github.com/apple/swift/blob/main/docs/SIL.rst)), but which
7introduces notions from polyhedral loop optimization as first-class concepts.
8This hybrid design is optimized to represent, analyze, and transform high level
9dataflow graphs as well as target-specific code generated for high performance
10data parallel systems. Beyond its representational capabilities, its single
11continuous design provides a framework to lower from dataflow graphs to
12high-performance target-specific code.
13
14This document defines and describes the key concepts in MLIR, and is intended to
15be a dry reference document - the
16[rationale documentation](Rationale/Rationale.md),
17[glossary](../getting_started/Glossary.md), and other content are hosted
18elsewhere.
19
20MLIR is designed to be used in three different forms: a human-readable textual
21form suitable for debugging, an in-memory form suitable for programmatic
22transformations and analysis, and a compact serialized form suitable for storage
23and transport. The different forms all describe the same semantic content. This
24document describes the human-readable textual form.
25
26[TOC]
27
28## High-Level Structure
29
30MLIR is fundamentally based on a graph-like data structure of nodes, called
31*Operations*, and edges, called *Values*. Each Value is the result of exactly
32one Operation or Block Argument, and has a *Value Type* defined by the
33[type system](#type-system). [Operations](#operations) are contained in
34[Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations
35are also ordered within their containing block and Blocks are ordered in their
36containing region, although this order may or may not be semantically meaningful
37in a given [kind of region](Interfaces.md/#regionkindinterfaces)). Operations
38may also contain regions, enabling hierarchical structures to be represented.
39
40Operations can represent many different concepts, from higher-level concepts
41like function definitions, function calls, buffer allocations, view or slices of
42buffers, and process creation, to lower-level concepts like target-independent
43arithmetic, target-specific instructions, configuration registers, and logic
44gates. These different concepts are represented by different operations in MLIR
45and the set of operations usable in MLIR can be arbitrarily extended.
46
47MLIR also provides an extensible framework for transformations on operations,
48using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary
49set of passes on an arbitrary set of operations results in a significant scaling
50challenge, since each transformation must potentially take into account the
51semantics of any operation. MLIR addresses this complexity by allowing operation
52semantics to be described abstractly using [Traits](Traits) and
53[Interfaces](Interfaces.md), enabling transformations to operate on operations
54more generically. Traits often describe verification constraints on valid IR,
55enabling complex invariants to be captured and checked. (see
56[Op vs Operation](Tutorials/Toy/Ch-2.md/#op-vs-operation-using-mlir-operations))
57
58One obvious application of MLIR is to represent an
59[SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR,
60like the LLVM core IR, with appropriate choice of operation types to define
61Modules, Functions, Branches, Memory Allocation, and verification constraints to
62ensure the SSA Dominance property. MLIR includes a collection of dialects which
63defines just such structures. However, MLIR is intended to be general enough to
64represent other compiler-like data structures, such as Abstract Syntax Trees in
65a language frontend, generated instructions in a target-specific backend, or
66circuits in a High-Level Synthesis tool.
67
68Here's an example of an MLIR module:
69
70```mlir
71// Compute A*B using an implementation of multiply kernel and print the
72// result using a TensorFlow op. The dimensions of A and B are partially
73// known. The shapes are assumed to match.
74func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) {
75  // Compute the inner dimension of %A using the dim operation.
76  %n = memref.dim %A, 1 : tensor<100x?xf32>
77
78  // Allocate addressable "buffers" and copy tensors %A and %B into them.
79  %A_m = memref.alloc(%n) : memref<100x?xf32>
80  bufferization.materialize_in_destination %A in writable %A_m
81      : (tensor<100x?xf32>, memref<100x?xf32>) -> ()
82
83  %B_m = memref.alloc(%n) : memref<?x50xf32>
84  bufferization.materialize_in_destination %B in writable %B_m
85      : (tensor<?x50xf32>, memref<?x50xf32>) -> ()
86
87  // Call function @multiply passing memrefs as arguments,
88  // and getting returned the result of the multiplication.
89  %C_m = call @multiply(%A_m, %B_m)
90          : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>)
91
92  memref.dealloc %A_m : memref<100x?xf32>
93  memref.dealloc %B_m : memref<?x50xf32>
94
95  // Load the buffer data into a higher level "tensor" value.
96  %C = memref.tensor_load %C_m : memref<100x50xf32>
97  memref.dealloc %C_m : memref<100x50xf32>
98
99  // Call TensorFlow built-in function to print the result tensor.
100  "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>)
101
102  return %C : tensor<100x50xf32>
103}
104
105// A function that multiplies two memrefs and returns the result.
106func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>)
107          -> (memref<100x50xf32>)  {
108  // Compute the inner dimension of %A.
109  %n = memref.dim %A, 1 : memref<100x?xf32>
110
111  // Allocate memory for the multiplication result.
112  %C = memref.alloc() : memref<100x50xf32>
113
114  // Multiplication loop nest.
115  affine.for %i = 0 to 100 {
116     affine.for %j = 0 to 50 {
117        memref.store 0 to %C[%i, %j] : memref<100x50xf32>
118        affine.for %k = 0 to %n {
119           %a_v  = memref.load %A[%i, %k] : memref<100x?xf32>
120           %b_v  = memref.load %B[%k, %j] : memref<?x50xf32>
121           %prod = arith.mulf %a_v, %b_v : f32
122           %c_v  = memref.load %C[%i, %j] : memref<100x50xf32>
123           %sum  = arith.addf %c_v, %prod : f32
124           memref.store %sum, %C[%i, %j] : memref<100x50xf32>
125        }
126     }
127  }
128  return %C : memref<100x50xf32>
129}
130```
131
132## Notation
133
134MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip
135through a textual form. This is important for development of the compiler - e.g.
136for understanding the state of code as it is being transformed and writing test
137cases.
138
139This document describes the grammar using
140[Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form).
141
142This is the EBNF grammar used in this document, presented in yellow boxes.
143
144```
145alternation ::= expr0 | expr1 | expr2  // Either expr0 or expr1 or expr2.
146sequence    ::= expr0 expr1 expr2      // Sequence of expr0 expr1 expr2.
147repetition0 ::= expr*  // 0 or more occurrences.
148repetition1 ::= expr+  // 1 or more occurrences.
149optionality ::= expr?  // 0 or 1 occurrence.
150grouping    ::= (expr) // Everything inside parens is grouped together.
151literal     ::= `abcd` // Matches the literal `abcd`.
152```
153
154Code examples are presented in blue boxes.
155
156```
157// This is an example use of the grammar above:
158// This matches things like: ba, bana, boma, banana, banoma, bomana...
159example ::= `b` (`an` | `om`)* `a`
160```
161
162### Common syntax
163
164The following core grammar productions are used in this document:
165
166```
167// TODO: Clarify the split between lexing (tokens) and parsing (grammar).
168digit     ::= [0-9]
169hex_digit ::= [0-9a-fA-F]
170letter    ::= [a-zA-Z]
171id-punct  ::= [$._-]
172
173integer-literal ::= decimal-literal | hexadecimal-literal
174decimal-literal ::= digit+
175hexadecimal-literal ::= `0x` hex_digit+
176float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)?
177string-literal  ::= `"` [^"\n\f\v\r]* `"`   TODO: define escaping rules
178```
179
180Not listed here, but MLIR does support comments. They use standard BCPL syntax,
181starting with a `//` and going until the end of the line.
182
183
184### Top level Productions
185
186```
187// Top level production
188toplevel := (operation | attribute-alias-def | type-alias-def)*
189```
190
191The production `toplevel` is the top level production that is parsed by any parsing
192consuming the MLIR syntax. [Operations](#operations),
193[Attribute aliases](#attribute-value-aliases), and [Type aliases](#type-aliases)
194can be declared on the toplevel.
195
196### Identifiers and keywords
197
198Syntax:
199
200```
201// Identifiers
202bare-id ::= (letter|[_]) (letter|digit|[_$.])*
203bare-id-list ::= bare-id (`,` bare-id)*
204value-id ::= `%` suffix-id
205alias-name :: = bare-id
206suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*))
207
208symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)?
209value-id-list ::= value-id (`,` value-id)*
210
211// Uses of value, e.g. in an operand list to an operation.
212value-use ::= value-id (`#` decimal-literal)?
213value-use-list ::= value-use (`,` value-use)*
214```
215
216Identifiers name entities such as values, types and functions, and are chosen by
217the writer of MLIR code. Identifiers may be descriptive (e.g. `%batch_size`,
218`@matmul`), or may be non-descriptive when they are auto-generated (e.g. `%23`,
219`@func42`). Identifier names for values may be used in an MLIR text file but are
220not persisted as part of the IR - the printer will give them anonymous names
221like `%42`.
222
223MLIR guarantees identifiers never collide with keywords by prefixing identifiers
224with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts
225(e.g. affine expressions), identifiers are not prefixed, for brevity. New
226keywords may be added to future versions of MLIR without danger of collision
227with existing identifiers.
228
229Value identifiers are only [in scope](#value-scoping) for the (nested) region in
230which they are defined and cannot be accessed or referenced outside of that
231region. Argument identifiers in mapping functions are in scope for the mapping
232body. Particular operations may further limit which identifiers are in scope in
233their regions. For instance, the scope of values in a region with
234[SSA control flow semantics](#control-flow-and-ssacfg-regions) is constrained
235according to the standard definition of
236[SSA dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)).
237Another example is the [IsolatedFromAbove trait](Traits/#isolatedfromabove),
238which restricts directly accessing values defined in containing regions.
239
240Function identifiers and mapping identifiers are associated with
241[Symbols](SymbolsAndSymbolTables.md) and have scoping rules dependent on symbol
242attributes.
243
244## Dialects
245
246Dialects are the mechanism by which to engage with and extend the MLIR
247ecosystem. They allow for defining new [operations](#operations), as well as
248[attributes](#attributes) and [types](#type-system). Each dialect is given a
249unique `namespace` that is prefixed to each defined attribute/operation/type.
250For example, the [Affine dialect](Dialects/Affine.md) defines the namespace:
251`affine`.
252
253MLIR allows for multiple dialects, even those outside of the main tree, to
254co-exist together within one module. Dialects are produced and consumed by
255certain passes. MLIR provides a [framework](DialectConversion.md) to convert
256between, and within, different dialects.
257
258A few of the dialects supported by MLIR:
259
260*   [Affine dialect](Dialects/Affine.md)
261*   [Func dialect](Dialects/Func.md)
262*   [GPU dialect](Dialects/GPU.md)
263*   [LLVM dialect](Dialects/LLVM.md)
264*   [SPIR-V dialect](Dialects/SPIR-V.md)
265*   [Vector dialect](Dialects/Vector.md)
266
267### Target specific operations
268
269Dialects provide a modular way in which targets can expose target-specific
270operations directly through to MLIR. As an example, some targets go through
271LLVM. LLVM has a rich set of intrinsics for certain target-independent
272operations (e.g. addition with overflow check) as well as providing access to
273target-specific operations for the targets it supports (e.g. vector permutation
274operations). LLVM intrinsics in MLIR are represented via operations that start
275with an "llvm." name.
276
277Example:
278
279```mlir
280// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
281%x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
282```
283
284These operations only work when targeting LLVM as a backend (e.g. for CPUs and
285GPUs), and are required to align with the LLVM definition of these intrinsics.
286
287## Operations
288
289Syntax:
290
291```
292operation             ::= op-result-list? (generic-operation | custom-operation)
293                          trailing-location?
294generic-operation     ::= string-literal `(` value-use-list? `)`  successor-list?
295                          dictionary-properties? region-list? dictionary-attribute?
296                          `:` function-type
297custom-operation      ::= bare-id custom-operation-format
298op-result-list        ::= op-result (`,` op-result)* `=`
299op-result             ::= value-id (`:` integer-literal)?
300successor-list        ::= `[` successor (`,` successor)* `]`
301successor             ::= caret-id (`:` block-arg-list)?
302dictionary-properties ::= `<` dictionary-attribute `>`
303region-list           ::= `(` region (`,` region)* `)`
304dictionary-attribute  ::= `{` (attribute-entry (`,` attribute-entry)*)? `}`
305trailing-location     ::= `loc` `(` location `)`
306```
307
308MLIR introduces a uniform concept called *operations* to enable describing many
309different levels of abstractions and computations. Operations in MLIR are fully
310extensible (there is no fixed list of operations) and have application-specific
311semantics. For example, MLIR supports
312[target-independent operations](Dialects/MemRef.md),
313[affine operations](Dialects/Affine.md), and
314[target-specific machine operations](#target-specific-operations).
315
316The internal representation of an operation is simple: an operation is
317identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`,
318`ppc.eieio`, etc), can return zero or more results, take zero or more operands,
319has storage for [properties](#properties), has a dictionary of
320[attributes](#attributes), has zero or more successors, and zero or more
321enclosed [regions](#regions). The generic printing form includes all these
322elements literally, with a function type to indicate the types of the
323results and operands.
324
325Example:
326
327```mlir
328// An operation that produces two results.
329// The results of %result can be accessed via the <name> `#` <opNo> syntax.
330%result:2 = "foo_div"() : () -> (f32, i32)
331
332// Pretty form that defines a unique name for each result.
333%foo, %bar = "foo_div"() : () -> (f32, i32)
334
335// Invoke a TensorFlow function called tf.scramble with two inputs
336// and an attribute "fruit" stored in properties.
337%2 = "tf.scramble"(%result#0, %bar) <{fruit = "banana"}> : (f32, i32) -> f32
338
339// Invoke an operation with some discardable attributes
340%foo, %bar = "foo_div"() {some_attr = "value", other_attr = 42 : i64} : () -> (f32, i32)
341```
342
343In addition to the basic syntax above, dialects may register known operations.
344This allows those dialects to support *custom assembly form* for parsing and
345printing operations. In the operation sets listed below, we show both forms.
346
347### Builtin Operations
348
349The [builtin dialect](Dialects/Builtin.md) defines a select few operations that
350are widely applicable by MLIR dialects, such as a universal conversion cast
351operation that simplifies inter/intra dialect conversion. This dialect also
352defines a top-level `module` operation, that represents a useful IR container.
353
354## Blocks
355
356Syntax:
357
358```
359block           ::= block-label operation+
360block-label     ::= block-id block-arg-list? `:`
361block-id        ::= caret-id
362caret-id        ::= `^` suffix-id
363value-id-and-type ::= value-id `:` type
364
365// Non-empty list of names and types.
366value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)*
367
368block-arg-list ::= `(` value-id-and-type-list? `)`
369```
370
371A *Block* is a list of operations. In
372[SSACFG regions](#control-flow-and-ssacfg-regions), each block represents a
373compiler [basic block](https://en.wikipedia.org/wiki/Basic_block) where
374instructions inside the block are executed in order and terminator operations
375implement control flow branches between basic blocks.
376
377The last operation in a block must be a
378[terminator operation](#control-flow-and-ssacfg-regions). A region with a single
379block may opt out of this requirement by attaching the `NoTerminator` on the
380enclosing op. The top-level `ModuleOp` is an example of such an operation which
381defines this trait and whose block body does not have a terminator.
382
383Blocks in MLIR take a list of block arguments, notated in a function-like way.
384Block arguments are bound to values specified by the semantics of individual
385operations. Block arguments of the entry block of a region are also arguments to
386the region and the values bound to these arguments are determined by the
387semantics of the containing operation. Block arguments of other blocks are
388determined by the semantics of terminator operations, e.g. Branches, which have
389the block as a successor. In regions with
390[control flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure
391to implicitly represent the passage of control-flow dependent values without the
392complex nuances of PHI nodes in traditional SSA representations. Note that
393values which are not control-flow dependent can be referenced directly and do
394not need to be passed through block arguments.
395
396Here is a simple example function showing branches, returns, and block
397arguments:
398
399```mlir
400func.func @simple(i64, i1) -> i64 {
401^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
402  cf.cond_br %cond, ^bb1, ^bb2
403
404^bb1:
405  cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
406
407^bb2:
408  %b = arith.addi %a, %a : i64
409  cf.br ^bb3(%b: i64)    // Branch passes %b as the argument
410
411// ^bb3 receives an argument, named %c, from predecessors
412// and passes it on to bb4 along with %a. %a is referenced
413// directly from its defining operation and is not passed through
414// an argument of ^bb3.
415^bb3(%c: i64):
416  cf.br ^bb4(%c, %a : i64, i64)
417
418^bb4(%d : i64, %e : i64):
419  %0 = arith.addi %d, %e : i64
420  return %0 : i64   // Return is also a terminator.
421}
422```
423
424**Context:** The "block argument" representation eliminates a number of special
425cases from the IR compared to traditional "PHI nodes are operations" SSA IRs
426(like LLVM). For example, the
427[parallel copy semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf)
428of SSA is immediately apparent, and function arguments are no longer a special
429case: they become arguments to the entry block
430[[more rationale](Rationale/Rationale.md/#block-arguments-vs-phi-nodes)]. Blocks
431are also a fundamental concept that cannot be represented by operations because
432values defined in an operation cannot be accessed outside the operation.
433
434## Regions
435
436### Definition
437
438A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a
439region is not imposed by the IR. Instead, the containing operation defines the
440semantics of the regions it contains. MLIR currently defines two kinds of
441regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe
442control flow between blocks, and [Graph regions](#graph-regions), which do not
443require control flow between block. The kinds of regions within an operation are
444described using the [RegionKindInterface](Interfaces.md/#regionkindinterfaces).
445
446Regions do not have a name or an address, only the blocks contained in a region
447do. Regions must be contained within operations and have no type or attributes.
448The first block in the region is a special block called the 'entry block'. The
449arguments to the entry block are also the arguments of the region itself. The
450entry block cannot be listed as a successor of any other block. The syntax for a
451region is as follows:
452
453```
454region      ::= `{` entry-block? block* `}`
455entry-block ::= operation+
456```
457
458A function body is an example of a region: it consists of a CFG of blocks and
459has additional semantic restrictions that other types of regions may not have.
460For example, in a function body, block terminators must either branch to a
461different block, or return from a function where the types of the `return`
462arguments must match the result types of the function signature. Similarly, the
463function arguments must match the types and count of the region arguments. In
464general, operations with regions can define these correspondences arbitrarily.
465
466An *entry block* is a block with no label and no arguments that may occur at
467the beginning of a region. It enables a common pattern of using a region to
468open a new scope.
469
470
471### Value Scoping
472
473Regions provide hierarchical encapsulation of programs: it is impossible to
474reference, i.e. branch to, a block which is not in the same region as the source
475of the reference, i.e. a terminator operation. Similarly, regions provides a
476natural scoping for value visibility: values defined in a region don't escape to
477the enclosing region, if any. By default, operations inside a region can
478reference values defined outside of the region whenever it would have been legal
479for operands of the enclosing operation to reference those values, but this can
480be restricted using traits, such as
481[OpTrait::IsolatedFromAbove](Traits/#isolatedfromabove), or a custom
482verifier.
483
484Example:
485
486```mlir
487  "any_op"(%a) ({ // if %a is in-scope in the containing region...
488     // then %a is in-scope here too.
489    %new_value = "another_op"(%a) : (i64) -> (i64)
490  }) : (i64) -> (i64)
491```
492
493MLIR defines a generalized 'hierarchical dominance' concept that operates across
494hierarchy and defines whether a value is 'in scope' and can be used by a
495particular operation. Whether a value can be used by another operation in the
496same region is defined by the kind of region. A value defined in a region can be
497used by an operation which has a parent in the same region, if and only if the
498parent could use the value. A value defined by an argument to a region can
499always be used by any operation deeply contained in the region. A value defined
500in a region can never be used outside of the region.
501
502### Control Flow and SSACFG Regions
503
504In MLIR, control flow semantics of a region is indicated by
505[RegionKind::SSACFG](Interfaces.md/#regionkindinterfaces). Informally, these
506regions support semantics where operations in a region 'execute sequentially'.
507Before an operation executes, its operands have well-defined values. After an
508operation executes, the operands have the same values and results also have
509well-defined values. After an operation executes, the next operation in the
510block executes until the operation is the terminator operation at the end of a
511block, in which case some other operation will execute. The determination of the
512next instruction to execute is the 'passing of control flow'.
513
514In general, when control flow is passed to an operation, MLIR does not restrict
515when control flow enters or exits the regions contained in that operation.
516However, when control flow enters a region, it always begins in the first block
517of the region, called the *entry* block. Terminator operations ending each block
518represent control flow by explicitly specifying the successor blocks of the
519block. Control flow can only pass to one of the specified successor blocks as in
520a `branch` operation, or back to the containing operation as in a `return`
521operation. Terminator operations without successors can only pass control back
522to the containing operation. Within these restrictions, the particular semantics
523of terminator operations is determined by the specific dialect operations
524involved. Blocks (other than the entry block) that are not listed as a successor
525of a terminator operation are defined to be unreachable and can be removed
526without affecting the semantics of the containing operation.
527
528Although control flow always enters a region through the entry block, control
529flow may exit a region through any block with an appropriate terminator. The
530standard dialect leverages this capability to define operations with
531Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different
532blocks in the region and exiting through any block with a `return` operation.
533This behavior is similar to that of a function body in most programming
534languages. In addition, control flow may also not reach the end of a block or
535region, for example if a function call does not return.
536
537Example:
538
539```mlir
540func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region
541^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
542  cf.cond_br %cond, ^bb1, ^bb2
543
544^bb1:
545  // This def for %value does not dominate ^bb2
546  %value = "op.convert"(%a) : (i64) -> i64
547  cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
548
549^bb2:
550  accelerator.launch() { // An SSACFG region
551    ^bb0:
552      // Region of code nested under "accelerator.launch", it can reference %a but
553      // not %value.
554      %new_value = "accelerator.do_something"(%a) : (i64) -> ()
555  }
556  // %new_value cannot be referenced outside of the region
557
558^bb3:
559  ...
560}
561```
562
563#### Operations with Multiple Regions
564
565An operation containing multiple regions also completely determines the
566semantics of those regions. In particular, when control flow is passed to an
567operation, it may transfer control flow to any contained region. When control
568flow exits a region and is returned to the containing operation, the containing
569operation may pass control flow to any region in the same operation. An
570operation may also pass control flow to multiple contained regions concurrently.
571An operation may also pass control flow into regions that were specified in
572other operations, in particular those that defined the values or symbols the
573given operation uses as in a call operation. This passage of control is
574generally independent of passage of control flow through the basic blocks of the
575containing region.
576
577#### Closure
578
579Regions allow defining an operation that creates a closure, for example by
580“boxing” the body of the region into a value they produce. It remains up to the
581operation to define its semantics. Note that if an operation triggers
582asynchronous execution of the region, it is under the responsibility of the
583operation caller to wait for the region to be executed guaranteeing that any
584directly used values remain live.
585
586### Graph Regions
587
588In MLIR, graph-like semantics in a region is indicated by
589[RegionKind::Graph](Interfaces.md/#regionkindinterfaces). Graph regions are
590appropriate for concurrent semantics without control flow, or for modeling
591generic directed graph data structures. Graph regions are appropriate for
592representing cyclic relationships between coupled values where there is no
593fundamental order to the relationships. For instance, operations in a graph
594region may represent independent threads of control with values representing
595streams of data. As usual in MLIR, the particular semantics of a region is
596completely determined by its containing operation. Graph regions may only
597contain a single basic block (the entry block).
598
599**Rationale:** Currently graph regions are arbitrarily limited to a single basic
600block, although there is no particular semantic reason for this limitation. This
601limitation has been added to make it easier to stabilize the pass infrastructure
602and commonly used passes for processing graph regions to properly handle
603feedback loops. Multi-block regions may be allowed in the future if use cases
604that require it arise.
605
606In graph regions, MLIR operations naturally represent nodes, while each MLIR
607value represents a multi-edge connecting a single source node and multiple
608destination nodes. All values defined in the region as results of operations are
609in scope within the region and can be accessed by any other operation in the
610region. In graph regions, the order of operations within a block and the order
611of blocks in a region is not semantically meaningful and non-terminator
612operations may be freely reordered, for instance, by canonicalization. Other
613kinds of graphs, such as graphs with multiple source nodes and multiple
614destination nodes, can also be represented by representing graph edges as MLIR
615operations.
616
617Note that cycles can occur within a single block in a graph region, or between
618basic blocks.
619
620```mlir
621"test.graph_region"() ({ // A Graph region
622  %1 = "op1"(%1, %3) : (i32, i32) -> (i32)  // OK: %1, %3 allowed here
623  %2 = "test.ssacfg_region"() ({
624     %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region
625  }) : () -> (i32)
626  %3 = "op2"(%1, %4) : (i32, i32) -> (i32)  // OK: %4 allowed here
627  %4 = "op3"(%1) : (i32) -> (i32)
628}) : () -> ()
629```
630
631### Arguments and Results
632
633The arguments of the first block of a region are treated as arguments of the
634region. The source of these arguments is defined by the semantics of the parent
635operation. They may correspond to some of the values the operation itself uses.
636
637Regions produce a (possibly empty) list of values. The operation semantics
638defines the relation between the region results and the operation results.
639
640## Type System
641
642Each value in MLIR has a type defined by the type system. MLIR has an open type
643system (i.e. there is no fixed list of types), and types may have
644application-specific semantics. MLIR dialects may define any number of types
645with no restrictions on the abstractions they represent.
646
647```
648type ::= type-alias | dialect-type | builtin-type
649
650type-list-no-parens ::=  type (`,` type)*
651type-list-parens ::= `(` `)`
652                   | `(` type-list-no-parens `)`
653
654// This is a common way to refer to a value with a specified type.
655ssa-use-and-type ::= ssa-use `:` type
656ssa-use ::= value-use
657
658// Non-empty list of names and types.
659ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)*
660
661function-type ::= (type | type-list-parens) `->` (type | type-list-parens)
662```
663
664### Type Aliases
665
666```
667type-alias-def ::= `!` alias-name `=` type
668type-alias ::= `!` alias-name
669```
670
671MLIR supports defining named aliases for types. A type alias is an identifier
672that can be used in the place of the type that it defines. These aliases *must*
673be defined before their uses. Alias names may not contain a '.', since those
674names are reserved for [dialect types](#dialect-types).
675
676Example:
677
678```mlir
679!avx_m128 = vector<4 x f32>
680
681// Using the original type.
682"foo"(%x) : vector<4 x f32> -> ()
683
684// Using the type alias.
685"foo"(%x) : !avx_m128 -> ()
686```
687
688### Dialect Types
689
690Similarly to operations, dialects may define custom extensions to the type
691system.
692
693```
694dialect-namespace ::= bare-id
695
696dialect-type ::= `!` (opaque-dialect-type | pretty-dialect-type)
697opaque-dialect-type ::= dialect-namespace dialect-type-body
698pretty-dialect-type ::= dialect-namespace `.` pretty-dialect-type-lead-ident
699                                              dialect-type-body?
700pretty-dialect-type-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*`
701
702dialect-type-body ::= `<` dialect-type-contents+ `>`
703dialect-type-contents ::= dialect-type-body
704                            | `(` dialect-type-contents+ `)`
705                            | `[` dialect-type-contents+ `]`
706                            | `{` dialect-type-contents+ `}`
707                            | [^\[<({\]>)}\0]+
708```
709
710Dialect types are generally specified in an opaque form, where the contents
711of the type are defined within a body wrapped with the dialect namespace
712and `<>`. Consider the following examples:
713
714```mlir
715// A tensorflow string type.
716!tf<string>
717
718// A type with complex components.
719!foo<something<abcd>>
720
721// An even more complex type.
722!foo<"a123^^^" + bar>
723```
724
725Dialect types that are simple enough may use a prettier format, which unwraps
726part of the syntax into an equivalent, but lighter weight form:
727
728```mlir
729// A tensorflow string type.
730!tf.string
731
732// A type with complex components.
733!foo.something<abcd>
734```
735
736See [here](DefiningDialects/AttributesAndTypes.md) to learn how to define dialect types.
737
738### Builtin Types
739
740The [builtin dialect](Dialects/Builtin.md) defines a set of types that are
741directly usable by any other dialect in MLIR. These types cover a range from
742primitive integer and floating-point types, function types, and more.
743
744## Properties
745
746Properties are extra data members stored directly on an Operation class. They
747provide a way to store [inherent attributes](#attributes) and other arbitrary
748data. The semantics of the data is specific to a given operation, and may be
749exposed through [Interfaces](Interfaces.md) accessors and other methods.
750Properties can always be serialized to Attribute in order to be printed
751generically.
752
753## Attributes
754
755Syntax:
756
757```
758attribute-entry ::= (bare-id | string-literal) `=` attribute-value
759attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
760```
761
762Attributes are the mechanism for specifying constant data on operations in
763places where a variable is never allowed - e.g. the comparison predicate of a
764[`cmpi` operation](Dialects/ArithOps.md/#arithcmpi-arithcmpiop). Each operation has an
765attribute dictionary, which associates a set of attribute names to attribute
766values. MLIR's builtin dialect provides a rich set of
767[builtin attribute values](#builtin-attribute-values) out of the box (such as
768arrays, dictionaries, strings, etc.). Additionally, dialects can define their
769own [dialect attribute values](#dialect-attribute-values).
770
771For dialects which haven't adopted properties yet, the top-level attribute
772dictionary attached to an operation has special semantics. The attribute
773entries are considered to be of two different kinds based on whether their
774dictionary key has a dialect prefix:
775
776-   *inherent attributes* are inherent to the definition of an operation's
777    semantics. The operation itself is expected to verify the consistency of
778    these attributes. An example is the `predicate` attribute of the
779    `arith.cmpi` op. These attributes must have names that do not start with a
780    dialect prefix.
781
782-   *discardable attributes* have semantics defined externally to the operation
783    itself, but must be compatible with the operations's semantics. These
784    attributes must have names that start with a dialect prefix. The dialect
785    indicated by the dialect prefix is expected to verify these attributes. An
786    example is the `gpu.container_module` attribute.
787
788Note that attribute values are allowed to themselves be dictionary attributes,
789but only the top-level dictionary attribute attached to the operation is subject
790to the classification above.
791
792When properties are adopted, only discardable attributes are stored in the
793top-level dictionary, while inherent attributes are stored in the properties
794storage.
795
796### Attribute Value Aliases
797
798```
799attribute-alias-def ::= `#` alias-name `=` attribute-value
800attribute-alias ::= `#` alias-name
801```
802
803MLIR supports defining named aliases for attribute values. An attribute alias is
804an identifier that can be used in the place of the attribute that it defines.
805These aliases *must* be defined before their uses. Alias names may not contain a
806'.', since those names are reserved for
807[dialect attributes](#dialect-attribute-values).
808
809Example:
810
811```mlir
812#map = affine_map<(d0) -> (d0 + 10)>
813
814// Using the original attribute.
815%b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a)
816
817// Using the attribute alias.
818%b = affine.apply #map(%a)
819```
820
821### Dialect Attribute Values
822
823Similarly to operations, dialects may define custom attribute values.
824
825```
826dialect-namespace ::= bare-id
827
828dialect-attribute ::= `#` (opaque-dialect-attribute | pretty-dialect-attribute)
829opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body
830pretty-dialect-attribute ::= dialect-namespace `.` pretty-dialect-attribute-lead-ident
831                                              dialect-attribute-body?
832pretty-dialect-attribute-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*`
833
834dialect-attribute-body ::= `<` dialect-attribute-contents+ `>`
835dialect-attribute-contents ::= dialect-attribute-body
836                            | `(` dialect-attribute-contents+ `)`
837                            | `[` dialect-attribute-contents+ `]`
838                            | `{` dialect-attribute-contents+ `}`
839                            | [^\[<({\]>)}\0]+
840```
841
842Dialect attributes are generally specified in an opaque form, where the contents
843of the attribute are defined within a body wrapped with the dialect namespace
844and `<>`. Consider the following examples:
845
846```mlir
847// A string attribute.
848#foo<string<"">>
849
850// A complex attribute.
851#foo<"a123^^^" + bar>
852```
853
854Dialect attributes that are simple enough may use a prettier format, which unwraps
855part of the syntax into an equivalent, but lighter weight form:
856
857```mlir
858// A string attribute.
859#foo.string<"">
860```
861
862See [here](DefiningDialects/AttributesAndTypes.md) on how to define dialect attribute values.
863
864### Builtin Attribute Values
865
866The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values
867that are directly usable by any other dialect in MLIR. These types cover a range
868from primitive integer and floating-point values, attribute dictionaries, dense
869multi-dimensional arrays, and more.
870
871### IR Versioning
872
873A dialect can opt-in to handle versioning through the
874`BytecodeDialectInterface`. Few hooks are exposed to the dialect to allow
875managing a version encoded into the bytecode file. The version is loaded lazily
876and allows to retrieve the version information while parsing the input IR, and
877gives an opportunity to each dialect for which a version is present to perform
878IR upgrades post-parsing through the `upgradeFromVersion` method. Custom
879Attribute and Type encodings can also be upgraded according to the dialect
880version using readAttribute and readType methods.
881
882There is no restriction on what kind of information a dialect is allowed to
883encode to model its versioning. Currently, versioning is supported only for
884bytecode formats.
885