xref: /llvm-project/mlir/docs/DefiningDialects/AttributesAndTypes.md (revision d0b7633d7ad566579bfb794f95cce9aef294c92b)
1# Defining Dialect Attributes and Types
2
3This document describes how to define dialect
4[attributes](../LangRef.md/#attributes) and [types](../LangRef.md/#type-system).
5
6[TOC]
7
8## LangRef Refresher
9
10Before diving into how to define these constructs, below is a quick refresher
11from the [MLIR LangRef](../LangRef.md).
12
13### Attributes
14
15Attributes are the mechanism for specifying constant data on operations in
16places where a variable is never allowed - e.g. the comparison predicate of a
17[`arith.cmpi` operation](../Dialects/ArithOps.md/#arithcmpi-arithcmpiop), or
18the underlying value of a [`arith.constant` operation](../Dialects/ArithOps.md/#arithconstant-arithconstantop).
19Each operation has an attribute dictionary, which associates a set of attribute
20names to attribute values.
21
22### Types
23
24Every SSA value, such as operation results or block arguments, in MLIR has a type
25defined by the type system. MLIR has an open type system with no fixed list of types,
26and there are no restrictions on the abstractions they represent. For example, take
27the following [Arithmetic AddI operation](../Dialects/ArithOps.md/#arithaddi-arithaddiop):
28
29```mlir
30  %result = arith.addi %lhs, %rhs : i64
31```
32
33It takes two input SSA values (`%lhs` and `%rhs`), and returns a single SSA
34value (`%result`). The inputs and outputs of this operation are of type `i64`,
35which is an instance of the [Builtin IntegerType](../Dialects/Builtin.md/#integertype).
36
37## Attributes and Types
38
39The C++ Attribute and Type classes in MLIR (like Ops, and many other things) are
40value-typed. This means that instances of `Attribute` or `Type` are passed
41around by-value, as opposed to by-pointer or by-reference. The `Attribute` and
42`Type` classes act as wrappers around internal storage objects that are uniqued
43within an instance of an `MLIRContext`.
44
45The structure for defining Attributes and Types is nearly identical, with only a
46few differences depending on the context. As such, a majority of this document
47describes the process for defining both Attributes and Types side-by-side with
48examples for both. If necessary, a section will explicitly call out any
49distinct differences.
50
51One difference is that generating C++ classes from declarative TableGen
52definitions will require adding additional targets to your `CMakeLists.txt`.
53This is not necessary for custom types. The details are outlined further below.
54
55### Adding a new Attribute or Type definition
56
57As described above, C++ Attribute and Type objects in MLIR are value-typed and
58essentially function as helpful wrappers around an internal storage object that
59holds the actual data for the type. Similarly to Operations, Attributes and Types
60are defined declaratively via [TableGen](https://llvm.org/docs/TableGen/index.html);
61a generic language with tooling to maintain records of domain-specific information.
62It is highly recommended that users review the
63[TableGen Programmer's Reference](https://llvm.org/docs/TableGen/ProgRef.html)
64for an introduction to its syntax and constructs.
65
66Starting the definition of a new attribute or type simply requires adding a
67specialization for either the `AttrDef` or `TypeDef` class respectively. Instances
68of the classes correspond to unqiue Attribute or Type classes.
69
70Below show cases an example Attribute and Type definition. We generally recommend
71defining Attribute and Type classes in different `.td` files to better encapsulate
72the different constructs, and define a proper layering between them. This
73recommendation extends to all of the MLIR constructs, including [Interfaces](../Interfaces.md),
74Operations, etc.
75
76```tablegen
77// Include the definition of the necessary tablegen constructs for defining
78// our types.
79include "mlir/IR/AttrTypeBase.td"
80
81// It's common to define a base classes for types in the same dialect. This
82// removes the need to pass in the dialect for each type, and can also be used
83// to define a few fields ahead of time.
84class MyDialect_Type<string name, string typeMnemonic, list<Trait> traits = []>
85    : TypeDef<My_Dialect, name, traits> {
86  let mnemonic = typeMnemonic;
87}
88
89// Here is a simple definition of an "integer" type, with a width parameter.
90def My_IntegerType : MyDialect_Type<"Integer", "int"> {
91  let summary = "Integer type with arbitrary precision up to a fixed limit";
92  let description = [{
93    Integer types have a designated bit width.
94  }];
95  /// Here we defined a single parameter for the type, which is the bitwidth.
96  let parameters = (ins "unsigned":$width);
97
98  /// Here we define the textual format of the type declaratively, which will
99  /// automatically generate parser and printer logic. This will allow for
100  /// instances of the type to be output as, for example:
101  ///
102  ///    !my.int<10> // a 10-bit integer.
103  ///
104  let assemblyFormat = "`<` $width `>`";
105
106  /// Indicate that our type will add additional verification to the parameters.
107  let genVerifyDecl = 1;
108}
109```
110
111Below is an example of an Attribute:
112
113```tablegen
114// Include the definition of the necessary tablegen constructs for defining
115// our attributes.
116include "mlir/IR/AttrTypeBase.td"
117
118// It's common to define a base classes for attributes in the same dialect. This
119// removes the need to pass in the dialect for each attribute, and can also be used
120// to define a few fields ahead of time.
121class MyDialect_Attr<string name, string attrMnemonic, list<Trait> traits = []>
122    : AttrDef<My_Dialect, name, traits> {
123  let mnemonic = attrMnemonic;
124}
125
126// Here is a simple definition of an "integer" attribute, with a type and value parameter.
127def My_IntegerAttr : MyDialect_Attr<"Integer", "int"> {
128  let summary = "An Attribute containing a integer value";
129  let description = [{
130    An integer attribute is a literal attribute that represents an integral
131    value of the specified integer type.
132  }];
133  /// Here we've defined two parameters, one is a "self" type parameter, and the
134  /// other is the integer value of the attribute. The self type parameter is
135  /// specially handled by the assembly format.
136  let parameters = (ins AttributeSelfTypeParameter<"">:$type, "APInt":$value);
137
138  /// Here we've defined a custom builder for the type, that removes the need to pass
139  /// in an MLIRContext instance; as it can be infered from the `type`.
140  let builders = [
141    AttrBuilderWithInferredContext<(ins "Type":$type,
142                                        "const APInt &":$value), [{
143      return $_get(type.getContext(), type, value);
144    }]>
145  ];
146
147  /// Here we define the textual format of the attribute declaratively, which will
148  /// automatically generate parser and printer logic. This will allow for
149  /// instances of the attribute to be output as, for example:
150  ///
151  ///    #my.int<50> : !my.int<32> // a 32-bit integer of value 50.
152  ///
153  /// Note that the self type parameter is not included in the assembly format.
154  /// Its value is derived from the optional trailing type on all attributes.
155  let assemblyFormat = "`<` $value `>`";
156
157  /// Indicate that our attribute will add additional verification to the parameters.
158  let genVerifyDecl = 1;
159
160  /// Indicate to the ODS generator that we do not want the default builders,
161  /// as we have defined our own simpler ones.
162  let skipDefaultBuilders = 1;
163}
164```
165
166### Class Name
167
168The name of the C++ class which gets generated defaults to
169`<classParamName>Attr` or `<classParamName>Type` for attributes and types
170respectively. In the examples above, this was the `name` template parameter that
171was provided to `MyDialect_Attr` and `MyDialect_Type`. For the definitions we
172added above, we would get C++ classes named `IntegerType` and `IntegerAttr`
173respectively. This can be explicitly overridden via the `cppClassName` field.
174
175### CMake Targets
176
177If you added your dialect using `add_mlir_dialect()` in your `CMakeLists.txt`,
178the above mentioned classes will automatically get generated for custom
179_types_. They will be output in a file named `<Your Dialect>Types.h.inc`.
180
181To also generate the classes for custom _attributes_, you will need to add
182two additional TableGen targets to your `CMakeLists.txt`:
183
184```cmake
185mlir_tablegen(<Your Dialect>AttrDefs.h.inc -gen-attrdef-decls
186              -attrdefs-dialect=<Your Dialect>)
187mlir_tablegen(<Your Dialect>AttrDefs.cpp.inc -gen-attrdef-defs
188              -attrdefs-dialect=<Your Dialect>)
189add_public_tablegen_target(<Your Dialect>AttrDefsIncGen)
190```
191
192The generated `<Your Dialect>AttrDefs.h.inc` will need to be included whereever
193you are referencing the custom attribute types.
194
195### Documentation
196
197The `summary` and `description` fields allow for providing user documentation
198for the attribute or type. The `summary` field expects a simple single-line
199string, with the `description` field used for long and extensive documentation.
200This documentation can be used to generate markdown documentation for the
201dialect and is used by upstream
202[MLIR dialects](https://mlir.llvm.org/docs/Dialects/).
203
204### Mnemonic
205
206The `mnemonic` field, i.e. the template parameters `attrMnemonic` and
207`typeMnemonic` we specified above, are used to specify a name for use during
208parsing. This allows for more easily dispatching to the current attribute or
209type class when parsing IR. This field is generally optional, and custom
210parsing/printing logic can be added without defining it, though most classes
211will want to take advantage of the convenience it provides. This is why we
212added it as a template parameter in the examples above.
213
214### Parameters
215
216The `parameters` field is a variable length list containing the attribute or
217type's parameters. If no parameters are specified (the default), this type is
218considered a singleton type (meaning there is only one possible instance).
219Parameters in this list take the form: `"c++Type":$paramName`. Parameter types
220with a C++ type that requires allocation when constructing the storage instance
221in the context require one of the following:
222
223- Utilize the `AttrParameter` or `TypeParameter` classes instead of the raw
224  "c++Type" string. This allows for providing custom allocation code when using
225  that parameter. `StringRefParameter` and `ArrayRefParameter` are examples of
226  common parameter types that require allocation.
227- Set the `genAccessors` field to 1 (the default) to generate accessor methods
228  for each parameter (e.g. `int getWidth() const` in the Type example above).
229- Set the `hasCustomStorageConstructor` field to `1` to generate a storage class
230  that only declares the constructor, allowing for you to specialize it with
231  whatever allocation code necessary.
232
233#### AttrParameter, TypeParameter, and AttrOrTypeParameter
234
235As hinted at above, these classes allow for specifying parameter types with
236additional functionality. This is generally useful for complex parameters, or those
237with additional invariants that prevent using the raw C++ class. Examples
238include documentation (e.g. the `summary` and `syntax` field), the C++ type, a
239custom allocator to use in the storage constructor method, a custom comparator
240to decide if two instances of the parameter type are equal, etc. As the names
241may suggest, `AttrParameter` is intended for parameters on Attributes,
242`TypeParameter` for Type parameters, and `AttrOrTypeParameters` for either.
243
244Below is an easy parameter pitfall, and highlights when to use these parameter
245classes.
246
247```tablegen
248let parameters = (ins "ArrayRef<int>":$dims);
249```
250
251The above seems innocuous, but it is often a bug! The default storage
252constructor blindly copies parameters by value. It does not know anything about
253the types, meaning that the data of this ArrayRef will be copied as-is and is
254likely to lead to use-after-free errors when using the created Attribute or
255Type if the underlying does not have a lifetime exceeding that of the MLIRContext.
256If the lifetime of the data can't be guaranteed, the `ArrayRef<int>` requires
257allocation to ensure that its elements reside within the MLIRContext, e.g. with
258`dims = allocator.copyInto(dims)`.
259
260Here is a simple example for the exact situation above:
261
262```tablegen
263def ArrayRefIntParam : TypeParameter<"::llvm::ArrayRef<int>", "Array of int"> {
264  let allocator = "$_dst = $_allocator.copyInto($_self);";
265}
266
267The parameter can then be used as so:
268
269...
270let parameters = (ins ArrayRefIntParam:$dims);
271```
272
273Below contains descriptions for other various available fields:
274
275The `allocator` code block has the following substitutions:
276
277- `$_allocator` is the TypeStorageAllocator in which to allocate objects.
278- `$_dst` is the variable in which to place the allocated data.
279
280The `comparator` code block has the following substitutions:
281
282- `$_lhs` is an instance of the parameter type.
283- `$_rhs` is an instance of the parameter type.
284
285MLIR includes several specialized classes for common situations:
286
287- `APFloatParameter` for APFloats.
288
289- `StringRefParameter<descriptionOfParam>` for StringRefs.
290
291- `ArrayRefParameter<arrayOf, descriptionOfParam>` for ArrayRefs of value types.
292
293- `SelfAllocationParameter<descriptionOfParam>` for C++ classes which contain a
294  method called `allocateInto(StorageAllocator &allocator)` to allocate itself
295  into `allocator`.
296
297- `ArrayRefOfSelfAllocationParameter<arrayOf, descriptionOfParam>` for arrays of
298  objects which self-allocate as per the last specialization.
299
300- `AttributeSelfTypeParameter` is a special `AttrParameter` that represents
301  parameters derived from the optional trailing type on attributes.
302
303### Traits
304
305Similarly to operations, Attribute and Type classes may attach `Traits` that
306provide additional mixin methods and other data. `Trait`s may be attached via
307the trailing template argument, i.e. the `traits` list parameter in the example
308above. See the main [`Trait`](../Traits) documentation for more information
309on defining and using traits.
310
311### Interfaces
312
313Attribute and Type classes may attach `Interfaces` to provide an virtual
314interface into the Attribute or Type. `Interfaces` are added in the same way as
315[Traits](#Traits), by using the `traits` list template parameter of the
316`AttrDef` or `TypeDef`. See the main [`Interface`](../Interfaces.md)
317documentation for more information on defining and using interfaces.
318
319### Builders
320
321For each attribute or type, there are a few builders(`get`/`getChecked`)
322automatically generated based on the parameters of the type. These are used to
323construct instances of the corresponding attribute or type. For example, given
324the following definition:
325
326```tablegen
327def MyAttrOrType : ... {
328  let parameters = (ins "int":$intParam);
329}
330```
331
332The following builders are generated:
333
334```c++
335// Builders are named `get`, and return a new instance for a given set of parameters.
336static MyAttrOrType get(MLIRContext *context, int intParam);
337
338// If `genVerifyDecl` is set to 1, the following method is also generated. This method
339// is similar to `get`, but is failable and on error will return nullptr.
340static MyAttrOrType getChecked(function_ref<InFlightDiagnostic()> emitError,
341                               MLIRContext *context, int intParam);
342```
343
344If these autogenerated methods are not desired, such as when they conflict with
345a custom builder method, the `skipDefaultBuilders` field may be set to 1 to
346signal that the default builders should not be generated.
347
348#### Custom builder methods
349
350The default builder methods may cover a majority of the simple cases related to
351construction, but when they cannot satisfy all of an attribute or type's needs,
352additional builders may be defined via the `builders` field. The `builders`
353field is a list of custom builders, either using `TypeBuilder` for types or
354`AttrBuilder` for attributes, that are added to the attribute or type class. The
355following will showcase several examples for defining builders for a custom type
356`MyType`, the process is the same for attributes except that attributes use
357`AttrBuilder` instead of `TypeBuilder`.
358
359```tablegen
360def MyType : ... {
361  let parameters = (ins "int":$intParam);
362
363  let builders = [
364    TypeBuilder<(ins "int":$intParam)>,
365    TypeBuilder<(ins CArg<"int", "0">:$intParam)>,
366    TypeBuilder<(ins CArg<"int", "0">:$intParam), [{
367      // Write the body of the `get` builder inline here.
368      return Base::get($_ctxt, intParam);
369    }]>,
370    TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{
371      // This builder states that it can infer an MLIRContext instance from
372      // its arguments.
373      return Base::get(typeParam.getContext(), ...);
374    }]>,
375    TypeBuilder<(ins "int":$intParam), [{}], "IntegerType">,
376  ];
377}
378```
379
380In this example, we provide several different convenience builders that are
381useful in different scenarios. The `ins` prefix is common to many function
382declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What
383follows is a comma-separated list of types (quoted string or `CArg`) and names
384prefixed with the `$` sign. The use of `CArg` allows for providing a default
385value to that argument. Let's take a look at each of these builders individually
386
387The first builder will generate the declaration of a builder method that looks
388like:
389
390```tablegen
391  let builders = [
392    TypeBuilder<(ins "int":$intParam)>,
393  ];
394```
395
396```c++
397class MyType : /*...*/ {
398  /*...*/
399  static MyType get(::mlir::MLIRContext *context, int intParam);
400};
401```
402
403This builder is identical to the one that will be automatically generated for
404`MyType`. The `context` parameter is implicitly added by the generator, and is
405used when building the Type instance (with `Base::get`). The distinction here is
406that we can provide the implementation of this `get` method. With this style of
407builder definition only the declaration is generated, the implementor of
408`MyType` will need to provide a definition of `MyType::get`.
409
410The second builder will generate the declaration of a builder method that looks
411like:
412
413```tablegen
414  let builders = [
415    TypeBuilder<(ins CArg<"int", "0">:$intParam)>,
416  ];
417```
418
419```c++
420class MyType : /*...*/ {
421  /*...*/
422  static MyType get(::mlir::MLIRContext *context, int intParam = 0);
423};
424```
425
426The constraints here are identical to the first builder example except for the
427fact that `intParam` now has a default value attached.
428
429The third builder will generate the declaration of a builder method that looks
430like:
431
432```tablegen
433  let builders = [
434    TypeBuilder<(ins CArg<"int", "0">:$intParam), [{
435      // Write the body of the `get` builder inline here.
436      return Base::get($_ctxt, intParam);
437    }]>,
438  ];
439```
440
441```c++
442class MyType : /*...*/ {
443  /*...*/
444  static MyType get(::mlir::MLIRContext *context, int intParam = 0);
445};
446
447MyType MyType::get(::mlir::MLIRContext *context, int intParam) {
448  // Write the body of the `get` builder inline here.
449  return Base::get(context, intParam);
450}
451```
452
453This is identical to the second builder example. The difference is that now, a
454definition for the builder method will be generated automatically using the
455provided code block as the body. When specifying the body inline, `$_ctxt` may
456be used to access the `MLIRContext *` parameter.
457
458The fourth builder will generate the declaration of a builder method that looks
459like:
460
461```tablegen
462  let builders = [
463    TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{
464      // This builder states that it can infer an MLIRContext instance from
465      // its arguments.
466      return Base::get(typeParam.getContext(), ...);
467    }]>,
468  ];
469```
470
471```c++
472class MyType : /*...*/ {
473  /*...*/
474  static MyType get(Type typeParam);
475};
476
477MyType MyType::get(Type typeParam) {
478  // This builder states that it can infer an MLIRContext instance from its
479  // arguments.
480  return Base::get(typeParam.getContext(), ...);
481}
482```
483
484In this builder example, the main difference from the third builder example
485there is that the `MLIRContext` parameter is no longer added. This is because
486the builder used `TypeBuilderWithInferredContext` implies that the context
487parameter is not necessary as it can be inferred from the arguments to the
488builder.
489
490The fifth builder will generate the declaration of a builder method with a
491custom return type, like:
492
493```tablegen
494  let builders = [
495    TypeBuilder<(ins "int":$intParam), [{}], "IntegerType">,
496  ]
497```
498
499```c++
500class MyType : /*...*/ {
501  /*...*/
502  static IntegerType get(::mlir::MLIRContext *context, int intParam);
503
504};
505```
506
507This generates a builder declaration the same as the first three examples, but
508the return type of the builder is user-specified instead of the attribute or
509type class. This is useful for defining builders of attributes and types that
510may fold or canonicalize on construction.
511
512### Parsing and Printing
513
514If a mnemonic was specified, the `hasCustomAssemblyFormat` and `assemblyFormat`
515fields may be used to specify the assembly format of an attribute or type. Attributes
516and Types with no parameters need not use either of these fields, in which case
517the syntax for the Attribute or Type is simply the mnemonic.
518
519For each dialect, two "dispatch" functions will be created: one for parsing and
520one for printing. These static functions placed alongside the class definitions
521and have the following function signatures:
522
523```c++
524static ParseResult generatedAttributeParser(DialectAsmParser& parser, StringRef *mnemonic, Type attrType, Attribute &result);
525static LogicalResult generatedAttributePrinter(Attribute attr, DialectAsmPrinter& printer);
526
527static ParseResult generatedTypeParser(DialectAsmParser& parser, StringRef *mnemonic, Type &result);
528static LogicalResult generatedTypePrinter(Type type, DialectAsmPrinter& printer);
529```
530
531The above functions should be added to the respective in your
532`Dialect::printType` and `Dialect::parseType` methods, or consider using the
533`useDefaultAttributePrinterParser` and `useDefaultTypePrinterParser` ODS Dialect
534options if all attributes or types define a mnemonic.
535
536The mnemonic, hasCustomAssemblyFormat, and assemblyFormat fields are optional.
537If none are defined, the generated code will not include any parsing or printing
538code and omit the attribute or type from the dispatch functions above. In this
539case, the dialect author is responsible for parsing/printing in the respective
540`Dialect::parseAttribute`/`Dialect::printAttribute` and
541`Dialect::parseType`/`Dialect::printType` methods.
542
543#### Using `hasCustomAssemblyFormat`
544
545Attributes and types defined in ODS with a mnemonic can define an
546`hasCustomAssemblyFormat` to specify custom parsers and printers defined in C++.
547When set to `1` a corresponding `parse` and `print` method will be declared on
548the Attribute or Type class to be defined by the user.
549
550For Types, these methods will have the form:
551
552- `static Type MyType::parse(AsmParser &parser)`
553
554- `void MyType::print(AsmPrinter &p) const`
555
556For Attributes, these methods will have the form:
557
558- `static Attribute MyAttr::parse(AsmParser &parser, Type attrType)`
559
560- `void MyAttr::print(AsmPrinter &p) const`
561
562#### Using `assemblyFormat`
563
564Attributes and types defined in ODS with a mnemonic can define an
565`assemblyFormat` to declaratively describe custom parsers and printers. The
566assembly format consists of literals, variables, and directives.
567
568- A literal is a keyword or valid punctuation enclosed in backticks, e.g.
569  `` `keyword` `` or `` `<` ``.
570- A variable is a parameter name preceded by a dollar sign, e.g. `$param0`,
571  which captures one attribute or type parameter.
572- A directive is a keyword followed by an optional argument list that defines
573  special parser and printer behaviour.
574
575```tablegen
576// An example type with an assembly format.
577def MyType : TypeDef<My_Dialect, "MyType"> {
578  // Define a mnemonic to allow the dialect's parser hook to call into the
579  // generated parser.
580  let mnemonic = "my_type";
581
582  // Define two parameters whose C++ types are indicated in string literals.
583  let parameters = (ins "int":$count, "AffineMap":$map);
584
585  // Define the assembly format. Surround the format with less `<` and greater
586  // `>` so that MLIR's printer uses the pretty format.
587  let assemblyFormat = "`<` $count `,` `map` `=` $map `>`";
588}
589```
590
591The declarative assembly format for `MyType` results in the following format in
592the IR:
593
594```mlir
595!my_dialect.my_type<42, map = affine_map<(i, j) -> (j, i)>>
596```
597
598##### Parameter Parsing and Printing
599
600For many basic parameter types, no additional work is needed to define how these
601parameters are parsed or printed.
602
603- The default printer for any parameter is `$_printer << $_self`, where `$_self`
604  is the C++ value of the parameter and `$_printer` is an `AsmPrinter`.
605- The default parser for a parameter is
606  `FieldParser<$cppClass>::parse($_parser)`, where `$cppClass` is the C++ type
607  of the parameter and `$_parser` is an `AsmParser`.
608
609Printing and parsing behaviour can be added to additional C++ types by
610overloading these functions or by defining a `parser` and `printer` in an ODS
611parameter class.
612
613Example of overloading:
614
615```c++
616using MyParameter = std::pair<int, int>;
617
618AsmPrinter &operator<<(AsmPrinter &printer, MyParameter param) {
619  printer << param.first << " * " << param.second;
620}
621
622template <> struct FieldParser<MyParameter> {
623  static FailureOr<MyParameter> parse(AsmParser &parser) {
624    int a, b;
625    if (parser.parseInteger(a) || parser.parseStar() ||
626        parser.parseInteger(b))
627      return failure();
628    return MyParameter(a, b);
629  }
630};
631```
632
633Example of using ODS parameter classes:
634
635```tablegen
636def MyParameter : TypeParameter<"std::pair<int, int>", "pair of ints"> {
637  let printer = [{ $_printer << $_self.first << " * " << $_self.second }];
638  let parser = [{ [&] -> FailureOr<std::pair<int, int>> {
639    int a, b;
640    if ($_parser.parseInteger(a) || $_parser.parseStar() ||
641        $_parser.parseInteger(b))
642      return failure();
643    return std::make_pair(a, b);
644  }() }];
645}
646```
647
648A type using this parameter with the assembly format `` `<` $myParam `>` `` will
649look as follows in the IR:
650
651```mlir
652!my_dialect.my_type<42 * 24>
653```
654
655###### Non-POD Parameters
656
657Parameters that aren't plain-old-data (e.g. references) may need to define a
658`cppStorageType` to contain the data until it is copied into the allocator. For
659example, `StringRefParameter` uses `std::string` as its storage type, whereas
660`ArrayRefParameter` uses `SmallVector` as its storage type. The parsers for
661these parameters are expected to return `FailureOr<$cppStorageType>`.
662
663To add a custom conversion between the `cppStorageType` and the C++ type of the
664parameter, parameters can override `convertFromStorage`, which by default is
665`"$_self"` (i.e., it attempts an implicit conversion from `cppStorageType`).
666
667###### Optional and Default-Valued Parameters
668
669An optional parameter can be omitted from the assembly format of an attribute or
670a type. An optional parameter is omitted when it is equal to its default value.
671Optional parameters in the assembly format can be indicated by setting
672`defaultValue`, a string of the C++ default value. If a value for the parameter
673was not encountered during parsing, it is set to this default value. If a
674parameter is equal to its default value, it is not printed. The `comparator`
675field of the parameter is used, but if one is not specified, the equality
676operator is used.
677
678When using `OptionalParameter`, the default value is set to the C++
679default-constructed value for the C++ storage type. For example, `Optional<int>`
680will be set to `std::nullopt` and `Attribute` will be set to `nullptr`. The
681presence of these parameters is tested by comparing them to their "null" values.
682
683An optional group is a set of elements optionally printed based on the presence
684of an anchor. Only optional parameters or directives that only capture optional
685parameters can be used in optional groups. The group in which the anchor is
686placed is printed if it is present, otherwise the other one is printed. If a
687directive that captures more than one optional parameter is used as the anchor,
688the optional group is printed if any of the captured parameters is present. For
689example, a `custom` directive may only be used as an optional group anchor if it
690captures at least one optional parameter.
691
692Suppose parameter `a` is an `IntegerAttr`.
693
694```
695( `(` $a^ `)` ) : (`x`)?
696```
697
698In the above assembly format, if `a` is present (non-null), then it will be
699printed as `(5 : i32)`. If it is not present, it will be `x`. Directives that
700are used inside optional groups are allowed only if all captured parameters are
701also optional.
702
703An optional parameter can also be specified with `DefaultValuedParameter`, which
704specifies that a parameter should be omitted when it is equal to some given
705value.
706
707```tablegen
708let parameters = (ins DefaultValuedParameter<"Optional<int>", "5">:$a)
709let mnemonic = "default_valued";
710let assemblyFormat = "(`<` $a^ `>`)?";
711```
712
713Which will look like:
714
715```mlir
716!test.default_valued     // a = 5
717!test.default_valued<10> // a = 10
718```
719
720For optional `Attribute` or `Type` parameters, the current MLIR context is
721available through `$_ctxt`. E.g.
722
723```tablegen
724DefaultValuedParameter<"IntegerType", "IntegerType::get($_ctxt, 32)">
725```
726
727The value of parameters that appear __before__ the default-valued parameter in
728the parameter declaration list are available as substitutions. E.g.
729
730```tablegen
731let parameters = (ins
732  "IntegerAttr":$value,
733  DefaultValuedParameter<"Type", "$value.getType()">:$type
734);
735```
736
737###### Attribute Self Type Parameter
738
739An attribute optionally has a trailing type after the assembly format of the
740attribute value itself. MLIR parses over the attribute value and optionally
741parses a colon-type before passing the `Type` into the dialect parser hook.
742
743```
744dialect-attribute  ::= `#` dialect-namespace `<` attr-data `>`
745                       (`:` type)?
746                     | `#` alias-name pretty-dialect-sym-body? (`:` type)?
747```
748
749`AttributeSelfTypeParameter` is an attribute parameter specially handled by the
750assembly format generator. Only one such parameter can be specified, and its
751value is derived from the trailing type. This parameter's default value is
752`NoneType::get($_ctxt)`.
753
754In order for the type to be printed by
755MLIR, however, the attribute must implement `TypedAttrInterface`. For example,
756
757```tablegen
758// This attribute has only a self type parameter.
759def MyExternAttr : AttrDef<MyDialect, "MyExtern", [TypedAttrInterface]> {
760  let parameters = (AttributeSelfTypeParameter<"">:$type);
761  let mnemonic = "extern";
762  let assemblyFormat = "";
763}
764```
765
766This attribute can look like:
767
768```mlir
769#my_dialect.extern // none
770#my_dialect.extern : i32
771#my_dialect.extern : tensor<4xi32>
772#my_dialect.extern : !my_dialect.my_type
773```
774
775##### Assembly Format Directives
776
777Attribute and type assembly formats have the following directives:
778
779- `params`: capture all parameters of an attribute or type.
780- `qualified`: mark a parameter to be printed with its leading dialect and
781  mnemonic.
782- `struct`: generate a "struct-like" parser and printer for a list of key-value
783  pairs.
784- `custom`: dispatch a call to user-define parser and printer functions
785- `ref`: in a custom directive, references a previously bound variable
786
787###### `params` Directive
788
789This directive is used to refer to all parameters of an attribute or type, except
790for the attribute self type (which is handled separately from normal parameters).
791When used as a top-level directive, `params` generates a parser and printer for a
792comma-separated list of the parameters. For example:
793
794```tablegen
795def MyPairType : TypeDef<My_Dialect, "MyPairType"> {
796  let parameters = (ins "int":$a, "int":$b);
797  let mnemonic = "pair";
798  let assemblyFormat = "`<` params `>`";
799}
800```
801
802In the IR, this type will appear as:
803
804```mlir
805!my_dialect.pair<42, 24>
806```
807
808The `params` directive can also be passed to other directives, such as `struct`,
809as an argument that refers to all parameters in place of explicitly listing all
810parameters as variables.
811
812###### `qualified` Directive
813
814This directive can be used to wrap attribute or type parameters such that they
815are printed in a fully qualified form, i.e., they include the dialect name and
816mnemonic prefix.
817
818For example:
819
820```tablegen
821def OuterType : TypeDef<My_Dialect, "MyOuterType"> {
822  let parameters = (ins MyPairType:$inner);
823  let mnemonic = "outer";
824  let assemblyFormat = "`<` pair `:` $inner `>`";
825}
826def OuterQualifiedType : TypeDef<My_Dialect, "MyOuterQualifiedType"> {
827  let parameters = (ins MyPairType:$inner);
828  let mnemonic = "outer_qual";
829  let assemblyFormat = "`<` pair `:` qualified($inner) `>`";
830}
831```
832
833In the IR, the types will appear as:
834
835```mlir
836!my_dialect.outer<pair : <42, 24>>
837!my_dialect.outer_qual<pair : !mydialect.pair<42, 24>>
838```
839
840If optional parameters are present, they are not printed in the parameter list
841if they are not present.
842
843###### `struct` Directive
844
845The `struct` directive accepts a list of variables to capture and will generate
846a parser and printer for a comma-separated list of key-value pairs. If an
847optional parameter is included in the `struct`, it can be elided. The variables
848are printed in the order they are specified in the argument list **but can be
849parsed in any order**. For example:
850
851```tablegen
852def MyStructType : TypeDef<My_Dialect, "MyStructType"> {
853  let parameters = (ins StringRefParameter<>:$sym_name,
854                        "int":$a, "int":$b, "int":$c);
855  let mnemonic = "struct";
856  let assemblyFormat = "`<` $sym_name `->` struct($a, $b, $c) `>`";
857}
858```
859
860In the IR, this type can appear with any permutation of the order of the
861parameters captured in the directive.
862
863```mlir
864!my_dialect.struct<"foo" -> a = 1, b = 2, c = 3>
865!my_dialect.struct<"foo" -> b = 2, c = 3, a = 1>
866```
867
868Passing `params` as the only argument to `struct` makes the directive capture
869all the parameters of the attribute or type. For the same type above, an
870assembly format of `` `<` struct(params) `>` `` will result in:
871
872```mlir
873!my_dialect.struct<b = 2, sym_name = "foo", c = 3, a = 1>
874```
875
876The order in which the parameters are printed is the order in which they are
877declared in the attribute's or type's `parameter` list.
878
879###### `custom` and `ref` directive
880
881The `custom` directive is used to dispatch calls to user-defined printer and
882parser functions. For example, suppose we had the following type:
883
884```tablegen
885let parameters = (ins "int":$foo, "int":$bar);
886let assemblyFormat = "custom<Foo>($foo) custom<Bar>($bar, ref($foo))";
887```
888
889The `custom` directive `custom<Foo>($foo)` will in the parser and printer
890respectively generate calls to:
891
892```c++
893ParseResult parseFoo(AsmParser &parser, int &foo);
894void printFoo(AsmPrinter &printer, int foo);
895```
896
897As you can see, by default parameters are passed into the parse function by
898reference. This is only possible if the C++ type is default constructible.
899If the C++ type is not default constructible, the parameter is wrapped in a
900`FailureOr`. Therefore, given the following definition:
901
902```tablegen
903let parameters = (ins "NotDefaultConstructible":$foobar);
904let assemblyFormat = "custom<Fizz>($foobar)";
905```
906
907It will generate calls expecting the following signature for `parseFizz`:
908
909```c++
910ParseResult parseFizz(AsmParser &parser, FailureOr<NotDefaultConstructible> &foobar);
911```
912
913A previously bound variable can be passed as a parameter to a `custom` directive
914by wrapping it in a `ref` directive. In the previous example, `$foo` is bound by
915the first directive. The second directive references it and expects the
916following printer and parser signatures:
917
918```c++
919ParseResult parseBar(AsmParser &parser, int &bar, int foo);
920void printBar(AsmPrinter &printer, int bar, int foo);
921```
922
923More complex C++ types can be used with the `custom` directive. The only caveat
924is that the parameter for the parser must use the storage type of the parameter.
925For example, `StringRefParameter` expects the parser and printer signatures as:
926
927```c++
928ParseResult parseStringParam(AsmParser &parser, std::string &value);
929void printStringParam(AsmPrinter &printer, StringRef value);
930```
931
932The custom parser is considered to have failed if it returns failure or if any
933bound parameters have failure values afterwards.
934
935A string of C++ code can be used as a `custom` directive argument. When
936generating the custom parser and printer call, the string is pasted as a
937function argument. For example, `parseBar` and `printBar` can be re-used with
938a constant integer:
939
940```tablegen
941let parameters = (ins "int":$bar);
942let assemblyFormat = [{ custom<Bar>($foo, "1") }];
943```
944
945The string is pasted verbatim but with substitutions for `$_builder` and
946`$_ctxt`. String literals can be used to parameterize custom directives.
947
948### Verification
949
950If the `genVerifyDecl` field is set, additional verification methods are
951generated on the class.
952
953- `static LogicalResult verify(function_ref<InFlightDiagnostic()> emitError, parameters...)`
954
955These methods are used to verify the parameters provided to the attribute or
956type class on construction, and emit any necessary diagnostics. This method is
957automatically invoked from the builders of the attribute or type class.
958
959- `AttrOrType getChecked(function_ref<InFlightDiagnostic()> emitError, parameters...)`
960
961As noted in the [Builders](#Builders) section, these methods are companions to
962`get` builders that are failable. If the `verify` invocation fails when these
963methods are called, they return nullptr instead of asserting.
964
965### Storage Classes
966
967Somewhat alluded to in the sections above is the concept of a "storage class"
968(often abbreviated to "storage"). Storage classes contain all of the data
969necessary to construct and unique a attribute or type instance. These classes
970are the "immortal" objects that get uniqued within an MLIRContext and get
971wrapped by the `Attribute` and `Type` classes. Every Attribute or Type class has
972a corresponding storage class, that can be accessed via the protected
973`getImpl()` method.
974
975In most cases the storage class is auto generated, but if necessary it can be
976manually defined by setting the `genStorageClass` field to 0. The name and
977namespace (defaults to `detail`) can additionally be controlled via the The
978`storageClass` and `storageNamespace` fields.
979
980#### Defining a storage class
981
982User defined storage classes must adhere to the following:
983
984- Inherit from the base type storage class of `AttributeStorage` or
985  `TypeStorage` respectively.
986- Define a type alias, `KeyTy`, that maps to a type that uniquely identifies an
987  instance of the derived type. For example, this could be a `std::tuple` of all
988  of the storage parameters.
989- Provide a construction method that is used to allocate a new instance of the
990  storage class.
991  - `static Storage *construct(StorageAllocator &allocator, const KeyTy &key)`
992- Provide a comparison method between an instance of the storage and the
993  `KeyTy`.
994  - `bool operator==(const KeyTy &) const`
995- Provide a method to generate the `KeyTy` from a list of arguments passed to
996  the uniquer when building an Attribute or Type. (Note: This is only necessary
997  if the `KeyTy` cannot be default constructed from these arguments).
998  - `static KeyTy getKey(Args...&& args)`
999- Provide a method to hash an instance of the `KeyTy`. (Note: This is not
1000  necessary if an `llvm::DenseMapInfo<KeyTy>` specialization exists)
1001  - `static llvm::hash_code hashKey(const KeyTy &)`
1002- Provide a method to generate the `KeyTy` from an instance of the storage class.
1003  - `static KeyTy getAsKey()`
1004
1005Let's look at an example:
1006
1007```c++
1008/// Here we define a storage class for a ComplexType, that holds a non-zero
1009/// integer and an integer type.
1010struct ComplexTypeStorage : public TypeStorage {
1011  ComplexTypeStorage(unsigned nonZeroParam, Type integerType)
1012      : nonZeroParam(nonZeroParam), integerType(integerType) {}
1013
1014  /// The hash key for this storage is a pair of the integer and type params.
1015  using KeyTy = std::pair<unsigned, Type>;
1016
1017  /// Define the comparison function for the key type.
1018  bool operator==(const KeyTy &key) const {
1019    return key == KeyTy(nonZeroParam, integerType);
1020  }
1021
1022  /// Define a hash function for the key type.
1023  /// Note: This isn't necessary because std::pair, unsigned, and Type all have
1024  /// hash functions already available.
1025  static llvm::hash_code hashKey(const KeyTy &key) {
1026    return llvm::hash_combine(key.first, key.second);
1027  }
1028
1029  /// Define a construction function for the key type.
1030  /// Note: This isn't necessary because KeyTy can be directly constructed with
1031  /// the given parameters.
1032  static KeyTy getKey(unsigned nonZeroParam, Type integerType) {
1033    return KeyTy(nonZeroParam, integerType);
1034  }
1035
1036  /// Define a construction method for creating a new instance of this storage.
1037  static ComplexTypeStorage *construct(StorageAllocator &allocator, const KeyTy &key) {
1038    return new (allocator.allocate<ComplexTypeStorage>())
1039        ComplexTypeStorage(key.first, key.second);
1040  }
1041
1042  /// Construct an instance of the key from this storage class.
1043  KeyTy getAsKey() const {
1044    return KeyTy(nonZeroParam, integerType);
1045  }
1046
1047  /// The parametric data held by the storage class.
1048  unsigned nonZeroParam;
1049  Type integerType;
1050};
1051```
1052
1053### Mutable attributes and types
1054
1055Attributes and Types are immutable objects uniqued within an MLIRContext. That
1056being said, some parameters may be treated as "mutable" and modified after
1057construction. Mutable parameters should be reserved for parameters that can not
1058be reasonably initialized during construction time. Given the mutable component,
1059these parameters do not take part in the uniquing of the Attribute or Type.
1060
1061TODO: Mutable parameters are currently not supported in the declarative
1062specification of attributes and types, and thus requires defining the Attribute
1063or Type class in C++.
1064
1065#### Defining a mutable storage
1066
1067In addition to the base requirements for a storage class, instances with a
1068mutable component must additionally adhere to the following:
1069
1070- The mutable component must not participate in the storage `KeyTy`.
1071- Provide a mutation method that is used to modify an existing instance of the
1072  storage. This method modifies the mutable component based on arguments, using
1073  `allocator` for any newly dynamically-allocated storage, and indicates whether
1074  the modification was successful.
1075  - `LogicalResult mutate(StorageAllocator &allocator, Args ...&& args)`
1076
1077Let's define a simple storage for recursive types, where a type is identified by
1078its name and may contain another type including itself.
1079
1080```c++
1081/// Here we define a storage class for a RecursiveType that is identified by its
1082/// name and contains another type.
1083struct RecursiveTypeStorage : public TypeStorage {
1084  /// The type is uniquely identified by its name. Note that the contained type
1085  /// is _not_ a part of the key.
1086  using KeyTy = StringRef;
1087
1088  /// Construct the storage from the type name. Explicitly initialize the
1089  /// containedType to nullptr, which is used as marker for the mutable
1090  /// component being not yet initialized.
1091  RecursiveTypeStorage(StringRef name) : name(name), containedType(nullptr) {}
1092
1093  /// Define the comparison function.
1094  bool operator==(const KeyTy &key) const { return key == name; }
1095
1096  /// Define a construction method for creating a new instance of the storage.
1097  static RecursiveTypeStorage *construct(StorageAllocator &allocator,
1098                                         const KeyTy &key) {
1099    // Note that the key string is copied into the allocator to ensure it
1100    // remains live as long as the storage itself.
1101    return new (allocator.allocate<RecursiveTypeStorage>())
1102        RecursiveTypeStorage(allocator.copyInto(key));
1103  }
1104
1105  /// Define a mutation method for changing the type after it is created. In
1106  /// many cases, we only want to set the mutable component once and reject
1107  /// any further modification, which can be achieved by returning failure from
1108  /// this function.
1109  LogicalResult mutate(StorageAllocator &, Type body) {
1110    // If the contained type has been initialized already, and the call tries
1111    // to change it, reject the change.
1112    if (containedType && containedType != body)
1113      return failure();
1114
1115    // Change the body successfully.
1116    containedType = body;
1117    return success();
1118  }
1119
1120  StringRef name;
1121  Type containedType;
1122};
1123```
1124
1125#### Type class definition
1126
1127Having defined the storage class, we can define the type class itself.
1128`Type::TypeBase` provides a `mutate` method that forwards its arguments to the
1129`mutate` method of the storage and ensures the mutation happens safely.
1130
1131```c++
1132class RecursiveType : public Type::TypeBase<RecursiveType, Type,
1133                                            RecursiveTypeStorage> {
1134public:
1135  /// Inherit parent constructors.
1136  using Base::Base;
1137
1138  /// Creates an instance of the Recursive type. This only takes the type name
1139  /// and returns the type with uninitialized body.
1140  static RecursiveType get(MLIRContext *ctx, StringRef name) {
1141    // Call into the base to get a uniqued instance of this type. The parameter
1142    // (name) is passed after the context.
1143    return Base::get(ctx, name);
1144  }
1145
1146  /// Now we can change the mutable component of the type. This is an instance
1147  /// method callable on an already existing RecursiveType.
1148  void setBody(Type body) {
1149    // Call into the base to mutate the type.
1150    LogicalResult result = Base::mutate(body);
1151
1152    // Most types expect the mutation to always succeed, but types can implement
1153    // custom logic for handling mutation failures.
1154    assert(succeeded(result) &&
1155           "attempting to change the body of an already-initialized type");
1156
1157    // Avoid unused-variable warning when building without assertions.
1158    (void) result;
1159  }
1160
1161  /// Returns the contained type, which may be null if it has not been
1162  /// initialized yet.
1163  Type getBody() { return getImpl()->containedType; }
1164
1165  /// Returns the name.
1166  StringRef getName() { return getImpl()->name; }
1167};
1168```
1169
1170### Extra declarations
1171
1172The declarative Attribute and Type definitions try to auto-generate as much
1173logic and methods as possible. With that said, there will always be long-tail
1174cases that won't be covered. For such cases, `extraClassDeclaration` and
1175`extraClassDefinition` can be used. Code within the `extraClassDeclaration`
1176field will be copied literally to the generated C++ Attribute or Type class.
1177Code within `extraClassDefinition` will be added to the generated source file
1178inside the class's C++ namespace. The substitution `$cppClass` will be replaced
1179by the Attribute or Type's C++ class name.
1180
1181Note that these are mechanisms intended for long-tail cases by power users; for
1182not-yet-implemented widely-applicable cases, improving the infrastructure is
1183preferable.
1184
1185### Registering with the Dialect
1186
1187Once the attributes and types have been defined, they must then be registered
1188with the parent `Dialect`. This is done via the `addAttributes` and `addTypes`
1189methods. Note that when registering, the full definition of the storage classes
1190must be visible.
1191
1192```c++
1193void MyDialect::initialize() {
1194    /// Add the defined attributes to the dialect.
1195  addAttributes<
1196#define GET_ATTRDEF_LIST
1197#include "MyDialect/Attributes.cpp.inc"
1198  >();
1199
1200    /// Add the defined types to the dialect.
1201  addTypes<
1202#define GET_TYPEDEF_LIST
1203#include "MyDialect/Types.cpp.inc"
1204  >();
1205}
1206```
1207