xref: /llvm-project/mlir/docs/Dialects/Linalg/OpDSL.md (revision 2b9edabe969e2f59f067ed7d49e2b0eca5411113)
1# Linalg OpDSL
2
3**_Warning: Linalg's OpDSL is currently being [deprecated](https://discourse.llvm.org/t/how-to-add-custom-linalg-named-ops-using-opdsl/83200/2),
4with its operations slowly [being moved](https://github.com/llvm/llvm-project/pull/115319) into TableGen's ODS format.
5Please refer to the [MLIR Restructuring discussion](https://discourse.llvm.org/t/rfc-mlir-project-charter-and-restructuring/82896)
6for more in-depth information._**
7
8Python based DSL for authoring Linalg op definitions and generating
9`linalg.generic` IR based on them for samples.
10
11The Linalg OpDSL is a high level DSL for constructing structured op definitions
12in a way that can be exported to built-in, named structured ops via
13[YAML-based definitions](_index.md/#yaml-gen) or used interactively to emit
14corresponding `linalg.generic` IR for the composition.
15
16## Basic usage
17
18The tool is bundled with the MLIR Python bindings. To use from the CMake build
19tree, MLIR must be build with Python bindings enabled
20(`-DMLIR_ENABLE_BINDINGS_PYTHON=ON`). Then add the `python` directory in the
21build tree to your `PYTHONPATH` environment variable (i.e. `export
22PYTHONPATH=$PWD/build/tools/mlir/python_packages/mlir_core`). Optionally, use an
23installed MLIR package, if available, to avoid building.
24
25```shell
26# Dump the `core_named_ops.py` module as YAML.
27python -m mlir.dialects.linalg.opdsl.dump_oplib .ops.core_named_ops
28```
29
30Alternatively, run the `$PWD/build/bin/update_core_linalg_named_ops.sh` script,
31which is available after building the `mlir-linalg-ods-yaml-gen` target. The tool
32is meant for use during both development and runtime, but not as a build tool of
33the core compiler: in order to export static named op definitions to be built as
34part of the compiler, the corresponding Linalg dialect YAML file must be updated
35and reviewed. TODO: Develop a script to automate op updates to these files.
36
37## Language Guide
38
39The language presented here is loosely inspired from the
40[Tensor Comprehensions](https://arxiv.org/pdf/1802.04730.pdf) work, adapted to
41represent linalg structured ops.
42
43This tool is new and rapidly evolving. For language examples, refer to the
44built-in ops in the `mlir.tools.linalg_opdsl.ops` package
45(`lib/Bindings/Python/mlir/tools/linalg_opdsl/ops` in the repository).
46
47Using a matmul as an example, we will decompose the language:
48
49```python
50T1 = TV.T1
51T2 = TV.T2
52
53@linalg_structured_op
54def matmul(A=TensorDef(T1, S.M, S.K),
55           B=TensorDef(T2, S.K, S.N),
56           C=TensorDef(U, S.M, S.N, output=True)):
57  """Performs a matrix multiplication of two 2D inputs.
58
59  Numeric casting is performed on the operands to the inner multiply, promoting
60  them to the same data type as the accumulator/output.
61  """
62  domain(D.m, D.n, D.k)
63  defines(Canonicalizer)
64  implements(ContractionOpInterface)
65  C[D.m, D.n] += TypeFn.cast_signed(
66      U, A[D.m, D.k]) * TypeFn.cast_signed(U, B[D.k, D.n])
67```
68
69Here we have a simple type polymorphic contraction that takes arguments `A` and
70`B` and outputs `C`. Each is bound to a `TensorDef`, which specifies:
71
72*   The symbolic element type (`T1`, `T2`, `U` above).
73*   Symbolic shape expressions with symbols that are bound globally for the op (
74    note that in this simple example, the shape expressions are just symbol
75    references, but they are permitted to be a constrained set of affine
76    expressions).
77*   Usage (`output=True`).
78
79The docstring will be transferred to the op definition verbatim.
80
81An explicit iteration domain dimension order can be declared for the op via
82`domain(D.d0[, D.d1...])`.
83
84Special identifying op interfaces can be declared for the op via
85`implements(interface1[, interface2...])`.
86
87Extra method definitions can be declared for the op via
88`defines(definition1[, definition2...])`.
89
90## Parameters
91
92Structured operations take two types of runtime parameters namely scalars and
93tensors. While scalars are inputs only, a tensor may be marked as an output.
94Assignment expressions index the tensor parameters to access the individual
95elements, while scalars can be accessed directly.
96
97The following example demonstrates the use of the two parameter types:
98
99```python
100@linalg_structured_op
101def copy_and_scale(val=ScalarDef(T),
102                   I=TensorDef(T, S.M, S.K),
103                   O=TensorDef(T, S.M, S.K, output=True)):
104  """Scale the input by the scalar value and store the result"""
105  O[D.m, D.n] = I[D.m, D.n] * val
106```
107
108The operation scales the input tensor `I` scales its elements by the value `val`
109and writes the result to the output tensor `out`. The scalar `val` is bound to a
110`ScalarDef`, which specifies the type of the scalar operand. The tensors are
111bound to a `TensorDef` as demonstrated by the matmul example. All parameters
112appear in the parameter list of the operation:
113
114```python
115copy_and_scale(val, in_tensor, outs=[out_tensor])
116```
117
118## Index Attributes
119
120Index attributes are compile-time constant parameters only accessible in index
121expressions. They can be used to parameterize the access pattern of a structured
122operation, for example, by setting its strides. They cannot take part in the
123actual computation.
124
125The following example demonstrates the use of index attributes:
126
127```python
128@linalg_structured_op
129def strided_copy(I=TensorDef(T, S.IH, S.IW),
130                 O=TensorDef(T, S.OH, S.OW, output=True),
131                 strides=IndexAttrDef(S.SH, S.SW, default=[1, 1])):
132  """Copy a subset of the input tensor elements to the output tensor"""
133  O[D.oh, D.ow] = I[D.oh * S.SH, D.ow * S.SW]
134```
135
136The operation implements a strided copy from the input tensor `I` to the output
137tensor `O`. The `strides` attribute is bound to an `IndexAttrDef`. It defines
138the symbols `S.SH` and `S.SW`, which are used to index the input tensor `I`.
139When instantiating the operation, the attribute is set using a named argument:
140
141```python
142strided_copy(in_tensor, outs=[out_tensor], strides=[1, 2])
143```
144
145The `strides` vector elements substitute the symbols `S.SH` and `S.SW` in the
146index expressions of the operation instance. If no strides are provided the
147`default` vector elements are used instead.
148
149Index attributes are currently limited to integer vectors and only accessible in
150index expressions. An operation may have multiple attributes all of them placed
151at the end of the parameter list after the output tensors.
152
153## Shape-Only Tensors
154
155Structured operations derive the iteration space given the sizes of the input
156and output tensors. Certain operations need shape-only tensors that are not
157accessed and exist purely for the sake of specifying the iteration domain. An
158example is the pooling operation that takes a shape-only tensor to define the
159iteration space of the reduction. As shape-only tensors have no uses, the
160`TensorDef` takes an additional optional `index_dims` parameter to map the shape
161to index dimensions.
162
163The following example demonstrates the index dimension annotation:
164
165```python
166@linalg_structured_op
167def pooling_poly(
168    I=TensorDef(T1, S.N, S.H, S.W, S.C),
169    K=TensorDef(T2, S.KH, S.KW, index_dims=[D.kh, D.kw]),
170    O=TensorDef(U, S.N, S.OH, S.OW, S.C, output=True),
171    strides=IndexAttrDef(S.SH, S.SW, default=[1, 1]),
172    dilations=IndexAttrDef(S.DH, S.DW, default=[1, 1])):
173  O[D.n, D.oh, D.ow, D.c] += TypeFn.cast_signed(U,
174          I[D.n, D.oh * S.SH + D.kh * S.DH, D.ow * S.SW + D.kw * S.DW, D.c])
175```
176
177The pooling operation does not access the shape-only tensor `K`. Instead, the
178shapes `S.KH` and `S.KW` specify the iteration domain for the reduction
179dimensions `D.kh` and `D.kw`.
180
181## Assignments
182
183The bulk of language consists of assignment expressions of the form above. The
184iteration dimension order is determined lexically based on the order encountered
185in the expression (following operator precedence if math operators are used).
186TODO: Introduce a directive to fix the dimension bindings.
187
188Reduction dimensions are inferred to be any dimensions on the RHS that are not
189on the LHS.
190
191A number of unary and binary arithmetic functions are supported:
192
193*   `BinaryFn.add(a, b)` (also via overloading the binary `+` operator)
194*   `BinaryFn.mul(a, b)` (also via overloading the binary `*` operator)
195*   `BinaryFn.max_signed(a, b)`
196*   `BinaryFn.min_signed(a, b)`
197*   `BinaryFn.sub(a, b)` (also via overloading the binary `-` operator)
198*   `BinaryFn.max_unsigned(a, b)`
199*   `BinaryFn.min_unsigned(a, b)`
200*   `UnaryFn.exp(a)`
201*   `UnaryFn.log(a)`
202
203As the integer types are signless, signedness is implement by different
204functions that treat integers as signed or unsigned values.
205
206A subset of the arithmetic functions are supported in reductions. These
207reduction functions can appear as the outermost function on the RHS:
208
209*   `ReduceFn.add` (also overloading the inplace `+=` on a LHS)
210*   `ReduceFn.mul`
211*   `ReduceFn.max_signed`
212*   `ReduceFn.min_signed`
213*   `ReduceFn.max_unsigned`
214*   `ReduceFn.min_unsigned`
215
216As the integer types are signless, signedness is implement by different
217functions that treat integers as signed or unsigned values.
218
219Additionally, type conversion functions cast an operand to a target type:
220
221*   `TypeFn.cast_signed(TypeVar, operand)`
222*   `TypeFn.cast_unsigned(TypeVar, operand)`
223
224As the integer types are signless, signedness is implement by different
225functions that treat integers as signed (`TypeFn.cast_signed`) or unsigned
226(`TypeFn.cast_unsigned`) values.
227
228There are also special forms:
229
230*   `const(value)` returns a constant value.
231*   `index(dim)` returns the iteration index in the given dimension `dim`.
232
233## Function Attributes
234
235Function attributes are compile-time constant function parameters. They can be
236used to parameterize the computation performed by a structured operation, for
237example, to support signed and unsigned computations.
238
239The following example demonstrates the use of function attributes:
240
241```python
242@linalg_structured_op
243def elemwise_binary(
244    lhs=TensorDef(T1),
245    rhs=TensorDef(T2),
246    O=TensorDef(U, output=True),
247    fun=BinaryFnAttrDef(default=BinaryFn.add),
248    cast=TypeFnAttrDef(default=TypeFn.cast_signed)):
249  O[None] = fun(cast(U, lhs[None]), cast(U, rhs[None]))
250```
251
252The `fun` and `cast` function attributes by default are aliases for their
253default values `BinaryFn.add` and `TypeFn.cast_signed`, respectively. When
254instantiating the operation, the function attributes may be set to other
255functions using optional named arguments:
256
257```python
258elemwise_binary(lhs, rhs, outs=[out_tensor],
259                fun=BinaryFn.mul, cast=TypeFn.cast_unsigned)
260```
261
262In the example, the `fun` and `cast` arguments adapt the body of the operation
263to implement multiplication and unsigned casts instead of addition and signed
264casts.
265
266OpDSL supports unary, binary, and type conversion function attributes. An
267operation can take multiple attributes of different kinds placed at the end of
268the parameter list.
269
270## Types
271
272All types in assignment expressions are late bound based on actual input and
273output types of constructed ops. An exception are predefined types such as
274`I32`, `I64`, `F32`, and `F64`. These hardwired types enable intermediate
275computations with a type that is independent of the input and output types. For
276example, parts of floating point computation may require double precision
277arithmetic despite all inputs and outputs being single precision values.
278Assignment expressions with no `TypeFn.cast_signed` calls will generally require
279uniform types throughout and will fail to verify if violated. The presence of a
280`TypeFn.cast_signed` or `TypeFn.cast_unsigned` allows for a limited form of
281numeric type conversion between element types that can be derived from inputs
282and outputs (and in the future, attributes). `TypeFn.cast_signed` calls with a
283`TypeVar` first argument are emitted as `type_fn` primitives in the YAML
284definition.
285
286Casting will perform `int<->float` and `index->int` type conversions and will
287perform any necessary extension or truncation within the type family. The
288integer types themselves are signless and signedness is implemented by
289functions/operations. The `TypeFn.cast_signed` function treats all integers as
290signed, while `TypeFn.cast_unsigned` treats them as unsigned.
291
292The following examples illustrate the lowering of signed and unsigned functions:
293
294*   cast_signed(I32 -> I64) -> `arith.ExtSIOp`
295*   cast_signed(F32 -> I32) -> `arith.FPToSIOp`
296*   cast_unsigned(I32 -> I64) -> `arith.ExtUIOp`
297*   cast_unsigned(F32 -> I32) -> `arith.FPToUIOp`
298*   max_signed -> `arith.MaxSIOp`
299*   max_unsigned -> `arith.MaxUIOp`
300
301Not all functions are applicable for all numeric types, and on mismatch, op
302verification will fail.
303
304## Pointwise Computations
305
306Pointwise computations are expressible in a rank polymorphic form that supports
307arbitrary ranked operands - all of them need to have the same rank - with a
308single operation definition.
309
310An example for a rank polymorphic operation is `fill`:
311
312```python
313@linalg_structured_op
314def fill(value=ScalarDef(T1),
315         O=TensorDef(U, output=True)):
316  O[None] = TypeFn.cast_signed(U, value)
317```
318
319The operation sets the elements of the output tensor `O` to `value`. All
320operands are either scalars or rank zero tensors that are accessed using the
321index `None`. The operation thus performs a scalar computation that trivially
322extends to a multi-dimensional pointwise computation. As a result, we may use
323`fill` with arbitrary ranked output tensors:
324
325```python
326tensor_2d = tensor.EmptyOp([4, 8], f32)
327tensor_3d = tensor.EmptyOp([4, 8, 16], f32)
328fill(value, outs=[tensor_2d])
329fill(value, outs=[tensor_3d])
330```
331