1# Operation Canonicalization 2 3Canonicalization is an important part of compiler IR design: it makes it easier 4to implement reliable compiler transformations and to reason about what is 5better or worse in the code, and it forces interesting discussions about the 6goals of a particular level of IR. Dan Gohman wrote 7[an article](https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html) 8exploring these issues; it is worth reading if you're not familiar with these 9concepts. 10 11Most compilers have canonicalization passes, and sometimes they have many 12different ones (e.g. instcombine, dag combine, etc in LLVM). Because MLIR is a 13multi-level IR, we can provide a single canonicalization infrastructure and 14reuse it across many different IRs that it represents. This document describes 15the general approach, global canonicalizations performed, and provides sections 16to capture IR-specific rules for reference. 17 18[TOC] 19 20## General Design 21 22MLIR has a single canonicalization pass, which iteratively applies the 23canonicalization patterns of all loaded dialects in a greedy way. 24Canonicalization is best-effort and not guaranteed to bring the entire IR in a 25canonical form. It applies patterns until either fixpoint is reached or the 26maximum number of iterations/rewrites (as specified via pass options) is 27exhausted. This is for efficiency reasons and to ensure that faulty patterns 28cannot cause infinite looping. 29 30Canonicalization patterns are registered with the operations themselves, which 31allows each dialect to define its own set of operations and canonicalizations 32together. 33 34Some important things to think about w.r.t. canonicalization patterns: 35 36* The goal of canonicalization is to make subsequent analyses and 37 optimizations more effective. Therefore, performance improvements are not 38 necessary for canonicalization. 39 40* Pass pipelines should not rely on the canonicalizer pass for correctness. 41 They should work correctly with all instances of the canonicalization pass 42 removed. 43 44* Repeated applications of patterns should converge. Unstable or cyclic 45 rewrites are considered a bug: they can make the canonicalizer pass less 46 predictable and less effective (i.e., some patterns may not be applied) and 47 prevent it from converging. 48 49* It is generally better to canonicalize towards operations that have fewer 50 uses of a value when the operands are duplicated, because some patterns only 51 match when a value has a single user. For example, it is generally good to 52 canonicalize "x + x" into "x * 2", because this reduces the number of uses 53 of x by one. 54 55* It is always good to eliminate operations entirely when possible, e.g. by 56 folding known identities (like "x + 0 = x"). 57 58* Pattens with expensive running time (i.e. have O(n) complexity) or 59 complicated cost models don't belong to canonicalization: since the 60 algorithm is executed iteratively until fixed-point we want patterns that 61 execute quickly (in particular their matching phase). 62 63* Canonicalize shouldn't lose the semantic of original operation: the original 64 information should always be recoverable from the transformed IR. 65 66For example, a pattern that transform 67 68``` 69 %transpose = linalg.transpose 70 ins(%input : tensor<1x2x3xf32>) 71 outs(%init1 : tensor<2x1x3xf32>) 72 dimensions = [1, 0, 2] 73 %out = linalg.transpose 74 ins(%transpose: tensor<2x1x3xf32>) 75 outs(%init2 : tensor<3x1x2xf32>) 76 permutation = [2, 1, 0] 77``` 78 79to 80 81``` 82 %out= linalg.transpose 83 ins(%input : tensor<1x2x3xf32>) 84 outs(%init2: tensor<3x1x2xf32>) 85 permutation = [2, 0, 1] 86``` 87 88is a good canonicalization pattern because it removes a redundant operation, 89making other analysis optimizations and more efficient. 90 91## Globally Applied Rules 92 93These transformations are applied to all levels of IR: 94 95* Elimination of operations that have no side effects and have no uses. 96 97* Constant folding - e.g. "(addi 1, 2)" to "3". Constant folding hooks are 98 specified by operations. 99 100* Move constant operands to commutative operators to the right side - e.g. 101 "(addi 4, x)" to "(addi x, 4)". 102 103* `constant-like` operations are uniqued and hoisted into the entry block of 104 the first parent barrier region. This is a region that is either isolated 105 from above, e.g. the entry block of a function, or one marked as a barrier 106 via the `shouldMaterializeInto` method on the `DialectFoldInterface`. 107 108## Defining Canonicalizations 109 110Two mechanisms are available with which to define canonicalizations; 111general `RewritePattern`s and the `fold` method. 112 113### Canonicalizing with `RewritePattern`s 114 115This mechanism allows for providing canonicalizations as a set of 116`RewritePattern`s, either imperatively defined in C++ or declaratively as 117[Declarative Rewrite Rules](DeclarativeRewrites.md). The pattern rewrite 118infrastructure allows for expressing many different types of canonicalizations. 119These transformations may be as simple as replacing a multiplication with a 120shift, or even replacing a conditional branch with an unconditional one. 121 122In [ODS](DefiningDialects/Operations.md), an operation can set the `hasCanonicalizer` bit or 123the `hasCanonicalizeMethod` bit to generate a declaration for the 124`getCanonicalizationPatterns` method: 125 126```tablegen 127def MyOp : ... { 128 // I want to define a fully general set of patterns for this op. 129 let hasCanonicalizer = 1; 130} 131 132def OtherOp : ... { 133 // A single "matchAndRewrite" style RewritePattern implemented as a method 134 // is good enough for me. 135 let hasCanonicalizeMethod = 1; 136} 137``` 138 139Canonicalization patterns can then be provided in the source file: 140 141```c++ 142void MyOp::getCanonicalizationPatterns(RewritePatternSet &patterns, 143 MLIRContext *context) { 144 patterns.add<...>(...); 145} 146 147LogicalResult OtherOp::canonicalize(OtherOp op, PatternRewriter &rewriter) { 148 // patterns and rewrites go here. 149 return failure(); 150} 151``` 152 153See the [quickstart guide](Tutorials/QuickstartRewrites.md) for information on 154defining operation rewrites. 155 156### Canonicalizing with the `fold` method 157 158The `fold` mechanism is an intentionally limited, but powerful mechanism that 159allows for applying canonicalizations in many places throughout the compiler. 160For example, outside of the canonicalizer pass, `fold` is used within the 161[dialect conversion infrastructure](DialectConversion.md) as a legalization 162mechanism, and can be invoked directly anywhere with an `OpBuilder` via 163`OpBuilder::createOrFold`. 164 165`fold` has the restriction that no new operations may be created, and only the 166root operation may be replaced (but not erased). It allows for updating an 167operation in-place, or returning a set of pre-existing values (or attributes) to 168replace the operation with. This ensures that the `fold` method is a truly 169"local" transformation, and can be invoked without the need for a pattern 170rewriter. 171 172In [ODS](DefiningDialects/Operations.md), an operation can set the `hasFolder` bit to generate 173a declaration for the `fold` method. This method takes on a different form, 174depending on the structure of the operation. 175 176```tablegen 177def MyOp : ... { 178 let hasFolder = 1; 179} 180``` 181 182If the operation has a single result the following will be generated: 183 184```c++ 185/// Implementations of this hook can only perform the following changes to the 186/// operation: 187/// 188/// 1. They can leave the operation alone and without changing the IR, and 189/// return nullptr. 190/// 2. They can mutate the operation in place, without changing anything else 191/// in the IR. In this case, return the operation itself. 192/// 3. They can return an existing value or attribute that can be used instead 193/// of the operation. The caller will remove the operation and use that 194/// result instead. 195/// 196OpFoldResult MyOp::fold(FoldAdaptor adaptor) { 197 ... 198} 199``` 200 201Otherwise, the following is generated: 202 203```c++ 204/// Implementations of this hook can only perform the following changes to the 205/// operation: 206/// 207/// 1. They can leave the operation alone and without changing the IR, and 208/// return failure. 209/// 2. They can mutate the operation in place, without changing anything else 210/// in the IR. In this case, return success. 211/// 3. They can return a list of existing values or attribute that can be used 212/// instead of the operation. In this case, fill in the results list and 213/// return success. The results list must correspond 1-1 with the results of 214/// the operation, partial folding is not supported. The caller will remove 215/// the operation and use those results instead. 216/// 217/// Note that this mechanism cannot be used to remove 0-result operations. 218LogicalResult MyOp::fold(FoldAdaptor adaptor, 219 SmallVectorImpl<OpFoldResult> &results) { 220 ... 221} 222``` 223 224In the above, for each method a `FoldAdaptor` is provided with getters for 225each of the operands, returning the corresponding constant attribute. These 226operands are those that implement the `ConstantLike` trait. If any of the 227operands are non-constant, a null `Attribute` value is provided instead. For 228example, if MyOp provides three operands [`a`, `b`, `c`], but only `b` is 229constant then `adaptor` will return Attribute() for `getA()` and `getC()`, 230and b-value for `getB()`. 231 232Also above, is the use of `OpFoldResult`. This class represents the possible 233result of folding an operation result: either an SSA `Value`, or an 234`Attribute`(for a constant result). If an SSA `Value` is provided, it *must* 235correspond to an existing value. The `fold` methods are not permitted to 236generate new `Value`s. There are no specific restrictions on the form of the 237`Attribute` value returned, but it is important to ensure that the `Attribute` 238representation of a specific `Type` is consistent. 239 240When the `fold` hook on an operation is not successful, the dialect can 241provide a fallback by implementing the `DialectFoldInterface` and overriding 242the fold hook. 243 244#### Generating Constants from Attributes 245 246When a `fold` method returns an `Attribute` as the result, it signifies that 247this result is "constant". The `Attribute` is the constant representation of the 248value. Users of the `fold` method, such as the canonicalizer pass, will take 249these `Attribute`s and materialize constant operations in the IR to represent 250them. To enable this materialization, the dialect of the operation must 251implement the `materializeConstant` hook. This hook takes in an `Attribute` 252value, generally returned by `fold`, and produces a "constant-like" operation 253that materializes that value. 254 255In [ODS](DefiningDialects/_index.md), a dialect can set the `hasConstantMaterializer` bit 256to generate a declaration for the `materializeConstant` method. 257 258```tablegen 259def MyDialect : ... { 260 let hasConstantMaterializer = 1; 261} 262``` 263 264Constants can then be materialized in the source file: 265 266```c++ 267/// Hook to materialize a single constant operation from a given attribute value 268/// with the desired resultant type. This method should use the provided builder 269/// to create the operation without changing the insertion position. The 270/// generated operation is expected to be constant-like. On success, this hook 271/// should return the value generated to represent the constant value. 272/// Otherwise, it should return nullptr on failure. 273Operation *MyDialect::materializeConstant(OpBuilder &builder, Attribute value, 274 Type type, Location loc) { 275 ... 276} 277``` 278