15b4a01d4SMehdi Amini# Chapter 3: High-level Language-Specific Analysis and Transformation 25b4a01d4SMehdi Amini 35b4a01d4SMehdi Amini[TOC] 45b4a01d4SMehdi Amini 55b4a01d4SMehdi AminiCreating a dialect that closely represents the semantics of an input language 65b4a01d4SMehdi Aminienables analyses, transformations and optimizations in MLIR that require 75b4a01d4SMehdi Aminihigh-level language information and are generally performed on the language AST. 85b4a01d4SMehdi AminiFor example, `clang` has a fairly 95b4a01d4SMehdi Amini[heavy mechanism](https://clang.llvm.org/doxygen/classclang_1_1TreeTransform.html) 105b4a01d4SMehdi Aminifor performing template instantiation in C++. 115b4a01d4SMehdi Amini 125b4a01d4SMehdi AminiWe divide compiler transformations into two categories: local and global. In 135b4a01d4SMehdi Aminithis chapter, we focus on how to leverage the Toy Dialect and its high-level 145b4a01d4SMehdi Aminisemantics to perform local pattern-match transformations that would be difficult 155b4a01d4SMehdi Aminiin LLVM. For this, we use MLIR's 16f7a13479SRiver Riddle[Generic DAG Rewriter](../../PatternRewriter.md). 175b4a01d4SMehdi Amini 185b4a01d4SMehdi AminiThere are two methods that can be used to implement pattern-match 195b4a01d4SMehdi Aminitransformations: 1. Imperative, C++ pattern-match and rewrite 2. Declarative, 205b4a01d4SMehdi Aminirule-based pattern-match and rewrite using table-driven 215b4a01d4SMehdi Amini[Declarative Rewrite Rules](../../DeclarativeRewrites.md) (DRR). Note that the 225b4a01d4SMehdi Aminiuse of DRR requires that the operations be defined using ODS, as described in 235b4a01d4SMehdi Amini[Chapter 2](Ch-2.md). 245b4a01d4SMehdi Amini 251842fd50SJacques Pienaar## Optimize Transpose using C++ style pattern-match and rewrite 265b4a01d4SMehdi Amini 275b4a01d4SMehdi AminiLet's start with a simple pattern and try to eliminate a sequence of two 2879c17330SMatthias Krammtransposes that cancel out: `transpose(transpose(X)) -> X`. Here is the 295b4a01d4SMehdi Aminicorresponding Toy example: 305b4a01d4SMehdi Amini 31430bba2aSJacques Pienaar```toy 325b4a01d4SMehdi Aminidef transpose_transpose(x) { 335b4a01d4SMehdi Amini return transpose(transpose(x)); 345b4a01d4SMehdi Amini} 355b4a01d4SMehdi Amini``` 365b4a01d4SMehdi Amini 375b4a01d4SMehdi AminiWhich corresponds to the following IR: 385b4a01d4SMehdi Amini 395b4a01d4SMehdi Amini```mlir 40ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> { 410050e8f0SRiver Riddle %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> 420050e8f0SRiver Riddle %1 = toy.transpose(%0 : tensor<*xf64>) to tensor<*xf64> 430050e8f0SRiver Riddle toy.return %1 : tensor<*xf64> 445b4a01d4SMehdi Amini} 455b4a01d4SMehdi Amini``` 465b4a01d4SMehdi Amini 475b4a01d4SMehdi AminiThis is a good example of a transformation that is trivial to match on the Toy 485b4a01d4SMehdi AminiIR but that would be quite hard for LLVM to figure. For example, today Clang 495b4a01d4SMehdi Aminican't optimize away the temporary array, and the computation with the naive 505b4a01d4SMehdi Aminitranspose is expressed with these loops: 515b4a01d4SMehdi Amini 525b4a01d4SMehdi Amini```c++ 535b4a01d4SMehdi Amini#define N 100 545b4a01d4SMehdi Amini#define M 100 555b4a01d4SMehdi Amini 565b4a01d4SMehdi Aminivoid sink(void *); 575b4a01d4SMehdi Aminivoid double_transpose(int A[N][M]) { 585b4a01d4SMehdi Amini int B[M][N]; 595b4a01d4SMehdi Amini for(int i = 0; i < N; ++i) { 605b4a01d4SMehdi Amini for(int j = 0; j < M; ++j) { 615b4a01d4SMehdi Amini B[j][i] = A[i][j]; 625b4a01d4SMehdi Amini } 635b4a01d4SMehdi Amini } 645b4a01d4SMehdi Amini for(int i = 0; i < N; ++i) { 655b4a01d4SMehdi Amini for(int j = 0; j < M; ++j) { 665b4a01d4SMehdi Amini A[i][j] = B[j][i]; 675b4a01d4SMehdi Amini } 685b4a01d4SMehdi Amini } 695b4a01d4SMehdi Amini sink(A); 705b4a01d4SMehdi Amini} 715b4a01d4SMehdi Amini``` 725b4a01d4SMehdi Amini 7379c17330SMatthias KrammFor a simple C++ approach to rewrite, involving matching a tree-like pattern in 745b4a01d4SMehdi Aminithe IR and replacing it with a different set of operations, we can plug into the 755b4a01d4SMehdi AminiMLIR `Canonicalizer` pass by implementing a `RewritePattern`: 765b4a01d4SMehdi Amini 775b4a01d4SMehdi Amini```c++ 785b4a01d4SMehdi Amini/// Fold transpose(transpose(x)) -> x 795b4a01d4SMehdi Aministruct SimplifyRedundantTranspose : public mlir::OpRewritePattern<TransposeOp> { 805b4a01d4SMehdi Amini /// We register this pattern to match every toy.transpose in the IR. 815b4a01d4SMehdi Amini /// The "benefit" is used by the framework to order the patterns and process 825b4a01d4SMehdi Amini /// them in order of profitability. 835b4a01d4SMehdi Amini SimplifyRedundantTranspose(mlir::MLIRContext *context) 845b4a01d4SMehdi Amini : OpRewritePattern<TransposeOp>(context, /*benefit=*/1) {} 855b4a01d4SMehdi Amini 865b4a01d4SMehdi Amini /// This method is attempting to match a pattern and rewrite it. The rewriter 875b4a01d4SMehdi Amini /// argument is the orchestrator of the sequence of rewrites. It is expected 885b4a01d4SMehdi Amini /// to interact with it to perform any changes to the IR from here. 89*db791b27SRamkumar Ramachandra llvm::LogicalResult 905b4a01d4SMehdi Amini matchAndRewrite(TransposeOp op, 915b4a01d4SMehdi Amini mlir::PatternRewriter &rewriter) const override { 925b4a01d4SMehdi Amini // Look through the input of the current transpose. 935b4a01d4SMehdi Amini mlir::Value transposeInput = op.getOperand(); 9498eead81SSean Silva TransposeOp transposeInputOp = transposeInput.getDefiningOp<TransposeOp>(); 95da025756SMatthias Kramm 96da025756SMatthias Kramm // Input defined by another transpose? If not, no match. 975b4a01d4SMehdi Amini if (!transposeInputOp) 983145427dSRiver Riddle return failure(); 995b4a01d4SMehdi Amini 100da025756SMatthias Kramm // Otherwise, we have a redundant transpose. Use the rewriter. 1015633813bSRahul Joshi rewriter.replaceOp(op, {transposeInputOp.getOperand()}); 1023145427dSRiver Riddle return success(); 1035b4a01d4SMehdi Amini } 1045b4a01d4SMehdi Amini}; 1055b4a01d4SMehdi Amini``` 1065b4a01d4SMehdi Amini 1075b4a01d4SMehdi AminiThe implementation of this rewriter is in `ToyCombine.cpp`. The 1085b4a01d4SMehdi Amini[canonicalization pass](../../Canonicalization.md) applies transformations 1095b4a01d4SMehdi Aminidefined by operations in a greedy, iterative manner. To ensure that the 1105b4a01d4SMehdi Aminicanonicalization pass applies our new transform, we set 1111294fa69SRiver Riddle[hasCanonicalizer = 1](../../DefiningDialects/Operations.md/#hascanonicalizer) and register the 1125b4a01d4SMehdi Aminipattern with the canonicalization framework. 1135b4a01d4SMehdi Amini 1145b4a01d4SMehdi Amini```c++ 1155b4a01d4SMehdi Amini// Register our patterns for rewrite by the Canonicalization framework. 1165b4a01d4SMehdi Aminivoid TransposeOp::getCanonicalizationPatterns( 117dc4e913bSChris Lattner RewritePatternSet &results, MLIRContext *context) { 118dc4e913bSChris Lattner results.add<SimplifyRedundantTranspose>(context); 1195b4a01d4SMehdi Amini} 1205b4a01d4SMehdi Amini``` 1215b4a01d4SMehdi Amini 1225b4a01d4SMehdi AminiWe also need to update our main file, `toyc.cpp`, to add an optimization 1235b4a01d4SMehdi Aminipipeline. In MLIR, the optimizations are run through a `PassManager` in a 1245b4a01d4SMehdi Aminisimilar way to LLVM: 1255b4a01d4SMehdi Amini 1265b4a01d4SMehdi Amini```c++ 12794a30928Srkayaith mlir::PassManager pm(module->getName()); 128ee2c6cd9SRiver Riddle pm.addNestedPass<mlir::toy::FuncOp>(mlir::createCanonicalizerPass()); 1295b4a01d4SMehdi Amini``` 1305b4a01d4SMehdi Amini 131495cf272SLucy FoxFinally, we can run `toyc-ch3 test/Examples/Toy/Ch3/transpose_transpose.toy 132495cf272SLucy Fox-emit=mlir -opt` and observe our pattern in action: 1335b4a01d4SMehdi Amini 1345b4a01d4SMehdi Amini```mlir 135ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> { 1360050e8f0SRiver Riddle %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> 1370050e8f0SRiver Riddle toy.return %arg0 : tensor<*xf64> 1385b4a01d4SMehdi Amini} 1395b4a01d4SMehdi Amini``` 1405b4a01d4SMehdi Amini 1415b4a01d4SMehdi AminiAs expected, we now directly return the function argument, bypassing any 1425b4a01d4SMehdi Aminitranspose operation. However, one of the transposes still hasn't been 1435b4a01d4SMehdi Aminieliminated. That is not ideal! What happened is that our pattern replaced the 1445b4a01d4SMehdi Aminilast transform with the function input and left behind the now dead transpose 1455b4a01d4SMehdi Aminiinput. The Canonicalizer knows to clean up dead operations; however, MLIR 1465b4a01d4SMehdi Aminiconservatively assumes that operations may have side-effects. We can fix this by 14708f31b8fSHsiangkai Wangadding a new trait, `Pure`, to our `TransposeOp`: 1485b4a01d4SMehdi Amini 149430bba2aSJacques Pienaar```tablegen 15008f31b8fSHsiangkai Wangdef TransposeOp : Toy_Op<"transpose", [Pure]> {...} 1515b4a01d4SMehdi Amini``` 1525b4a01d4SMehdi Amini 1535b4a01d4SMehdi AminiLet's retry now `toyc-ch3 test/transpose_transpose.toy -emit=mlir -opt`: 1545b4a01d4SMehdi Amini 1555b4a01d4SMehdi Amini```mlir 156ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> { 1570050e8f0SRiver Riddle toy.return %arg0 : tensor<*xf64> 1585b4a01d4SMehdi Amini} 1595b4a01d4SMehdi Amini``` 1605b4a01d4SMehdi Amini 1615b4a01d4SMehdi AminiPerfect! No `transpose` operation is left - the code is optimal. 1625b4a01d4SMehdi Amini 1635b4a01d4SMehdi AminiIn the next section, we use DRR for pattern match optimizations associated with 1645b4a01d4SMehdi Aminithe Reshape op. 1655b4a01d4SMehdi Amini 1661842fd50SJacques Pienaar## Optimize Reshapes using DRR 1675b4a01d4SMehdi Amini 1685b4a01d4SMehdi AminiDeclarative, rule-based pattern-match and rewrite (DRR) is an operation 1695b4a01d4SMehdi AminiDAG-based declarative rewriter that provides a table-based syntax for 1705b4a01d4SMehdi Aminipattern-match and rewrite rules: 1715b4a01d4SMehdi Amini 172430bba2aSJacques Pienaar```tablegen 1735b4a01d4SMehdi Aminiclass Pattern< 1745b4a01d4SMehdi Amini dag sourcePattern, list<dag> resultPatterns, 1755b4a01d4SMehdi Amini list<dag> additionalConstraints = [], 1765b4a01d4SMehdi Amini dag benefitsAdded = (addBenefit 0)>; 1775b4a01d4SMehdi Amini``` 1785b4a01d4SMehdi Amini 1795b4a01d4SMehdi AminiA redundant reshape optimization similar to SimplifyRedundantTranspose can be 1805b4a01d4SMehdi Aminiexpressed more simply using DRR as follows: 1815b4a01d4SMehdi Amini 182430bba2aSJacques Pienaar```tablegen 1835b4a01d4SMehdi Amini// Reshape(Reshape(x)) = Reshape(x) 1845b4a01d4SMehdi Aminidef ReshapeReshapeOptPattern : Pat<(ReshapeOp(ReshapeOp $arg)), 1855b4a01d4SMehdi Amini (ReshapeOp $arg)>; 1865b4a01d4SMehdi Amini``` 1875b4a01d4SMehdi Amini 1885b4a01d4SMehdi AminiThe automatically generated C++ code corresponding to each of the DRR patterns 1893737be89SJonathan Roelofscan be found under `path/to/BUILD/tools/mlir/examples/toy/Ch3/ToyCombine.inc`. 1905b4a01d4SMehdi Amini 1915b4a01d4SMehdi AminiDRR also provides a method for adding argument constraints when the 1925b4a01d4SMehdi Aminitransformation is conditional on some properties of the arguments and results. 1935b4a01d4SMehdi AminiAn example is a transformation that eliminates reshapes when they are redundant, 1945b4a01d4SMehdi Aminii.e. when the input and output shapes are identical. 1955b4a01d4SMehdi Amini 196430bba2aSJacques Pienaar```tablegen 1972bdf33ccSRiver Riddledef TypesAreIdentical : Constraint<CPred<"$0.getType() == $1.getType()">>; 1985b4a01d4SMehdi Aminidef RedundantReshapeOptPattern : Pat< 1995b4a01d4SMehdi Amini (ReshapeOp:$res $arg), (replaceWithValue $arg), 2005b4a01d4SMehdi Amini [(TypesAreIdentical $res, $arg)]>; 2015b4a01d4SMehdi Amini``` 2025b4a01d4SMehdi Amini 2035b4a01d4SMehdi AminiSome optimizations may require additional transformations on instruction 2045b4a01d4SMehdi Aminiarguments. This is achieved using NativeCodeCall, which allows for more complex 2055b4a01d4SMehdi Aminitransformations either by calling into a C++ helper function or by using inline 2065b4a01d4SMehdi AminiC++. An example of such an optimization is FoldConstantReshape, where we 2075b4a01d4SMehdi Aminioptimize Reshape of a constant value by reshaping the constant in place and 2085b4a01d4SMehdi Aminieliminating the reshape operation. 2095b4a01d4SMehdi Amini 210430bba2aSJacques Pienaar```tablegen 2112bdf33ccSRiver Riddledef ReshapeConstant : NativeCodeCall<"$0.reshape(($1.getType()).cast<ShapedType>())">; 2125b4a01d4SMehdi Aminidef FoldConstantReshapeOptPattern : Pat< 2135b4a01d4SMehdi Amini (ReshapeOp:$res (ConstantOp $arg)), 2145b4a01d4SMehdi Amini (ConstantOp (ReshapeConstant $arg, $res))>; 2155b4a01d4SMehdi Amini``` 2165b4a01d4SMehdi Amini 2175b4a01d4SMehdi AminiWe demonstrate these reshape optimizations using the following 218495cf272SLucy Foxtrivial_reshape.toy program: 2195b4a01d4SMehdi Amini 2205b4a01d4SMehdi Amini```c++ 2215b4a01d4SMehdi Aminidef main() { 2225b4a01d4SMehdi Amini var a<2,1> = [1, 2]; 2235b4a01d4SMehdi Amini var b<2,1> = a; 2245b4a01d4SMehdi Amini var c<2,1> = b; 2255b4a01d4SMehdi Amini print(c); 2265b4a01d4SMehdi Amini} 2275b4a01d4SMehdi Amini``` 2285b4a01d4SMehdi Amini 2295b4a01d4SMehdi Amini```mlir 2305b4a01d4SMehdi Aminimodule { 231ee2c6cd9SRiver Riddle toy.func @main() { 2320050e8f0SRiver Riddle %0 = toy.constant dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf64> 2330050e8f0SRiver Riddle %1 = toy.reshape(%0 : tensor<2xf64>) to tensor<2x1xf64> 2340050e8f0SRiver Riddle %2 = toy.reshape(%1 : tensor<2x1xf64>) to tensor<2x1xf64> 2350050e8f0SRiver Riddle %3 = toy.reshape(%2 : tensor<2x1xf64>) to tensor<2x1xf64> 2360050e8f0SRiver Riddle toy.print %3 : tensor<2x1xf64> 2370050e8f0SRiver Riddle toy.return 2385b4a01d4SMehdi Amini } 2395b4a01d4SMehdi Amini} 2405b4a01d4SMehdi Amini``` 2415b4a01d4SMehdi Amini 242495cf272SLucy FoxWe can try to run `toyc-ch3 test/Examples/Toy/Ch3/trivial_reshape.toy -emit=mlir 243495cf272SLucy Fox-opt` and observe our pattern in action: 2445b4a01d4SMehdi Amini 2455b4a01d4SMehdi Amini```mlir 2465b4a01d4SMehdi Aminimodule { 247ee2c6cd9SRiver Riddle toy.func @main() { 2480050e8f0SRiver Riddle %0 = toy.constant dense<[[1.000000e+00], [2.000000e+00]]> : tensor<2x1xf64> 2490050e8f0SRiver Riddle toy.print %0 : tensor<2x1xf64> 2500050e8f0SRiver Riddle toy.return 2515b4a01d4SMehdi Amini } 2525b4a01d4SMehdi Amini} 2535b4a01d4SMehdi Amini``` 2545b4a01d4SMehdi Amini 2555b4a01d4SMehdi AminiAs expected, no reshape operations remain after canonicalization. 2565b4a01d4SMehdi Amini 2575b4a01d4SMehdi AminiFurther details on the declarative rewrite method can be found at 2585b4a01d4SMehdi Amini[Table-driven Declarative Rewrite Rule (DRR)](../../DeclarativeRewrites.md). 2595b4a01d4SMehdi Amini 2605b4a01d4SMehdi AminiIn this chapter, we saw how to use certain core transformations through always 2615b4a01d4SMehdi Aminiavailable hooks. In the [next chapter](Ch-4.md), we will see how to use generic 2625b4a01d4SMehdi Aminisolutions that scale better through Interfaces. 263