xref: /llvm-project/mlir/docs/Tutorials/Toy/Ch-3.md (revision db791b278a414fb6df1acc1799adcf11d8fb9169)
15b4a01d4SMehdi Amini# Chapter 3: High-level Language-Specific Analysis and Transformation
25b4a01d4SMehdi Amini
35b4a01d4SMehdi Amini[TOC]
45b4a01d4SMehdi Amini
55b4a01d4SMehdi AminiCreating a dialect that closely represents the semantics of an input language
65b4a01d4SMehdi Aminienables analyses, transformations and optimizations in MLIR that require
75b4a01d4SMehdi Aminihigh-level language information and are generally performed on the language AST.
85b4a01d4SMehdi AminiFor example, `clang` has a fairly
95b4a01d4SMehdi Amini[heavy mechanism](https://clang.llvm.org/doxygen/classclang_1_1TreeTransform.html)
105b4a01d4SMehdi Aminifor performing template instantiation in C++.
115b4a01d4SMehdi Amini
125b4a01d4SMehdi AminiWe divide compiler transformations into two categories: local and global. In
135b4a01d4SMehdi Aminithis chapter, we focus on how to leverage the Toy Dialect and its high-level
145b4a01d4SMehdi Aminisemantics to perform local pattern-match transformations that would be difficult
155b4a01d4SMehdi Aminiin LLVM. For this, we use MLIR's
16f7a13479SRiver Riddle[Generic DAG Rewriter](../../PatternRewriter.md).
175b4a01d4SMehdi Amini
185b4a01d4SMehdi AminiThere are two methods that can be used to implement pattern-match
195b4a01d4SMehdi Aminitransformations: 1. Imperative, C++ pattern-match and rewrite 2. Declarative,
205b4a01d4SMehdi Aminirule-based pattern-match and rewrite using table-driven
215b4a01d4SMehdi Amini[Declarative Rewrite Rules](../../DeclarativeRewrites.md) (DRR). Note that the
225b4a01d4SMehdi Aminiuse of DRR requires that the operations be defined using ODS, as described in
235b4a01d4SMehdi Amini[Chapter 2](Ch-2.md).
245b4a01d4SMehdi Amini
251842fd50SJacques Pienaar## Optimize Transpose using C++ style pattern-match and rewrite
265b4a01d4SMehdi Amini
275b4a01d4SMehdi AminiLet's start with a simple pattern and try to eliminate a sequence of two
2879c17330SMatthias Krammtransposes that cancel out: `transpose(transpose(X)) -> X`. Here is the
295b4a01d4SMehdi Aminicorresponding Toy example:
305b4a01d4SMehdi Amini
31430bba2aSJacques Pienaar```toy
325b4a01d4SMehdi Aminidef transpose_transpose(x) {
335b4a01d4SMehdi Amini  return transpose(transpose(x));
345b4a01d4SMehdi Amini}
355b4a01d4SMehdi Amini```
365b4a01d4SMehdi Amini
375b4a01d4SMehdi AminiWhich corresponds to the following IR:
385b4a01d4SMehdi Amini
395b4a01d4SMehdi Amini```mlir
40ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
410050e8f0SRiver Riddle  %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
420050e8f0SRiver Riddle  %1 = toy.transpose(%0 : tensor<*xf64>) to tensor<*xf64>
430050e8f0SRiver Riddle  toy.return %1 : tensor<*xf64>
445b4a01d4SMehdi Amini}
455b4a01d4SMehdi Amini```
465b4a01d4SMehdi Amini
475b4a01d4SMehdi AminiThis is a good example of a transformation that is trivial to match on the Toy
485b4a01d4SMehdi AminiIR but that would be quite hard for LLVM to figure. For example, today Clang
495b4a01d4SMehdi Aminican't optimize away the temporary array, and the computation with the naive
505b4a01d4SMehdi Aminitranspose is expressed with these loops:
515b4a01d4SMehdi Amini
525b4a01d4SMehdi Amini```c++
535b4a01d4SMehdi Amini#define N 100
545b4a01d4SMehdi Amini#define M 100
555b4a01d4SMehdi Amini
565b4a01d4SMehdi Aminivoid sink(void *);
575b4a01d4SMehdi Aminivoid double_transpose(int A[N][M]) {
585b4a01d4SMehdi Amini  int B[M][N];
595b4a01d4SMehdi Amini  for(int i = 0; i < N; ++i) {
605b4a01d4SMehdi Amini    for(int j = 0; j < M; ++j) {
615b4a01d4SMehdi Amini       B[j][i] = A[i][j];
625b4a01d4SMehdi Amini    }
635b4a01d4SMehdi Amini  }
645b4a01d4SMehdi Amini  for(int i = 0; i < N; ++i) {
655b4a01d4SMehdi Amini    for(int j = 0; j < M; ++j) {
665b4a01d4SMehdi Amini       A[i][j] = B[j][i];
675b4a01d4SMehdi Amini    }
685b4a01d4SMehdi Amini  }
695b4a01d4SMehdi Amini  sink(A);
705b4a01d4SMehdi Amini}
715b4a01d4SMehdi Amini```
725b4a01d4SMehdi Amini
7379c17330SMatthias KrammFor a simple C++ approach to rewrite, involving matching a tree-like pattern in
745b4a01d4SMehdi Aminithe IR and replacing it with a different set of operations, we can plug into the
755b4a01d4SMehdi AminiMLIR `Canonicalizer` pass by implementing a `RewritePattern`:
765b4a01d4SMehdi Amini
775b4a01d4SMehdi Amini```c++
785b4a01d4SMehdi Amini/// Fold transpose(transpose(x)) -> x
795b4a01d4SMehdi Aministruct SimplifyRedundantTranspose : public mlir::OpRewritePattern<TransposeOp> {
805b4a01d4SMehdi Amini  /// We register this pattern to match every toy.transpose in the IR.
815b4a01d4SMehdi Amini  /// The "benefit" is used by the framework to order the patterns and process
825b4a01d4SMehdi Amini  /// them in order of profitability.
835b4a01d4SMehdi Amini  SimplifyRedundantTranspose(mlir::MLIRContext *context)
845b4a01d4SMehdi Amini      : OpRewritePattern<TransposeOp>(context, /*benefit=*/1) {}
855b4a01d4SMehdi Amini
865b4a01d4SMehdi Amini  /// This method is attempting to match a pattern and rewrite it. The rewriter
875b4a01d4SMehdi Amini  /// argument is the orchestrator of the sequence of rewrites. It is expected
885b4a01d4SMehdi Amini  /// to interact with it to perform any changes to the IR from here.
89*db791b27SRamkumar Ramachandra  llvm::LogicalResult
905b4a01d4SMehdi Amini  matchAndRewrite(TransposeOp op,
915b4a01d4SMehdi Amini                  mlir::PatternRewriter &rewriter) const override {
925b4a01d4SMehdi Amini    // Look through the input of the current transpose.
935b4a01d4SMehdi Amini    mlir::Value transposeInput = op.getOperand();
9498eead81SSean Silva    TransposeOp transposeInputOp = transposeInput.getDefiningOp<TransposeOp>();
95da025756SMatthias Kramm
96da025756SMatthias Kramm    // Input defined by another transpose? If not, no match.
975b4a01d4SMehdi Amini    if (!transposeInputOp)
983145427dSRiver Riddle      return failure();
995b4a01d4SMehdi Amini
100da025756SMatthias Kramm    // Otherwise, we have a redundant transpose. Use the rewriter.
1015633813bSRahul Joshi    rewriter.replaceOp(op, {transposeInputOp.getOperand()});
1023145427dSRiver Riddle    return success();
1035b4a01d4SMehdi Amini  }
1045b4a01d4SMehdi Amini};
1055b4a01d4SMehdi Amini```
1065b4a01d4SMehdi Amini
1075b4a01d4SMehdi AminiThe implementation of this rewriter is in `ToyCombine.cpp`. The
1085b4a01d4SMehdi Amini[canonicalization pass](../../Canonicalization.md) applies transformations
1095b4a01d4SMehdi Aminidefined by operations in a greedy, iterative manner. To ensure that the
1105b4a01d4SMehdi Aminicanonicalization pass applies our new transform, we set
1111294fa69SRiver Riddle[hasCanonicalizer = 1](../../DefiningDialects/Operations.md/#hascanonicalizer) and register the
1125b4a01d4SMehdi Aminipattern with the canonicalization framework.
1135b4a01d4SMehdi Amini
1145b4a01d4SMehdi Amini```c++
1155b4a01d4SMehdi Amini// Register our patterns for rewrite by the Canonicalization framework.
1165b4a01d4SMehdi Aminivoid TransposeOp::getCanonicalizationPatterns(
117dc4e913bSChris Lattner    RewritePatternSet &results, MLIRContext *context) {
118dc4e913bSChris Lattner  results.add<SimplifyRedundantTranspose>(context);
1195b4a01d4SMehdi Amini}
1205b4a01d4SMehdi Amini```
1215b4a01d4SMehdi Amini
1225b4a01d4SMehdi AminiWe also need to update our main file, `toyc.cpp`, to add an optimization
1235b4a01d4SMehdi Aminipipeline. In MLIR, the optimizations are run through a `PassManager` in a
1245b4a01d4SMehdi Aminisimilar way to LLVM:
1255b4a01d4SMehdi Amini
1265b4a01d4SMehdi Amini```c++
12794a30928Srkayaith  mlir::PassManager pm(module->getName());
128ee2c6cd9SRiver Riddle  pm.addNestedPass<mlir::toy::FuncOp>(mlir::createCanonicalizerPass());
1295b4a01d4SMehdi Amini```
1305b4a01d4SMehdi Amini
131495cf272SLucy FoxFinally, we can run `toyc-ch3 test/Examples/Toy/Ch3/transpose_transpose.toy
132495cf272SLucy Fox-emit=mlir -opt` and observe our pattern in action:
1335b4a01d4SMehdi Amini
1345b4a01d4SMehdi Amini```mlir
135ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
1360050e8f0SRiver Riddle  %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
1370050e8f0SRiver Riddle  toy.return %arg0 : tensor<*xf64>
1385b4a01d4SMehdi Amini}
1395b4a01d4SMehdi Amini```
1405b4a01d4SMehdi Amini
1415b4a01d4SMehdi AminiAs expected, we now directly return the function argument, bypassing any
1425b4a01d4SMehdi Aminitranspose operation. However, one of the transposes still hasn't been
1435b4a01d4SMehdi Aminieliminated. That is not ideal! What happened is that our pattern replaced the
1445b4a01d4SMehdi Aminilast transform with the function input and left behind the now dead transpose
1455b4a01d4SMehdi Aminiinput. The Canonicalizer knows to clean up dead operations; however, MLIR
1465b4a01d4SMehdi Aminiconservatively assumes that operations may have side-effects. We can fix this by
14708f31b8fSHsiangkai Wangadding a new trait, `Pure`, to our `TransposeOp`:
1485b4a01d4SMehdi Amini
149430bba2aSJacques Pienaar```tablegen
15008f31b8fSHsiangkai Wangdef TransposeOp : Toy_Op<"transpose", [Pure]> {...}
1515b4a01d4SMehdi Amini```
1525b4a01d4SMehdi Amini
1535b4a01d4SMehdi AminiLet's retry now `toyc-ch3 test/transpose_transpose.toy -emit=mlir -opt`:
1545b4a01d4SMehdi Amini
1555b4a01d4SMehdi Amini```mlir
156ee2c6cd9SRiver Riddletoy.func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
1570050e8f0SRiver Riddle  toy.return %arg0 : tensor<*xf64>
1585b4a01d4SMehdi Amini}
1595b4a01d4SMehdi Amini```
1605b4a01d4SMehdi Amini
1615b4a01d4SMehdi AminiPerfect! No `transpose` operation is left - the code is optimal.
1625b4a01d4SMehdi Amini
1635b4a01d4SMehdi AminiIn the next section, we use DRR for pattern match optimizations associated with
1645b4a01d4SMehdi Aminithe Reshape op.
1655b4a01d4SMehdi Amini
1661842fd50SJacques Pienaar## Optimize Reshapes using DRR
1675b4a01d4SMehdi Amini
1685b4a01d4SMehdi AminiDeclarative, rule-based pattern-match and rewrite (DRR) is an operation
1695b4a01d4SMehdi AminiDAG-based declarative rewriter that provides a table-based syntax for
1705b4a01d4SMehdi Aminipattern-match and rewrite rules:
1715b4a01d4SMehdi Amini
172430bba2aSJacques Pienaar```tablegen
1735b4a01d4SMehdi Aminiclass Pattern<
1745b4a01d4SMehdi Amini    dag sourcePattern, list<dag> resultPatterns,
1755b4a01d4SMehdi Amini    list<dag> additionalConstraints = [],
1765b4a01d4SMehdi Amini    dag benefitsAdded = (addBenefit 0)>;
1775b4a01d4SMehdi Amini```
1785b4a01d4SMehdi Amini
1795b4a01d4SMehdi AminiA redundant reshape optimization similar to SimplifyRedundantTranspose can be
1805b4a01d4SMehdi Aminiexpressed more simply using DRR as follows:
1815b4a01d4SMehdi Amini
182430bba2aSJacques Pienaar```tablegen
1835b4a01d4SMehdi Amini// Reshape(Reshape(x)) = Reshape(x)
1845b4a01d4SMehdi Aminidef ReshapeReshapeOptPattern : Pat<(ReshapeOp(ReshapeOp $arg)),
1855b4a01d4SMehdi Amini                                   (ReshapeOp $arg)>;
1865b4a01d4SMehdi Amini```
1875b4a01d4SMehdi Amini
1885b4a01d4SMehdi AminiThe automatically generated C++ code corresponding to each of the DRR patterns
1893737be89SJonathan Roelofscan be found under `path/to/BUILD/tools/mlir/examples/toy/Ch3/ToyCombine.inc`.
1905b4a01d4SMehdi Amini
1915b4a01d4SMehdi AminiDRR also provides a method for adding argument constraints when the
1925b4a01d4SMehdi Aminitransformation is conditional on some properties of the arguments and results.
1935b4a01d4SMehdi AminiAn example is a transformation that eliminates reshapes when they are redundant,
1945b4a01d4SMehdi Aminii.e. when the input and output shapes are identical.
1955b4a01d4SMehdi Amini
196430bba2aSJacques Pienaar```tablegen
1972bdf33ccSRiver Riddledef TypesAreIdentical : Constraint<CPred<"$0.getType() == $1.getType()">>;
1985b4a01d4SMehdi Aminidef RedundantReshapeOptPattern : Pat<
1995b4a01d4SMehdi Amini  (ReshapeOp:$res $arg), (replaceWithValue $arg),
2005b4a01d4SMehdi Amini  [(TypesAreIdentical $res, $arg)]>;
2015b4a01d4SMehdi Amini```
2025b4a01d4SMehdi Amini
2035b4a01d4SMehdi AminiSome optimizations may require additional transformations on instruction
2045b4a01d4SMehdi Aminiarguments. This is achieved using NativeCodeCall, which allows for more complex
2055b4a01d4SMehdi Aminitransformations either by calling into a C++ helper function or by using inline
2065b4a01d4SMehdi AminiC++. An example of such an optimization is FoldConstantReshape, where we
2075b4a01d4SMehdi Aminioptimize Reshape of a constant value by reshaping the constant in place and
2085b4a01d4SMehdi Aminieliminating the reshape operation.
2095b4a01d4SMehdi Amini
210430bba2aSJacques Pienaar```tablegen
2112bdf33ccSRiver Riddledef ReshapeConstant : NativeCodeCall<"$0.reshape(($1.getType()).cast<ShapedType>())">;
2125b4a01d4SMehdi Aminidef FoldConstantReshapeOptPattern : Pat<
2135b4a01d4SMehdi Amini  (ReshapeOp:$res (ConstantOp $arg)),
2145b4a01d4SMehdi Amini  (ConstantOp (ReshapeConstant $arg, $res))>;
2155b4a01d4SMehdi Amini```
2165b4a01d4SMehdi Amini
2175b4a01d4SMehdi AminiWe demonstrate these reshape optimizations using the following
218495cf272SLucy Foxtrivial_reshape.toy program:
2195b4a01d4SMehdi Amini
2205b4a01d4SMehdi Amini```c++
2215b4a01d4SMehdi Aminidef main() {
2225b4a01d4SMehdi Amini  var a<2,1> = [1, 2];
2235b4a01d4SMehdi Amini  var b<2,1> = a;
2245b4a01d4SMehdi Amini  var c<2,1> = b;
2255b4a01d4SMehdi Amini  print(c);
2265b4a01d4SMehdi Amini}
2275b4a01d4SMehdi Amini```
2285b4a01d4SMehdi Amini
2295b4a01d4SMehdi Amini```mlir
2305b4a01d4SMehdi Aminimodule {
231ee2c6cd9SRiver Riddle  toy.func @main() {
2320050e8f0SRiver Riddle    %0 = toy.constant dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf64>
2330050e8f0SRiver Riddle    %1 = toy.reshape(%0 : tensor<2xf64>) to tensor<2x1xf64>
2340050e8f0SRiver Riddle    %2 = toy.reshape(%1 : tensor<2x1xf64>) to tensor<2x1xf64>
2350050e8f0SRiver Riddle    %3 = toy.reshape(%2 : tensor<2x1xf64>) to tensor<2x1xf64>
2360050e8f0SRiver Riddle    toy.print %3 : tensor<2x1xf64>
2370050e8f0SRiver Riddle    toy.return
2385b4a01d4SMehdi Amini  }
2395b4a01d4SMehdi Amini}
2405b4a01d4SMehdi Amini```
2415b4a01d4SMehdi Amini
242495cf272SLucy FoxWe can try to run `toyc-ch3 test/Examples/Toy/Ch3/trivial_reshape.toy -emit=mlir
243495cf272SLucy Fox-opt` and observe our pattern in action:
2445b4a01d4SMehdi Amini
2455b4a01d4SMehdi Amini```mlir
2465b4a01d4SMehdi Aminimodule {
247ee2c6cd9SRiver Riddle  toy.func @main() {
2480050e8f0SRiver Riddle    %0 = toy.constant dense<[[1.000000e+00], [2.000000e+00]]> : tensor<2x1xf64>
2490050e8f0SRiver Riddle    toy.print %0 : tensor<2x1xf64>
2500050e8f0SRiver Riddle    toy.return
2515b4a01d4SMehdi Amini  }
2525b4a01d4SMehdi Amini}
2535b4a01d4SMehdi Amini```
2545b4a01d4SMehdi Amini
2555b4a01d4SMehdi AminiAs expected, no reshape operations remain after canonicalization.
2565b4a01d4SMehdi Amini
2575b4a01d4SMehdi AminiFurther details on the declarative rewrite method can be found at
2585b4a01d4SMehdi Amini[Table-driven Declarative Rewrite Rule (DRR)](../../DeclarativeRewrites.md).
2595b4a01d4SMehdi Amini
2605b4a01d4SMehdi AminiIn this chapter, we saw how to use certain core transformations through always
2615b4a01d4SMehdi Aminiavailable hooks. In the [next chapter](Ch-4.md), we will see how to use generic
2625b4a01d4SMehdi Aminisolutions that scale better through Interfaces.
263