xref: /llvm-project/mlir/docs/Tutorials/Toy/Ch-3.md (revision 79c17330d35995d689ecec11dda0dfdb19f33428)
15b4a01d4SMehdi Amini# Chapter 3: High-level Language-Specific Analysis and Transformation
25b4a01d4SMehdi Amini
35b4a01d4SMehdi Amini[TOC]
45b4a01d4SMehdi Amini
55b4a01d4SMehdi AminiCreating a dialect that closely represents the semantics of an input language
65b4a01d4SMehdi Aminienables analyses, transformations and optimizations in MLIR that require
75b4a01d4SMehdi Aminihigh-level language information and are generally performed on the language AST.
85b4a01d4SMehdi AminiFor example, `clang` has a fairly
95b4a01d4SMehdi Amini[heavy mechanism](https://clang.llvm.org/doxygen/classclang_1_1TreeTransform.html)
105b4a01d4SMehdi Aminifor performing template instantiation in C++.
115b4a01d4SMehdi Amini
125b4a01d4SMehdi AminiWe divide compiler transformations into two categories: local and global. In
135b4a01d4SMehdi Aminithis chapter, we focus on how to leverage the Toy Dialect and its high-level
145b4a01d4SMehdi Aminisemantics to perform local pattern-match transformations that would be difficult
155b4a01d4SMehdi Aminiin LLVM. For this, we use MLIR's
165b4a01d4SMehdi Amini[Generic DAG Rewriter](../../GenericDAGRewriter.md).
175b4a01d4SMehdi Amini
185b4a01d4SMehdi AminiThere are two methods that can be used to implement pattern-match
195b4a01d4SMehdi Aminitransformations: 1. Imperative, C++ pattern-match and rewrite 2. Declarative,
205b4a01d4SMehdi Aminirule-based pattern-match and rewrite using table-driven
215b4a01d4SMehdi Amini[Declarative Rewrite Rules](../../DeclarativeRewrites.md) (DRR). Note that the
225b4a01d4SMehdi Aminiuse of DRR requires that the operations be defined using ODS, as described in
235b4a01d4SMehdi Amini[Chapter 2](Ch-2.md).
245b4a01d4SMehdi Amini
251842fd50SJacques Pienaar## Optimize Transpose using C++ style pattern-match and rewrite
265b4a01d4SMehdi Amini
275b4a01d4SMehdi AminiLet's start with a simple pattern and try to eliminate a sequence of two
28*79c17330SMatthias Krammtransposes that cancel out: `transpose(transpose(X)) -> X`. Here is the
295b4a01d4SMehdi Aminicorresponding Toy example:
305b4a01d4SMehdi Amini
31430bba2aSJacques Pienaar```toy
325b4a01d4SMehdi Aminidef transpose_transpose(x) {
335b4a01d4SMehdi Amini  return transpose(transpose(x));
345b4a01d4SMehdi Amini}
355b4a01d4SMehdi Amini```
365b4a01d4SMehdi Amini
375b4a01d4SMehdi AminiWhich corresponds to the following IR:
385b4a01d4SMehdi Amini
395b4a01d4SMehdi Amini```mlir
405b4a01d4SMehdi Aminifunc @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
410050e8f0SRiver Riddle  %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
420050e8f0SRiver Riddle  %1 = toy.transpose(%0 : tensor<*xf64>) to tensor<*xf64>
430050e8f0SRiver Riddle  toy.return %1 : tensor<*xf64>
445b4a01d4SMehdi Amini}
455b4a01d4SMehdi Amini```
465b4a01d4SMehdi Amini
475b4a01d4SMehdi AminiThis is a good example of a transformation that is trivial to match on the Toy
485b4a01d4SMehdi AminiIR but that would be quite hard for LLVM to figure. For example, today Clang
495b4a01d4SMehdi Aminican't optimize away the temporary array, and the computation with the naive
505b4a01d4SMehdi Aminitranspose is expressed with these loops:
515b4a01d4SMehdi Amini
525b4a01d4SMehdi Amini```c++
535b4a01d4SMehdi Amini#define N 100
545b4a01d4SMehdi Amini#define M 100
555b4a01d4SMehdi Amini
565b4a01d4SMehdi Aminivoid sink(void *);
575b4a01d4SMehdi Aminivoid double_transpose(int A[N][M]) {
585b4a01d4SMehdi Amini  int B[M][N];
595b4a01d4SMehdi Amini  for(int i = 0; i < N; ++i) {
605b4a01d4SMehdi Amini    for(int j = 0; j < M; ++j) {
615b4a01d4SMehdi Amini       B[j][i] = A[i][j];
625b4a01d4SMehdi Amini    }
635b4a01d4SMehdi Amini  }
645b4a01d4SMehdi Amini  for(int i = 0; i < N; ++i) {
655b4a01d4SMehdi Amini    for(int j = 0; j < M; ++j) {
665b4a01d4SMehdi Amini       A[i][j] = B[j][i];
675b4a01d4SMehdi Amini    }
685b4a01d4SMehdi Amini  }
695b4a01d4SMehdi Amini  sink(A);
705b4a01d4SMehdi Amini}
715b4a01d4SMehdi Amini```
725b4a01d4SMehdi Amini
73*79c17330SMatthias KrammFor a simple C++ approach to rewrite, involving matching a tree-like pattern in
745b4a01d4SMehdi Aminithe IR and replacing it with a different set of operations, we can plug into the
755b4a01d4SMehdi AminiMLIR `Canonicalizer` pass by implementing a `RewritePattern`:
765b4a01d4SMehdi Amini
775b4a01d4SMehdi Amini```c++
785b4a01d4SMehdi Amini/// Fold transpose(transpose(x)) -> x
795b4a01d4SMehdi Aministruct SimplifyRedundantTranspose : public mlir::OpRewritePattern<TransposeOp> {
805b4a01d4SMehdi Amini  /// We register this pattern to match every toy.transpose in the IR.
815b4a01d4SMehdi Amini  /// The "benefit" is used by the framework to order the patterns and process
825b4a01d4SMehdi Amini  /// them in order of profitability.
835b4a01d4SMehdi Amini  SimplifyRedundantTranspose(mlir::MLIRContext *context)
845b4a01d4SMehdi Amini      : OpRewritePattern<TransposeOp>(context, /*benefit=*/1) {}
855b4a01d4SMehdi Amini
865b4a01d4SMehdi Amini  /// This method is attempting to match a pattern and rewrite it. The rewriter
875b4a01d4SMehdi Amini  /// argument is the orchestrator of the sequence of rewrites. It is expected
885b4a01d4SMehdi Amini  /// to interact with it to perform any changes to the IR from here.
895b4a01d4SMehdi Amini  mlir::PatternMatchResult
905b4a01d4SMehdi Amini  matchAndRewrite(TransposeOp op,
915b4a01d4SMehdi Amini                  mlir::PatternRewriter &rewriter) const override {
925b4a01d4SMehdi Amini    // Look through the input of the current transpose.
935b4a01d4SMehdi Amini    mlir::Value transposeInput = op.getOperand();
945b4a01d4SMehdi Amini    TransposeOp transposeInputOp =
952bdf33ccSRiver Riddle        llvm::dyn_cast_or_null<TransposeOp>(transposeInput.getDefiningOp());
965b4a01d4SMehdi Amini    // If the input is defined by another Transpose, bingo!
975b4a01d4SMehdi Amini    if (!transposeInputOp)
985b4a01d4SMehdi Amini      return matchFailure();
995b4a01d4SMehdi Amini
1005b4a01d4SMehdi Amini    // Use the rewriter to perform the replacement
1015b4a01d4SMehdi Amini    rewriter.replaceOp(op, {transposeInputOp.getOperand()}, {transposeInputOp});
1025b4a01d4SMehdi Amini    return matchSuccess();
1035b4a01d4SMehdi Amini  }
1045b4a01d4SMehdi Amini};
1055b4a01d4SMehdi Amini```
1065b4a01d4SMehdi Amini
1075b4a01d4SMehdi AminiThe implementation of this rewriter is in `ToyCombine.cpp`. The
1085b4a01d4SMehdi Amini[canonicalization pass](../../Canonicalization.md) applies transformations
1095b4a01d4SMehdi Aminidefined by operations in a greedy, iterative manner. To ensure that the
1105b4a01d4SMehdi Aminicanonicalization pass applies our new transform, we set
1115b4a01d4SMehdi Amini[hasCanonicalizer = 1](../../OpDefinitions.md#hascanonicalizer) and register the
1125b4a01d4SMehdi Aminipattern with the canonicalization framework.
1135b4a01d4SMehdi Amini
1145b4a01d4SMehdi Amini```c++
1155b4a01d4SMehdi Amini// Register our patterns for rewrite by the Canonicalization framework.
1165b4a01d4SMehdi Aminivoid TransposeOp::getCanonicalizationPatterns(
1175b4a01d4SMehdi Amini    OwningRewritePatternList &results, MLIRContext *context) {
1185b4a01d4SMehdi Amini  results.insert<SimplifyRedundantTranspose>(context);
1195b4a01d4SMehdi Amini}
1205b4a01d4SMehdi Amini```
1215b4a01d4SMehdi Amini
1225b4a01d4SMehdi AminiWe also need to update our main file, `toyc.cpp`, to add an optimization
1235b4a01d4SMehdi Aminipipeline. In MLIR, the optimizations are run through a `PassManager` in a
1245b4a01d4SMehdi Aminisimilar way to LLVM:
1255b4a01d4SMehdi Amini
1265b4a01d4SMehdi Amini```c++
1275b4a01d4SMehdi Amini  mlir::PassManager pm(module.getContext());
1285b4a01d4SMehdi Amini  pm.addNestedPass<mlir::FuncOp>(mlir::createCanonicalizerPass());
1295b4a01d4SMehdi Amini```
1305b4a01d4SMehdi Amini
1315b4a01d4SMehdi AminiFinally, we can run `toyc-ch3 test/transpose_transpose.toy -emit=mlir -opt` and
1325b4a01d4SMehdi Aminiobserve our pattern in action:
1335b4a01d4SMehdi Amini
1345b4a01d4SMehdi Amini```mlir
1355b4a01d4SMehdi Aminifunc @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
1360050e8f0SRiver Riddle  %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
1370050e8f0SRiver Riddle  toy.return %arg0 : tensor<*xf64>
1385b4a01d4SMehdi Amini}
1395b4a01d4SMehdi Amini```
1405b4a01d4SMehdi Amini
1415b4a01d4SMehdi AminiAs expected, we now directly return the function argument, bypassing any
1425b4a01d4SMehdi Aminitranspose operation. However, one of the transposes still hasn't been
1435b4a01d4SMehdi Aminieliminated. That is not ideal! What happened is that our pattern replaced the
1445b4a01d4SMehdi Aminilast transform with the function input and left behind the now dead transpose
1455b4a01d4SMehdi Aminiinput. The Canonicalizer knows to clean up dead operations; however, MLIR
1465b4a01d4SMehdi Aminiconservatively assumes that operations may have side-effects. We can fix this by
1475b4a01d4SMehdi Aminiadding a new trait, `NoSideEffect`, to our `TransposeOp`:
1485b4a01d4SMehdi Amini
149430bba2aSJacques Pienaar```tablegen
1505b4a01d4SMehdi Aminidef TransposeOp : Toy_Op<"transpose", [NoSideEffect]> {...}
1515b4a01d4SMehdi Amini```
1525b4a01d4SMehdi Amini
1535b4a01d4SMehdi AminiLet's retry now `toyc-ch3 test/transpose_transpose.toy -emit=mlir -opt`:
1545b4a01d4SMehdi Amini
1555b4a01d4SMehdi Amini```mlir
1565b4a01d4SMehdi Aminifunc @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
1570050e8f0SRiver Riddle  toy.return %arg0 : tensor<*xf64>
1585b4a01d4SMehdi Amini}
1595b4a01d4SMehdi Amini```
1605b4a01d4SMehdi Amini
1615b4a01d4SMehdi AminiPerfect! No `transpose` operation is left - the code is optimal.
1625b4a01d4SMehdi Amini
1635b4a01d4SMehdi AminiIn the next section, we use DRR for pattern match optimizations associated with
1645b4a01d4SMehdi Aminithe Reshape op.
1655b4a01d4SMehdi Amini
1661842fd50SJacques Pienaar## Optimize Reshapes using DRR
1675b4a01d4SMehdi Amini
1685b4a01d4SMehdi AminiDeclarative, rule-based pattern-match and rewrite (DRR) is an operation
1695b4a01d4SMehdi AminiDAG-based declarative rewriter that provides a table-based syntax for
1705b4a01d4SMehdi Aminipattern-match and rewrite rules:
1715b4a01d4SMehdi Amini
172430bba2aSJacques Pienaar```tablegen
1735b4a01d4SMehdi Aminiclass Pattern<
1745b4a01d4SMehdi Amini    dag sourcePattern, list<dag> resultPatterns,
1755b4a01d4SMehdi Amini    list<dag> additionalConstraints = [],
1765b4a01d4SMehdi Amini    dag benefitsAdded = (addBenefit 0)>;
1775b4a01d4SMehdi Amini```
1785b4a01d4SMehdi Amini
1795b4a01d4SMehdi AminiA redundant reshape optimization similar to SimplifyRedundantTranspose can be
1805b4a01d4SMehdi Aminiexpressed more simply using DRR as follows:
1815b4a01d4SMehdi Amini
182430bba2aSJacques Pienaar```tablegen
1835b4a01d4SMehdi Amini// Reshape(Reshape(x)) = Reshape(x)
1845b4a01d4SMehdi Aminidef ReshapeReshapeOptPattern : Pat<(ReshapeOp(ReshapeOp $arg)),
1855b4a01d4SMehdi Amini                                   (ReshapeOp $arg)>;
1865b4a01d4SMehdi Amini```
1875b4a01d4SMehdi Amini
1885b4a01d4SMehdi AminiThe automatically generated C++ code corresponding to each of the DRR patterns
1895b4a01d4SMehdi Aminican be found under path/to/BUILD/projects/mlir/examples/toy/Ch3/ToyCombine.inc.
1905b4a01d4SMehdi Amini
1915b4a01d4SMehdi AminiDRR also provides a method for adding argument constraints when the
1925b4a01d4SMehdi Aminitransformation is conditional on some properties of the arguments and results.
1935b4a01d4SMehdi AminiAn example is a transformation that eliminates reshapes when they are redundant,
1945b4a01d4SMehdi Aminii.e. when the input and output shapes are identical.
1955b4a01d4SMehdi Amini
196430bba2aSJacques Pienaar```tablegen
1972bdf33ccSRiver Riddledef TypesAreIdentical : Constraint<CPred<"$0.getType() == $1.getType()">>;
1985b4a01d4SMehdi Aminidef RedundantReshapeOptPattern : Pat<
1995b4a01d4SMehdi Amini  (ReshapeOp:$res $arg), (replaceWithValue $arg),
2005b4a01d4SMehdi Amini  [(TypesAreIdentical $res, $arg)]>;
2015b4a01d4SMehdi Amini```
2025b4a01d4SMehdi Amini
2035b4a01d4SMehdi AminiSome optimizations may require additional transformations on instruction
2045b4a01d4SMehdi Aminiarguments. This is achieved using NativeCodeCall, which allows for more complex
2055b4a01d4SMehdi Aminitransformations either by calling into a C++ helper function or by using inline
2065b4a01d4SMehdi AminiC++. An example of such an optimization is FoldConstantReshape, where we
2075b4a01d4SMehdi Aminioptimize Reshape of a constant value by reshaping the constant in place and
2085b4a01d4SMehdi Aminieliminating the reshape operation.
2095b4a01d4SMehdi Amini
210430bba2aSJacques Pienaar```tablegen
2112bdf33ccSRiver Riddledef ReshapeConstant : NativeCodeCall<"$0.reshape(($1.getType()).cast<ShapedType>())">;
2125b4a01d4SMehdi Aminidef FoldConstantReshapeOptPattern : Pat<
2135b4a01d4SMehdi Amini  (ReshapeOp:$res (ConstantOp $arg)),
2145b4a01d4SMehdi Amini  (ConstantOp (ReshapeConstant $arg, $res))>;
2155b4a01d4SMehdi Amini```
2165b4a01d4SMehdi Amini
2175b4a01d4SMehdi AminiWe demonstrate these reshape optimizations using the following
2185b4a01d4SMehdi AminitrivialReshape.toy program:
2195b4a01d4SMehdi Amini
2205b4a01d4SMehdi Amini```c++
2215b4a01d4SMehdi Aminidef main() {
2225b4a01d4SMehdi Amini  var a<2,1> = [1, 2];
2235b4a01d4SMehdi Amini  var b<2,1> = a;
2245b4a01d4SMehdi Amini  var c<2,1> = b;
2255b4a01d4SMehdi Amini  print(c);
2265b4a01d4SMehdi Amini}
2275b4a01d4SMehdi Amini```
2285b4a01d4SMehdi Amini
2295b4a01d4SMehdi Amini```mlir
2305b4a01d4SMehdi Aminimodule {
2315b4a01d4SMehdi Amini  func @main() {
2320050e8f0SRiver Riddle    %0 = toy.constant dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf64>
2330050e8f0SRiver Riddle    %1 = toy.reshape(%0 : tensor<2xf64>) to tensor<2x1xf64>
2340050e8f0SRiver Riddle    %2 = toy.reshape(%1 : tensor<2x1xf64>) to tensor<2x1xf64>
2350050e8f0SRiver Riddle    %3 = toy.reshape(%2 : tensor<2x1xf64>) to tensor<2x1xf64>
2360050e8f0SRiver Riddle    toy.print %3 : tensor<2x1xf64>
2370050e8f0SRiver Riddle    toy.return
2385b4a01d4SMehdi Amini  }
2395b4a01d4SMehdi Amini}
2405b4a01d4SMehdi Amini```
2415b4a01d4SMehdi Amini
2425b4a01d4SMehdi AminiWe can try to run `toyc-ch3 test/trivialReshape.toy -emit=mlir -opt` and observe
2435b4a01d4SMehdi Aminiour pattern in action:
2445b4a01d4SMehdi Amini
2455b4a01d4SMehdi Amini```mlir
2465b4a01d4SMehdi Aminimodule {
2475b4a01d4SMehdi Amini  func @main() {
2480050e8f0SRiver Riddle    %0 = toy.constant dense<[[1.000000e+00], [2.000000e+00]]> : tensor<2x1xf64>
2490050e8f0SRiver Riddle    toy.print %0 : tensor<2x1xf64>
2500050e8f0SRiver Riddle    toy.return
2515b4a01d4SMehdi Amini  }
2525b4a01d4SMehdi Amini}
2535b4a01d4SMehdi Amini```
2545b4a01d4SMehdi Amini
2555b4a01d4SMehdi AminiAs expected, no reshape operations remain after canonicalization.
2565b4a01d4SMehdi Amini
2575b4a01d4SMehdi AminiFurther details on the declarative rewrite method can be found at
2585b4a01d4SMehdi Amini[Table-driven Declarative Rewrite Rule (DRR)](../../DeclarativeRewrites.md).
2595b4a01d4SMehdi Amini
2605b4a01d4SMehdi AminiIn this chapter, we saw how to use certain core transformations through always
2615b4a01d4SMehdi Aminiavailable hooks. In the [next chapter](Ch-4.md), we will see how to use generic
2625b4a01d4SMehdi Aminisolutions that scale better through Interfaces.
263