1# Using `mlir-opt` 2 3`mlir-opt` is a command-line entry point for running passes and lowerings on MLIR code. 4This tutorial will explain how to use `mlir-opt`, show some examples of its usage, 5and mention some useful tips for working with it. 6 7Prerequisites: 8 9- [Building MLIR from source](/getting_started/) 10- [MLIR Language Reference](/docs/LangRef/) 11 12[TOC] 13 14## `mlir-opt` basics 15 16The `mlir-opt` tool loads a textual IR or bytecode into an in-memory structure, 17and optionally executes a sequence of passes 18before serializing back the IR (textual form by default). 19It is intended as a testing and debugging utility. 20 21After building the MLIR project, 22the `mlir-opt` binary (located in `build/bin`) 23is the entry point for running passes and lowerings, 24as well as emitting debug and diagnostic data. 25 26Running `mlir-opt` with no flags will consume textual or bytecode IR 27from the standard input, parse and run verifiers on it, 28and write the textual format back to the standard output. 29This is a good way to test if an input MLIR is well-formed. 30 31`mlir-opt --help` shows a complete list of flags 32(there are nearly 1000). 33Each pass has its own flag, 34though it is recommended to use `--pass-pipeline` 35to run passes rather than bare flags. 36 37## Running a pass 38 39Next we run [`convert-to-llvm`](/docs/Passes/#-convert-to-llvm), 40which converts all supported dialects to the `llvm` dialect, 41on the following IR: 42 43```mlir 44// mlir/test/Examples/mlir-opt/ctlz.mlir 45module { 46 func.func @main(%arg0: i32) -> i32 { 47 %0 = math.ctlz %arg0 : i32 48 func.return %0 : i32 49 } 50} 51``` 52 53After building MLIR, and from the `llvm-project` base directory, run 54 55```bash 56build/bin/mlir-opt --pass-pipeline="builtin.module(convert-math-to-llvm)" mlir/test/Examples/mlir-opt/ctlz.mlir 57``` 58 59which produces 60 61```mlir 62module { 63 func.func @main(%arg0: i32) -> i32 { 64 %0 = "llvm.intr.ctlz"(%arg0) <{is_zero_poison = false}> : (i32) -> i32 65 return %0 : i32 66 } 67} 68``` 69 70Note that `llvm` here is MLIR's `llvm` dialect, 71which would still need to be processed through `mlir-translate` 72to generate LLVM-IR. 73 74## Running a pass with options 75 76Next we will show how to run a pass that takes configuration options. 77Consider the following IR containing loops with poor cache locality. 78 79```mlir 80// mlir/test/Examples/mlir-opt/loop_fusion.mlir 81module { 82 func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) { 83 %0 = memref.alloc() : memref<10xf32> 84 %1 = memref.alloc() : memref<10xf32> 85 %cst = arith.constant 0.000000e+00 : f32 86 affine.for %arg2 = 0 to 10 { 87 affine.store %cst, %0[%arg2] : memref<10xf32> 88 affine.store %cst, %1[%arg2] : memref<10xf32> 89 } 90 affine.for %arg2 = 0 to 10 { 91 %2 = affine.load %0[%arg2] : memref<10xf32> 92 %3 = arith.addf %2, %2 : f32 93 affine.store %3, %arg0[%arg2] : memref<10xf32> 94 } 95 affine.for %arg2 = 0 to 10 { 96 %2 = affine.load %1[%arg2] : memref<10xf32> 97 %3 = arith.mulf %2, %2 : f32 98 affine.store %3, %arg1[%arg2] : memref<10xf32> 99 } 100 return 101 } 102} 103``` 104 105Running this with the [`affine-loop-fusion`](/docs/Passes/#-affine-loop-fusion) pass 106produces a fused loop. 107 108```bash 109build/bin/mlir-opt --pass-pipeline="builtin.module(affine-loop-fusion)" mlir/test/Examples/mlir-opt/loop_fusion.mlir 110``` 111 112```mlir 113module { 114 func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) { 115 %alloc = memref.alloc() : memref<1xf32> 116 %alloc_0 = memref.alloc() : memref<1xf32> 117 %cst = arith.constant 0.000000e+00 : f32 118 affine.for %arg2 = 0 to 10 { 119 affine.store %cst, %alloc[0] : memref<1xf32> 120 affine.store %cst, %alloc_0[0] : memref<1xf32> 121 %0 = affine.load %alloc_0[0] : memref<1xf32> 122 %1 = arith.mulf %0, %0 : f32 123 affine.store %1, %arg1[%arg2] : memref<10xf32> 124 %2 = affine.load %alloc[0] : memref<1xf32> 125 %3 = arith.addf %2, %2 : f32 126 affine.store %3, %arg0[%arg2] : memref<10xf32> 127 } 128 return 129 } 130} 131``` 132 133This pass has options that allow the user to configure its behavior. 134For example, the `fusion-compute-tolerance` option 135is described as the "fractional increase in additional computation tolerated while fusing." 136If this value is set to zero on the command line, 137the pass will not fuse the loops. 138 139```bash 140build/bin/mlir-opt --pass-pipeline="builtin.module(affine-loop-fusion{fusion-compute-tolerance=0})" \ 141mlir/test/Examples/mlir-opt/loop_fusion.mlir 142``` 143 144```mlir 145module { 146 func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) { 147 %alloc = memref.alloc() : memref<10xf32> 148 %alloc_0 = memref.alloc() : memref<10xf32> 149 %cst = arith.constant 0.000000e+00 : f32 150 affine.for %arg2 = 0 to 10 { 151 affine.store %cst, %alloc[%arg2] : memref<10xf32> 152 affine.store %cst, %alloc_0[%arg2] : memref<10xf32> 153 } 154 affine.for %arg2 = 0 to 10 { 155 %0 = affine.load %alloc[%arg2] : memref<10xf32> 156 %1 = arith.addf %0, %0 : f32 157 affine.store %1, %arg0[%arg2] : memref<10xf32> 158 } 159 affine.for %arg2 = 0 to 10 { 160 %0 = affine.load %alloc_0[%arg2] : memref<10xf32> 161 %1 = arith.mulf %0, %0 : f32 162 affine.store %1, %arg1[%arg2] : memref<10xf32> 163 } 164 return 165 } 166} 167``` 168 169Options passed to a pass 170are specified via the syntax `{option1=value1 option2=value2 ...}`, 171i.e., use space-separated `key=value` pairs for each option. 172 173## Building a pass pipeline on the command line 174 175The `--pass-pipeline` flag supports combining multiple passes into a pipeline. 176So far we have used the trivial pipeline with a single pass 177that is "anchored" on the top-level `builtin.module` op. 178[Pass anchoring](/docs/PassManagement/#oppassmanager) 179is a way for passes to specify 180that they only run on particular ops. 181While many passes are anchored on `builtin.module`, 182if you try to run a pass that is anchored on some other op 183inside `--pass-pipeline="builtin.module(pass-name)"`, 184it will not run. 185 186Multiple passes can be chained together 187by providing the pass names in a comma-separated list 188in the `--pass-pipeline` string, 189e.g., 190`--pass-pipeline="builtin.module(pass1,pass2)"`. 191The passes will be run sequentially. 192 193To use passes that have nontrivial anchoring, 194the appropriate level of nesting must be specified 195in the pass pipeline. 196For example, consider the following IR which has the same redundant code, 197but in two different levels of nesting. 198 199```mlir 200module { 201 module { 202 func.func @func1(%arg0: i32) -> i32 { 203 %0 = arith.addi %arg0, %arg0 : i32 204 %1 = arith.addi %arg0, %arg0 : i32 205 %2 = arith.addi %0, %1 : i32 206 func.return %2 : i32 207 } 208 } 209 210 gpu.module @gpu_module { 211 gpu.func @func2(%arg0: i32) -> i32 { 212 %0 = arith.addi %arg0, %arg0 : i32 213 %1 = arith.addi %arg0, %arg0 : i32 214 %2 = arith.addi %0, %1 : i32 215 gpu.return %2 : i32 216 } 217 } 218} 219``` 220 221The following pipeline runs `cse` (common subexpression elimination) 222but only on the `func.func` inside the two `builtin.module` ops. 223 224```bash 225build/bin/mlir-opt mlir/test/Examples/mlir-opt/ctlz.mlir --pass-pipeline=' 226 builtin.module( 227 builtin.module( 228 func.func(cse,canonicalize), 229 convert-to-llvm 230 ) 231 )' 232``` 233 234The output leaves the `gpu.module` alone 235 236```mlir 237module { 238 module { 239 llvm.func @func1(%arg0: i32) -> i32 { 240 %0 = llvm.add %arg0, %arg0 : i32 241 %1 = llvm.add %0, %0 : i32 242 llvm.return %1 : i32 243 } 244 } 245 gpu.module @gpu_module { 246 gpu.func @func2(%arg0: i32) -> i32 { 247 %0 = arith.addi %arg0, %arg0 : i32 248 %1 = arith.addi %arg0, %arg0 : i32 249 %2 = arith.addi %0, %1 : i32 250 gpu.return %2 : i32 251 } 252 } 253} 254``` 255 256Specifying a pass pipeline with nested anchoring 257is also beneficial for performance reasons: 258passes with anchoring can run on IR subsets in parallel, 259which provides better threaded runtime and cache locality 260within threads. 261For example, 262even if a pass is not restricted to anchor on `func.func`, 263running `builtin.module(func.func(cse, canonicalize))` 264is more efficient than `builtin.module(cse, canonicalize)`. 265 266For a spec of the pass-pipeline textual description language, 267see [the docs](/docs/PassManagement/#textual-pass-pipeline-specification). 268For more general information on pass management, see [Pass Infrastructure](/docs/PassManagement/#). 269 270## Useful CLI flags 271 272- `--debug` prints all debug information produced by `LLVM_DEBUG` calls. 273- `--debug-only="my-tag"` prints only the debug information produced by `LLVM_DEBUG` 274 in files that have the macro `#define DEBUG_TYPE "my-tag"`. 275 This often allows you to print only debug information associated with a specific pass. 276 - `"greedy-rewriter"` only prints debug information 277 for patterns applied with the greedy rewriter engine. 278 - `"dialect-conversion"` only prints debug information 279 for the dialect conversion framework. 280 - `--emit-bytecode` emits MLIR in the bytecode format. 281 - `--mlir-pass-statistics` print statistics about the passes run. 282 These are generated via [pass statistics](/docs/PassManagement/#pass-statistics). 283 - `--mlir-print-ir-after-all` prints the IR after each pass. 284 - See also `--mlir-print-ir-after-change`, `--mlir-print-ir-after-failure`, 285 and analogous versions of these flags with `before` instead of `after`. 286 - When using `print-ir` flags, adding `--mlir-print-ir-tree-dir` writes the 287 IRs to files in a directory tree, making them easier to inspect versus a 288 large dump to the terminal. 289 - `--mlir-timing` displays execution times of each pass. 290 291## Further readering 292 293- [List of passes](/docs/Passes/) 294- [List of dialects](/docs/Dialects/) 295