1========================== 2Using the New Pass Manager 3========================== 4 5.. contents:: 6 :local: 7 8Overview 9======== 10 11For an overview of the new pass manager, see the `blog post 12<https://blog.llvm.org/posts/2021-03-26-the-new-pass-manager/>`_. 13 14Just Tell Me How To Run The Default Optimization Pipeline With The New Pass Manager 15=================================================================================== 16 17.. code-block:: c++ 18 19 // Create the analysis managers. 20 // These must be declared in this order so that they are destroyed in the 21 // correct order due to inter-analysis-manager references. 22 LoopAnalysisManager LAM; 23 FunctionAnalysisManager FAM; 24 CGSCCAnalysisManager CGAM; 25 ModuleAnalysisManager MAM; 26 27 // Create the new pass manager builder. 28 // Take a look at the PassBuilder constructor parameters for more 29 // customization, e.g. specifying a TargetMachine or various debugging 30 // options. 31 PassBuilder PB; 32 33 // Register all the basic analyses with the managers. 34 PB.registerModuleAnalyses(MAM); 35 PB.registerCGSCCAnalyses(CGAM); 36 PB.registerFunctionAnalyses(FAM); 37 PB.registerLoopAnalyses(LAM); 38 PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); 39 40 // Create the pass manager. 41 // This one corresponds to a typical -O2 optimization pipeline. 42 ModulePassManager MPM = PB.buildPerModuleDefaultPipeline(OptimizationLevel::O2); 43 44 // Optimize the IR! 45 MPM.run(MyModule, MAM); 46 47The C API also supports most of this, see ``llvm-c/Transforms/PassBuilder.h``. 48 49Adding Passes to a Pass Manager 50=============================== 51 52For how to write a new PM pass, see :doc:`this page <WritingAnLLVMNewPMPass>`. 53 54To add a pass to a new PM pass manager, the important thing is to match the 55pass type and the pass manager type. For example, a ``FunctionPassManager`` 56can only contain function passes: 57 58.. code-block:: c++ 59 60 FunctionPassManager FPM; 61 // InstSimplifyPass is a function pass 62 FPM.addPass(InstSimplifyPass()); 63 64If you want to add a loop pass that runs on all loops in a function to a 65``FunctionPassManager``, the loop pass must be wrapped in a function pass 66adaptor that goes through all the loops in the function and runs the loop 67pass on each one. 68 69.. code-block:: c++ 70 71 FunctionPassManager FPM; 72 // LoopRotatePass is a loop pass 73 FPM.addPass(createFunctionToLoopPassAdaptor(LoopRotatePass())); 74 75The IR hierarchy in terms of the new PM is Module -> (CGSCC ->) Function -> 76Loop, where going through a CGSCC is optional. 77 78.. code-block:: c++ 79 80 FunctionPassManager FPM; 81 // loop -> function 82 FPM.addPass(createFunctionToLoopPassAdaptor(LoopFooPass())); 83 84 CGSCCPassManager CGPM; 85 // loop -> function -> cgscc 86 CGPM.addPass(createCGSCCToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass()))); 87 // function -> cgscc 88 CGPM.addPass(createCGSCCToFunctionPassAdaptor(FunctionFooPass())); 89 90 ModulePassManager MPM; 91 // loop -> function -> module 92 MPM.addPass(createModuleToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass()))); 93 // function -> module 94 MPM.addPass(createModuleToFunctionPassAdaptor(FunctionFooPass())); 95 96 // loop -> function -> cgscc -> module 97 MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(createCGSCCToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass())))); 98 // function -> cgscc -> module 99 MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(createCGSCCToFunctionPassAdaptor(FunctionFooPass()))); 100 101 102A pass manager of a specific IR unit is also a pass of that kind. For 103example, a ``FunctionPassManager`` is a function pass, meaning it can be 104added to a ``ModulePassManager``: 105 106.. code-block:: c++ 107 108 ModulePassManager MPM; 109 110 FunctionPassManager FPM; 111 // InstSimplifyPass is a function pass 112 FPM.addPass(InstSimplifyPass()); 113 114 MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); 115 116Generally you want to group CGSCC/function/loop passes together in a pass 117manager, as opposed to adding adaptors for each pass to the containing upper 118level pass manager. For example, 119 120.. code-block:: c++ 121 122 ModulePassManager MPM; 123 MPM.addPass(createModuleToFunctionPassAdaptor(FunctionPass1())); 124 MPM.addPass(createModuleToFunctionPassAdaptor(FunctionPass2())); 125 MPM.run(); 126 127will run ``FunctionPass1`` on each function in a module, then run 128``FunctionPass2`` on each function in the module. In contrast, 129 130.. code-block:: c++ 131 132 ModulePassManager MPM; 133 134 FunctionPassManager FPM; 135 FPM.addPass(FunctionPass1()); 136 FPM.addPass(FunctionPass2()); 137 138 MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); 139 140will run ``FunctionPass1`` and ``FunctionPass2`` on the first function in a 141module, then run both passes on the second function in the module, and so on. 142This is better for cache locality around LLVM data structures. This similarly 143applies for the other IR types, and in some cases can even affect the quality 144of optimization. For example, running all loop passes on a loop may cause a 145later loop to be able to be optimized more than if each loop pass were run 146separately. 147 148Inserting Passes into Default Pipelines 149======================================= 150 151Rather than manually adding passes to a pass manager, the typical way of 152creating a pass manager is to use a ``PassBuilder`` and call something like 153``PassBuilder::buildPerModuleDefaultPipeline()`` which creates a typical 154pipeline for a given optimization level. 155 156Sometimes either frontends or backends will want to inject passes into the 157pipeline. For example, frontends may want to add instrumentation, and target 158backends may want to add passes that lower custom intrinsics. For these 159cases, ``PassBuilder`` exposes callbacks that allow injecting passes into 160certain parts of the pipeline. For example, 161 162.. code-block:: c++ 163 164 PassBuilder PB; 165 PB.registerPipelineStartEPCallback( 166 [&](ModulePassManager &MPM, PassBuilder::OptimizationLevel Level) { 167 MPM.addPass(FooPass()); 168 }); 169 170will add ``FooPass`` near the very beginning of the pipeline for pass 171managers created by that ``PassBuilder``. See the documentation for 172``PassBuilder`` for the various places that passes can be added. 173 174If a ``PassBuilder`` has a corresponding ``TargetMachine`` for a backend, it 175will call ``TargetMachine::registerPassBuilderCallbacks()`` to allow the 176backend to inject passes into the pipeline. 177 178Clang's ``BackendUtil.cpp`` shows examples of a frontend adding (mostly 179sanitizer) passes to various parts of the pipeline. 180``AMDGPUTargetMachine::registerPassBuilderCallbacks()`` is an example of a 181backend adding passes to various parts of the pipeline. 182 183Pass plugins can also add passes into default pipelines. Different tools have 184different ways of loading dynamic pass plugins. For example, ``opt 185-load-pass-plugin=path/to/plugin.so`` loads a pass plugin into ``opt``. For 186information on writing a pass plugin, see :doc:`WritingAnLLVMNewPMPass`. 187 188Using Analyses 189============== 190 191LLVM provides many analyses that passes can use, such as a dominator tree. 192Calculating these can be expensive, so the new pass manager has 193infrastructure to cache analyses and reuse them when possible. 194 195When a pass runs on some IR, it also receives an analysis manager which it can 196query for analyses. Querying for an analysis will cause the manager to check if 197it has already computed the result for the requested IR. If it already has and 198the result is still valid, it will return that. Otherwise it will construct a 199new result by calling the analysis's ``run()`` method, cache it, and return it. 200You can also ask the analysis manager to only return an analysis if it's 201already cached. 202 203The analysis manager only provides analysis results for the same IR type as 204what the pass runs on. For example, a function pass receives an analysis 205manager that only provides function-level analyses. This works for many 206passes which work on a fixed scope. However, some passes want to peek up or 207down the IR hierarchy. For example, an SCC pass may want to look at function 208analyses for the functions inside the SCC. Or it may want to look at some 209immutable global analysis. In these cases, the analysis manager can provide a 210proxy to an outer or inner level analysis manager. For example, to get a 211``FunctionAnalysisManager`` from a ``CGSCCAnalysisManager``, you can call 212 213.. code-block:: c++ 214 215 FunctionAnalysisManager &FAM = 216 AM.getResult<FunctionAnalysisManagerCGSCCProxy>(InitialC, CG) 217 .getManager(); 218 219and use ``FAM`` as a typical ``FunctionAnalysisManager`` that a function pass 220would have access to. To get access to an outer level IR analysis, you can 221call 222 223.. code-block:: c++ 224 225 const auto &MAMProxy = 226 AM.getResult<ModuleAnalysisManagerCGSCCProxy>(InitialC, CG); 227 FooAnalysisResult *AR = MAMProxy.getCachedResult<FooAnalysis>(M); 228 229Asking for a cached and immutable outer level IR analysis works via 230``getCachedResult()``, but getting direct access to an outer level IR analysis 231manager to compute an outer level IR analysis is not allowed. This is for a 232couple reasons. 233 234The first reason is that running analyses across outer level IR in inner level 235IR passes can result in quadratic compile time behavior. For example, a module 236analysis often scans every function and allowing function passes to run a module 237analysis may cause us to scan functions a quadratic number of times. If passes 238could keep outer level analyses up to date rather than computing them on demand 239this wouldn't be an issue, but that would be a lot of work to ensure every pass 240updates all outer level analyses, and so far this hasn't been necessary and 241there isn't infrastructure for this (aside from function analyses in loop passes 242as described below). Self-updating analyses that gracefully degrade also handle 243this problem (e.g. GlobalsAA), but they run into the issue of having to be 244manually recomputed somewhere in the optimization pipeline if we want precision, 245and they block potential future concurrency. 246 247The second reason is to keep in mind potential future pass concurrency, for 248example parallelizing function passes over different functions in a CGSCC or 249module. Since passes can ask for a cached analysis result, allowing passes to 250trigger outer level analysis computation could result in non-determinism if 251concurrency was supported. A related limitation is that outer level IR analyses 252that are used must be immutable, or else they could be invalidated by changes to 253inner level IR. Outer analyses unused by inner passes can and often will be 254invalidated by changes to inner level IR. These invalidations happen after the 255inner pass manager finishes, so accessing mutable analyses would give invalid 256results. 257 258The exception to not being able to access outer level analyses is accessing 259function analyses in loop passes. Loop passes often use function analyses such 260as the dominator tree. Loop passes inherently require modifying the function the 261loop is in, and that includes some function analyses the loop analyses depend 262on. This discounts future concurrency over separate loops in a function, but 263that's a tradeoff due to how tightly a loop and its function are coupled. To 264make sure the function analyses that loop passes use are valid, they are 265manually updated in the loop passes to ensure that invalidation is not 266necessary. There is a set of common function analyses that loop passes and 267analyses have access to which is passed into loop passes as a 268``LoopStandardAnalysisResults`` parameter. Other mutable function analyses are 269not accessible from loop passes. 270 271As with any caching mechanism, we need some way to tell analysis managers 272when results are no longer valid. Much of the analysis manager complexity 273comes from trying to invalidate as few analysis results as possible to keep 274compile times as low as possible. 275 276There are two ways to deal with potentially invalid analysis results. One is 277to simply force clear the results. This should generally only be used when 278the IR that the result is keyed on becomes invalid. For example, a function 279is deleted, or a CGSCC has become invalid due to call graph changes. 280 281The typical way to invalidate analysis results is for a pass to declare what 282types of analyses it preserves and what types it does not. When transforming 283IR, a pass either has the option to update analyses alongside the IR 284transformation, or tell the analysis manager that analyses are no longer 285valid and should be invalidated. If a pass wants to keep some specific 286analysis up to date, such as when updating it would be faster than 287invalidating and recalculating it, the analysis itself may have methods to 288update it for specific transformations, or there may be helper updaters like 289``DomTreeUpdater`` for a ``DominatorTree``. Otherwise to mark some analysis 290as no longer valid, the pass can return a ``PreservedAnalyses`` with the 291proper analyses invalidated. 292 293.. code-block:: c++ 294 295 // We've made no transformations that can affect any analyses. 296 return PreservedAnalyses::all(); 297 298 // We've made transformations and don't want to bother to update any analyses. 299 return PreservedAnalyses::none(); 300 301 // We've specifically updated the dominator tree alongside any transformations, but other analysis results may be invalid. 302 PreservedAnalyses PA; 303 PA.preserve<DominatorAnalysis>(); 304 return PA; 305 306 // We haven't made any control flow changes, any analyses that only care about the control flow are still valid. 307 PreservedAnalyses PA; 308 PA.preserveSet<CFGAnalyses>(); 309 return PA; 310 311The pass manager will call the analysis manager's ``invalidate()`` method 312with the pass's returned ``PreservedAnalyses``. This can be also done 313manually within the pass: 314 315.. code-block:: c++ 316 317 FooModulePass::run(Module& M, ModuleAnalysisManager& AM) { 318 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager(); 319 320 // Invalidate all analysis results for function F1. 321 FAM.invalidate(F1, PreservedAnalyses::none()); 322 323 // Invalidate all analysis results across the entire module. 324 AM.invalidate(M, PreservedAnalyses::none()); 325 326 // Clear the entry in the analysis manager for function F2 if we've completely removed it from the module. 327 FAM.clear(F2); 328 329 ... 330 } 331 332One thing to note when accessing inner level IR analyses is cached results for 333deleted IR. If a function is deleted in a module pass, its address is still used 334as the key for cached analyses. Take care in the pass to either clear the 335results for that function or not use inner analyses at all. 336 337``AM.invalidate(M, PreservedAnalyses::none());`` will invalidate the inner 338analysis manager proxy which will clear all cached analyses, conservatively 339assuming that there are invalid addresses used as keys for cached analyses. 340However, if you'd like to be more selective about which analyses are 341cached/invalidated, you can mark the analysis manager proxy as preserved, 342essentially saying that all deleted entries have been taken care of manually. 343This should only be done with measurable compile time gains as it can be tricky 344to make sure all the right analyses are invalidated. 345 346Implementing Analysis Invalidation 347================================== 348 349By default, an analysis is invalidated if ``PreservedAnalyses`` says that 350analyses on the IR unit it runs on are not preserved (see 351``AnalysisResultModel::invalidate()``). An analysis can implement 352``invalidate()`` to be more conservative when it comes to invalidation. For 353example, 354 355.. code-block:: c++ 356 357 bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA, 358 FunctionAnalysisManager::Invalidator &) { 359 auto PAC = PA.getChecker<FooAnalysis>(); 360 // the default would be: 361 // return !(PAC.preserved() || PAC.preservedSet<AllAnalysesOn<Function>>()); 362 return !(PAC.preserved() || PAC.preservedSet<AllAnalysesOn<Function>>() 363 || PAC.preservedSet<CFGAnalyses>()); 364 } 365 366says that if the ``PreservedAnalyses`` specifically preserves 367``FooAnalysis``, or if ``PreservedAnalyses`` preserves all analyses (implicit 368in ``PAC.preserved()``), or if ``PreservedAnalyses`` preserves all function 369analyses, or ``PreservedAnalyses`` preserves all analyses that only care 370about the CFG, the ``FooAnalysisResult`` should not be invalidated. 371 372If an analysis is stateless and generally shouldn't be invalidated, use the 373following: 374 375.. code-block:: c++ 376 377 bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA, 378 FunctionAnalysisManager::Invalidator &) { 379 // Check whether the analysis has been explicitly invalidated. Otherwise, it's 380 // stateless and remains preserved. 381 auto PAC = PA.getChecker<FooAnalysis>(); 382 return !PAC.preservedWhenStateless(); 383 } 384 385If an analysis depends on other analyses, those analyses also need to be 386checked if they are invalidated: 387 388.. code-block:: c++ 389 390 bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA, 391 FunctionAnalysisManager::Invalidator &Inv) { 392 auto PAC = PA.getChecker<FooAnalysis>(); 393 if (!PAC.preserved() && !PAC.preservedSet<AllAnalysesOn<Function>>()) 394 return true; 395 396 // Check transitive dependencies. 397 return Inv.invalidate<BarAnalysis>(F, PA) || 398 Inv.invalidate<BazAnalysis>(F, PA); 399 } 400 401Combining invalidation and analysis manager proxies results in some 402complexity. For example, when we invalidate all analyses in a module pass, 403we have to make sure that we also invalidate function analyses accessible via 404any existing inner proxies. The inner proxy's ``invalidate()`` first checks 405if the proxy itself should be invalidated. If so, that means the proxy may 406contain pointers to IR that is no longer valid, meaning that the inner proxy 407needs to completely clear all relevant analysis results. Otherwise the proxy 408simply forwards the invalidation to the inner analysis manager. 409 410Generally for outer proxies, analysis results from the outer analysis manager 411should be immutable, so invalidation shouldn't be a concern. However, it is 412possible for some inner analysis to depend on some outer analysis, and when 413the outer analysis is invalidated, we need to make sure that dependent inner 414analyses are also invalidated. This actually happens with alias analysis 415results. Alias analysis is a function-level analysis, but there are 416module-level implementations of specific types of alias analysis. Currently 417``GlobalsAA`` is the only module-level alias analysis and it generally is not 418invalidated so this is not so much of a concern. See 419``OuterAnalysisManagerProxy::Result::registerOuterAnalysisInvalidation()`` 420for more details. 421 422Invoking ``opt`` 423================ 424 425.. code-block:: shell 426 427 $ opt -passes='pass1,pass2' /tmp/a.ll -S 428 # -p is an alias for -passes 429 $ opt -p pass1,pass2 /tmp/a.ll -S 430 431The new PM typically requires explicit pass nesting. For example, to run a 432function pass, then a module pass, we need to wrap the function pass in a module 433adaptor: 434 435.. code-block:: shell 436 437 $ opt -passes='function(no-op-function),no-op-module' /tmp/a.ll -S 438 439A more complete example, and ``-debug-pass-manager`` to show the execution 440order: 441 442.. code-block:: shell 443 444 $ opt -passes='no-op-module,cgscc(no-op-cgscc,function(no-op-function,loop(no-op-loop))),function(no-op-function,loop(no-op-loop))' /tmp/a.ll -S -debug-pass-manager 445 446Improper nesting can lead to error messages such as 447 448.. code-block:: shell 449 450 $ opt -passes='no-op-function,no-op-module' /tmp/a.ll -S 451 opt: unknown function pass 'no-op-module' 452 453The nesting is: module (-> cgscc) -> function -> loop, where the CGSCC nesting is optional. 454 455There are a couple of special cases for easier typing: 456 457* If the first pass is not a module pass, a pass manager of the first pass is 458 implicitly created 459 460 * For example, the following are equivalent 461 462.. code-block:: shell 463 464 $ opt -passes='no-op-function,no-op-function' /tmp/a.ll -S 465 $ opt -passes='function(no-op-function,no-op-function)' /tmp/a.ll -S 466 467* If there is an adaptor for a pass that lets it fit in the previous pass 468 manager, that is implicitly created 469 470 * For example, the following are equivalent 471 472.. code-block:: shell 473 474 $ opt -passes='no-op-function,no-op-loop' /tmp/a.ll -S 475 $ opt -passes='no-op-function,loop(no-op-loop)' /tmp/a.ll -S 476 477For a list of available passes and analyses, including the IR unit (module, 478CGSCC, function, loop) they operate on, run 479 480.. code-block:: shell 481 482 $ opt --print-passes 483 484or take a look at ``PassRegistry.def``. 485 486To make sure an analysis named ``foo`` is available before a pass, add 487``require<foo>`` to the pass pipeline. This adds a pass that simply requests 488that the analysis is run. This pass is also subject to proper nesting. For 489example, to make sure some function analysis is already computed for all 490functions before a module pass: 491 492.. code-block:: shell 493 494 $ opt -passes='function(require<my-function-analysis>),my-module-pass' /tmp/a.ll -S 495 496Status of the New and Legacy Pass Managers 497========================================== 498 499LLVM currently contains two pass managers, the legacy PM and the new PM. The 500optimization pipeline (aka the middle-end) uses the new PM, whereas the backend 501target-dependent code generation uses the legacy PM. 502 503The legacy PM somewhat works with the optimization pipeline, but this is 504deprecated and there are ongoing efforts to remove its usage. 505 506Some IR passes are considered part of the backend codegen pipeline even if 507they are LLVM IR passes (whereas all MIR passes are codegen passes). This 508includes anything added via ``TargetPassConfig`` hooks, e.g. 509``TargetPassConfig::addCodeGenPrepare()``. 510 511The ``TargetMachine::adjustPassManager()`` function that was used to extend a 512legacy PM with passes on a per target basis has been removed. It was mainly 513used from opt, but since support for using the default pipelines has been 514removed in opt the function isn't needed any longer. In the new PM such 515adjustments are done by using ``TargetMachine::registerPassBuilderCallbacks()``. 516 517Currently there are efforts to make the codegen pipeline work with the new 518PM. 519