1*7330f729Sjoerg============================================= 2*7330f729SjoergBuilding a JIT: Per-function Lazy Compilation 3*7330f729Sjoerg============================================= 4*7330f729Sjoerg 5*7330f729Sjoerg.. contents:: 6*7330f729Sjoerg :local: 7*7330f729Sjoerg 8*7330f729Sjoerg**This tutorial is under active development. It is incomplete and details may 9*7330f729Sjoergchange frequently.** Nonetheless we invite you to try it out as it stands, and 10*7330f729Sjoergwe welcome any feedback. 11*7330f729Sjoerg 12*7330f729SjoergChapter 3 Introduction 13*7330f729Sjoerg====================== 14*7330f729Sjoerg 15*7330f729Sjoerg**Warning: This text is currently out of date due to ORC API updates.** 16*7330f729Sjoerg 17*7330f729Sjoerg**The example code has been updated and can be used. The text will be updated 18*7330f729Sjoergonce the API churn dies down.** 19*7330f729Sjoerg 20*7330f729SjoergWelcome to Chapter 3 of the "Building an ORC-based JIT in LLVM" tutorial. This 21*7330f729Sjoergchapter discusses lazy JITing and shows you how to enable it by adding an ORC 22*7330f729SjoergCompileOnDemand layer the JIT from `Chapter 2 <BuildingAJIT2.html>`_. 23*7330f729Sjoerg 24*7330f729SjoergLazy Compilation 25*7330f729Sjoerg================ 26*7330f729Sjoerg 27*7330f729SjoergWhen we add a module to the KaleidoscopeJIT class from Chapter 2 it is 28*7330f729Sjoergimmediately optimized, compiled and linked for us by the IRTransformLayer, 29*7330f729SjoergIRCompileLayer and RTDyldObjectLinkingLayer respectively. This scheme, where all the 30*7330f729Sjoergwork to make a Module executable is done up front, is simple to understand and 31*7330f729Sjoergits performance characteristics are easy to reason about. However, it will lead 32*7330f729Sjoergto very high startup times if the amount of code to be compiled is large, and 33*7330f729Sjoergmay also do a lot of unnecessary compilation if only a few compiled functions 34*7330f729Sjoergare ever called at runtime. A truly "just-in-time" compiler should allow us to 35*7330f729Sjoergdefer the compilation of any given function until the moment that function is 36*7330f729Sjoergfirst called, improving launch times and eliminating redundant work. In fact, 37*7330f729Sjoergthe ORC APIs provide us with a layer to lazily compile LLVM IR: 38*7330f729Sjoerg*CompileOnDemandLayer*. 39*7330f729Sjoerg 40*7330f729SjoergThe CompileOnDemandLayer class conforms to the layer interface described in 41*7330f729SjoergChapter 2, but its addModule method behaves quite differently from the layers 42*7330f729Sjoergwe have seen so far: rather than doing any work up front, it just scans the 43*7330f729SjoergModules being added and arranges for each function in them to be compiled the 44*7330f729Sjoergfirst time it is called. To do this, the CompileOnDemandLayer creates two small 45*7330f729Sjoergutilities for each function that it scans: a *stub* and a *compile 46*7330f729Sjoergcallback*. The stub is a pair of a function pointer (which will be pointed at 47*7330f729Sjoergthe function's implementation once the function has been compiled) and an 48*7330f729Sjoergindirect jump through the pointer. By fixing the address of the indirect jump 49*7330f729Sjoergfor the lifetime of the program we can give the function a permanent "effective 50*7330f729Sjoergaddress", one that can be safely used for indirection and function pointer 51*7330f729Sjoergcomparison even if the function's implementation is never compiled, or if it is 52*7330f729Sjoergcompiled more than once (due to, for example, recompiling the function at a 53*7330f729Sjoerghigher optimization level) and changes address. The second utility, the compile 54*7330f729Sjoergcallback, represents a re-entry point from the program into the compiler that 55*7330f729Sjoergwill trigger compilation and then execution of a function. By initializing the 56*7330f729Sjoergfunction's stub to point at the function's compile callback, we enable lazy 57*7330f729Sjoergcompilation: The first attempted call to the function will follow the function 58*7330f729Sjoergpointer and trigger the compile callback instead. The compile callback will 59*7330f729Sjoergcompile the function, update the function pointer for the stub, then execute 60*7330f729Sjoergthe function. On all subsequent calls to the function, the function pointer 61*7330f729Sjoergwill point at the already-compiled function, so there is no further overhead 62*7330f729Sjoergfrom the compiler. We will look at this process in more detail in the next 63*7330f729Sjoergchapter of this tutorial, but for now we'll trust the CompileOnDemandLayer to 64*7330f729Sjoergset all the stubs and callbacks up for us. All we need to do is to add the 65*7330f729SjoergCompileOnDemandLayer to the top of our stack and we'll get the benefits of 66*7330f729Sjoerglazy compilation. We just need a few changes to the source: 67*7330f729Sjoerg 68*7330f729Sjoerg.. code-block:: c++ 69*7330f729Sjoerg 70*7330f729Sjoerg ... 71*7330f729Sjoerg #include "llvm/ExecutionEngine/SectionMemoryManager.h" 72*7330f729Sjoerg #include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h" 73*7330f729Sjoerg #include "llvm/ExecutionEngine/Orc/CompileUtils.h" 74*7330f729Sjoerg ... 75*7330f729Sjoerg 76*7330f729Sjoerg ... 77*7330f729Sjoerg class KaleidoscopeJIT { 78*7330f729Sjoerg private: 79*7330f729Sjoerg std::unique_ptr<TargetMachine> TM; 80*7330f729Sjoerg const DataLayout DL; 81*7330f729Sjoerg RTDyldObjectLinkingLayer ObjectLayer; 82*7330f729Sjoerg IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer; 83*7330f729Sjoerg 84*7330f729Sjoerg using OptimizeFunction = 85*7330f729Sjoerg std::function<std::shared_ptr<Module>(std::shared_ptr<Module>)>; 86*7330f729Sjoerg 87*7330f729Sjoerg IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer; 88*7330f729Sjoerg 89*7330f729Sjoerg std::unique_ptr<JITCompileCallbackManager> CompileCallbackManager; 90*7330f729Sjoerg CompileOnDemandLayer<decltype(OptimizeLayer)> CODLayer; 91*7330f729Sjoerg 92*7330f729Sjoerg public: 93*7330f729Sjoerg using ModuleHandle = decltype(CODLayer)::ModuleHandleT; 94*7330f729Sjoerg 95*7330f729SjoergFirst we need to include the CompileOnDemandLayer.h header, then add two new 96*7330f729Sjoergmembers: a std::unique_ptr<JITCompileCallbackManager> and a CompileOnDemandLayer, 97*7330f729Sjoergto our class. The CompileCallbackManager member is used by the CompileOnDemandLayer 98*7330f729Sjoergto create the compile callback needed for each function. 99*7330f729Sjoerg 100*7330f729Sjoerg.. code-block:: c++ 101*7330f729Sjoerg 102*7330f729Sjoerg KaleidoscopeJIT() 103*7330f729Sjoerg : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), 104*7330f729Sjoerg ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }), 105*7330f729Sjoerg CompileLayer(ObjectLayer, SimpleCompiler(*TM)), 106*7330f729Sjoerg OptimizeLayer(CompileLayer, 107*7330f729Sjoerg [this](std::shared_ptr<Module> M) { 108*7330f729Sjoerg return optimizeModule(std::move(M)); 109*7330f729Sjoerg }), 110*7330f729Sjoerg CompileCallbackManager( 111*7330f729Sjoerg orc::createLocalCompileCallbackManager(TM->getTargetTriple(), 0)), 112*7330f729Sjoerg CODLayer(OptimizeLayer, 113*7330f729Sjoerg [this](Function &F) { return std::set<Function*>({&F}); }, 114*7330f729Sjoerg *CompileCallbackManager, 115*7330f729Sjoerg orc::createLocalIndirectStubsManagerBuilder( 116*7330f729Sjoerg TM->getTargetTriple())) { 117*7330f729Sjoerg llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); 118*7330f729Sjoerg } 119*7330f729Sjoerg 120*7330f729SjoergNext we have to update our constructor to initialize the new members. To create 121*7330f729Sjoergan appropriate compile callback manager we use the 122*7330f729SjoergcreateLocalCompileCallbackManager function, which takes a TargetMachine and a 123*7330f729SjoergJITTargetAddress to call if it receives a request to compile an unknown 124*7330f729Sjoergfunction. In our simple JIT this situation is unlikely to come up, so we'll 125*7330f729Sjoergcheat and just pass '0' here. In a production quality JIT you could give the 126*7330f729Sjoergaddress of a function that throws an exception in order to unwind the JIT'd 127*7330f729Sjoergcode's stack. 128*7330f729Sjoerg 129*7330f729SjoergNow we can construct our CompileOnDemandLayer. Following the pattern from 130*7330f729Sjoergprevious layers we start by passing a reference to the next layer down in our 131*7330f729Sjoergstack -- the OptimizeLayer. Next we need to supply a 'partitioning function': 132*7330f729Sjoergwhen a not-yet-compiled function is called, the CompileOnDemandLayer will call 133*7330f729Sjoergthis function to ask us what we would like to compile. At a minimum we need to 134*7330f729Sjoergcompile the function being called (given by the argument to the partitioning 135*7330f729Sjoergfunction), but we could also request that the CompileOnDemandLayer compile other 136*7330f729Sjoergfunctions that are unconditionally called (or highly likely to be called) from 137*7330f729Sjoergthe function being called. For KaleidoscopeJIT we'll keep it simple and just 138*7330f729Sjoergrequest compilation of the function that was called. Next we pass a reference to 139*7330f729Sjoergour CompileCallbackManager. Finally, we need to supply an "indirect stubs 140*7330f729Sjoergmanager builder": a utility function that constructs IndirectStubManagers, which 141*7330f729Sjoergare in turn used to build the stubs for the functions in each module. The 142*7330f729SjoergCompileOnDemandLayer will call the indirect stub manager builder once for each 143*7330f729Sjoergcall to addModule, and use the resulting indirect stubs manager to create 144*7330f729Sjoergstubs for all functions in all modules in the set. If/when the module set is 145*7330f729Sjoergremoved from the JIT the indirect stubs manager will be deleted, freeing any 146*7330f729Sjoergmemory allocated to the stubs. We supply this function by using the 147*7330f729SjoergcreateLocalIndirectStubsManagerBuilder utility. 148*7330f729Sjoerg 149*7330f729Sjoerg.. code-block:: c++ 150*7330f729Sjoerg 151*7330f729Sjoerg // ... 152*7330f729Sjoerg if (auto Sym = CODLayer.findSymbol(Name, false)) 153*7330f729Sjoerg // ... 154*7330f729Sjoerg return cantFail(CODLayer.addModule(std::move(Ms), 155*7330f729Sjoerg std::move(Resolver))); 156*7330f729Sjoerg // ... 157*7330f729Sjoerg 158*7330f729Sjoerg // ... 159*7330f729Sjoerg return CODLayer.findSymbol(MangledNameStream.str(), true); 160*7330f729Sjoerg // ... 161*7330f729Sjoerg 162*7330f729Sjoerg // ... 163*7330f729Sjoerg CODLayer.removeModule(H); 164*7330f729Sjoerg // ... 165*7330f729Sjoerg 166*7330f729SjoergFinally, we need to replace the references to OptimizeLayer in our addModule, 167*7330f729SjoergfindSymbol, and removeModule methods. With that, we're up and running. 168*7330f729Sjoerg 169*7330f729Sjoerg**To be done:** 170*7330f729Sjoerg 171*7330f729Sjoerg** Chapter conclusion.** 172*7330f729Sjoerg 173*7330f729SjoergFull Code Listing 174*7330f729Sjoerg================= 175*7330f729Sjoerg 176*7330f729SjoergHere is the complete code listing for our running example with a CompileOnDemand 177*7330f729Sjoerglayer added to enable lazy function-at-a-time compilation. To build this example, use: 178*7330f729Sjoerg 179*7330f729Sjoerg.. code-block:: bash 180*7330f729Sjoerg 181*7330f729Sjoerg # Compile 182*7330f729Sjoerg clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy 183*7330f729Sjoerg # Run 184*7330f729Sjoerg ./toy 185*7330f729Sjoerg 186*7330f729SjoergHere is the code: 187*7330f729Sjoerg 188*7330f729Sjoerg.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter3/KaleidoscopeJIT.h 189*7330f729Sjoerg :language: c++ 190*7330f729Sjoerg 191*7330f729Sjoerg`Next: Extreme Laziness -- Using Compile Callbacks to JIT directly from ASTs <BuildingAJIT4.html>`_ 192