1==================================== 2JITLink and ORC's ObjectLinkingLayer 3==================================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11This document aims to provide a high-level overview of the design and API 12of the JITLink library. It assumes some familiarity with linking and 13relocatable object files, but should not require deep expertise. If you know 14what a section, symbol, and relocation are you should find this document 15accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a 16bug (:doc:`HowToSubmitABug`). 17 18JITLink is a library for :ref:`jit_linking`. It was built to support the ORC JIT 19APIs and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was 20developed with the aim of supporting the full set of features provided by each 21object format; including static initializers, exception handling, thread local 22variables, and language runtime registration. Supporting these features enables 23ORC to execute code generated from source languages which rely on these features 24(e.g. C++ requires object format support for static initializers to support 25static constructors, eh-frame registration for exceptions, and TLV support for 26thread locals; Swift and Objective-C require language runtime registration for 27many features). For some object format features support is provided entirely 28within JITLink, and for others it is provided in cooperation with the 29(prototype) ORC runtime. 30 31JITLink aims to support the following features, some of which are still under 32development: 33 341. Cross-process and cross-architecture linking of single relocatable objects 35 into a target *executor* process. 36 372. Support for all object format features. 38 393. Open linker data structures (``LinkGraph``) and pass system. 40 41JITLink and ObjectLinkingLayer 42============================== 43 44``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that 45allows objects to be added to a ``JITDylib``, or emitted from some higher level 46program representation. When an object is emitted, ``ObjectLinkingLayer`` uses 47JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and 48calls JITLink's ``link`` function to link the graph into the executor process. 49 50The ``ObjectLinkingLayer`` class provides a plugin API, 51``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and 52modify ``LinkGraph`` instances at link time, and react to important JIT events 53(such as an object being emitted into target memory). This enables many features 54and optimizations that were not possible under MCJIT or RuntimeDyld. 55 56ObjectLinkingLayer Plugins 57-------------------------- 58 59The ``ObjectLinkingLayer::Plugin`` class provides the following methods: 60 61* ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It 62 can be overridden to install JITLink *Passes* to run during the link process. 63 64 .. code-block:: c++ 65 66 void modifyPassConfig(MaterializationResponsibility &MR, 67 const Triple &TT, 68 jitlink::PassConfiguration &Config) 69 70* ``notifyLoaded`` is called before the link begins, and can be overridden to 71 set up any initial state for the given ``MaterializationResponsibility`` if 72 needed. 73 74 .. code-block:: c++ 75 76 void notifyLoaded(MaterializationResponsibility &MR) 77 78* ``notifyEmitted`` is called after the link is complete and code has been 79 emitted to the executor process. It can be overridden to finalize state 80 for the ``MaterializationResponsibility`` if needed. 81 82 .. code-block:: c++ 83 84 Error notifyEmitted(MaterializationResponsibility &MR) 85 86* ``notifyFailed`` is called if the link fails at any point. It can be 87 overridden to react to the failure (e.g. to deallocate any already allocated 88 resources). 89 90 .. code-block:: c++ 91 92 Error notifyFailed(MaterializationResponsibility &MR) 93 94* ``notifyRemovingResources`` is called when a request is made to remove any 95 resources associated with the ``ResourceKey`` *K* for the 96 ``MaterializationResponsibility``. 97 98 .. code-block:: c++ 99 100 Error notifyRemovingResources(ResourceKey K) 101 102* ``notifyTransferringResources`` is called if/when a request is made to 103 transfer tracking of any resources associated with ``ResourceKey`` 104 *SrcKey* to *DstKey*. 105 106 .. code-block:: c++ 107 108 void notifyTransferringResources(ResourceKey DstKey, 109 ResourceKey SrcKey) 110 111Plugin authors are required to implement the ``notifyFailed``, 112``notifyRemovingResources``, and ``notifyTransferringResources`` methods in 113order to safely manage resources in the case of resource removal or transfer, 114or link failure. If no resources are managed by the plugin then these methods 115can be implemented as no-ops returning ``Error::success()``. 116 117Plugin instances are added to an ``ObjectLinkingLayer`` by 118calling the ``addPlugin`` method [1]_. E.g. 119 120.. code-block:: c++ 121 122 // Plugin class to print the set of defined symbols in an object when that 123 // object is linked. 124 class MyPlugin : public ObjectLinkingLayer::Plugin { 125 public: 126 127 // Add passes to print the set of defined symbols after dead-stripping. 128 void modifyPassConfig(MaterializationResponsibility &MR, 129 const Triple &TT, 130 jitlink::PassConfiguration &Config) override { 131 Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) { 132 return printAllSymbols(G); 133 }); 134 } 135 136 // Implement mandatory overrides: 137 Error notifyFailed(MaterializationResponsibility &MR) override { 138 return Error::success(); 139 } 140 Error notifyRemovingResources(ResourceKey K) override { 141 return Error::success(); 142 } 143 void notifyTransferringResources(ResourceKey DstKey, 144 ResourceKey SrcKey) override {} 145 146 // JITLink pass to print all defined symbols in G. 147 Error printAllSymbols(LinkGraph &G) { 148 for (auto *Sym : G.defined_symbols()) 149 if (Sym->hasName()) 150 dbgs() << Sym->getName() << "\n"; 151 return Error::success(); 152 } 153 }; 154 155 // Create our LLJIT instance using a custom object linking layer setup. 156 // This gives us a chance to install our plugin. 157 auto J = ExitOnErr(LLJITBuilder() 158 .setObjectLinkingLayerCreator( 159 [](ExecutionSession &ES, const Triple &T) { 160 // Manually set up the ObjectLinkingLayer for our LLJIT 161 // instance. 162 auto OLL = std::make_unique<ObjectLinkingLayer>( 163 ES, std::make_unique<jitlink::InProcessMemoryManager>()); 164 165 // Install our plugin: 166 OLL->addPlugin(std::make_unique<MyPlugin>()); 167 168 return OLL; 169 }) 170 .create()); 171 172 // Add an object to the JIT. Nothing happens here: linking isn't triggered 173 // until we look up some symbol in our object. 174 ExitOnErr(J->addObject(loadFromDisk("main.o"))); 175 176 // Plugin triggers here when our lookup of main triggers linking of main.o 177 auto MainSym = J->lookup("main"); 178 179LinkGraph 180========= 181 182JITLink maps all relocatable object formats to a generic ``LinkGraph`` type 183that is designed to make linking fast and easy (``LinkGraph`` instances can 184also be created manually. See :ref:`constructing_linkgraphs`). 185 186Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details, 187but share a common goal: to represent machine level code and data with 188annotations that allow them to be relocated in a virtual address space. To 189this end they usually contain names (symbols) for content defined inside the 190file or externally, chunks of content that must be moved as a unit (sections 191or subsections, depending on the format), and annotations describing how to 192patch content based on the final address of some target symbol/section 193(relocations). 194 195At a high level, the ``LinkGraph`` type represents these concepts as a decorated 196graph. Nodes in the graph represent symbols and content, and edges represent 197relocations. Each of the elements of the graph is listed here: 198 199* ``Addressable`` -- A node in the link graph that can be assigned an address 200 in the executor process's virtual address space. 201 202 Absolute and external symbols are represented using plain ``Addressable`` 203 instances. Content defined inside the object file is represented using the 204 ``Block`` subclass. 205 206* ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as 207 zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an 208 ``AlignmentOffset``), and a list of ``Edge`` instances. 209 210 Blocks provide a container for binary content which must remain contiguous in 211 the target address space (a *layout unit*). Many interesting low level 212 operations on ``LinkGraph`` instances involve inspecting or mutating block 213 content or edges. 214 215 * ``Content`` is represented as an ``llvm::StringRef``, and accessible via 216 the ``getContent`` method. Content is only available for content blocks, 217 and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer 218 ``getSize`` when only the block size is needed as it works for both 219 zero-fill and content blocks). 220 221 * ``Section`` is represented as a ``Section&`` reference, and accessible via 222 the ``getSection`` method. The ``Section`` class is described in more detail 223 below. 224 225 * ``Size`` is represented as a ``size_t``, and is accessible via the 226 ``getSize`` method for both content and zero-filled blocks. 227 228 * ``Alignment`` is represented as a ``uint64_t``, and available via the 229 ``getAlignment`` method. It represents the minimum alignment requirement (in 230 bytes) of the start of the block. 231 232 * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the 233 ``getAlignmentOffset`` method. It represents the offset from the alignment 234 required for the start of the block. This is required to support blocks 235 whose minimum alignment requirement comes from data at some non-zero offset 236 inside the block. E.g. if a block consists of a single byte (with byte 237 alignment) followed by a uint64_t (with 8-byte alignment), then the block 238 will have 8-byte alignment with an alignment offset of 7. 239 240 * list of ``Edge`` instances. An iterator range for this list is returned by 241 the ``edges`` method. The ``Edge`` class is described in more detail below. 242 243* ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an 244 optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a 245 ``Live`` flag. 246 247 Symbols make it possible to name content (blocks and addressables are 248 anonymous), or target content with an ``Edge``. 249 250 * ``Name`` is represented as an ``llvm::StringRef`` (equal to 251 ``llvm::StringRef()`` if the symbol has no name), and accessible via the 252 ``getName`` method. 253 254 * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the 255 ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine 256 whether this symbol definition should be kept or dropped. 257 258 * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via 259 the ``getScope`` method. The ``JITLinkContext`` can use this to determine 260 who should be able to see the symbol. A symbol with default scope should be 261 globally visible. A symbol with hidden scope should be visible to other 262 definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or 263 executable, but not from elsewhere. A symbol with local scope should only be 264 visible within the current ``LinkGraph``. 265 266 * ``Callable`` is a boolean which is set to true if this symbol can be called, 267 and is accessible via the ``isCallable`` method. This can be used to 268 automate the introduction of call-stubs for lazy compilation. 269 270 * ``Live`` is a boolean that can be set to mark this symbol as root for 271 dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's 272 dead-stripping algorithm will propagate liveness flags through the graph to 273 all reachable symbols before deleting any symbols (and blocks) that are not 274 marked live. 275 276* ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the 277 containing ``Block``), a ``Kind`` (describing the relocation type), a 278 ``Target``, and an ``Addend``. 279 280 Edges represent relocations, and occasionally other relationships, between 281 blocks and symbols. 282 283 * ``Offset``, accessible via ``getOffset``, is an offset from the start of the 284 ``Block`` containing the ``Edge``. 285 286 * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes 287 what kinds of changes (if any) should be made to block content at the given 288 ``Offset`` based on the address of the ``Target``. 289 290 * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``, 291 representing whose address is relevant to the fixup calculation specified by 292 the edge's ``Kind``. 293 294 * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation 295 is determined by the edge's ``Kind``. 296 297* ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block`` 298 instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``. 299 300 Sections make it easy to iterate over the symbols or blocks associated with 301 a particular section in the source object file. 302 303 * ``blocks()`` returns an iterator over the set of blocks defined in the 304 section (as ``Block*`` pointers). 305 306 * ``symbols()`` returns an iterator over the set of symbols defined in the 307 section (as ``Symbol*`` pointers). 308 309 * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the 310 ``getName`` method. 311 312 * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum, 313 and accessible via the ``getProtectionFlags`` method. These flags describe 314 whether the section is readable, writable, executable, or some combination 315 of these. The most common combinations are ``RW-`` for writable data, 316 ``R--`` for constant data, and ``R-X`` for code. 317 318 * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order 319 the section relative to others. It is usually used to preserve section 320 order within a segment (a set of sections with the same memory protections) 321 when laying out memory. 322 323For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of 324``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has 325one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of 326edges (possibly empty, represented as ``Edge`` instances) back to elements of 327the ``Symbol`` set. For convenience and performance of common algorithms, 328symbols and blocks are further grouped into ``Sections``. 329 330The ``LinkGraph`` itself provides operations for constructing, removing, and 331iterating over sections, symbols, and blocks. It also provides metadata 332and utilities relevant to the linking process: 333 334* Graph element operations 335 336 * ``sections`` returns an iterator over all sections in the graph. 337 338 * ``findSectionByName`` returns a pointer to the section with the given 339 name (as a ``Section*``) if it exists, otherwise returns a nullptr. 340 341 * ``blocks`` returns an iterator over all blocks in the graph (across all 342 sections). 343 344 * ``defined_symbols`` returns an iterator over all defined symbols in the 345 graph (across all sections). 346 347 * ``external_symbols`` returns an iterator over all external symbols in the 348 graph. 349 350 * ``absolute_symbols`` returns an iterator over all absolute symbols in the 351 graph. 352 353 * ``createSection`` creates a section with a given name and protection flags. 354 355 * ``createContentBlock`` creates a block with the given initial content, 356 parent section, address, alignment, and alignment offset. 357 358 * ``createZeroFillBlock`` creates a zero-fill block with the given size, 359 parent section, address, alignment, and alignment offset. 360 361 * ``addExternalSymbol`` creates a new addressable and symbol with a given 362 name, size, and linkage. 363 364 * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given 365 name, address, size, linkage, scope, and liveness. 366 367 * ``addCommonSymbol`` convenience function for creating a zero-filled block 368 and weak symbol with a given name, scope, section, initial address, size, 369 alignment and liveness. 370 371 * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block, 372 offset, size, callable-ness, and liveness. 373 374 * ``addDefinedSymbol`` creates a new symbol for a given block with a name, 375 offset, size, linkage, scope, callable-ness and liveness. 376 377 * ``makeExternal`` transforms a formerly defined symbol into an external one 378 by creating a new addressable and pointing the symbol at it. The existing 379 block is not deleted, but can be manually removed (if unreferenced) by 380 calling ``removeBlock``. All edges to the symbol remain valid, but the 381 symbol must now be defined outside this ``LinkGraph``. 382 383 * ``removeExternalSymbol`` removes an external symbol and its target 384 addressable. The target addressable must not be referenced by any other 385 symbols. 386 387 * ``removeAbsoluteSymbol`` removes an absolute symbol and its target 388 addressable. The target addressable must not be referenced by any other 389 symbols. 390 391 * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove 392 its target block. 393 394 * ``removeBlock`` removes the given block. 395 396 * ``splitBlock`` split a given block in two at a given index (useful where 397 it is known that a block contains decomposable records, e.g. CFI records 398 in an eh-frame section). 399 400* Graph utility operations 401 402 * ``getName`` returns the name of this graph, which is usually based on the 403 name of the input object file. 404 405 * ``getTargetTriple`` returns an `llvm::Triple` for the executor process. 406 407 * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor 408 process. 409 410 * ``getEndinaness`` returns the endianness of the executor process. 411 412 * ``allocateString`` copies data from a given ``llvm::Twine`` into the 413 link graph's internal allocator. This can be used to ensure that content 414 created inside a pass outlives that pass's execution. 415 416.. _generic_link_algorithm: 417 418Generic Link Algorithm 419====================== 420 421JITLink provides a generic link algorithm which can be extended / modified at 422certain points by the introduction of JITLink :ref:`passes`: 423 424#. Phase 1 425 426 This phase is called immediately by the ``link`` function as soon as the 427 initial configuration (including the pass pipeline setup) is complete. 428 429 #. Run pre-prune passes. 430 431 These passes are called on the graph before it is pruned. At this stage 432 ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass 433 (supplied by the ``JITLinkContext``) will be run at the end of this 434 sequence to mark the initial set of live symbols. 435 436 Notable use cases: marking nodes live, accessing/copying graph data that 437 will be pruned (e.g. metadata that's important for the JIT, but not needed 438 for the link process). 439 440 #. Prune (dead-strip) the ``LinkGraph``. 441 442 Removes all symbols and blocks not reachable from the initial set of live 443 symbols. 444 445 This allows JITLink to remove unreachable symbols / content, including 446 overridden weak and redundant ODR definitions. 447 448 #. Run post-prune passes. 449 450 These passes are run on the graph after dead-stripping, but before memory 451 is allocated or nodes assigned their final target vmaddrs. 452 453 Passes run at this stage benefit from pruning, as dead functions and data 454 have been stripped from the graph. However new content can still be added 455 to the graph, as target and working memory have not been allocated yet. 456 457 Notable use cases: Building Global Offset Table (GOT), Procedure Linkage 458 Table (PLT), and Thread Local Variable (TLV) entries. 459 460 #. Sort blocks into segments. 461 462 Sorts all blocks by ordinal and then address. Collects sections with 463 matching permissions into segments and computes the size of these 464 segments for memory allocation. 465 466 #. Allocate segment memory, update node addresses. 467 468 Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both 469 working and target memory for the graph, then updates all node addresses 470 to their assigned target address. 471 472 Note: This step only updates the addresses of nodes defined in this graph. 473 External symbols will still have null addresses. 474 475 #. Run post-allocation passes. 476 477 These passes are run on the graph after working and target memory have 478 been allocated, but before the ``JITLinkContext`` is notified of the 479 final addresses of the symbols in the graph. This gives these passes a 480 chance to set up data structures associated with target addresses before 481 any JITLink clients (especially ORC queries for symbol resolution) can 482 attempt to access them. 483 484 Notable use cases: Setting up mappings between target addresses and 485 JIT data structures, such as a mapping between ``__dso_handle`` and 486 ``JITDylib*``. 487 488 #. Notify the ``JITLinkContext`` of the assigned symbol addresses. 489 490 Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing 491 clients to react to the symbol address assignments made for this graph. 492 In ORC this is used to notify any pending queries for *resolved* symbols, 493 including pending queries from concurrently running JITLink instances that 494 have reached the next step and are waiting on the address of a symbol in 495 this graph to proceed with their link. 496 497 #. Identify external symbols and resolve their addresses asynchronously. 498 499 Calls the ``JITLinkContext`` to resolve the target address of any external 500 symbols in the graph. This step is asynchronous -- JITLink will pack the 501 link state into a *continuation* to be run once the symbols are resolved. 502 503 This is the final step of Phase 1. 504 505#. Phase 2 506 507 This phase is called by the continuation constructed at the end of the 508 external symbol resolution step above. 509 510 #. Apply external symbol resolution results. 511 512 This updates the addresses of all external symbols. At this point all 513 nodes in the graph have their final target addresses, however node 514 content still points back to the original data in the object file. 515 516 #. Run pre-fixup passes. 517 518 These passes are called on the graph after all nodes have been assigned 519 their final target addresses, but before node content is copied into 520 working memory and fixed up. Passes run at this stage can make late 521 optimizations to the graph and content based on address layout. 522 523 Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are 524 bypassed for fixup targets that are directly accessible under the assigned 525 memory layout. 526 527 #. Copy block content to working memory and apply fixups. 528 529 Copies all block content into allocated working memory (following the 530 target layout) and applies fixups. Graph blocks are updated to point at 531 the fixed up content. 532 533 #. Run post-fixup passes. 534 535 These passes are called on the graph after fixups have been applied and 536 blocks updated to point to the fixed up content. 537 538 Post-fixup passes can inspect blocks contents to see the exact bytes that 539 will be copied to the assigned target addresses. 540 541 #. Finalize memory asynchronously. 542 543 Calls the ``JITLinkMemoryManager`` to copy working memory to the executor 544 process and apply the requested permissions. This step is asynchronous -- 545 JITLink will pack the link state into a *continuation* to be run once 546 memory has been copied and protected. 547 548 This is the final step of Phase 2. 549 550#. Phase 3. 551 552 This phase is called by the continuation constructed at the end of the 553 memory finalization step above. 554 555 #. Notify the context that the graph has been emitted. 556 557 Calls ``JITLinkContext::notifyFinalized`` and hands off the 558 ``JITLinkMemoryManager::Allocation`` object for this graph's memory 559 allocation. This allows the context to track/hold memory allocations and 560 react to the newly emitted definitions. In ORC this is used to update the 561 ``ExecutionSession`` instance's dependence graph, which may result in 562 these symbols (and possibly others) becoming *Ready* if all of their 563 dependencies have also been emitted. 564 565.. _passes: 566 567Passes 568------ 569 570JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free 571to inspect and modify the given ``LinkGraph`` subject to the constraints of 572whatever phase they are running in (see :ref:`generic_link_algorithm`). If a 573pass returns ``Error::success()`` then linking continues. If a pass returns 574a failure value then linking is stopped and the ``JITLinkContext`` is notified 575that the link failed. 576 577Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT 578and PLT construction as a pass), and external clients like 579``ObjectLinkingLayer::Plugin``. 580 581In combination with the open ``LinkGraph`` API, JITLink passes enable the 582implementation of powerful new features. For example: 583 584* Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT 585 calls and identify situations where the addresses of the entry target and the 586 access are close enough to be accessed directly. In this case the pass can 587 rewrite the instruction stream of the containing block and update the fixup 588 edges to make the access direct. 589 590 Code for this looks like: 591 592.. code-block:: c++ 593 594 Error relaxGOTEdges(LinkGraph &G) { 595 for (auto *B : G.blocks()) 596 for (auto &E : B->edges()) 597 if (E.getKind() == x86_64::GOTLoad) { 598 auto &GOTTarget = getGOTEntryTarget(E.getTarget()); 599 if (isInRange(B.getFixupAddress(E), GOTTarget)) { 600 // Rewrite B.getContent() at fixup address from 601 // MOVQ to LEAQ 602 603 // Update edge target and kind. 604 E.setTarget(GOTTarget); 605 E.setKind(x86_64::PCRel32); 606 } 607 } 608 609 return Error::success(); 610 } 611 612* Metadata registration -- Post allocation passes can be used to record the 613 address range of sections in the target. This can be used to register the 614 metadata (e.g exception handling frames, language metadata) in the target 615 once memory has been finalized. 616 617.. code-block:: c++ 618 619 Error registerEHFrameSection(LinkGraph &G) { 620 if (auto *Sec = G.findSectionByName("__eh_frame")) { 621 SectionRange SR(*Sec); 622 registerEHFrameSection(SR.getStart(), SR.getEnd()); 623 } 624 625 return Error::success(); 626 } 627 628* Record call sites for later mutation -- A post-allocation pass can record 629 the call sites of all calls to a particular function, allowing those call 630 sites to be updated later at runtime (e.g. for instrumentation, or to 631 enable the function to be lazily compiled but still called directly after 632 compilation). 633 634.. code-block:: c++ 635 636 StringRef FunctionName = "foo"; 637 std::vector<JITTargetAddress> CallSitesForFunction; 638 639 auto RecordCallSites = 640 [&](LinkGraph &G) -> Error { 641 for (auto *B : G.blocks()) 642 for (auto &E : B.edges()) 643 if (E.getKind() == CallEdgeKind && 644 E.getTarget().hasName() && 645 E.getTraget().getName() == FunctionName) 646 CallSitesForFunction.push_back(B.getFixupAddress(E)); 647 return Error::success(); 648 }; 649 650Memory Management with JITLinkMemoryManager 651------------------------------------------- 652 653JIT linking requires allocation of two kinds of memory: working memory in the 654JIT process and target memory in the execution process (these processes and 655memory allocations may be one and the same, depending on how the user wants 656to build their JIT). It also requires that these allocations conform to the 657requested code model in the target process (e.g. MachO/x86-64's Small code 658model requires that all code and data for a simulated dylib is allocated within 6594Gb). Finally, it is natural to make the memory manager responsible for 660transferring memory to the target address space and applying memory protections, 661since the memory manager must know how to communicate with the executor, and 662since sharing and protection assignment can often be efficiently managed (in 663the common case of running across processes on the same machine for security) 664via the host operating system's virtual memory management APIs. 665 666To satisfy these requirements ``JITLinkMemoryManager`` adopts the following 667design: The memory manager itself has just one virtual method that returns a 668``JITLinkMemoryManager::Allocation``: 669 670.. code-block:: c++ 671 672 virtual Expected<std::unique_ptr<Allocation>> 673 allocate(const JITLinkDylib *JD, const SegmentsRequestMap &Request) = 0; 674 675This method takes a ``JITLinkDylib*`` representing the target simulated 676dylib, and the full set of sections that must be allocated for this object. 677``JITLinkMemoryManager`` implementations can (optionally) use the ``JD`` 678argument to manage a per-simulated-dylib memory pool (since code model 679constraints are typically imposed on a per-dylib basis, and not across 680dylibs) [2]_. The ``Request`` argument, by describing all sections in the current 681object up-front, allows the implementer to allocate those sections as a 682single slab, either within a pre-allocated per-jitdylib pool or directly 683from system memory. 684 685All subsequent operations are provided by the 686``JITLinkMemoryManager::Allocation`` interface: 687 688* ``virtual MutableArrayRef<char> getWorkingMemory(ProtectionFlags Seg)`` 689 690 Should be overridden to return the address in working memory of the segment 691 with the given protection flags. 692 693* ``virtual JITTargetAddress getTargetMemory(ProtectionFlags Seg)`` 694 695 Should be overridden to return the address in the executor's address space of 696 the segment with the given protection flags. 697 698* ``virtual void finalizeAsync(FinalizeContinuation OnFinalize)`` 699 700 Should be overridden to copy the contents of working memory to the target 701 address space and apply memory protections for all segments. Where working 702 memory and target memory are separate, this method should deallocate the 703 working memory. 704 705* ``virtual Error deallocate()`` 706 707 Should be overridden to deallocate memory in the target address space. 708 709JITLink provides a simple in-process implementation of this interface: 710``InProcessMemoryManager``. It allocates pages once and re-uses them as both 711working and target memory. 712 713ORC provides a cross-process ``JITLinkMemoryManager`` based on an ORC-RPC-based 714implementation of the ``orc::TargetProcessControl`` API: 715``OrcRPCTPCJITLinkMemoryManager``. This API uses TargetProcessControl API calls 716to allocate and manage memory in a remote process. The underlying communication 717channel is determined by the ORC-RPC channel type. Common options include unix 718sockets or TCP. 719 720JITLinkMemoryManager and Security 721--------------------------------- 722 723JITLink's ability to link JIT'd code for a separate executor process can be 724used to improve the security of a JIT system: The executor process can be 725sandboxed, run within a VM, or even run on a fully separate machine. 726 727JITLink's memory manager interface is flexible enough to allow for a range of 728trade-offs between performance and security. For example, on a system where code 729pages must be signed (preventing code from being updated), the memory manager 730can deallocate working memory pages after linking to free memory in the process 731running JITLink. Alternatively, on a system that allows RWX pages, the memory 732manager may use the same pages for both working and target memory by marking 733them as RWX, allowing code to be modified in place without further overhead. 734Finally, if RWX pages are not permitted but dual-virtual-mappings of 735physical memory pages are, then the memory manager can dual map physical pages 736as RW- in the JITLink process and R-X in the executor process, allowing 737modification from the JITLink process but not from the executor (at the cost of 738extra administrative overhead for the dual mapping). 739 740Error Handling 741-------------- 742 743JITLink makes extensive use of the ``llvm::Error`` type (see the error handling 744section of :doc:`ProgrammersManual` for details). The link process itself, all 745passes, the memory manager interface, and operations on the ``JITLinkContext`` 746are all permitted to fail. Link graph construction utilities (especially parsers 747for object formats) are encouraged to validate input, and validate fixups 748(e.g. with range checks) before application. 749 750Any error will halt the link process and notify the context of failure. In ORC, 751reported failures are propagated to queries pending on definitions provided by 752the failing link, and also through edges of the dependence graph to any queries 753waiting on dependent symbols. 754 755.. _connection_to_orc_runtime: 756 757Connection to the ORC Runtime 758============================= 759 760The ORC Runtime (currently under development) aims to provide runtime support 761for advanced JIT features, including object format features that require 762non-trivial action in the executor (e.g. running initializers, managing thread 763local storage, registering with language runtimes, etc.). 764 765ORC Runtime support for object format features typically requires cooperation 766between the runtime (which executes in the executor process) and JITLink (which 767runs in the JIT process and can inspect LinkGraphs to determine what actions 768must be taken in the executor). For example: Execution of MachO static 769initializers in the ORC runtime is performed by the ``jit_dlopen`` function, 770which calls back to the JIT process to ask for the list of address ranges of 771``__mod_init`` sections to walk. This list is collated by the 772``MachOPlatformPlugin``, which installs a pass to record this information for 773each object as it is linked into the target. 774 775.. _constructing_linkgraphs: 776 777Constructing LinkGraphs 778======================= 779 780Clients usually access and manipulate ``LinkGraph`` instances that were created 781for them by an ``ObjectLinkingLayer`` instance, but they can be created manually: 782 783#. By directly constructing and populating a ``LinkGraph`` instance. 784 785#. By using the ``createLinkGraph`` family of functions to create a 786 ``LinkGraph`` from an in-memory buffer containing an object file. This is how 787 ``ObjectLinkingLayer`` usually creates ``LinkGraphs``. 788 789 #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when 790 both the object format and architecture are known ahead of time. 791 792 #. ``createLinkGraph_<Object-Format>`` can be used when the object format is 793 known ahead of time, but the architecture is not. In this case the 794 architecture will be determined by inspection of the object header. 795 796 #. ``createLinkGraph`` can be used when neither the object format nor 797 the architecture are known ahead of time. In this case the object header 798 will be inspected to determine both the format and architecture. 799 800.. _jit_linking: 801 802JIT Linking 803=========== 804 805The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs, 806MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an 807in-memory compiler by adding an in-memory link step to the end of the usual 808compiler pipeline. Rather than dumping relocatable objects to disk as a compiler 809usually would, MCJIT passed them to RuntimeDyld to be linked into a target 810process. 811 812This approach to linking differs from standard *static* or *dynamic* linking: 813 814A *static linker* takes one or more relocatable object files as input and links 815them into an executable or dynamic library on disk. 816 817A *dynamic linker* applies relocations to executables and dynamic libraries that 818have been loaded into memory. 819 820A *JIT linker* takes a single relocatable object file at a time and links it 821into a target process, usually using a context object to allow the linked code 822to resolve symbols in the target. 823 824RuntimeDyld 825----------- 826 827In order to keep RuntimeDyld's implementation simple MCJIT imposed some 828restrictions on compiled code: 829 830#. It had to use the Large code model, and often restricted available relocation 831 models in order to limit the kinds of relocations that had to be supported. 832 833#. It required strong linkage and default visibility on all symbols -- behavior 834 for other linkages/visibilities was not well defined. 835 836#. It constrained and/or prohibited the use of features requiring runtime 837 support, e.g. static initializers or thread local storage. 838 839As a result of these restrictions not all language features supported by LLVM 840worked under MCJIT, and objects to be loaded under the JIT had to be compiled to 841target it (precluding the use of precompiled code from other sources under the 842JIT). 843 844RuntimeDyld also provided very limited visibility into the linking process 845itself: Clients could access conservative estimates of section size 846(RuntimeDyld bundled stub size and padding estimates into the section size 847value) and the final relocated bytes, but could not access RuntimeDyld's 848internal object representations. 849 850Eliminating these restrictions and limitations was one of the primary motivations 851for the development of JITLink. 852 853The llvm-jitlink tool 854===================== 855 856The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library. 857It loads some set of relocatable object files and then links them using 858JITLink. Depending on the options used it will then execute them, or validate 859the linked memory. 860 861The ``llvm-jitlink`` tool was originally designed to aid JITLink development by 862providing a simple environment for testing. 863 864Basic usage 865----------- 866 867By default, ``llvm-jitlink`` will link the set of objects passed on the command 868line, then search for a "main" function and execute it: 869 870.. code-block:: sh 871 872 % cat hello-world.c 873 #include <stdio.h> 874 875 int main(int argc, char *argv[]) { 876 printf("hello, world!\n"); 877 return 0; 878 } 879 880 % clang -c -o hello-world.o hello-world.c 881 % llvm-jitlink hello-world.o 882 Hello, World! 883 884Multiple objects may be specified, and arguments may be provided to the JIT'd 885main function using the -args option: 886 887.. code-block:: sh 888 889 % cat print-args.c 890 #include <stdio.h> 891 892 void print_args(int argc, char *argv[]) { 893 for (int i = 0; i != argc; ++i) 894 printf("arg %i is \"%s\"\n", i, argv[i]); 895 } 896 897 % cat print-args-main.c 898 void print_args(int argc, char *argv[]); 899 900 int main(int argc, char *argv[]) { 901 print_args(argc, argv); 902 return 0; 903 } 904 905 % clang -c -o print-args.o print-args.c 906 % clang -c -o print-args-main.o print-args-main.c 907 % llvm-jitlink print-args.o print-args-main.o -args a b c 908 arg 0 is "a" 909 arg 1 is "b" 910 arg 2 is "c" 911 912Alternative entry points may be specified using the ``-entry <entry point 913name>`` option. 914 915Other options can be found by calling ``llvm-jitlink -help``. 916 917llvm-jitlink as a regression testing utility 918-------------------------------------------- 919 920One of the primary aims of ``llvm-jitlink`` was to enable readable regression 921tests for JITLink. To do this it supports two options: 922 923The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry 924point, and before attempting to execute it. Since the linked code is not 925executed, this can be used to link for other targets even if you do not have 926access to the target being linked (the ``-define-abs`` or ``-phony-externals`` 927options can be used to supply any missing definitions in this case). 928 929The ``-check <check-file>`` option can be used to run a set of ``jitlink-check`` 930expressions against working memory. It is typically used in conjunction with 931``-noexec``, since the aim is to validate JIT'd memory rather than to run the 932code and ``-noexec`` allows us to link for any supported target architecture 933from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the 934given check-file for lines of the form ``# jitlink-check: <expr>``. See 935examples of this usage in ``llvm/test/ExecutionEngine/JITLink``. 936 937Remote execution via llvm-jitlink-executor 938------------------------------------------ 939 940By default ``llvm-jitlink`` will link the given objects into its own process, 941but this can be overridden by two options: 942 943The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to 944execute the given executor (which defaults to ``llvm-jitlink-executor``) and 945communicate with it via file descriptors which it passes to the executor 946as the first argument with the format ``filedescs=<in-fd>,<out-fd>``. 947 948The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to 949connect to an already running executor via TCP on the given host and port. To 950use this option you will need to start ``llvm-jitlink-executor`` manually with 951``listen=<host>:<port>`` as the first argument. 952 953Harness mode 954------------ 955 956The ``-harness`` option allows a set of input objects to be designated as a test 957harness, with the regular object files implicitly treated as objects to be 958tested. Definitions of symbols in the harness set override definitions in the 959test set, and external references from the harness cause automatic scope 960promotion of local symbols in the test set (these modifications to the usual 961linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by 962``llvm-jitlink`` when it sees the ``-harness`` option). 963 964With these modifications in place we can selectively test functions in an object 965file by mocking those function's callees. For example, suppose we have an object 966file, ``test_code.o``, compiled from the following C source (which we need not 967have access to): 968 969.. code-block:: c 970 971 void irrelevant_function() { irrelevant_external(); } 972 973 int function_to_mock(int X) { 974 return /* some function of X */; 975 } 976 977 static void function_to_test() { 978 ... 979 int Y = function_to_mock(); 980 printf("Y is %i\n", Y); 981 } 982 983If we want to know how ``function_to_test`` behaves when we change the behavior 984of ``function_to_mock`` we can test it by writing a test harness: 985 986.. code-block:: c 987 988 void function_to_test(); 989 990 int function_to_mock(int X) { 991 printf("used mock utility function\n"); 992 return 42; 993 } 994 995 int main(int argc, char *argv[]) { 996 function_to_test(): 997 return 0; 998 } 999 1000Under normal circumstances these objects could not be linked together: 1001``function_to_test`` is static and could not be resolved outside 1002``test_code.o``, the two ``function_to_mock`` functions would result in a 1003duplicate definition error, and ``irrelevant_external`` is undefined. 1004However, using ``-harness`` and ``-phony-externals`` we can run this code 1005with: 1006 1007.. code-block:: sh 1008 1009 % clang -c -o test_code_harness.o test_code_harness.c 1010 % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o 1011 used mock utility function 1012 Y is 42 1013 1014The ``-harness`` option may be of interest to people who want to perform some 1015very late testing on build products to verify that compiled code behaves as 1016expected. On basic C test cases this is relatively straightforward. Mocks for 1017more complicated languages (e.g. C++) are much trickier: Any code involving 1018classes tends to have a lot of non-trivial surface area (e.g. vtables) that 1019would require great care to mock. 1020 1021Tips for JITLink backend developers 1022----------------------------------- 1023 1024#. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input 1025 object is well formed: Return any errors produced by libObject (or your own 1026 object parsing code) and validate as you construct. Think carefully about the 1027 distinction between contract (which should be validated with asserts and 1028 llvm_unreachable) and environmental errors (which should generate 1029 ``llvm::Error`` instances). 1030 1031#. Don't assume you're linking in-process. Use libSupport's sized, 1032 endian-specific types when reading/writing content in the ``LinkGraph``. 1033 1034As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an 1035invaluable resource for developers bringing in a new JITLink backend. A standard 1036workflow is to start by throwing an unsupported object at the tool and seeing 1037what error is returned, then fixing that (you can often make a reasonable guess 1038at what should be done based on existing code for other formats or 1039architectures). 1040 1041In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the 1042JITLink library during the link process. These can be useful for spotting some bugs at 1043a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink`` 1044tool, which can be useful for debugging both testcases (it is often less verbose than 1045``-debug-only=jitlink``) and the tool itself. 1046 1047The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing 1048handling of cross-process and cross-architecture use cases. 1049 1050Roadmap 1051======= 1052 1053JITLink is under active development. Work so far has focused on the MachO 1054implementation. In LLVM 12 there is limited support for ELF on x86-64. 1055 1056Major outstanding projects include: 1057 1058* Refactor architecture support to maximize sharing across formats. 1059 1060 All formats should be able to share the bulk of the architecture specific 1061 code (especially relocations) for each supported architecture. 1062 1063* Refactor ELF link graph construction. 1064 1065 ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp` 1066 file, and tied to the x86-64 relocation parsing code. The bulk of the code is 1067 generic and should be split into an ELFLinkGraphBuilder base class along the 1068 same lines as the existing generic MachOLinkGraphBuilder. 1069 1070* Implement ELF support for arm64. 1071 1072 Once the architecture support code has been refactored to enable sharing and 1073 ELF link graph construction has been refactored to allow re-use we should be 1074 able to construct an ELF / arm64 JITLink implementation by combining 1075 these existing pieces. 1076 1077* Implement support for new architectures. 1078 1079* Implement support for COFF. 1080 1081 There is no COFF implementation of JITLink yet. Such an implementation should 1082 follow the MachO and ELF paths: a generic COFFLinkGraphBuilder base class that 1083 can be specialized for each architecture. 1084 1085* Design and implement a shared-memory based JITLinkMemoryManager. 1086 1087 One use-case that is expected to be common is out-of-process linking targeting 1088 another process on the same machine. This allows JITs to sandbox JIT'd code. 1089 For this use case a shared-memory based JITLinkMemoryManager would provide the 1090 most efficient form of allocation. Creating one will require designing a 1091 generic API for shared memory though, as LLVM does not currently have one. 1092 1093JITLink Availability and Feature Status 1094--------------------------------------- 1095 1096.. list-table:: Availability and Status 1097 :widths: 10 30 30 30 1098 :header-rows: 1 1099 1100 * - Architecture 1101 - ELF 1102 - COFF 1103 - MachO 1104 * - arm64 1105 - 1106 - 1107 - Partial (small code model, PIC relocation model only) 1108 * - x86-64 1109 - Partial 1110 - 1111 - Full (except TLV and debugging) 1112 1113.. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for 1114 a full worked example. 1115 1116.. [2] If not for *hidden* scoped symbols we could eliminate the 1117 ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and 1118 treat every object as a separate simulated dylib for the purposes of 1119 memory layout. Hidden symbols break this by generating in-range accesses 1120 to external symbols, requiring the access and symbol to be allocated 1121 within range of one another. That said, providing a pre-reserved address 1122 range pool for each simulated dylib guarantees that the relaxation 1123 optimizations will kick in for all intra-dylib references, which is good 1124 for performance (at the cost of whatever overhead is introduced by 1125 reserving the address-range up-front). 1126