1==================================== 2JITLink and ORC's ObjectLinkingLayer 3==================================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11This document aims to provide a high-level overview of the design and API 12of the JITLink library. It assumes some familiarity with linking and 13relocatable object files, but should not require deep expertise. If you know 14what a section, symbol, and relocation are then you should find this document 15accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a 16bug (:doc:`HowToSubmitABug`). 17 18JITLink is a library for :ref:`jit_linking`. It was built to support the :doc:`ORC JIT 19APIs<ORCv2>` and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was 20developed with the aim of supporting the full set of features provided by each 21object format; including static initializers, exception handling, thread local 22variables, and language runtime registration. Supporting these features enables 23ORC to execute code generated from source languages which rely on these features 24(e.g. C++ requires object format support for static initializers to support 25static constructors, eh-frame registration for exceptions, and TLV support for 26thread locals; Swift and Objective-C require language runtime registration for 27many features). For some object format features support is provided entirely 28within JITLink, and for others it is provided in cooperation with the 29(prototype) ORC runtime. 30 31JITLink aims to support the following features, some of which are still under 32development: 33 341. Cross-process and cross-architecture linking of single relocatable objects 35 into a target *executor* process. 36 372. Support for all object format features. 38 393. Open linker data structures (``LinkGraph``) and pass system. 40 41JITLink and ObjectLinkingLayer 42============================== 43 44``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that 45allows objects to be added to a ``JITDylib``, or emitted from some higher level 46program representation. When an object is emitted, ``ObjectLinkingLayer`` uses 47JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and 48calls JITLink's ``link`` function to link the graph into the executor process. 49 50The ``ObjectLinkingLayer`` class provides a plugin API, 51``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and 52modify ``LinkGraph`` instances at link time, and react to important JIT events 53(such as an object being emitted into target memory). This enables many features 54and optimizations that were not possible under MCJIT or RuntimeDyld. 55 56ObjectLinkingLayer Plugins 57-------------------------- 58 59The ``ObjectLinkingLayer::Plugin`` class provides the following methods: 60 61* ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It 62 can be overridden to install JITLink *Passes* to run during the link process. 63 64 .. code-block:: c++ 65 66 void modifyPassConfig(MaterializationResponsibility &MR, 67 jitlink::LinkGraph &G, 68 jitlink::PassConfiguration &Config) 69 70* ``notifyLoaded`` is called before the link begins, and can be overridden to 71 set up any initial state for the given ``MaterializationResponsibility`` if 72 needed. 73 74 .. code-block:: c++ 75 76 void notifyLoaded(MaterializationResponsibility &MR) 77 78* ``notifyEmitted`` is called after the link is complete and code has been 79 emitted to the executor process. It can be overridden to finalize state 80 for the ``MaterializationResponsibility`` if needed. 81 82 .. code-block:: c++ 83 84 Error notifyEmitted(MaterializationResponsibility &MR) 85 86* ``notifyFailed`` is called if the link fails at any point. It can be 87 overridden to react to the failure (e.g. to deallocate any already allocated 88 resources). 89 90 .. code-block:: c++ 91 92 Error notifyFailed(MaterializationResponsibility &MR) 93 94* ``notifyRemovingResources`` is called when a request is made to remove any 95 resources associated with the ``ResourceKey`` *K* for the 96 ``MaterializationResponsibility``. 97 98 .. code-block:: c++ 99 100 Error notifyRemovingResources(JITDylib &JD, ResourceKey K) 101 102* ``notifyTransferringResources`` is called if/when a request is made to 103 transfer tracking of any resources associated with ``ResourceKey`` 104 *SrcKey* to *DstKey*. 105 106 .. code-block:: c++ 107 108 void notifyTransferringResources(JITDylib &JD, ResourceKey DstKey, 109 ResourceKey SrcKey) 110 111Plugin authors are required to implement the ``notifyFailed``, 112``notifyRemovingResources``, and ``notifyTransferringResources`` methods in 113order to safely manage resources in the case of resource removal or transfer, 114or link failure. If no resources are managed by the plugin then these methods 115can be implemented as no-ops returning ``Error::success()``. 116 117Plugin instances are added to an ``ObjectLinkingLayer`` by 118calling the ``addPlugin`` method [1]_. E.g. 119 120.. code-block:: c++ 121 122 // Plugin class to print the set of defined symbols in an object when that 123 // object is linked. 124 class MyPlugin : public ObjectLinkingLayer::Plugin { 125 public: 126 127 // Add passes to print the set of defined symbols after dead-stripping. 128 void modifyPassConfig(MaterializationResponsibility &MR, 129 jitlink::LinkGraph &G, 130 jitlink::PassConfiguration &Config) override { 131 Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) { 132 return printAllSymbols(G); 133 }); 134 } 135 136 // Implement mandatory overrides: 137 Error notifyFailed(MaterializationResponsibility &MR) override { 138 return Error::success(); 139 } 140 Error notifyRemovingResources(JITDylib &JD, ResourceKey K) override { 141 return Error::success(); 142 } 143 void notifyTransferringResources(JITDylib &JD, ResourceKey DstKey, 144 ResourceKey SrcKey) override {} 145 146 // JITLink pass to print all defined symbols in G. 147 Error printAllSymbols(LinkGraph &G) { 148 for (auto *Sym : G.defined_symbols()) 149 if (Sym->hasName()) 150 dbgs() << Sym->getName() << "\n"; 151 return Error::success(); 152 } 153 }; 154 155 // Create our LLJIT instance using a custom object linking layer setup. 156 // This gives us a chance to install our plugin. 157 auto J = ExitOnErr(LLJITBuilder() 158 .setObjectLinkingLayerCreator( 159 [](ExecutionSession &ES, const Triple &T) { 160 // Manually set up the ObjectLinkingLayer for our LLJIT 161 // instance. 162 auto OLL = std::make_unique<ObjectLinkingLayer>( 163 ES, std::make_unique<jitlink::InProcessMemoryManager>()); 164 165 // Install our plugin: 166 OLL->addPlugin(std::make_unique<MyPlugin>()); 167 168 return OLL; 169 }) 170 .create()); 171 172 // Add an object to the JIT. Nothing happens here: linking isn't triggered 173 // until we look up some symbol in our object. 174 ExitOnErr(J->addObject(loadFromDisk("main.o"))); 175 176 // Plugin triggers here when our lookup of main triggers linking of main.o 177 auto MainSym = J->lookup("main"); 178 179LinkGraph 180========= 181 182JITLink maps all relocatable object formats to a generic ``LinkGraph`` type 183that is designed to make linking fast and easy (``LinkGraph`` instances can 184also be created manually. See :ref:`constructing_linkgraphs`). 185 186Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details, 187but share a common goal: to represent machine level code and data with 188annotations that allow them to be relocated in a virtual address space. To 189this end they usually contain names (symbols) for content defined inside the 190file or externally, chunks of content that must be moved as a unit (sections 191or subsections, depending on the format), and annotations describing how to 192patch content based on the final address of some target symbol/section 193(relocations). 194 195At a high level, the ``LinkGraph`` type represents these concepts as a decorated 196graph. Nodes in the graph represent symbols and content, and edges represent 197relocations. Each of the elements of the graph is listed here: 198 199* ``Addressable`` -- A node in the link graph that can be assigned an address 200 in the executor process's virtual address space. 201 202 Absolute and external symbols are represented using plain ``Addressable`` 203 instances. Content defined inside the object file is represented using the 204 ``Block`` subclass. 205 206* ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as 207 zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an 208 ``AlignmentOffset``), and a list of ``Edge`` instances. 209 210 Blocks provide a container for binary content which must remain contiguous in 211 the target address space (a *layout unit*). Many interesting low level 212 operations on ``LinkGraph`` instances involve inspecting or mutating block 213 content or edges. 214 215 * ``Content`` is represented as an ``llvm::StringRef``, and accessible via 216 the ``getContent`` method. Content is only available for content blocks, 217 and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer 218 ``getSize`` when only the block size is needed as it works for both 219 zero-fill and content blocks). 220 221 * ``Section`` is represented as a ``Section&`` reference, and accessible via 222 the ``getSection`` method. The ``Section`` class is described in more detail 223 below. 224 225 * ``Size`` is represented as a ``size_t``, and is accessible via the 226 ``getSize`` method for both content and zero-filled blocks. 227 228 * ``Alignment`` is represented as a ``uint64_t``, and available via the 229 ``getAlignment`` method. It represents the minimum alignment requirement (in 230 bytes) of the start of the block. 231 232 * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the 233 ``getAlignmentOffset`` method. It represents the offset from the alignment 234 required for the start of the block. This is required to support blocks 235 whose minimum alignment requirement comes from data at some non-zero offset 236 inside the block. E.g. if a block consists of a single byte (with byte 237 alignment) followed by a uint64_t (with 8-byte alignment), then the block 238 will have 8-byte alignment with an alignment offset of 7. 239 240 * list of ``Edge`` instances. An iterator range for this list is returned by 241 the ``edges`` method. The ``Edge`` class is described in more detail below. 242 243* ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an 244 optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a 245 ``Live`` flag. 246 247 Symbols make it possible to name content (blocks and addressables are 248 anonymous), or target content with an ``Edge``. 249 250 * ``Name`` is represented as an ``llvm::StringRef`` (equal to 251 ``llvm::StringRef()`` if the symbol has no name), and accessible via the 252 ``getName`` method. 253 254 * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the 255 ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine 256 whether this symbol definition should be kept or dropped. 257 258 * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via 259 the ``getScope`` method. The ``JITLinkContext`` can use this to determine 260 who should be able to see the symbol. A symbol with default scope should be 261 globally visible. A symbol with hidden scope should be visible to other 262 definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or 263 executable, but not from elsewhere. A symbol with local scope should only be 264 visible within the current ``LinkGraph``. 265 266 * ``Callable`` is a boolean which is set to true if this symbol can be called, 267 and is accessible via the ``isCallable`` method. This can be used to 268 automate the introduction of call-stubs for lazy compilation. 269 270 * ``Live`` is a boolean that can be set to mark this symbol as root for 271 dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's 272 dead-stripping algorithm will propagate liveness flags through the graph to 273 all reachable symbols before deleting any symbols (and blocks) that are not 274 marked live. 275 276* ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the 277 containing ``Block``), a ``Kind`` (describing the relocation type), a 278 ``Target``, and an ``Addend``. 279 280 Edges represent relocations, and occasionally other relationships, between 281 blocks and symbols. 282 283 * ``Offset``, accessible via ``getOffset``, is an offset from the start of the 284 ``Block`` containing the ``Edge``. 285 286 * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes 287 what kinds of changes (if any) should be made to block content at the given 288 ``Offset`` based on the address of the ``Target``. 289 290 * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``, 291 representing whose address is relevant to the fixup calculation specified by 292 the edge's ``Kind``. 293 294 * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation 295 is determined by the edge's ``Kind``. 296 297* ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block`` 298 instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``. 299 300 Sections make it easy to iterate over the symbols or blocks associated with 301 a particular section in the source object file. 302 303 * ``blocks()`` returns an iterator over the set of blocks defined in the 304 section (as ``Block*`` pointers). 305 306 * ``symbols()`` returns an iterator over the set of symbols defined in the 307 section (as ``Symbol*`` pointers). 308 309 * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the 310 ``getName`` method. 311 312 * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum, 313 and accessible via the ``getProtectionFlags`` method. These flags describe 314 whether the section is readable, writable, executable, or some combination 315 of these. The most common combinations are ``RW-`` for writable data, 316 ``R--`` for constant data, and ``R-X`` for code. 317 318 * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order 319 the section relative to others. It is usually used to preserve section 320 order within a segment (a set of sections with the same memory protections) 321 when laying out memory. 322 323For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of 324``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has 325one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of 326edges (possibly empty, represented as ``Edge`` instances) back to elements of 327the ``Symbol`` set. For convenience and performance of common algorithms, 328symbols and blocks are further grouped into ``Sections``. 329 330The ``LinkGraph`` itself provides operations for constructing, removing, and 331iterating over sections, symbols, and blocks. It also provides metadata 332and utilities relevant to the linking process: 333 334* Graph element operations 335 336 * ``sections`` returns an iterator over all sections in the graph. 337 338 * ``findSectionByName`` returns a pointer to the section with the given 339 name (as a ``Section*``) if it exists, otherwise returns a nullptr. 340 341 * ``blocks`` returns an iterator over all blocks in the graph (across all 342 sections). 343 344 * ``defined_symbols`` returns an iterator over all defined symbols in the 345 graph (across all sections). 346 347 * ``external_symbols`` returns an iterator over all external symbols in the 348 graph. 349 350 * ``absolute_symbols`` returns an iterator over all absolute symbols in the 351 graph. 352 353 * ``createSection`` creates a section with a given name and protection flags. 354 355 * ``createContentBlock`` creates a block with the given initial content, 356 parent section, address, alignment, and alignment offset. 357 358 * ``createZeroFillBlock`` creates a zero-fill block with the given size, 359 parent section, address, alignment, and alignment offset. 360 361 * ``addExternalSymbol`` creates a new addressable and symbol with a given 362 name, size, and linkage. 363 364 * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given 365 name, address, size, linkage, scope, and liveness. 366 367 * ``addCommonSymbol`` convenience function for creating a zero-filled block 368 and weak symbol with a given name, scope, section, initial address, size, 369 alignment and liveness. 370 371 * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block, 372 offset, size, callable-ness, and liveness. 373 374 * ``addDefinedSymbol`` creates a new symbol for a given block with a name, 375 offset, size, linkage, scope, callable-ness and liveness. 376 377 * ``makeExternal`` transforms a formerly defined symbol into an external one 378 by creating a new addressable and pointing the symbol at it. The existing 379 block is not deleted, but can be manually removed (if unreferenced) by 380 calling ``removeBlock``. All edges to the symbol remain valid, but the 381 symbol must now be defined outside this ``LinkGraph``. 382 383 * ``removeExternalSymbol`` removes an external symbol and its target 384 addressable. The target addressable must not be referenced by any other 385 symbols. 386 387 * ``removeAbsoluteSymbol`` removes an absolute symbol and its target 388 addressable. The target addressable must not be referenced by any other 389 symbols. 390 391 * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove 392 its target block. 393 394 * ``removeBlock`` removes the given block. 395 396 * ``splitBlock`` split a given block in two at a given index (useful where 397 it is known that a block contains decomposable records, e.g. CFI records 398 in an eh-frame section). 399 400* Graph utility operations 401 402 * ``getName`` returns the name of this graph, which is usually based on the 403 name of the input object file. 404 405 * ``getTargetTriple`` returns an `llvm::Triple` for the executor process. 406 407 * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor 408 process. 409 410 * ``getEndianness`` returns the endianness of the executor process. 411 412 * ``allocateString`` copies data from a given ``llvm::Twine`` into the 413 link graph's internal allocator. This can be used to ensure that content 414 created inside a pass outlives that pass's execution. 415 416.. _generic_link_algorithm: 417 418Generic Link Algorithm 419====================== 420 421JITLink provides a generic link algorithm which can be extended / modified at 422certain points by the introduction of JITLink :ref:`passes`. 423 424At the end of each phase the linker packages its state into a *continuation* 425and calls the ``JITLinkContext`` object to perform a (potentially high-latency) 426asynchronous operation: allocating memory, resolving external symbols, and 427finally transferring linked memory to the executing process. 428 429#. Phase 1 430 431 This phase is called immediately by the ``link`` function as soon as the 432 initial configuration (including the pass pipeline setup) is complete. 433 434 #. Run pre-prune passes. 435 436 These passes are called on the graph before it is pruned. At this stage 437 ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass 438 (supplied by the ``JITLinkContext``) will be run at the end of this 439 sequence to mark the initial set of live symbols. 440 441 Notable use cases: marking nodes live, accessing/copying graph data that 442 will be pruned (e.g. metadata that's important for the JIT, but not needed 443 for the link process). 444 445 #. Prune (dead-strip) the ``LinkGraph``. 446 447 Removes all symbols and blocks not reachable from the initial set of live 448 symbols. 449 450 This allows JITLink to remove unreachable symbols / content, including 451 overridden weak and redundant ODR definitions. 452 453 #. Run post-prune passes. 454 455 These passes are run on the graph after dead-stripping, but before memory 456 is allocated or nodes assigned their final target vmaddrs. 457 458 Passes run at this stage benefit from pruning, as dead functions and data 459 have been stripped from the graph. However new content can still be added 460 to the graph, as target and working memory have not been allocated yet. 461 462 Notable use cases: Building Global Offset Table (GOT), Procedure Linkage 463 Table (PLT), and Thread Local Variable (TLV) entries. 464 465 #. Asynchronously allocate memory. 466 467 Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both 468 working and target memory for the graph. As part of this process the 469 ``JITLinkMemoryManager`` will update the addresses of all nodes 470 defined in the graph to their assigned target address. 471 472 Note: This step only updates the addresses of nodes defined in this graph. 473 External symbols will still have null addresses. 474 475#. Phase 2 476 477 #. Run post-allocation passes. 478 479 These passes are run on the graph after working and target memory have 480 been allocated, but before the ``JITLinkContext`` is notified of the 481 final addresses of the symbols in the graph. This gives these passes a 482 chance to set up data structures associated with target addresses before 483 any JITLink clients (especially ORC queries for symbol resolution) can 484 attempt to access them. 485 486 Notable use cases: Setting up mappings between target addresses and 487 JIT data structures, such as a mapping between ``__dso_handle`` and 488 ``JITDylib*``. 489 490 #. Notify the ``JITLinkContext`` of the assigned symbol addresses. 491 492 Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing 493 clients to react to the symbol address assignments made for this graph. 494 In ORC this is used to notify any pending queries for *resolved* symbols, 495 including pending queries from concurrently running JITLink instances that 496 have reached the next step and are waiting on the address of a symbol in 497 this graph to proceed with their link. 498 499 #. Identify external symbols and resolve their addresses asynchronously. 500 501 Calls the ``JITLinkContext`` to resolve the target address of any external 502 symbols in the graph. 503 504#. Phase 3 505 506 #. Apply external symbol resolution results. 507 508 This updates the addresses of all external symbols. At this point all 509 nodes in the graph have their final target addresses, however node 510 content still points back to the original data in the object file. 511 512 #. Run pre-fixup passes. 513 514 These passes are called on the graph after all nodes have been assigned 515 their final target addresses, but before node content is copied into 516 working memory and fixed up. Passes run at this stage can make late 517 optimizations to the graph and content based on address layout. 518 519 Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are 520 bypassed for fixup targets that are directly accessible under the assigned 521 memory layout. 522 523 #. Copy block content to working memory and apply fixups. 524 525 Copies all block content into allocated working memory (following the 526 target layout) and applies fixups. Graph blocks are updated to point at 527 the fixed up content. 528 529 #. Run post-fixup passes. 530 531 These passes are called on the graph after fixups have been applied and 532 blocks updated to point to the fixed up content. 533 534 Post-fixup passes can inspect blocks contents to see the exact bytes that 535 will be copied to the assigned target addresses. 536 537 #. Finalize memory asynchronously. 538 539 Calls the ``JITLinkMemoryManager`` to copy working memory to the executor 540 process and apply the requested permissions. 541 542#. Phase 3. 543 544 #. Notify the context that the graph has been emitted. 545 546 Calls ``JITLinkContext::notifyFinalized`` and hands off the 547 ``JITLinkMemoryManager::FinalizedAlloc`` object for this graph's memory 548 allocation. This allows the context to track/hold memory allocations and 549 react to the newly emitted definitions. In ORC this is used to update the 550 ``ExecutionSession`` instance's dependence graph, which may result in 551 these symbols (and possibly others) becoming *Ready* if all of their 552 dependencies have also been emitted. 553 554.. _passes: 555 556Passes 557------ 558 559JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free 560to inspect and modify the given ``LinkGraph`` subject to the constraints of 561whatever phase they are running in (see :ref:`generic_link_algorithm`). If a 562pass returns ``Error::success()`` then linking continues. If a pass returns 563a failure value then linking is stopped and the ``JITLinkContext`` is notified 564that the link failed. 565 566Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT 567and PLT construction as a pass), and external clients like 568``ObjectLinkingLayer::Plugin``. 569 570In combination with the open ``LinkGraph`` API, JITLink passes enable the 571implementation of powerful new features. For example: 572 573* Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT 574 calls and identify situations where the addresses of the entry target and the 575 access are close enough to be accessed directly. In this case the pass can 576 rewrite the instruction stream of the containing block and update the fixup 577 edges to make the access direct. 578 579 Code for this looks like: 580 581.. code-block:: c++ 582 583 Error relaxGOTEdges(LinkGraph &G) { 584 for (auto *B : G.blocks()) 585 for (auto &E : B->edges()) 586 if (E.getKind() == x86_64::GOTLoad) { 587 auto &GOTTarget = getGOTEntryTarget(E.getTarget()); 588 if (isInRange(B.getFixupAddress(E), GOTTarget)) { 589 // Rewrite B.getContent() at fixup address from 590 // MOVQ to LEAQ 591 592 // Update edge target and kind. 593 E.setTarget(GOTTarget); 594 E.setKind(x86_64::PCRel32); 595 } 596 } 597 598 return Error::success(); 599 } 600 601* Metadata registration -- Post allocation passes can be used to record the 602 address range of sections in the target. This can be used to register the 603 metadata (e.g exception handling frames, language metadata) in the target 604 once memory has been finalized. 605 606.. code-block:: c++ 607 608 Error registerEHFrameSection(LinkGraph &G) { 609 if (auto *Sec = G.findSectionByName("__eh_frame")) { 610 SectionRange SR(*Sec); 611 registerEHFrameSection(SR.getStart(), SR.getEnd()); 612 } 613 614 return Error::success(); 615 } 616 617* Record call sites for later mutation -- A post-allocation pass can record 618 the call sites of all calls to a particular function, allowing those call 619 sites to be updated later at runtime (e.g. for instrumentation, or to 620 enable the function to be lazily compiled but still called directly after 621 compilation). 622 623.. code-block:: c++ 624 625 StringRef FunctionName = "foo"; 626 std::vector<ExecutorAddr> CallSitesForFunction; 627 628 auto RecordCallSites = 629 [&](LinkGraph &G) -> Error { 630 for (auto *B : G.blocks()) 631 for (auto &E : B.edges()) 632 if (E.getKind() == CallEdgeKind && 633 E.getTarget().hasName() && 634 E.getTraget().getName() == FunctionName) 635 CallSitesForFunction.push_back(B.getFixupAddress(E)); 636 return Error::success(); 637 }; 638 639Memory Management with JITLinkMemoryManager 640------------------------------------------- 641 642JIT linking requires allocation of two kinds of memory: working memory in the 643JIT process and target memory in the execution process (these processes and 644memory allocations may be one and the same, depending on how the user wants 645to build their JIT). It also requires that these allocations conform to the 646requested code model in the target process (e.g. MachO/x86-64's Small code 647model requires that all code and data for a simulated dylib is allocated within 6484Gb). Finally, it is natural to make the memory manager responsible for 649transferring memory to the target address space and applying memory protections, 650since the memory manager must know how to communicate with the executor, and 651since sharing and protection assignment can often be efficiently managed (in 652the common case of running across processes on the same machine for security) 653via the host operating system's virtual memory management APIs. 654 655To satisfy these requirements ``JITLinkMemoryManager`` adopts the following 656design: The memory manager itself has just two virtual methods for asynchronous 657operations (each with convenience overloads for calling synchronously): 658 659.. code-block:: c++ 660 661 /// Called when allocation has been completed. 662 using OnAllocatedFunction = 663 unique_function<void(Expected<std::unique_ptr<InFlightAlloc>)>; 664 665 /// Called when deallocation has completed. 666 using OnDeallocatedFunction = unique_function<void(Error)>; 667 668 /// Call to allocate memory. 669 virtual void allocate(const JITLinkDylib *JD, LinkGraph &G, 670 OnAllocatedFunction OnAllocated) = 0; 671 672 /// Call to deallocate memory. 673 virtual void deallocate(std::vector<FinalizedAlloc> Allocs, 674 OnDeallocatedFunction OnDeallocated) = 0; 675 676The ``allocate`` method takes a ``JITLinkDylib*`` representing the target 677simulated dylib, a reference to the ``LinkGraph`` that must be allocated for, 678and a callback to run once an ``InFlightAlloc`` has been constructed. 679``JITLinkMemoryManager`` implementations can (optionally) use the ``JD`` 680argument to manage a per-simulated-dylib memory pool (since code model 681constraints are typically imposed on a per-dylib basis, and not across 682dylibs) [2]_. The ``LinkGraph`` describes the object file that we need to 683allocate memory for. The allocator must allocate working memory for all of 684the Blocks defined in the graph, assign address space for each Block within the 685executing processes memory, and update the Blocks' addresses to reflect this 686assignment. Block content should be copied to working memory, but does not need 687to be transferred to executor memory yet (that will be done once the content is 688fixed up). ``JITLinkMemoryManager`` implementations can take full 689responsibility for these steps, or use the ``BasicLayout`` utility to reduce 690the task to allocating working and executor memory for *segments*: chunks of 691memory defined by permissions, alignments, content sizes, and zero-fill sizes. 692Once the allocation step is complete the memory manager should construct an 693``InFlightAlloc`` object to represent the allocation, and then pass this object 694to the ``OnAllocated`` callback. 695 696The ``InFlightAlloc`` object has two virtual methods: 697 698.. code-block:: c++ 699 700 using OnFinalizedFunction = unique_function<void(Expected<FinalizedAlloc>)>; 701 using OnAbandonedFunction = unique_function<void(Error)>; 702 703 /// Called prior to finalization if the allocation should be abandoned. 704 virtual void abandon(OnAbandonedFunction OnAbandoned) = 0; 705 706 /// Called to transfer working memory to the target and apply finalization. 707 virtual void finalize(OnFinalizedFunction OnFinalized) = 0; 708 709The linking process will call the ``finalize`` method on the ``InFlightAlloc`` 710object if linking succeeds up to the finalization step, otherwise it will call 711``abandon`` to indicate that some error occurred during linking. A call to the 712``InFlightAlloc::finalize`` method should cause content for the allocation to be 713transferred from working to executor memory, and permissions to be run. A call 714to ``abandon`` should result in both kinds of memory being deallocated. 715 716On successful finalization, the ``InFlightAlloc::finalize`` method should 717construct a ``FinalizedAlloc`` object (an opaque uint64_t id that the 718``JITLinkMemoryManager`` can use to identify executor memory for deallocation) 719and pass it to the ``OnFinalized`` callback. 720 721Finalized allocations (represented by ``FinalizedAlloc`` objects) can be 722deallocated by calling the ``JITLinkMemoryManager::dealloc`` method. This method 723takes a vector of ``FinalizedAlloc`` objects, since it is common to deallocate 724multiple objects at the same time and this allows us to batch these requests for 725transmission to the executing process. 726 727JITLink provides a simple in-process implementation of this interface: 728``InProcessMemoryManager``. It allocates pages once and re-uses them as both 729working and target memory. 730 731ORC provides a cross-process-capable ``MapperJITLinkMemoryManager`` that can use 732shared memory or ORC-RPC-based communication to transfer content to the executing 733process. 734 735JITLinkMemoryManager and Security 736--------------------------------- 737 738JITLink's ability to link JIT'd code for a separate executor process can be 739used to improve the security of a JIT system: The executor process can be 740sandboxed, run within a VM, or even run on a fully separate machine. 741 742JITLink's memory manager interface is flexible enough to allow for a range of 743trade-offs between performance and security. For example, on a system where code 744pages must be signed (preventing code from being updated), the memory manager 745can deallocate working memory pages after linking to free memory in the process 746running JITLink. Alternatively, on a system that allows RWX pages, the memory 747manager may use the same pages for both working and target memory by marking 748them as RWX, allowing code to be modified in place without further overhead. 749Finally, if RWX pages are not permitted but dual-virtual-mappings of 750physical memory pages are, then the memory manager can dual map physical pages 751as RW- in the JITLink process and R-X in the executor process, allowing 752modification from the JITLink process but not from the executor (at the cost of 753extra administrative overhead for the dual mapping). 754 755Error Handling 756-------------- 757 758JITLink makes extensive use of the ``llvm::Error`` type (see the error handling 759section of :doc:`ProgrammersManual` for details). The link process itself, all 760passes, the memory manager interface, and operations on the ``JITLinkContext`` 761are all permitted to fail. Link graph construction utilities (especially parsers 762for object formats) are encouraged to validate input, and validate fixups 763(e.g. with range checks) before application. 764 765Any error will halt the link process and notify the context of failure. In ORC, 766reported failures are propagated to queries pending on definitions provided by 767the failing link, and also through edges of the dependence graph to any queries 768waiting on dependent symbols. 769 770.. _connection_to_orc_runtime: 771 772Connection to the ORC Runtime 773============================= 774 775The ORC Runtime (currently under development) aims to provide runtime support 776for advanced JIT features, including object format features that require 777non-trivial action in the executor (e.g. running initializers, managing thread 778local storage, registering with language runtimes, etc.). 779 780ORC Runtime support for object format features typically requires cooperation 781between the runtime (which executes in the executor process) and JITLink (which 782runs in the JIT process and can inspect LinkGraphs to determine what actions 783must be taken in the executor). For example: Execution of MachO static 784initializers in the ORC runtime is performed by the ``jit_dlopen`` function, 785which calls back to the JIT process to ask for the list of address ranges of 786``__mod_init`` sections to walk. This list is collated by the 787``MachOPlatformPlugin``, which installs a pass to record this information for 788each object as it is linked into the target. 789 790.. _constructing_linkgraphs: 791 792Constructing LinkGraphs 793======================= 794 795Clients usually access and manipulate ``LinkGraph`` instances that were created 796for them by an ``ObjectLinkingLayer`` instance, but they can be created manually: 797 798#. By directly constructing and populating a ``LinkGraph`` instance. 799 800#. By using the ``createLinkGraph`` family of functions to create a 801 ``LinkGraph`` from an in-memory buffer containing an object file. This is how 802 ``ObjectLinkingLayer`` usually creates ``LinkGraphs``. 803 804 #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when 805 both the object format and architecture are known ahead of time. 806 807 #. ``createLinkGraph_<Object-Format>`` can be used when the object format is 808 known ahead of time, but the architecture is not. In this case the 809 architecture will be determined by inspection of the object header. 810 811 #. ``createLinkGraph`` can be used when neither the object format nor 812 the architecture are known ahead of time. In this case the object header 813 will be inspected to determine both the format and architecture. 814 815.. _jit_linking: 816 817JIT Linking 818=========== 819 820The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs, 821MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an 822in-memory compiler by adding an in-memory link step to the end of the usual 823compiler pipeline. Rather than dumping relocatable objects to disk as a compiler 824usually would, MCJIT passed them to RuntimeDyld to be linked into a target 825process. 826 827This approach to linking differs from standard *static* or *dynamic* linking: 828 829A *static linker* takes one or more relocatable object files as input and links 830them into an executable or dynamic library on disk. 831 832A *dynamic linker* applies relocations to executables and dynamic libraries that 833have been loaded into memory. 834 835A *JIT linker* takes a single relocatable object file at a time and links it 836into a target process, usually using a context object to allow the linked code 837to resolve symbols in the target. 838 839RuntimeDyld 840----------- 841 842In order to keep RuntimeDyld's implementation simple MCJIT imposed some 843restrictions on compiled code: 844 845#. It had to use the Large code model, and often restricted available relocation 846 models in order to limit the kinds of relocations that had to be supported. 847 848#. It required strong linkage and default visibility on all symbols -- behavior 849 for other linkages/visibilities was not well defined. 850 851#. It constrained and/or prohibited the use of features requiring runtime 852 support, e.g. static initializers or thread local storage. 853 854As a result of these restrictions not all language features supported by LLVM 855worked under MCJIT, and objects to be loaded under the JIT had to be compiled to 856target it (precluding the use of precompiled code from other sources under the 857JIT). 858 859RuntimeDyld also provided very limited visibility into the linking process 860itself: Clients could access conservative estimates of section size 861(RuntimeDyld bundled stub size and padding estimates into the section size 862value) and the final relocated bytes, but could not access RuntimeDyld's 863internal object representations. 864 865Eliminating these restrictions and limitations was one of the primary motivations 866for the development of JITLink. 867 868The llvm-jitlink tool 869===================== 870 871The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library. 872It loads some set of relocatable object files and then links them using 873JITLink. Depending on the options used it will then execute them, or validate 874the linked memory. 875 876The ``llvm-jitlink`` tool was originally designed to aid JITLink development by 877providing a simple environment for testing. 878 879Basic usage 880----------- 881 882By default, ``llvm-jitlink`` will link the set of objects passed on the command 883line, then search for a "main" function and execute it: 884 885.. code-block:: sh 886 887 % cat hello-world.c 888 #include <stdio.h> 889 890 int main(int argc, char *argv[]) { 891 printf("hello, world!\n"); 892 return 0; 893 } 894 895 % clang -c -o hello-world.o hello-world.c 896 % llvm-jitlink hello-world.o 897 Hello, World! 898 899Multiple objects may be specified, and arguments may be provided to the JIT'd 900main function using the -args option: 901 902.. code-block:: sh 903 904 % cat print-args.c 905 #include <stdio.h> 906 907 void print_args(int argc, char *argv[]) { 908 for (int i = 0; i != argc; ++i) 909 printf("arg %i is \"%s\"\n", i, argv[i]); 910 } 911 912 % cat print-args-main.c 913 void print_args(int argc, char *argv[]); 914 915 int main(int argc, char *argv[]) { 916 print_args(argc, argv); 917 return 0; 918 } 919 920 % clang -c -o print-args.o print-args.c 921 % clang -c -o print-args-main.o print-args-main.c 922 % llvm-jitlink print-args.o print-args-main.o -args a b c 923 arg 0 is "a" 924 arg 1 is "b" 925 arg 2 is "c" 926 927Alternative entry points may be specified using the ``-entry <entry point 928name>`` option. 929 930Other options can be found by calling ``llvm-jitlink -help``. 931 932llvm-jitlink as a regression testing utility 933-------------------------------------------- 934 935One of the primary aims of ``llvm-jitlink`` was to enable readable regression 936tests for JITLink. To do this it supports two options: 937 938The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry 939point, and before attempting to execute it. Since the linked code is not 940executed, this can be used to link for other targets even if you do not have 941access to the target being linked (the ``-define-abs`` or ``-phony-externals`` 942options can be used to supply any missing definitions in this case). 943 944The ``-check <check-file>`` option can be used to run a set of ``jitlink-check`` 945expressions against working memory. It is typically used in conjunction with 946``-noexec``, since the aim is to validate JIT'd memory rather than to run the 947code and ``-noexec`` allows us to link for any supported target architecture 948from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the 949given check-file for lines of the form ``# jitlink-check: <expr>``. See 950examples of this usage in ``llvm/test/ExecutionEngine/JITLink``. 951 952Remote execution via llvm-jitlink-executor 953------------------------------------------ 954 955By default ``llvm-jitlink`` will link the given objects into its own process, 956but this can be overridden by two options: 957 958The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to 959execute the given executor (which defaults to ``llvm-jitlink-executor``) and 960communicate with it via file descriptors which it passes to the executor 961as the first argument with the format ``filedescs=<in-fd>,<out-fd>``. 962 963The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to 964connect to an already running executor via TCP on the given host and port. To 965use this option you will need to start ``llvm-jitlink-executor`` manually with 966``listen=<host>:<port>`` as the first argument. 967 968Harness mode 969------------ 970 971The ``-harness`` option allows a set of input objects to be designated as a test 972harness, with the regular object files implicitly treated as objects to be 973tested. Definitions of symbols in the harness set override definitions in the 974test set, and external references from the harness cause automatic scope 975promotion of local symbols in the test set (these modifications to the usual 976linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by 977``llvm-jitlink`` when it sees the ``-harness`` option). 978 979With these modifications in place we can selectively test functions in an object 980file by mocking those function's callees. For example, suppose we have an object 981file, ``test_code.o``, compiled from the following C source (which we need not 982have access to): 983 984.. code-block:: c 985 986 void irrelevant_function() { irrelevant_external(); } 987 988 int function_to_mock(int X) { 989 return /* some function of X */; 990 } 991 992 static void function_to_test() { 993 ... 994 int Y = function_to_mock(); 995 printf("Y is %i\n", Y); 996 } 997 998If we want to know how ``function_to_test`` behaves when we change the behavior 999of ``function_to_mock`` we can test it by writing a test harness: 1000 1001.. code-block:: c 1002 1003 void function_to_test(); 1004 1005 int function_to_mock(int X) { 1006 printf("used mock utility function\n"); 1007 return 42; 1008 } 1009 1010 int main(int argc, char *argv[]) { 1011 function_to_test(): 1012 return 0; 1013 } 1014 1015Under normal circumstances these objects could not be linked together: 1016``function_to_test`` is static and could not be resolved outside 1017``test_code.o``, the two ``function_to_mock`` functions would result in a 1018duplicate definition error, and ``irrelevant_external`` is undefined. 1019However, using ``-harness`` and ``-phony-externals`` we can run this code 1020with: 1021 1022.. code-block:: sh 1023 1024 % clang -c -o test_code_harness.o test_code_harness.c 1025 % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o 1026 used mock utility function 1027 Y is 42 1028 1029The ``-harness`` option may be of interest to people who want to perform some 1030very late testing on build products to verify that compiled code behaves as 1031expected. On basic C test cases this is relatively straightforward. Mocks for 1032more complicated languages (e.g. C++) are much trickier: Any code involving 1033classes tends to have a lot of non-trivial surface area (e.g. vtables) that 1034would require great care to mock. 1035 1036Tips for JITLink backend developers 1037----------------------------------- 1038 1039#. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input 1040 object is well formed: Return any errors produced by libObject (or your own 1041 object parsing code) and validate as you construct. Think carefully about the 1042 distinction between contract (which should be validated with asserts and 1043 llvm_unreachable) and environmental errors (which should generate 1044 ``llvm::Error`` instances). 1045 1046#. Don't assume you're linking in-process. Use libSupport's sized, 1047 endian-specific types when reading/writing content in the ``LinkGraph``. 1048 1049As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an 1050invaluable resource for developers bringing in a new JITLink backend. A standard 1051workflow is to start by throwing an unsupported object at the tool and seeing 1052what error is returned, then fixing that (you can often make a reasonable guess 1053at what should be done based on existing code for other formats or 1054architectures). 1055 1056In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the 1057JITLink library during the link process. These can be useful for spotting some bugs at 1058a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink`` 1059tool, which can be useful for debugging both testcases (it is often less verbose than 1060``-debug-only=jitlink``) and the tool itself. 1061 1062The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing 1063handling of cross-process and cross-architecture use cases. 1064 1065Roadmap 1066======= 1067 1068JITLink is under active development. Work so far has focused on the MachO 1069implementation. In LLVM 12 there is limited support for ELF on x86-64. 1070 1071Major outstanding projects include: 1072 1073* Refactor architecture support to maximize sharing across formats. 1074 1075 All formats should be able to share the bulk of the architecture specific 1076 code (especially relocations) for each supported architecture. 1077 1078* Refactor ELF link graph construction. 1079 1080 ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp` 1081 file, and tied to the x86-64 relocation parsing code. The bulk of the code is 1082 generic and should be split into an ELFLinkGraphBuilder base class along the 1083 same lines as the existing generic MachOLinkGraphBuilder. 1084 1085* Implement support for arm32. 1086 1087* Implement support for other new architectures. 1088 1089JITLink Availability and Feature Status 1090--------------------------------------- 1091 1092The following table describes the status of the JITlink backends for various 1093format / architecture combinations (as of July 2023). 1094 1095Support levels: 1096 1097* None: No backend. JITLink will return an "architecture not supported" error. 1098 Represented by empty cells in the table below. 1099* Skeleton: A backend exists, but does not support commonly used relocations. 1100 Even simple programs are likely to trigger an "unsupported relocation" error. 1101 Backends in this state may be easy to improve by implementing new relocations. 1102 Consider getting involved! 1103* Basic: The backend supports simple programs, isn't ready for general use yet. 1104* Usable: The backend is useable for general use for at least one code and 1105 relocation model. 1106* Good: The backend supports almost all relocations. Advanced features like 1107 native thread local storage may not be available yet. 1108* Complete: The backend supports all relocations and object format features. 1109 1110.. list-table:: Availability and Status 1111 :widths: 10 30 30 30 1112 :header-rows: 1 1113 :stub-columns: 1 1114 1115 * - Architecture 1116 - ELF 1117 - COFF 1118 - MachO 1119 * - arm32 1120 - Skeleton 1121 - 1122 - 1123 * - arm64 1124 - Usable 1125 - 1126 - Good 1127 * - LoongArch 1128 - Good 1129 - 1130 - 1131 * - PowerPC 64 1132 - Usable 1133 - 1134 - 1135 * - RISC-V 1136 - Good 1137 - 1138 - 1139 * - x86-32 1140 - Basic 1141 - 1142 - 1143 * - x86-64 1144 - Good 1145 - Usable 1146 - Good 1147 1148.. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for 1149 a full worked example. 1150 1151.. [2] If not for *hidden* scoped symbols we could eliminate the 1152 ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and 1153 treat every object as a separate simulated dylib for the purposes of 1154 memory layout. Hidden symbols break this by generating in-range accesses 1155 to external symbols, requiring the access and symbol to be allocated 1156 within range of one another. That said, providing a pre-reserved address 1157 range pool for each simulated dylib guarantees that the relaxation 1158 optimizations will kick in for all intra-dylib references, which is good 1159 for performance (at the cost of whatever overhead is introduced by 1160 reserving the address-range up-front). 1161