xref: /llvm-project/llvm/docs/JITLink.rst (revision 8f3fb5d982db63572c11dd602780218ec45df986)
1====================================
2JITLink and ORC's ObjectLinkingLayer
3====================================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11This document aims to provide a high-level overview of the design and API
12of the JITLink library. It assumes some familiarity with linking and
13relocatable object files, but should not require deep expertise. If you know
14what a section, symbol, and relocation are then you should find this document
15accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a
16bug (:doc:`HowToSubmitABug`).
17
18JITLink is a library for :ref:`jit_linking`. It was built to support the :doc:`ORC JIT
19APIs<ORCv2>` and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was
20developed with the aim of supporting the full set of features provided by each
21object format; including static initializers, exception handling, thread local
22variables, and language runtime registration. Supporting these features enables
23ORC to execute code generated from source languages which rely on these features
24(e.g. C++ requires object format support for static initializers to support
25static constructors, eh-frame registration for exceptions, and TLV support for
26thread locals; Swift and Objective-C require language runtime registration for
27many features). For some object format features support is provided entirely
28within JITLink, and for others it is provided in cooperation with the
29(prototype) ORC runtime.
30
31JITLink aims to support the following features, some of which are still under
32development:
33
341. Cross-process and cross-architecture linking of single relocatable objects
35   into a target *executor* process.
36
372. Support for all object format features.
38
393. Open linker data structures (``LinkGraph``) and pass system.
40
41JITLink and ObjectLinkingLayer
42==============================
43
44``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that
45allows objects to be added to a ``JITDylib``, or emitted from some higher level
46program representation. When an object is emitted, ``ObjectLinkingLayer`` uses
47JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and
48calls JITLink's ``link`` function to link the graph into the executor process.
49
50The ``ObjectLinkingLayer`` class provides a plugin API,
51``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and
52modify ``LinkGraph`` instances at link time, and react to important JIT events
53(such as an object being emitted into target memory). This enables many features
54and optimizations that were not possible under MCJIT or RuntimeDyld.
55
56ObjectLinkingLayer Plugins
57--------------------------
58
59The ``ObjectLinkingLayer::Plugin`` class provides the following methods:
60
61* ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It
62  can be overridden to install JITLink *Passes* to run during the link process.
63
64  .. code-block:: c++
65
66    void modifyPassConfig(MaterializationResponsibility &MR,
67                          jitlink::LinkGraph &G,
68                          jitlink::PassConfiguration &Config)
69
70* ``notifyLoaded`` is called before the link begins, and can be overridden to
71  set up any initial state for the given ``MaterializationResponsibility`` if
72  needed.
73
74  .. code-block:: c++
75
76    void notifyLoaded(MaterializationResponsibility &MR)
77
78* ``notifyEmitted`` is called after the link is complete and code has been
79  emitted to the executor process. It can be overridden to finalize state
80  for the ``MaterializationResponsibility`` if needed.
81
82  .. code-block:: c++
83
84    Error notifyEmitted(MaterializationResponsibility &MR)
85
86* ``notifyFailed`` is called if the link fails at any point. It can be
87  overridden to react to the failure (e.g. to deallocate any already allocated
88  resources).
89
90  .. code-block:: c++
91
92    Error notifyFailed(MaterializationResponsibility &MR)
93
94* ``notifyRemovingResources`` is called when a request is made to remove any
95  resources associated with the ``ResourceKey`` *K* for the
96  ``MaterializationResponsibility``.
97
98  .. code-block:: c++
99
100    Error notifyRemovingResources(JITDylib &JD, ResourceKey K)
101
102* ``notifyTransferringResources`` is called if/when a request is made to
103  transfer tracking of any resources associated with ``ResourceKey``
104  *SrcKey* to *DstKey*.
105
106  .. code-block:: c++
107
108    void notifyTransferringResources(JITDylib &JD, ResourceKey DstKey,
109                                     ResourceKey SrcKey)
110
111Plugin authors are required to implement the ``notifyFailed``,
112``notifyRemovingResources``, and ``notifyTransferringResources`` methods in
113order to safely manage resources in the case of resource removal or transfer,
114or link failure. If no resources are managed by the plugin then these methods
115can be implemented as no-ops returning ``Error::success()``.
116
117Plugin instances are added to an ``ObjectLinkingLayer`` by
118calling the ``addPlugin`` method [1]_. E.g.
119
120.. code-block:: c++
121
122  // Plugin class to print the set of defined symbols in an object when that
123  // object is linked.
124  class MyPlugin : public ObjectLinkingLayer::Plugin {
125  public:
126
127    // Add passes to print the set of defined symbols after dead-stripping.
128    void modifyPassConfig(MaterializationResponsibility &MR,
129                          jitlink::LinkGraph &G,
130                          jitlink::PassConfiguration &Config) override {
131      Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) {
132        return printAllSymbols(G);
133      });
134    }
135
136    // Implement mandatory overrides:
137    Error notifyFailed(MaterializationResponsibility &MR) override {
138      return Error::success();
139    }
140    Error notifyRemovingResources(JITDylib &JD, ResourceKey K) override {
141      return Error::success();
142    }
143    void notifyTransferringResources(JITDylib &JD, ResourceKey DstKey,
144                                     ResourceKey SrcKey) override {}
145
146    // JITLink pass to print all defined symbols in G.
147    Error printAllSymbols(LinkGraph &G) {
148      for (auto *Sym : G.defined_symbols())
149        if (Sym->hasName())
150          dbgs() << Sym->getName() << "\n";
151      return Error::success();
152    }
153  };
154
155  // Create our LLJIT instance using a custom object linking layer setup.
156  // This gives us a chance to install our plugin.
157  auto J = ExitOnErr(LLJITBuilder()
158             .setObjectLinkingLayerCreator(
159               [](ExecutionSession &ES, const Triple &T) {
160                 // Manually set up the ObjectLinkingLayer for our LLJIT
161                 // instance.
162                 auto OLL = std::make_unique<ObjectLinkingLayer>(
163                     ES, std::make_unique<jitlink::InProcessMemoryManager>());
164
165                 // Install our plugin:
166                 OLL->addPlugin(std::make_unique<MyPlugin>());
167
168                 return OLL;
169               })
170             .create());
171
172  // Add an object to the JIT. Nothing happens here: linking isn't triggered
173  // until we look up some symbol in our object.
174  ExitOnErr(J->addObject(loadFromDisk("main.o")));
175
176  // Plugin triggers here when our lookup of main triggers linking of main.o
177  auto MainSym = J->lookup("main");
178
179LinkGraph
180=========
181
182JITLink maps all relocatable object formats to a generic ``LinkGraph`` type
183that is designed to make linking fast and easy (``LinkGraph`` instances can
184also be created manually. See :ref:`constructing_linkgraphs`).
185
186Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details,
187but share a common goal: to represent machine level code and data with
188annotations that allow them to be relocated in a virtual address space. To
189this end they usually contain names (symbols) for content defined inside the
190file or externally, chunks of content that must be moved as a unit (sections
191or subsections, depending on the format), and annotations describing how to
192patch content based on the final address of some target symbol/section
193(relocations).
194
195At a high level, the ``LinkGraph`` type represents these concepts as a decorated
196graph. Nodes in the graph represent symbols and content, and edges represent
197relocations. Each of the elements of the graph is listed here:
198
199* ``Addressable`` -- A node in the link graph that can be assigned an address
200  in the executor process's virtual address space.
201
202  Absolute and external symbols are represented using plain ``Addressable``
203  instances. Content defined inside the object file is represented using the
204  ``Block`` subclass.
205
206* ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as
207  zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an
208  ``AlignmentOffset``), and a list of ``Edge`` instances.
209
210  Blocks provide a container for binary content which must remain contiguous in
211  the target address space (a *layout unit*). Many interesting low level
212  operations on ``LinkGraph`` instances involve inspecting or mutating block
213  content or edges.
214
215  * ``Content`` is represented as an ``llvm::StringRef``, and accessible via
216    the ``getContent`` method. Content is only available for content blocks,
217    and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer
218    ``getSize`` when only the block size is needed as it works for both
219    zero-fill and content blocks).
220
221  * ``Section`` is represented as a ``Section&`` reference, and accessible via
222    the ``getSection`` method. The ``Section`` class is described in more detail
223    below.
224
225  * ``Size`` is represented as a ``size_t``, and is accessible via the
226    ``getSize`` method for both content and zero-filled blocks.
227
228  * ``Alignment`` is represented as a ``uint64_t``, and available via the
229    ``getAlignment`` method. It represents the minimum alignment requirement (in
230    bytes) of the start of the block.
231
232  * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the
233    ``getAlignmentOffset`` method. It represents the offset from the alignment
234    required for the start of the block. This is required to support blocks
235    whose minimum alignment requirement comes from data at some non-zero offset
236    inside the block. E.g. if a block consists of a single byte (with byte
237    alignment) followed by a uint64_t (with 8-byte alignment), then the block
238    will have 8-byte alignment with an alignment offset of 7.
239
240  * list of ``Edge`` instances. An iterator range for this list is returned by
241    the ``edges`` method. The ``Edge`` class is described in more detail below.
242
243* ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an
244  optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a
245  ``Live`` flag.
246
247  Symbols make it possible to name content (blocks and addressables are
248  anonymous), or target content with an ``Edge``.
249
250  * ``Name`` is represented as an ``llvm::StringRef`` (equal to
251    ``llvm::StringRef()`` if the symbol has no name), and accessible via the
252    ``getName`` method.
253
254  * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the
255    ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine
256    whether this symbol definition should be kept or dropped.
257
258  * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via
259    the ``getScope`` method. The ``JITLinkContext`` can use this to determine
260    who should be able to see the symbol. A symbol with default scope should be
261    globally visible. A symbol with hidden scope should be visible to other
262    definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or
263    executable, but not from elsewhere. A symbol with local scope should only be
264    visible within the current ``LinkGraph``.
265
266  * ``Callable`` is a boolean which is set to true if this symbol can be called,
267    and is accessible via the ``isCallable`` method. This can be used to
268    automate the introduction of call-stubs for lazy compilation.
269
270  * ``Live`` is a boolean that can be set to mark this symbol as root for
271    dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's
272    dead-stripping algorithm will propagate liveness flags through the graph to
273    all reachable symbols before deleting any symbols (and blocks) that are not
274    marked live.
275
276* ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the
277  containing ``Block``), a ``Kind`` (describing the relocation type), a
278  ``Target``, and an ``Addend``.
279
280  Edges represent relocations, and occasionally other relationships, between
281  blocks and symbols.
282
283  * ``Offset``, accessible via ``getOffset``, is an offset from the start of the
284    ``Block`` containing the ``Edge``.
285
286  * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes
287    what kinds of changes (if any) should be made to block content at the given
288    ``Offset`` based on the address of the ``Target``.
289
290  * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``,
291    representing whose address is relevant to the fixup calculation specified by
292    the edge's ``Kind``.
293
294  * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation
295    is determined by the edge's ``Kind``.
296
297* ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block``
298  instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``.
299
300  Sections make it easy to iterate over the symbols or blocks associated with
301  a particular section in the source object file.
302
303  * ``blocks()`` returns an iterator over the set of blocks defined in the
304    section (as ``Block*`` pointers).
305
306  * ``symbols()`` returns an iterator over the set of symbols defined in the
307    section (as ``Symbol*`` pointers).
308
309  * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the
310    ``getName`` method.
311
312  * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum,
313    and accessible via the ``getProtectionFlags`` method. These flags describe
314    whether the section is readable, writable, executable, or some combination
315    of these. The most common combinations are ``RW-`` for writable data,
316    ``R--`` for constant data, and ``R-X`` for code.
317
318  * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order
319    the section relative to others.  It is usually used to preserve section
320    order within a segment (a set of sections with the same memory protections)
321    when laying out memory.
322
323For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of
324``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has
325one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of
326edges (possibly empty, represented as ``Edge`` instances) back to elements of
327the ``Symbol`` set. For convenience and performance of common algorithms,
328symbols and blocks are further grouped into ``Sections``.
329
330The ``LinkGraph`` itself provides operations for constructing, removing, and
331iterating over sections, symbols, and blocks. It also provides metadata
332and utilities relevant to the linking process:
333
334* Graph element operations
335
336  * ``sections`` returns an iterator over all sections in the graph.
337
338  * ``findSectionByName`` returns a pointer to the section with the given
339    name (as a ``Section*``) if it exists, otherwise returns a nullptr.
340
341  * ``blocks`` returns an iterator over all blocks in the graph (across all
342    sections).
343
344  * ``defined_symbols`` returns an iterator over all defined symbols in the
345    graph (across all sections).
346
347  * ``external_symbols`` returns an iterator over all external symbols in the
348    graph.
349
350  * ``absolute_symbols`` returns an iterator over all absolute symbols in the
351    graph.
352
353  * ``createSection`` creates a section with a given name and protection flags.
354
355  * ``createContentBlock`` creates a block with the given initial content,
356    parent section, address, alignment, and alignment offset.
357
358  * ``createZeroFillBlock`` creates a zero-fill block with the given size,
359    parent section, address, alignment, and alignment offset.
360
361  * ``addExternalSymbol`` creates a new addressable and symbol with a given
362    name, size, and linkage.
363
364  * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given
365    name, address, size, linkage, scope, and liveness.
366
367  * ``addCommonSymbol`` convenience function for creating a zero-filled block
368    and weak symbol with a given name, scope, section, initial address, size,
369    alignment and liveness.
370
371  * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block,
372    offset, size, callable-ness, and liveness.
373
374  * ``addDefinedSymbol`` creates a new symbol for a given block with a name,
375    offset, size, linkage, scope, callable-ness and liveness.
376
377  * ``makeExternal`` transforms a formerly defined symbol into an external one
378    by creating a new addressable and pointing the symbol at it. The existing
379    block is not deleted, but can be manually removed (if unreferenced) by
380    calling ``removeBlock``. All edges to the symbol remain valid, but the
381    symbol must now be defined outside this ``LinkGraph``.
382
383  * ``removeExternalSymbol`` removes an external symbol and its target
384    addressable. The target addressable must not be referenced by any other
385    symbols.
386
387  * ``removeAbsoluteSymbol`` removes an absolute symbol and its target
388    addressable. The target addressable must not be referenced by any other
389    symbols.
390
391  * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove
392    its target block.
393
394  * ``removeBlock`` removes the given block.
395
396  * ``splitBlock`` split a given block in two at a given index (useful where
397    it is known that a block contains decomposable records, e.g. CFI records
398    in an eh-frame section).
399
400* Graph utility operations
401
402  * ``getName`` returns the name of this graph, which is usually based on the
403    name of the input object file.
404
405  * ``getTargetTriple`` returns an `llvm::Triple` for the executor process.
406
407  * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor
408    process.
409
410  * ``getEndianness`` returns the endianness of the executor process.
411
412  * ``allocateString`` copies data from a given ``llvm::Twine`` into the
413    link graph's internal allocator. This can be used to ensure that content
414    created inside a pass outlives that pass's execution.
415
416.. _generic_link_algorithm:
417
418Generic Link Algorithm
419======================
420
421JITLink provides a generic link algorithm which can be extended / modified at
422certain points by the introduction of JITLink :ref:`passes`.
423
424At the end of each phase the linker packages its state into a *continuation*
425and calls the ``JITLinkContext`` object to perform a (potentially high-latency)
426asynchronous operation: allocating memory, resolving external symbols, and
427finally transferring linked memory to the executing process.
428
429#. Phase 1
430
431   This phase is called immediately by the ``link`` function as soon as the
432   initial configuration (including the pass pipeline setup) is complete.
433
434   #. Run pre-prune passes.
435
436      These passes are called on the graph before it is pruned. At this stage
437      ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass
438      (supplied by the ``JITLinkContext``) will be run at the end of this
439      sequence to mark the initial set of live symbols.
440
441      Notable use cases: marking nodes live, accessing/copying graph data that
442      will be pruned (e.g. metadata that's important for the JIT, but not needed
443      for the link process).
444
445   #. Prune (dead-strip) the ``LinkGraph``.
446
447      Removes all symbols and blocks not reachable from the initial set of live
448      symbols.
449
450      This allows JITLink to remove unreachable symbols / content, including
451      overridden weak and redundant ODR definitions.
452
453   #. Run post-prune passes.
454
455      These passes are run on the graph after dead-stripping, but before memory
456      is allocated or nodes assigned their final target vmaddrs.
457
458      Passes run at this stage benefit from pruning, as dead functions and data
459      have been stripped from the graph. However new content can still be added
460      to the graph, as target and working memory have not been allocated yet.
461
462      Notable use cases: Building Global Offset Table (GOT), Procedure Linkage
463      Table (PLT), and Thread Local Variable (TLV) entries.
464
465   #. Asynchronously allocate memory.
466
467      Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both
468      working and target memory for the graph. As part of this process the
469      ``JITLinkMemoryManager`` will update the addresses of all nodes
470      defined in the graph to their assigned target address.
471
472      Note: This step only updates the addresses of nodes defined in this graph.
473      External symbols will still have null addresses.
474
475#. Phase 2
476
477   #. Run post-allocation passes.
478
479      These passes are run on the graph after working and target memory have
480      been allocated, but before the ``JITLinkContext`` is notified of the
481      final addresses of the symbols in the graph. This gives these passes a
482      chance to set up data structures associated with target addresses before
483      any JITLink clients (especially ORC queries for symbol resolution) can
484      attempt to access them.
485
486      Notable use cases: Setting up mappings between target addresses and
487      JIT data structures, such as a mapping between ``__dso_handle`` and
488      ``JITDylib*``.
489
490   #. Notify the ``JITLinkContext`` of the assigned symbol addresses.
491
492      Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing
493      clients to react to the symbol address assignments made for this graph.
494      In ORC this is used to notify any pending queries for *resolved* symbols,
495      including pending queries from concurrently running JITLink instances that
496      have reached the next step and are waiting on the address of a symbol in
497      this graph to proceed with their link.
498
499   #. Identify external symbols and resolve their addresses asynchronously.
500
501      Calls the ``JITLinkContext`` to resolve the target address of any external
502      symbols in the graph.
503
504#. Phase 3
505
506   #. Apply external symbol resolution results.
507
508      This updates the addresses of all external symbols. At this point all
509      nodes in the graph have their final target addresses, however node
510      content still points back to the original data in the object file.
511
512   #. Run pre-fixup passes.
513
514      These passes are called on the graph after all nodes have been assigned
515      their final target addresses, but before node content is copied into
516      working memory and fixed up. Passes run at this stage can make late
517      optimizations to the graph and content based on address layout.
518
519      Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are
520      bypassed for fixup targets that are directly accessible under the assigned
521      memory layout.
522
523   #. Copy block content to working memory and apply fixups.
524
525      Copies all block content into allocated working memory (following the
526      target layout) and applies fixups. Graph blocks are updated to point at
527      the fixed up content.
528
529   #. Run post-fixup passes.
530
531      These passes are called on the graph after fixups have been applied and
532      blocks updated to point to the fixed up content.
533
534      Post-fixup passes can inspect blocks contents to see the exact bytes that
535      will be copied to the assigned target addresses.
536
537   #. Finalize memory asynchronously.
538
539      Calls the ``JITLinkMemoryManager`` to copy working memory to the executor
540      process and apply the requested permissions.
541
542#. Phase 3.
543
544   #. Notify the context that the graph has been emitted.
545
546      Calls ``JITLinkContext::notifyFinalized`` and hands off the
547      ``JITLinkMemoryManager::FinalizedAlloc`` object for this graph's memory
548      allocation. This allows the context to track/hold memory allocations and
549      react to the newly emitted definitions. In ORC this is used to update the
550      ``ExecutionSession`` instance's dependence graph, which may result in
551      these symbols (and possibly others) becoming *Ready* if all of their
552      dependencies have also been emitted.
553
554.. _passes:
555
556Passes
557------
558
559JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free
560to inspect and modify the given ``LinkGraph`` subject to the constraints of
561whatever phase they are running in (see :ref:`generic_link_algorithm`). If a
562pass returns ``Error::success()`` then linking continues. If a pass returns
563a failure value then linking is stopped and the ``JITLinkContext`` is notified
564that the link failed.
565
566Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT
567and PLT construction as a pass), and external clients like
568``ObjectLinkingLayer::Plugin``.
569
570In combination with the open ``LinkGraph`` API, JITLink passes enable the
571implementation of powerful new features. For example:
572
573* Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT
574  calls and identify situations where the addresses of the entry target and the
575  access are close enough to be accessed directly. In this case the pass can
576  rewrite the instruction stream of the containing block and update the fixup
577  edges to make the access direct.
578
579  Code for this looks like:
580
581.. code-block:: c++
582
583  Error relaxGOTEdges(LinkGraph &G) {
584    for (auto *B : G.blocks())
585      for (auto &E : B->edges())
586        if (E.getKind() == x86_64::GOTLoad) {
587          auto &GOTTarget = getGOTEntryTarget(E.getTarget());
588          if (isInRange(B.getFixupAddress(E), GOTTarget)) {
589            // Rewrite B.getContent() at fixup address from
590            // MOVQ to LEAQ
591
592            // Update edge target and kind.
593            E.setTarget(GOTTarget);
594            E.setKind(x86_64::PCRel32);
595          }
596        }
597
598    return Error::success();
599  }
600
601* Metadata registration -- Post allocation passes can be used to record the
602  address range of sections in the target. This can be used to register the
603  metadata (e.g exception handling frames, language metadata) in the target
604  once memory has been finalized.
605
606.. code-block:: c++
607
608  Error registerEHFrameSection(LinkGraph &G) {
609    if (auto *Sec = G.findSectionByName("__eh_frame")) {
610      SectionRange SR(*Sec);
611      registerEHFrameSection(SR.getStart(), SR.getEnd());
612    }
613
614    return Error::success();
615  }
616
617* Record call sites for later mutation -- A post-allocation pass can record
618  the call sites of all calls to a particular function, allowing those call
619  sites to be updated later at runtime (e.g. for instrumentation, or to
620  enable the function to be lazily compiled but still called directly after
621  compilation).
622
623.. code-block:: c++
624
625  StringRef FunctionName = "foo";
626  std::vector<ExecutorAddr> CallSitesForFunction;
627
628  auto RecordCallSites =
629    [&](LinkGraph &G) -> Error {
630      for (auto *B : G.blocks())
631        for (auto &E : B.edges())
632          if (E.getKind() == CallEdgeKind &&
633              E.getTarget().hasName() &&
634              E.getTraget().getName() == FunctionName)
635            CallSitesForFunction.push_back(B.getFixupAddress(E));
636      return Error::success();
637    };
638
639Memory Management with JITLinkMemoryManager
640-------------------------------------------
641
642JIT linking requires allocation of two kinds of memory: working memory in the
643JIT process and target memory in the execution process (these processes and
644memory allocations may be one and the same, depending on how the user wants
645to build their JIT). It also requires that these allocations conform to the
646requested code model in the target process (e.g. MachO/x86-64's Small code
647model requires that all code and data for a simulated dylib is allocated within
6484Gb). Finally, it is natural to make the memory manager responsible for
649transferring memory to the target address space and applying memory protections,
650since the memory manager must know how to communicate with the executor, and
651since sharing and protection assignment can often be efficiently managed (in
652the common case of running across processes on the same machine for security)
653via the host operating system's virtual memory management APIs.
654
655To satisfy these requirements ``JITLinkMemoryManager`` adopts the following
656design: The memory manager itself has just two virtual methods for asynchronous
657operations (each with convenience overloads for calling synchronously):
658
659.. code-block:: c++
660
661  /// Called when allocation has been completed.
662  using OnAllocatedFunction =
663    unique_function<void(Expected<std::unique_ptr<InFlightAlloc>)>;
664
665  /// Called when deallocation has completed.
666  using OnDeallocatedFunction = unique_function<void(Error)>;
667
668  /// Call to allocate memory.
669  virtual void allocate(const JITLinkDylib *JD, LinkGraph &G,
670                        OnAllocatedFunction OnAllocated) = 0;
671
672  /// Call to deallocate memory.
673  virtual void deallocate(std::vector<FinalizedAlloc> Allocs,
674                          OnDeallocatedFunction OnDeallocated) = 0;
675
676The ``allocate`` method takes a ``JITLinkDylib*`` representing the target
677simulated dylib, a reference to the ``LinkGraph`` that must be allocated for,
678and a callback to run once an ``InFlightAlloc`` has been constructed.
679``JITLinkMemoryManager`` implementations can (optionally) use the ``JD``
680argument to manage a per-simulated-dylib memory pool (since code model
681constraints are typically imposed on a per-dylib basis, and not across
682dylibs) [2]_. The ``LinkGraph`` describes the object file that we need to
683allocate memory for. The allocator must allocate working memory for all of
684the Blocks defined in the graph, assign address space for each Block within the
685executing processes memory, and update the Blocks' addresses to reflect this
686assignment. Block content should be copied to working memory, but does not need
687to be transferred to executor memory yet (that will be done once the content is
688fixed up). ``JITLinkMemoryManager`` implementations can take full
689responsibility for these steps, or use the ``BasicLayout`` utility to reduce
690the task to allocating working and executor memory for *segments*: chunks of
691memory defined by permissions, alignments, content sizes, and zero-fill sizes.
692Once the allocation step is complete the memory manager should construct an
693``InFlightAlloc`` object to represent the allocation, and then pass this object
694to the ``OnAllocated`` callback.
695
696The ``InFlightAlloc`` object has two virtual methods:
697
698.. code-block:: c++
699
700    using OnFinalizedFunction = unique_function<void(Expected<FinalizedAlloc>)>;
701    using OnAbandonedFunction = unique_function<void(Error)>;
702
703    /// Called prior to finalization if the allocation should be abandoned.
704    virtual void abandon(OnAbandonedFunction OnAbandoned) = 0;
705
706    /// Called to transfer working memory to the target and apply finalization.
707    virtual void finalize(OnFinalizedFunction OnFinalized) = 0;
708
709The linking process will call the ``finalize`` method on the ``InFlightAlloc``
710object if linking succeeds up to the finalization step, otherwise it will call
711``abandon`` to indicate that some error occurred during linking. A call to the
712``InFlightAlloc::finalize`` method should cause content for the allocation to be
713transferred from working to executor memory, and permissions to be run. A call
714to ``abandon`` should result in both kinds of memory being deallocated.
715
716On successful finalization, the ``InFlightAlloc::finalize`` method should
717construct a ``FinalizedAlloc`` object (an opaque uint64_t id that the
718``JITLinkMemoryManager`` can use to identify executor memory for deallocation)
719and pass it to the ``OnFinalized`` callback.
720
721Finalized allocations (represented by ``FinalizedAlloc`` objects) can be
722deallocated by calling the ``JITLinkMemoryManager::dealloc`` method. This method
723takes a vector of ``FinalizedAlloc`` objects, since it is common to deallocate
724multiple objects at the same time and this allows us to batch these requests for
725transmission to the executing process.
726
727JITLink provides a simple in-process implementation of this interface:
728``InProcessMemoryManager``. It allocates pages once and re-uses them as both
729working and target memory.
730
731ORC provides a cross-process-capable ``MapperJITLinkMemoryManager`` that can use
732shared memory or ORC-RPC-based communication to transfer content to the executing
733process.
734
735JITLinkMemoryManager and Security
736---------------------------------
737
738JITLink's ability to link JIT'd code for a separate executor process can be
739used to improve the security of a JIT system: The executor process can be
740sandboxed, run within a VM, or even run on a fully separate machine.
741
742JITLink's memory manager interface is flexible enough to allow for a range of
743trade-offs between performance and security. For example, on a system where code
744pages must be signed (preventing code from being updated), the memory manager
745can deallocate working memory pages after linking to free memory in the process
746running JITLink. Alternatively, on a system that allows RWX pages, the memory
747manager may use the same pages for both working and target memory by marking
748them as RWX, allowing code to be modified in place without further overhead.
749Finally, if RWX pages are not permitted but dual-virtual-mappings of
750physical memory pages are, then the memory manager can dual map physical pages
751as RW- in the JITLink process and R-X in the executor process, allowing
752modification from the JITLink process but not from the executor (at the cost of
753extra administrative overhead for the dual mapping).
754
755Error Handling
756--------------
757
758JITLink makes extensive use of the ``llvm::Error`` type (see the error handling
759section of :doc:`ProgrammersManual` for details). The link process itself, all
760passes, the memory manager interface, and operations on the ``JITLinkContext``
761are all permitted to fail. Link graph construction utilities (especially parsers
762for object formats) are encouraged to validate input, and validate fixups
763(e.g. with range checks) before application.
764
765Any error will halt the link process and notify the context of failure. In ORC,
766reported failures are propagated to queries pending on definitions provided by
767the failing link, and also through edges of the dependence graph to any queries
768waiting on dependent symbols.
769
770.. _connection_to_orc_runtime:
771
772Connection to the ORC Runtime
773=============================
774
775The ORC Runtime (currently under development) aims to provide runtime support
776for advanced JIT features, including object format features that require
777non-trivial action in the executor (e.g. running initializers, managing thread
778local storage, registering with language runtimes, etc.).
779
780ORC Runtime support for object format features typically requires cooperation
781between the runtime (which executes in the executor process) and JITLink (which
782runs in the JIT process and can inspect LinkGraphs to determine what actions
783must be taken in the executor). For example: Execution of MachO static
784initializers in the ORC runtime is performed by the ``jit_dlopen`` function,
785which calls back to the JIT process to ask for the list of address ranges of
786``__mod_init`` sections to walk. This list is collated by the
787``MachOPlatformPlugin``, which installs a pass to record this information for
788each object as it is linked into the target.
789
790.. _constructing_linkgraphs:
791
792Constructing LinkGraphs
793=======================
794
795Clients usually access and manipulate ``LinkGraph`` instances that were created
796for them by an ``ObjectLinkingLayer`` instance, but they can be created manually:
797
798#. By directly constructing and populating a ``LinkGraph`` instance.
799
800#. By using the ``createLinkGraph`` family of functions to create a
801   ``LinkGraph`` from an in-memory buffer containing an object file. This is how
802   ``ObjectLinkingLayer`` usually creates ``LinkGraphs``.
803
804  #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when
805     both the object format and architecture are known ahead of time.
806
807  #. ``createLinkGraph_<Object-Format>`` can be used when the object format is
808     known ahead of time, but the architecture is not. In this case the
809     architecture will be determined by inspection of the object header.
810
811  #. ``createLinkGraph`` can be used when neither the object format nor
812     the architecture are known ahead of time. In this case the object header
813     will be inspected to determine both the format and architecture.
814
815.. _jit_linking:
816
817JIT Linking
818===========
819
820The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs,
821MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an
822in-memory compiler by adding an in-memory link step to the end of the usual
823compiler pipeline. Rather than dumping relocatable objects to disk as a compiler
824usually would, MCJIT passed them to RuntimeDyld to be linked into a target
825process.
826
827This approach to linking differs from standard *static* or *dynamic* linking:
828
829A *static linker* takes one or more relocatable object files as input and links
830them into an executable or dynamic library on disk.
831
832A *dynamic linker* applies relocations to executables and dynamic libraries that
833have been loaded into memory.
834
835A *JIT linker* takes a single relocatable object file at a time and links it
836into a target process, usually using a context object to allow the linked code
837to resolve symbols in the target.
838
839RuntimeDyld
840-----------
841
842In order to keep RuntimeDyld's implementation simple MCJIT imposed some
843restrictions on compiled code:
844
845#. It had to use the Large code model, and often restricted available relocation
846   models in order to limit the kinds of relocations that had to be supported.
847
848#. It required strong linkage and default visibility on all symbols -- behavior
849   for other linkages/visibilities was not well defined.
850
851#. It constrained and/or prohibited the use of features requiring runtime
852   support, e.g. static initializers or thread local storage.
853
854As a result of these restrictions not all language features supported by LLVM
855worked under MCJIT, and objects to be loaded under the JIT had to be compiled to
856target it (precluding the use of precompiled code from other sources under the
857JIT).
858
859RuntimeDyld also provided very limited visibility into the linking process
860itself: Clients could access conservative estimates of section size
861(RuntimeDyld bundled stub size and padding estimates into the section size
862value) and the final relocated bytes, but could not access RuntimeDyld's
863internal object representations.
864
865Eliminating these restrictions and limitations was one of the primary motivations
866for the development of JITLink.
867
868The llvm-jitlink tool
869=====================
870
871The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library.
872It loads some set of relocatable object files and then links them using
873JITLink. Depending on the options used it will then execute them, or validate
874the linked memory.
875
876The ``llvm-jitlink`` tool was originally designed to aid JITLink development by
877providing a simple environment for testing.
878
879Basic usage
880-----------
881
882By default, ``llvm-jitlink`` will link the set of objects passed on the command
883line, then search for a "main" function and execute it:
884
885.. code-block:: sh
886
887  % cat hello-world.c
888  #include <stdio.h>
889
890  int main(int argc, char *argv[]) {
891    printf("hello, world!\n");
892    return 0;
893  }
894
895  % clang -c -o hello-world.o hello-world.c
896  % llvm-jitlink hello-world.o
897  Hello, World!
898
899Multiple objects may be specified, and arguments may be provided to the JIT'd
900main function using the -args option:
901
902.. code-block:: sh
903
904  % cat print-args.c
905  #include <stdio.h>
906
907  void print_args(int argc, char *argv[]) {
908    for (int i = 0; i != argc; ++i)
909      printf("arg %i is \"%s\"\n", i, argv[i]);
910  }
911
912  % cat print-args-main.c
913  void print_args(int argc, char *argv[]);
914
915  int main(int argc, char *argv[]) {
916    print_args(argc, argv);
917    return 0;
918  }
919
920  % clang -c -o print-args.o print-args.c
921  % clang -c -o print-args-main.o print-args-main.c
922  % llvm-jitlink print-args.o print-args-main.o -args a b c
923  arg 0 is "a"
924  arg 1 is "b"
925  arg 2 is "c"
926
927Alternative entry points may be specified using the ``-entry <entry point
928name>`` option.
929
930Other options can be found by calling ``llvm-jitlink -help``.
931
932llvm-jitlink as a regression testing utility
933--------------------------------------------
934
935One of the primary aims of ``llvm-jitlink`` was to enable readable regression
936tests for JITLink. To do this it supports two options:
937
938The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry
939point, and before attempting to execute it. Since the linked code is not
940executed, this can be used to link for other targets even if you do not have
941access to the target being linked (the ``-define-abs`` or ``-phony-externals``
942options can be used to supply any missing definitions in this case).
943
944The ``-check <check-file>`` option can be used to run a set of ``jitlink-check``
945expressions against working memory. It is typically used in conjunction with
946``-noexec``, since the aim is to validate JIT'd memory rather than to run the
947code and ``-noexec`` allows us to link for any supported target architecture
948from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the
949given check-file for lines of the form ``# jitlink-check: <expr>``. See
950examples of this usage in ``llvm/test/ExecutionEngine/JITLink``.
951
952Remote execution via llvm-jitlink-executor
953------------------------------------------
954
955By default ``llvm-jitlink`` will link the given objects into its own process,
956but this can be overridden by two options:
957
958The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to
959execute the given executor (which defaults to ``llvm-jitlink-executor``) and
960communicate with it via file descriptors which it passes to the executor
961as the first argument with the format ``filedescs=<in-fd>,<out-fd>``.
962
963The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to
964connect to an already running executor via TCP on the given host and port. To
965use this option you will need to start ``llvm-jitlink-executor`` manually with
966``listen=<host>:<port>`` as the first argument.
967
968Harness mode
969------------
970
971The ``-harness`` option allows a set of input objects to be designated as a test
972harness, with the regular object files implicitly treated as objects to be
973tested. Definitions of symbols in the harness set override definitions in the
974test set, and external references from the harness cause automatic scope
975promotion of local symbols in the test set (these modifications to the usual
976linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by
977``llvm-jitlink`` when it sees the ``-harness`` option).
978
979With these modifications in place we can selectively test functions in an object
980file by mocking those function's callees. For example, suppose we have an object
981file, ``test_code.o``, compiled from the following C source (which we need not
982have access to):
983
984.. code-block:: c
985
986  void irrelevant_function() { irrelevant_external(); }
987
988  int function_to_mock(int X) {
989    return /* some function of X */;
990  }
991
992  static void function_to_test() {
993    ...
994    int Y = function_to_mock();
995    printf("Y is %i\n", Y);
996  }
997
998If we want to know how ``function_to_test`` behaves when we change the behavior
999of ``function_to_mock`` we can test it by writing a test harness:
1000
1001.. code-block:: c
1002
1003  void function_to_test();
1004
1005  int function_to_mock(int X) {
1006    printf("used mock utility function\n");
1007    return 42;
1008  }
1009
1010  int main(int argc, char *argv[]) {
1011    function_to_test():
1012    return 0;
1013  }
1014
1015Under normal circumstances these objects could not be linked together:
1016``function_to_test`` is static and could not be resolved outside
1017``test_code.o``, the two ``function_to_mock`` functions would result in a
1018duplicate definition error, and ``irrelevant_external`` is undefined.
1019However, using ``-harness`` and ``-phony-externals`` we can run this code
1020with:
1021
1022.. code-block:: sh
1023
1024  % clang -c -o test_code_harness.o test_code_harness.c
1025  % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o
1026  used mock utility function
1027  Y is 42
1028
1029The ``-harness`` option may be of interest to people who want to perform some
1030very late testing on build products to verify that compiled code behaves as
1031expected. On basic C test cases this is relatively straightforward. Mocks for
1032more complicated languages (e.g. C++) are much trickier: Any code involving
1033classes tends to have a lot of non-trivial surface area (e.g. vtables) that
1034would require great care to mock.
1035
1036Tips for JITLink backend developers
1037-----------------------------------
1038
1039#. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input
1040   object is well formed: Return any errors produced by libObject (or your own
1041   object parsing code) and validate as you construct. Think carefully about the
1042   distinction between contract (which should be validated with asserts and
1043   llvm_unreachable) and environmental errors (which should generate
1044   ``llvm::Error`` instances).
1045
1046#. Don't assume you're linking in-process. Use libSupport's sized,
1047   endian-specific types when reading/writing content in the ``LinkGraph``.
1048
1049As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an
1050invaluable resource for developers bringing in a new JITLink backend. A standard
1051workflow is to start by throwing an unsupported object at the tool and seeing
1052what error is returned, then fixing that (you can often make a reasonable guess
1053at what should be done based on existing code for other formats or
1054architectures).
1055
1056In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the
1057JITLink library during the link process. These can be useful for spotting some bugs at
1058a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink``
1059tool, which can be useful for debugging both testcases (it is often less verbose than
1060``-debug-only=jitlink``) and the tool itself.
1061
1062The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing
1063handling of cross-process and cross-architecture use cases.
1064
1065Roadmap
1066=======
1067
1068JITLink is under active development. Work so far has focused on the MachO
1069implementation. In LLVM 12 there is limited support for ELF on x86-64.
1070
1071Major outstanding projects include:
1072
1073* Refactor architecture support to maximize sharing across formats.
1074
1075  All formats should be able to share the bulk of the architecture specific
1076  code (especially relocations) for each supported architecture.
1077
1078* Refactor ELF link graph construction.
1079
1080  ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp`
1081  file, and tied to the x86-64 relocation parsing code. The bulk of the code is
1082  generic and should be split into an ELFLinkGraphBuilder base class along the
1083  same lines as the existing generic MachOLinkGraphBuilder.
1084
1085* Implement support for arm32.
1086
1087* Implement support for other new architectures.
1088
1089JITLink Availability and Feature Status
1090---------------------------------------
1091
1092The following table describes the status of the JITlink backends for various
1093format / architecture combinations (as of July 2023).
1094
1095Support levels:
1096
1097* None: No backend. JITLink will return an "architecture not supported" error.
1098  Represented by empty cells in the table below.
1099* Skeleton: A backend exists, but does not support commonly used relocations.
1100  Even simple programs are likely to trigger an "unsupported relocation" error.
1101  Backends in this state may be easy to improve by implementing new relocations.
1102  Consider getting involved!
1103* Basic: The backend supports simple programs, isn't ready for general use yet.
1104* Usable: The backend is useable for general use for at least one code and
1105  relocation model.
1106* Good: The backend supports almost all relocations. Advanced features like
1107  native thread local storage may not be available yet.
1108* Complete: The backend supports all relocations and object format features.
1109
1110.. list-table:: Availability and Status
1111   :widths: 10 30 30 30
1112   :header-rows: 1
1113   :stub-columns: 1
1114
1115   * - Architecture
1116     - ELF
1117     - COFF
1118     - MachO
1119   * - arm32
1120     - Skeleton
1121     -
1122     -
1123   * - arm64
1124     - Usable
1125     -
1126     - Good
1127   * - LoongArch
1128     - Good
1129     -
1130     -
1131   * - PowerPC 64
1132     - Usable
1133     -
1134     -
1135   * - RISC-V
1136     - Good
1137     -
1138     -
1139   * - x86-32
1140     - Basic
1141     -
1142     -
1143   * - x86-64
1144     - Good
1145     - Usable
1146     - Good
1147
1148.. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for
1149       a full worked example.
1150
1151.. [2] If not for *hidden* scoped symbols we could eliminate the
1152       ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and
1153       treat every object as a separate simulated dylib for the purposes of
1154       memory layout. Hidden symbols break this by generating in-range accesses
1155       to external symbols, requiring the access and symbol to be allocated
1156       within range of one another. That said, providing a pre-reserved address
1157       range pool for each simulated dylib guarantees that the relaxation
1158       optimizations will kick in for all intra-dylib references, which is good
1159       for performance (at the cost of whatever overhead is introduced by
1160       reserving the address-range up-front).
1161