xref: /openbsd-src/gnu/llvm/llvm/docs/JITLink.rst (revision d415bd752c734aee168c4ee86ff32e8cc249eb16)
1====================================
2JITLink and ORC's ObjectLinkingLayer
3====================================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11This document aims to provide a high-level overview of the design and API
12of the JITLink library. It assumes some familiarity with linking and
13relocatable object files, but should not require deep expertise. If you know
14what a section, symbol, and relocation are you should find this document
15accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a
16bug (:doc:`HowToSubmitABug`).
17
18JITLink is a library for :ref:`jit_linking`. It was built to support the ORC JIT
19APIs and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was
20developed with the aim of supporting the full set of features provided by each
21object format; including static initializers, exception handling, thread local
22variables, and language runtime registration. Supporting these features enables
23ORC to execute code generated from source languages which rely on these features
24(e.g. C++ requires object format support for static initializers to support
25static constructors, eh-frame registration for exceptions, and TLV support for
26thread locals; Swift and Objective-C require language runtime registration for
27many features). For some object format features support is provided entirely
28within JITLink, and for others it is provided in cooperation with the
29(prototype) ORC runtime.
30
31JITLink aims to support the following features, some of which are still under
32development:
33
341. Cross-process and cross-architecture linking of single relocatable objects
35   into a target *executor* process.
36
372. Support for all object format features.
38
393. Open linker data structures (``LinkGraph``) and pass system.
40
41JITLink and ObjectLinkingLayer
42==============================
43
44``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that
45allows objects to be added to a ``JITDylib``, or emitted from some higher level
46program representation. When an object is emitted, ``ObjectLinkingLayer`` uses
47JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and
48calls JITLink's ``link`` function to link the graph into the executor process.
49
50The ``ObjectLinkingLayer`` class provides a plugin API,
51``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and
52modify ``LinkGraph`` instances at link time, and react to important JIT events
53(such as an object being emitted into target memory). This enables many features
54and optimizations that were not possible under MCJIT or RuntimeDyld.
55
56ObjectLinkingLayer Plugins
57--------------------------
58
59The ``ObjectLinkingLayer::Plugin`` class  provides the following  methods:
60
61* ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It
62  can be overridden to install JITLink *Passes* to run during the link process.
63
64  .. code-block:: c++
65
66    void modifyPassConfig(MaterializationResponsibility &MR,
67                          const Triple &TT,
68                          jitlink::PassConfiguration &Config)
69
70* ``notifyLoaded`` is called before the link begins, and can be overridden to
71  set up any initial state for the given ``MaterializationResponsibility`` if
72  needed.
73
74  .. code-block:: c++
75
76    void notifyLoaded(MaterializationResponsibility &MR)
77
78* ``notifyEmitted`` is called after the link is complete and code has been
79  emitted to the executor process. It can be overridden to finalize state
80  for the ``MaterializationResponsibility`` if needed.
81
82  .. code-block:: c++
83
84    Error notifyEmitted(MaterializationResponsibility &MR)
85
86* ``notifyFailed`` is called if the link fails at any point. It can be
87  overridden to react to the failure (e.g. to deallocate any already allocated
88  resources).
89
90  .. code-block:: c++
91
92    Error notifyFailed(MaterializationResponsibility &MR)
93
94* ``notifyRemovingResources`` is called when a request is made to remove any
95  resources associated with the ``ResourceKey`` *K* for the
96  ``MaterializationResponsibility``.
97
98  .. code-block:: c++
99
100    Error notifyRemovingResources(ResourceKey K)
101
102* ``notifyTransferringResources`` is called if/when a request is made to
103  transfer tracking of any resources associated with ``ResourceKey``
104  *SrcKey* to *DstKey*.
105
106  .. code-block:: c++
107
108    void notifyTransferringResources(ResourceKey DstKey,
109                                     ResourceKey SrcKey)
110
111Plugin authors are required to implement the ``notifyFailed``,
112``notifyRemovingResources``, and ``notifyTransferringResources`` methods in
113order to safely manage resources in the case of resource removal or transfer,
114or link failure. If no resources are managed by the plugin then these methods
115can be implemented as no-ops returning ``Error::success()``.
116
117Plugin instances are added to an ``ObjectLinkingLayer`` by
118calling the ``addPlugin`` method [1]_. E.g.
119
120.. code-block:: c++
121
122  // Plugin class to print the set of defined symbols in an object when that
123  // object is linked.
124  class MyPlugin : public ObjectLinkingLayer::Plugin {
125  public:
126
127    // Add passes to print the set of defined symbols after dead-stripping.
128    void modifyPassConfig(MaterializationResponsibility &MR,
129                          const Triple &TT,
130                          jitlink::PassConfiguration &Config) override {
131      Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) {
132        return printAllSymbols(G);
133      });
134    }
135
136    // Implement mandatory overrides:
137    Error notifyFailed(MaterializationResponsibility &MR) override {
138      return Error::success();
139    }
140    Error notifyRemovingResources(ResourceKey K) override {
141      return Error::success();
142    }
143    void notifyTransferringResources(ResourceKey DstKey,
144                                     ResourceKey SrcKey) override {}
145
146    // JITLink pass to print all defined symbols in G.
147    Error printAllSymbols(LinkGraph &G) {
148      for (auto *Sym : G.defined_symbols())
149        if (Sym->hasName())
150          dbgs() << Sym->getName() << "\n";
151      return Error::success();
152    }
153  };
154
155  // Create our LLJIT instance using a custom object linking layer setup.
156  // This gives us a chance to install our plugin.
157  auto J = ExitOnErr(LLJITBuilder()
158             .setObjectLinkingLayerCreator(
159               [](ExecutionSession &ES, const Triple &T) {
160                 // Manually set up the ObjectLinkingLayer for our LLJIT
161                 // instance.
162                 auto OLL = std::make_unique<ObjectLinkingLayer>(
163                     ES, std::make_unique<jitlink::InProcessMemoryManager>());
164
165                 // Install our plugin:
166                 OLL->addPlugin(std::make_unique<MyPlugin>());
167
168                 return OLL;
169               })
170             .create());
171
172  // Add an object to the JIT. Nothing happens here: linking isn't triggered
173  // until we look up some symbol in our object.
174  ExitOnErr(J->addObject(loadFromDisk("main.o")));
175
176  // Plugin triggers here when our lookup of main triggers linking of main.o
177  auto MainSym = J->lookup("main");
178
179LinkGraph
180=========
181
182JITLink maps all relocatable object formats to a generic ``LinkGraph`` type
183that is designed to make linking fast and easy (``LinkGraph`` instances can
184also be created manually. See :ref:`constructing_linkgraphs`).
185
186Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details,
187but share a common goal: to represent machine level code and data with
188annotations that allow them to be relocated in a virtual address space. To
189this end they usually contain names (symbols) for content defined inside the
190file or externally, chunks of content that must be moved as a unit (sections
191or subsections, depending on the format), and annotations describing how to
192patch content based on the final address of some target symbol/section
193(relocations).
194
195At a high level, the ``LinkGraph`` type represents these concepts as a decorated
196graph. Nodes in the graph represent symbols and content, and edges represent
197relocations. Each of the elements of the graph is listed here:
198
199* ``Addressable`` -- A node in the link graph that can be assigned an address
200  in the executor process's virtual address space.
201
202  Absolute and external symbols are represented using plain ``Addressable``
203  instances. Content defined inside the object file is represented using the
204  ``Block`` subclass.
205
206* ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as
207  zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an
208  ``AlignmentOffset``), and a list of ``Edge`` instances.
209
210  Blocks provide a container for binary content which must remain contiguous in
211  the target address space (a *layout unit*). Many interesting low level
212  operations on ``LinkGraph`` instances involve inspecting or mutating block
213  content or edges.
214
215  * ``Content`` is represented as an ``llvm::StringRef``, and accessible via
216    the ``getContent`` method. Content is only available for content blocks,
217    and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer
218    ``getSize`` when only the block size is needed as it works for both
219    zero-fill and content blocks).
220
221  * ``Section`` is represented as a ``Section&`` reference, and accessible via
222    the ``getSection`` method. The ``Section`` class is described in more detail
223    below.
224
225  * ``Size`` is represented as a ``size_t``, and is accessible via the
226    ``getSize`` method for both content and zero-filled blocks.
227
228  * ``Alignment`` is represented as a ``uint64_t``, and available via the
229    ``getAlignment`` method. It represents the minimum alignment requirement (in
230    bytes) of the start of the block.
231
232  * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the
233    ``getAlignmentOffset`` method. It represents the offset from the alignment
234    required for the start of the block. This is required to support blocks
235    whose minimum alignment requirement comes from data at some non-zero offset
236    inside the block. E.g. if a block consists of a single byte (with byte
237    alignment) followed by a uint64_t (with 8-byte alignment), then the block
238    will have 8-byte alignment with an alignment offset of 7.
239
240  * list of ``Edge`` instances. An iterator range for this list is returned by
241    the ``edges`` method. The ``Edge`` class is described in more detail below.
242
243* ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an
244  optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a
245  ``Live`` flag.
246
247  Symbols make it possible to name content (blocks and addressables are
248  anonymous), or target content with an ``Edge``.
249
250  * ``Name`` is represented as an ``llvm::StringRef`` (equal to
251    ``llvm::StringRef()`` if the symbol has no name), and accessible via the
252    ``getName`` method.
253
254  * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the
255    ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine
256    whether this symbol definition should be kept or dropped.
257
258  * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via
259    the ``getScope`` method. The ``JITLinkContext`` can use this to determine
260    who should be able to see the symbol. A symbol with default scope should be
261    globally visible. A symbol with hidden scope should be visible to other
262    definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or
263    executable, but not from elsewhere. A symbol with local scope should only be
264    visible within the current ``LinkGraph``.
265
266  * ``Callable`` is a boolean which is set to true if this symbol can be called,
267    and is accessible via the ``isCallable`` method. This can be used to
268    automate the introduction of call-stubs for lazy compilation.
269
270  * ``Live`` is a boolean that can be set to mark this symbol as root for
271    dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's
272    dead-stripping algorithm will propagate liveness flags through the graph to
273    all reachable symbols before deleting any symbols (and blocks) that are not
274    marked live.
275
276* ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the
277  containing ``Block``), a ``Kind`` (describing the relocation type), a
278  ``Target``, and an ``Addend``.
279
280  Edges represent relocations, and occasionally other relationships, between
281  blocks and symbols.
282
283  * ``Offset``, accessible via ``getOffset``, is an offset from the start of the
284    ``Block`` containing the ``Edge``.
285
286  * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes
287    what kinds of changes (if any) should be made to block content at the given
288    ``Offset`` based on the address of the ``Target``.
289
290  * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``,
291    representing whose address is relevant to the fixup calculation specified by
292    the edge's ``Kind``.
293
294  * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation
295    is determined by the edge's ``Kind``.
296
297* ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block``
298  instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``.
299
300  Sections make it easy to iterate over the symbols or blocks associated with
301  a particular section in the source object file.
302
303  * ``blocks()`` returns an iterator over the set of blocks defined in the
304    section (as ``Block*`` pointers).
305
306  * ``symbols()`` returns an iterator over the set of symbols defined in the
307    section (as ``Symbol*`` pointers).
308
309  * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the
310    ``getName`` method.
311
312  * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum,
313    and accessible via the ``getProtectionFlags`` method. These flags describe
314    whether the section is readable, writable, executable, or some combination
315    of these. The most common combinations are ``RW-`` for writable data,
316    ``R--`` for constant data, and ``R-X`` for code.
317
318  * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order
319    the section relative to others.  It is usually used to preserve section
320    order within a segment (a set of sections with the same memory protections)
321    when laying out memory.
322
323For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of
324``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has
325one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of
326edges (possibly empty, represented as ``Edge`` instances) back to elements of
327the ``Symbol`` set. For convenience and performance of common algorithms,
328symbols and blocks are further grouped into ``Sections``.
329
330The ``LinkGraph`` itself provides operations for constructing, removing, and
331iterating over sections, symbols, and blocks. It also provides metadata
332and utilities relevant to the linking process:
333
334* Graph element operations
335
336  * ``sections`` returns an iterator over all sections in the graph.
337
338  * ``findSectionByName`` returns a pointer to the section with the given
339    name (as a ``Section*``) if it exists, otherwise returns a nullptr.
340
341  * ``blocks`` returns an iterator over all blocks in the graph (across all
342    sections).
343
344  * ``defined_symbols`` returns an iterator over all defined symbols in the
345    graph (across all sections).
346
347  * ``external_symbols`` returns an iterator over all external symbols in the
348    graph.
349
350  * ``absolute_symbols`` returns an iterator over all absolute symbols in the
351    graph.
352
353  * ``createSection`` creates a section with a given name and protection flags.
354
355  * ``createContentBlock`` creates a block with the given initial content,
356    parent section, address, alignment, and alignment offset.
357
358  * ``createZeroFillBlock`` creates a zero-fill block with the given size,
359    parent section, address, alignment, and alignment offset.
360
361  * ``addExternalSymbol`` creates a new addressable and symbol with a given
362    name, size, and linkage.
363
364  * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given
365    name, address, size, linkage, scope, and liveness.
366
367  * ``addCommonSymbol`` convenience function for creating a zero-filled block
368    and weak symbol with a given name, scope, section, initial address, size,
369    alignment and liveness.
370
371  * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block,
372    offset, size, callable-ness, and liveness.
373
374  * ``addDefinedSymbol`` creates a new symbol for a given block with a name,
375    offset, size, linkage, scope, callable-ness and liveness.
376
377  * ``makeExternal`` transforms a formerly defined symbol into an external one
378    by creating a new addressable and pointing the symbol at it. The existing
379    block is not deleted, but can be manually removed (if unreferenced) by
380    calling ``removeBlock``. All edges to the symbol remain valid, but the
381    symbol must now be defined outside this ``LinkGraph``.
382
383  * ``removeExternalSymbol`` removes an external symbol and its target
384    addressable. The target addressable must not be referenced by any other
385    symbols.
386
387  * ``removeAbsoluteSymbol`` removes an absolute symbol and its target
388    addressable. The target addressable must not be referenced by any other
389    symbols.
390
391  * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove
392    its target block.
393
394  * ``removeBlock`` removes the given block.
395
396  * ``splitBlock`` split a given block in two at a given index (useful where
397    it is known that a block contains decomposable records, e.g. CFI records
398    in an eh-frame section).
399
400* Graph utility operations
401
402  * ``getName`` returns the name of this graph, which is usually based on the
403    name of the input object file.
404
405  * ``getTargetTriple`` returns an `llvm::Triple` for the executor process.
406
407  * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor
408    process.
409
410  * ``getEndinaness`` returns the endianness of the executor process.
411
412  * ``allocateString`` copies data from a given ``llvm::Twine`` into the
413    link graph's internal allocator. This can be used to ensure that content
414    created inside a pass outlives that pass's execution.
415
416.. _generic_link_algorithm:
417
418Generic Link Algorithm
419======================
420
421JITLink provides a generic link algorithm which can be extended / modified at
422certain points by the introduction of JITLink :ref:`passes`:
423
424#. Phase 1
425
426   This phase is called immediately by the ``link`` function as soon as the
427   initial configuration (including the pass pipeline setup) is complete.
428
429   #. Run pre-prune passes.
430
431      These passes are called on the graph before it is pruned. At this stage
432      ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass
433      (supplied by the ``JITLinkContext``) will be run at the end of this
434      sequence to mark the initial set of live symbols.
435
436      Notable use cases: marking nodes live, accessing/copying graph data that
437      will be pruned (e.g. metadata that's important for the JIT, but not needed
438      for the link process).
439
440   #. Prune (dead-strip) the ``LinkGraph``.
441
442      Removes all symbols and blocks not reachable from the initial set of live
443      symbols.
444
445      This allows JITLink to remove unreachable symbols / content, including
446      overridden weak and redundant ODR definitions.
447
448   #. Run post-prune passes.
449
450      These passes are run on the graph after dead-stripping, but before memory
451      is allocated or nodes assigned their final target vmaddrs.
452
453      Passes run at this stage benefit from pruning, as dead functions and data
454      have been stripped from the graph. However new content can still be added
455      to the graph, as target and working memory have not been allocated yet.
456
457      Notable use cases: Building Global Offset Table (GOT), Procedure Linkage
458      Table (PLT), and Thread Local Variable (TLV) entries.
459
460   #. Sort blocks into segments.
461
462      Sorts all blocks by ordinal and then address. Collects sections with
463      matching permissions into segments and computes the size of these
464      segments for memory allocation.
465
466   #. Allocate segment memory, update node addresses.
467
468      Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both
469      working and target memory for the graph, then updates all node addresses
470      to their assigned target address.
471
472      Note: This step only updates the addresses of nodes defined in this graph.
473      External symbols will still have null addresses.
474
475   #. Run post-allocation passes.
476
477      These passes are run on the graph after working and target memory have
478      been allocated, but before the ``JITLinkContext`` is notified of the
479      final addresses of the symbols in the graph. This gives these passes a
480      chance to set up data structures associated with target addresses before
481      any JITLink clients (especially ORC queries for symbol resolution) can
482      attempt to access them.
483
484      Notable use cases: Setting up mappings between target addresses and
485      JIT data structures, such as a mapping between ``__dso_handle`` and
486      ``JITDylib*``.
487
488   #. Notify the ``JITLinkContext`` of the assigned symbol addresses.
489
490      Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing
491      clients to react to the symbol address assignments made for this graph.
492      In ORC this is used to notify any pending queries for *resolved* symbols,
493      including pending queries from concurrently running JITLink instances that
494      have reached the next step and are waiting on the address of a symbol in
495      this graph to proceed with their link.
496
497   #. Identify external symbols and resolve their addresses asynchronously.
498
499      Calls the ``JITLinkContext`` to resolve the target address of any external
500      symbols in the graph. This step is asynchronous -- JITLink will pack the
501      link state into a *continuation* to be run once the symbols are resolved.
502
503      This is the final step of Phase 1.
504
505#. Phase 2
506
507   This phase is called by the continuation constructed at the end of the
508   external symbol resolution step above.
509
510   #. Apply external symbol resolution results.
511
512      This updates the addresses of all external symbols. At this point all
513      nodes in the graph have their final target addresses, however node
514      content still points back to the original data in the object file.
515
516   #. Run pre-fixup passes.
517
518      These passes are called on the graph after all nodes have been assigned
519      their final target addresses, but before node content is copied into
520      working memory and fixed up. Passes run at this stage can make late
521      optimizations to the graph and content based on address layout.
522
523      Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are
524      bypassed for fixup targets that are directly accessible under the assigned
525      memory layout.
526
527   #. Copy block content to working memory and apply fixups.
528
529      Copies all block content into allocated working memory (following the
530      target layout) and applies fixups. Graph blocks are updated to point at
531      the fixed up content.
532
533   #. Run post-fixup passes.
534
535      These passes are called on the graph after fixups have been applied and
536      blocks updated to point to the fixed up content.
537
538      Post-fixup passes can inspect blocks contents to see the exact bytes that
539      will be copied to the assigned target addresses.
540
541   #. Finalize memory asynchronously.
542
543      Calls the ``JITLinkMemoryManager`` to copy working memory to the executor
544      process and apply the requested permissions. This step is asynchronous --
545      JITLink will pack the link state into a *continuation* to be run once
546      memory has been copied and protected.
547
548      This is the final step of Phase 2.
549
550#. Phase 3.
551
552   This phase is called by the continuation constructed at the end of the
553   memory finalization step above.
554
555   #. Notify the context that the graph has been emitted.
556
557      Calls ``JITLinkContext::notifyFinalized`` and hands off the
558      ``JITLinkMemoryManager::Allocation`` object for this graph's memory
559      allocation. This allows the context to track/hold memory allocations and
560      react to the newly emitted definitions. In ORC this is used to update the
561      ``ExecutionSession`` instance's dependence graph, which may result in
562      these symbols (and possibly others) becoming *Ready* if all of their
563      dependencies have also been emitted.
564
565.. _passes:
566
567Passes
568------
569
570JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free
571to inspect and modify the given ``LinkGraph`` subject to the constraints of
572whatever phase they are running in (see :ref:`generic_link_algorithm`). If a
573pass returns ``Error::success()`` then linking continues. If a pass returns
574a failure value then linking is stopped and the ``JITLinkContext`` is notified
575that the link failed.
576
577Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT
578and PLT construction as a pass), and external clients like
579``ObjectLinkingLayer::Plugin``.
580
581In combination with the open ``LinkGraph`` API, JITLink passes enable the
582implementation of powerful new features. For example:
583
584* Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT
585  calls and identify situations where the addresses of the entry target and the
586  access are close enough to be accessed directly. In this case the pass can
587  rewrite the instruction stream of the containing block and update the fixup
588  edges to make the access direct.
589
590  Code for this looks like:
591
592.. code-block:: c++
593
594  Error relaxGOTEdges(LinkGraph &G) {
595    for (auto *B : G.blocks())
596      for (auto &E : B->edges())
597        if (E.getKind() == x86_64::GOTLoad) {
598          auto &GOTTarget = getGOTEntryTarget(E.getTarget());
599          if (isInRange(B.getFixupAddress(E), GOTTarget)) {
600            // Rewrite B.getContent() at fixup address from
601            // MOVQ to LEAQ
602
603            // Update edge target and kind.
604            E.setTarget(GOTTarget);
605            E.setKind(x86_64::PCRel32);
606          }
607        }
608
609    return Error::success();
610  }
611
612* Metadata registration -- Post allocation passes can be used to record the
613  address range of sections in the target. This can be used to register the
614  metadata (e.g exception handling frames, language metadata) in the target
615  once memory has been finalized.
616
617.. code-block:: c++
618
619  Error registerEHFrameSection(LinkGraph &G) {
620    if (auto *Sec = G.findSectionByName("__eh_frame")) {
621      SectionRange SR(*Sec);
622      registerEHFrameSection(SR.getStart(), SR.getEnd());
623    }
624
625    return Error::success();
626  }
627
628* Record call sites for later mutation -- A post-allocation pass can record
629  the call sites of all calls to a particular function, allowing those call
630  sites to be updated later at runtime (e.g. for instrumentation, or to
631  enable the function to be lazily compiled but still called directly after
632  compilation).
633
634.. code-block:: c++
635
636  StringRef FunctionName = "foo";
637  std::vector<JITTargetAddress> CallSitesForFunction;
638
639  auto RecordCallSites =
640    [&](LinkGraph &G) -> Error {
641      for (auto *B : G.blocks())
642        for (auto &E : B.edges())
643          if (E.getKind() == CallEdgeKind &&
644              E.getTarget().hasName() &&
645              E.getTraget().getName() == FunctionName)
646            CallSitesForFunction.push_back(B.getFixupAddress(E));
647      return Error::success();
648    };
649
650Memory Management with JITLinkMemoryManager
651-------------------------------------------
652
653JIT linking requires allocation of two kinds of memory: working memory in the
654JIT process and target memory in the execution process (these processes and
655memory allocations may be one and the same, depending on how the user wants
656to build their JIT). It also requires that these allocations conform to the
657requested code model in the target process (e.g. MachO/x86-64's Small code
658model requires that all code and data for a simulated dylib is allocated within
6594Gb). Finally, it is natural to make the memory manager responsible for
660transferring memory to the target address space and applying memory protections,
661since the memory manager must know how to communicate with the executor, and
662since sharing and protection assignment can often be efficiently managed (in
663the common case of running across processes on the same machine for security)
664via the host operating system's virtual memory management APIs.
665
666To satisfy these requirements ``JITLinkMemoryManager`` adopts the following
667design: The memory manager itself has just one virtual method that returns a
668``JITLinkMemoryManager::Allocation``:
669
670.. code-block:: c++
671
672  virtual Expected<std::unique_ptr<Allocation>>
673  allocate(const JITLinkDylib *JD, const SegmentsRequestMap &Request) = 0;
674
675This method takes a ``JITLinkDylib*`` representing the target simulated
676dylib, and the full set of sections that must be allocated for this object.
677``JITLinkMemoryManager`` implementations can (optionally) use the ``JD``
678argument to manage a per-simulated-dylib memory pool (since code model
679constraints are typically imposed on a per-dylib basis, and not across
680dylibs) [2]_. The ``Request`` argument, by describing all sections in the current
681object up-front, allows the implementer to allocate those sections as a
682single slab, either within a pre-allocated per-jitdylib pool or directly
683from system memory.
684
685All subsequent operations are provided by the
686``JITLinkMemoryManager::Allocation`` interface:
687
688* ``virtual MutableArrayRef<char> getWorkingMemory(ProtectionFlags Seg)``
689
690  Should be overridden to return the address in working memory of the segment
691  with the given protection flags.
692
693* ``virtual JITTargetAddress getTargetMemory(ProtectionFlags Seg)``
694
695  Should be overridden to return the address in the executor's address space of
696  the segment with the given protection flags.
697
698* ``virtual void finalizeAsync(FinalizeContinuation OnFinalize)``
699
700  Should be overridden to copy the contents of working memory to the target
701  address space and apply memory protections for all segments. Where working
702  memory and target memory are separate, this method should deallocate the
703  working memory.
704
705* ``virtual Error deallocate()``
706
707  Should be overridden to deallocate memory in the target address space.
708
709JITLink provides a simple in-process implementation of this interface:
710``InProcessMemoryManager``. It allocates pages once and re-uses them as both
711working and target memory.
712
713ORC provides a cross-process ``JITLinkMemoryManager`` based on an ORC-RPC-based
714implementation of the ``orc::TargetProcessControl`` API:
715``OrcRPCTPCJITLinkMemoryManager``. This API uses TargetProcessControl API calls
716to allocate and manage memory in a remote process. The underlying communication
717channel is determined by the ORC-RPC channel type. Common options include unix
718sockets or TCP.
719
720JITLinkMemoryManager and Security
721---------------------------------
722
723JITLink's ability to link JIT'd code for a separate executor process can be
724used to improve the security of a JIT system: The executor process can be
725sandboxed, run within a VM, or even run on a fully separate machine.
726
727JITLink's memory manager interface is flexible enough to allow for a range of
728trade-offs between performance and security. For example, on a system where code
729pages must be signed (preventing code from being updated), the memory manager
730can deallocate working memory pages after linking to free memory in the process
731running JITLink. Alternatively, on a system that allows RWX pages, the memory
732manager may use the same pages for both working and target memory by marking
733them as RWX, allowing code to be modified in place without further overhead.
734Finally, if RWX pages are not permitted but dual-virtual-mappings of
735physical memory pages are, then the memory manager can dual map physical pages
736as RW- in the JITLink process and R-X in the executor process, allowing
737modification from the JITLink process but not from the executor (at the cost of
738extra administrative overhead for the dual mapping).
739
740Error Handling
741--------------
742
743JITLink makes extensive use of the ``llvm::Error`` type (see the error handling
744section of :doc:`ProgrammersManual` for details). The link process itself, all
745passes, the memory manager interface, and operations on the ``JITLinkContext``
746are all permitted to fail. Link graph construction utilities (especially parsers
747for object formats) are encouraged to validate input, and validate fixups
748(e.g. with range checks) before application.
749
750Any error will halt the link process and notify the context of failure. In ORC,
751reported failures are propagated to queries pending on definitions provided by
752the failing link, and also through edges of the dependence graph to any queries
753waiting on dependent symbols.
754
755.. _connection_to_orc_runtime:
756
757Connection to the ORC Runtime
758=============================
759
760The ORC Runtime (currently under development) aims to provide runtime support
761for advanced JIT features, including object format features that require
762non-trivial action in the executor (e.g. running initializers, managing thread
763local storage, registering with language runtimes, etc.).
764
765ORC Runtime support for object format features typically requires cooperation
766between the runtime (which executes in the executor process) and JITLink (which
767runs in the JIT process and can inspect LinkGraphs to determine what actions
768must be taken in the executor). For example: Execution of MachO static
769initializers in the ORC runtime is performed by the ``jit_dlopen`` function,
770which calls back to the JIT process to ask for the list of address ranges of
771``__mod_init`` sections to walk. This list is collated by the
772``MachOPlatformPlugin``, which installs a pass to record this information for
773each object as it is linked into the target.
774
775.. _constructing_linkgraphs:
776
777Constructing LinkGraphs
778=======================
779
780Clients usually access and manipulate ``LinkGraph`` instances that were created
781for them by an ``ObjectLinkingLayer`` instance, but they can be created manually:
782
783#. By directly constructing and populating a ``LinkGraph`` instance.
784
785#. By using the ``createLinkGraph`` family of functions to create a
786   ``LinkGraph`` from an in-memory buffer containing an object file. This is how
787   ``ObjectLinkingLayer`` usually creates ``LinkGraphs``.
788
789  #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when
790      both the object format and architecture are known ahead of time.
791
792  #. ``createLinkGraph_<Object-Format>`` can be used when the object format is
793     known ahead of time, but the architecture is not. In this case the
794     architecture will be determined by inspection of the object header.
795
796  #. ``createLinkGraph`` can be used when neither the object format nor
797     the architecture are known ahead of time. In this case the object header
798     will be inspected to determine both the format and architecture.
799
800.. _jit_linking:
801
802JIT Linking
803===========
804
805The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs,
806MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an
807in-memory compiler by adding an in-memory link step to the end of the usual
808compiler pipeline. Rather than dumping relocatable objects to disk as a compiler
809usually would, MCJIT passed them to RuntimeDyld to be linked into a target
810process.
811
812This approach to linking differs from standard *static* or *dynamic* linking:
813
814A *static linker* takes one or more relocatable object files as input and links
815them into an executable or dynamic library on disk.
816
817A *dynamic linker* applies relocations to executables and dynamic libraries that
818have been loaded into memory.
819
820A *JIT linker* takes a single relocatable object file at a time and links it
821into a target process, usually using a context object to allow the linked code
822to resolve symbols in the target.
823
824RuntimeDyld
825-----------
826
827In order to keep RuntimeDyld's implementation simple MCJIT imposed some
828restrictions on compiled code:
829
830#. It had to use the Large code model, and often restricted available relocation
831   models in order to limit the kinds of relocations that had to be supported.
832
833#. It required strong linkage and default visibility on all symbols -- behavior
834   for other linkages/visibilities was not well defined.
835
836#. It constrained and/or prohibited the use of features requiring runtime
837   support, e.g. static initializers or thread local storage.
838
839As a result of these restrictions not all language features supported by LLVM
840worked under MCJIT, and objects to be loaded under the JIT had to be compiled to
841target it (precluding the use of precompiled code from other sources under the
842JIT).
843
844RuntimeDyld also provided very limited visibility into the linking process
845itself: Clients could access conservative estimates of section size
846(RuntimeDyld bundled stub size and padding estimates into the section size
847value) and the final relocated bytes, but could not access RuntimeDyld's
848internal object representations.
849
850Eliminating these restrictions and limitations was one of the primary motivations
851for the development of JITLink.
852
853The llvm-jitlink tool
854=====================
855
856The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library.
857It loads some set of relocatable object files and then links them using
858JITLink. Depending on the options used it will then execute them, or validate
859the linked memory.
860
861The ``llvm-jitlink`` tool was originally designed to aid JITLink development by
862providing a simple environment for testing.
863
864Basic usage
865-----------
866
867By default, ``llvm-jitlink`` will link the set of objects passed on the command
868line, then search for a "main" function and execute it:
869
870.. code-block:: sh
871
872  % cat hello-world.c
873  #include <stdio.h>
874
875  int main(int argc, char *argv[]) {
876    printf("hello, world!\n");
877    return 0;
878  }
879
880  % clang -c -o hello-world.o hello-world.c
881  % llvm-jitlink hello-world.o
882  Hello, World!
883
884Multiple objects may be specified, and arguments may be provided to the JIT'd
885main function using the -args option:
886
887.. code-block:: sh
888
889  % cat print-args.c
890  #include <stdio.h>
891
892  void print_args(int argc, char *argv[]) {
893    for (int i = 0; i != argc; ++i)
894      printf("arg %i is \"%s\"\n", i, argv[i]);
895  }
896
897  % cat print-args-main.c
898  void print_args(int argc, char *argv[]);
899
900  int main(int argc, char *argv[]) {
901    print_args(argc, argv);
902    return 0;
903  }
904
905  % clang -c -o print-args.o print-args.c
906  % clang -c -o print-args-main.o print-args-main.c
907  % llvm-jitlink print-args.o print-args-main.o -args a b c
908  arg 0 is "a"
909  arg 1 is "b"
910  arg 2 is "c"
911
912Alternative entry points may be specified using the ``-entry <entry point
913name>`` option.
914
915Other options can be found by calling ``llvm-jitlink -help``.
916
917llvm-jitlink as a regression testing utility
918--------------------------------------------
919
920One of the primary aims of ``llvm-jitlink`` was to enable readable regression
921tests for JITLink. To do this it supports two options:
922
923The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry
924point, and before attempting to execute it. Since the linked code is not
925executed, this can be used to link for other targets even if you do not have
926access to the target being linked (the ``-define-abs`` or ``-phony-externals``
927options can be used to supply any missing definitions in this case).
928
929The ``-check <check-file>`` option can be used to run a set of ``jitlink-check``
930expressions against working memory. It is typically used in conjunction with
931``-noexec``, since the aim is to validate JIT'd memory rather than to run the
932code and ``-noexec`` allows us to link for any supported target architecture
933from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the
934given check-file for lines of the form ``# jitlink-check: <expr>``. See
935examples of this usage in ``llvm/test/ExecutionEngine/JITLink``.
936
937Remote execution via llvm-jitlink-executor
938------------------------------------------
939
940By default ``llvm-jitlink`` will link the given objects into its own process,
941but this can be overridden by two options:
942
943The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to
944execute the given executor (which defaults to ``llvm-jitlink-executor``) and
945communicate with it via file descriptors which it passes to the executor
946as the first argument with the format ``filedescs=<in-fd>,<out-fd>``.
947
948The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to
949connect to an already running executor via TCP on the given host and port. To
950use this option you will need to start ``llvm-jitlink-executor`` manually with
951``listen=<host>:<port>`` as the first argument.
952
953Harness mode
954------------
955
956The ``-harness`` option allows a set of input objects to be designated as a test
957harness, with the regular object files implicitly treated as objects to be
958tested. Definitions of symbols in the harness set override definitions in the
959test set, and external references from the harness cause automatic scope
960promotion of local symbols in the test set (these modifications to the usual
961linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by
962``llvm-jitlink`` when it sees the ``-harness`` option).
963
964With these modifications in place we can selectively test functions in an object
965file by mocking those function's callees. For example, suppose we have an object
966file, ``test_code.o``, compiled from the following C source (which we need not
967have access to):
968
969.. code-block:: c
970
971  void irrelevant_function() { irrelevant_external(); }
972
973  int function_to_mock(int X) {
974    return /* some function of X */;
975  }
976
977  static void function_to_test() {
978    ...
979    int Y = function_to_mock();
980    printf("Y is %i\n", Y);
981  }
982
983If we want to know how ``function_to_test`` behaves when we change the behavior
984of ``function_to_mock`` we can test it by writing a test harness:
985
986.. code-block:: c
987
988  void function_to_test();
989
990  int function_to_mock(int X) {
991    printf("used mock utility function\n");
992    return 42;
993  }
994
995  int main(int argc, char *argv[]) {
996    function_to_test():
997    return 0;
998  }
999
1000Under normal circumstances these objects could not be linked together:
1001``function_to_test`` is static and could not be resolved outside
1002``test_code.o``, the two ``function_to_mock`` functions would result in a
1003duplicate definition error, and ``irrelevant_external`` is undefined.
1004However, using ``-harness`` and ``-phony-externals`` we can run this code
1005with:
1006
1007.. code-block:: sh
1008
1009  % clang -c -o test_code_harness.o test_code_harness.c
1010  % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o
1011  used mock utility function
1012  Y is 42
1013
1014The ``-harness`` option may be of interest to people who want to perform some
1015very late testing on build products to verify that compiled code behaves as
1016expected. On basic C test cases this is relatively straightforward. Mocks for
1017more complicated languages (e.g. C++) are much trickier: Any code involving
1018classes tends to have a lot of non-trivial surface area (e.g. vtables) that
1019would require great care to mock.
1020
1021Tips for JITLink backend developers
1022-----------------------------------
1023
1024#. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input
1025   object is well formed: Return any errors produced by libObject (or your own
1026   object parsing code) and validate as you construct. Think carefully about the
1027   distinction between contract (which should be validated with asserts and
1028   llvm_unreachable) and environmental errors (which should generate
1029   ``llvm::Error`` instances).
1030
1031#. Don't assume you're linking in-process. Use libSupport's sized,
1032   endian-specific types when reading/writing content in the ``LinkGraph``.
1033
1034As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an
1035invaluable resource for developers bringing in a new JITLink backend. A standard
1036workflow is to start by throwing an unsupported object at the tool and seeing
1037what error is returned, then fixing that (you can often make a reasonable guess
1038at what should be done based on existing code for other formats or
1039architectures).
1040
1041In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the
1042JITLink library during the link process. These can be useful for spotting some bugs at
1043a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink``
1044tool, which can be useful for debugging both testcases (it is often less verbose than
1045``-debug-only=jitlink``) and the tool itself.
1046
1047The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing
1048handling of cross-process and cross-architecture use cases.
1049
1050Roadmap
1051=======
1052
1053JITLink is under active development. Work so far has focused on the MachO
1054implementation. In LLVM 12 there is limited support for ELF on x86-64.
1055
1056Major outstanding projects include:
1057
1058* Refactor architecture support to maximize sharing across formats.
1059
1060  All formats should be able to share the bulk of the architecture specific
1061  code (especially relocations) for each supported architecture.
1062
1063* Refactor ELF link graph construction.
1064
1065  ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp`
1066  file, and tied to the x86-64 relocation parsing code. The bulk of the code is
1067  generic and should be split into an ELFLinkGraphBuilder base class along the
1068  same lines as the existing generic MachOLinkGraphBuilder.
1069
1070* Implement ELF support for arm64.
1071
1072  Once the architecture support code has been refactored to enable sharing and
1073  ELF link graph construction has been refactored to allow re-use we should be
1074  able to construct an ELF / arm64 JITLink implementation by combining
1075  these existing pieces.
1076
1077* Implement support for new architectures.
1078
1079* Implement support for COFF.
1080
1081  There is no COFF implementation of JITLink yet. Such an implementation should
1082  follow the MachO and ELF paths: a generic COFFLinkGraphBuilder base class that
1083  can be specialized for each architecture.
1084
1085* Design and implement a shared-memory based JITLinkMemoryManager.
1086
1087  One use-case that is expected to be common is out-of-process linking targeting
1088  another process on the same machine. This allows JITs to sandbox JIT'd code.
1089  For this use case a shared-memory based JITLinkMemoryManager would provide the
1090  most efficient form of allocation. Creating one will require designing a
1091  generic API for shared memory though, as LLVM does not currently have one.
1092
1093JITLink Availability and Feature Status
1094---------------------------------------
1095
1096.. list-table:: Availability and Status
1097   :widths: 10 30 30 30
1098   :header-rows: 1
1099
1100   * - Architecture
1101     - ELF
1102     - COFF
1103     - MachO
1104   * - arm64
1105     -
1106     -
1107     - Partial (small code model, PIC relocation model only)
1108   * - x86-64
1109     - Partial
1110     -
1111     - Full (except TLV and debugging)
1112
1113.. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for
1114       a full worked example.
1115
1116.. [2] If not for *hidden* scoped symbols we could eliminate the
1117       ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and
1118       treat every object as a separate simulated dylib for the purposes of
1119       memory layout. Hidden symbols break this by generating in-range accesses
1120       to external symbols, requiring the access and symbol to be allocated
1121       within range of one another. That said, providing a pre-reserved address
1122       range pool for each simulated dylib guarantees that the relaxation
1123       optimizations will kick in for all intra-dylib references, which is good
1124       for performance (at the cost of whatever overhead is introduced by
1125       reserving the address-range up-front).
1126