xref: /llvm-project/llvm/docs/ORCv2.rst (revision c34079c9455515fd1eb4feaa7613a57e88b7209d)
1===============================
2ORC Design and Implementation
3===============================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11This document aims to provide a high-level overview of the design and
12implementation of the ORC JIT APIs. Except where otherwise stated all discussion
13refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to
14transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`.
15
16Use-cases
17=========
18
19ORC provides a modular API for building JIT compilers. There are a number
20of use cases for such an API. For example:
21
221. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
23compiled from a toy language: Kaleidoscope.
24
252. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
26evaluation. In this use case, cross compilation allows expressions compiled
27in the debugger process to be executed on the debug target process, which may
28be on a different device/architecture.
29
303. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
31optimizations within an existing JIT infrastructure.
32
334. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
34
35By adopting a modular, library-based design we aim to make ORC useful in as many
36of these contexts as possible.
37
38Features
39========
40
41ORC provides the following features:
42
43**JIT-linking**
44  ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
45  into a target process at runtime. The target process may be the same process
46  that contains the JIT session object and jit-linker, or may be another process
47  (even one running on a different machine or architecture) that communicates
48  with the JIT via RPC.
49
50**LLVM IR compilation**
51  ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
52  ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
53
54**Eager and lazy compilation**
55  By default, ORC will compile symbols as soon as they are looked up in the JIT
56  session object (``ExecutionSession``). Compiling eagerly by default makes it
57  easy to use ORC as an in-memory compiler for an existing JIT (similar to how
58  MCJIT is commonly used). However ORC also provides built-in support for lazy
59  compilation via lazy-reexports (see :ref:`Laziness`).
60
61**Support for Custom Compilers and Program Representations**
62  Clients can supply custom compilers for each symbol that they define in their
63  JIT session. ORC will run the user-supplied compiler when the a definition of
64  a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
65  treated specially, and is supported via the same wrapper mechanism (the
66  ``MaterializationUnit`` class) that is used for custom compilers.
67
68**Concurrent JIT'd code** and **Concurrent Compilation**
69  JIT'd code may be executed in multiple threads, may spawn new threads, and may
70  re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple
71  threads. Compilers launched my ORC can run concurrently (provided the client
72  sets up an appropriate dispatcher). Built-in dependency tracking ensures that
73  ORC does not release pointers to JIT'd code or data until all dependencies
74  have also been JIT'd and they are safe to call or use.
75
76**Removable Code**
77  Resources for JIT'd program representations
78
79**Orthogonality** and **Composability**
80  Each of the features above can be used independently. It is possible to put
81  ORC components together to make a non-lazy, in-process, single threaded JIT
82  or a lazy, out-of-process, concurrent JIT, or anything in between.
83
84LLJIT and LLLazyJIT
85===================
86
87ORC provides two basic JIT classes off-the-shelf. These are useful both as
88examples of how to assemble ORC components to make a JIT, and as replacements
89for earlier LLVM JIT APIs (e.g. MCJIT).
90
91The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
92compilation of LLVM IR and linking of relocatable object files. All operations
93are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
94as soon as you attempt to look up its address). LLJIT is a suitable replacement
95for MCJIT in most cases (note: some more advanced features, e.g.
96JITEventListeners are not supported yet).
97
98The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
99compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
100method, function bodies in that module will not be compiled until they are first
101called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
102JIT API.
103
104LLJIT and LLLazyJIT instances can be created using their respective builder
105classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
106module ``M`` loaded on a ThreadSafeContext ``Ctx``:
107
108.. code-block:: c++
109
110  // Try to detect the host arch and construct an LLJIT instance.
111  auto JIT = LLJITBuilder().create();
112
113  // If we could not construct an instance, return an error.
114  if (!JIT)
115    return JIT.takeError();
116
117  // Add the module.
118  if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
119    return Err;
120
121  // Look up the JIT'd code entry point.
122  auto EntrySym = JIT->lookup("entry");
123  if (!EntrySym)
124    return EntrySym.takeError();
125
126  // Cast the entry point address to a function pointer.
127  auto *Entry = EntrySym.getAddress().toPtr<void(*)()>();
128
129  // Call into JIT'd code.
130  Entry();
131
132The builder classes provide a number of configuration options that can be
133specified before the JIT instance is constructed. For example:
134
135.. code-block:: c++
136
137  // Build an LLLazyJIT instance that uses four worker threads for compilation,
138  // and jumps to a specific error handler (rather than null) on lazy compile
139  // failures.
140
141  void handleLazyCompileFailure() {
142    // JIT'd code will jump here if lazy compilation fails, giving us an
143    // opportunity to exit or throw an exception into JIT'd code.
144    throw JITFailed();
145  }
146
147  auto JIT = LLLazyJITBuilder()
148               .setNumCompileThreads(4)
149               .setLazyCompileFailureAddr(
150                   ExecutorAddr::fromPtr(&handleLazyCompileFailure))
151               .create();
152
153  // ...
154
155For users wanting to get started with LLJIT a minimal example program can be
156found at ``llvm/examples/HowToUseLLJIT``.
157
158Design Overview
159===============
160
161ORC's JIT program model aims to emulate the linking and symbol resolution
162rules used by the static and dynamic linkers. This allows ORC to JIT
163arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
164clang) that uses constructs like symbol linkage and visibility, and weak [3]_
165and common symbol definitions.
166
167To see how this works, imagine a program ``foo`` which links against a pair
168of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
169program might look like:
170
171.. code-block:: bash
172
173  $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
174  $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
175  $ clang++ -o myapp myapp.cpp -L. -lA -lB
176  $ ./myapp
177
178In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer
179(with error checking omitted for brevity) as:
180
181.. code-block:: c++
182
183  ExecutionSession ES;
184  RTDyldObjectLinkingLayer ObjLinkingLayer(
185      ES, []() { return std::make_unique<SectionMemoryManager>(); });
186  CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
187
188  // Create JITDylib "A" and add code to it using the CXX layer.
189  auto &LibA = ES.createJITDylib("A");
190  CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
191  CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
192
193  // Create JITDylib "B" and add code to it using the CXX layer.
194  auto &LibB = ES.createJITDylib("B");
195  CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
196  CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
197
198  // Create and specify the search order for the main JITDylib. This is
199  // equivalent to a "links against" relationship in a command-line link.
200  auto &MainJD = ES.createJITDylib("main");
201  MainJD.addToLinkOrder(&LibA);
202  MainJD.addToLinkOrder(&LibB);
203  CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp"));
204
205  // Look up the JIT'd main, cast it to a function pointer, then call it.
206  auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main"));
207  auto *Main = MainSym.getAddress().toPtr<int(*)(int, char *[])>();
208
209  int Result = Main(...);
210
211This example tells us nothing about *how* or *when* compilation will happen.
212That will depend on the implementation of the hypothetical CXXCompilingLayer.
213The same linker-based symbol resolution rules will apply regardless of that
214implementation, however. For example, if a1.cpp and a2.cpp both define a
215function "foo" then ORCv2 will generate a duplicate definition error. On the
216other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
217dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
218should bind to the definition in LibA rather than the one in LibB, since
219main.cpp is part of the "main" dylib, and the main dylib links against LibA
220before LibB.
221
222Many JIT clients will have no need for this strict adherence to the usual
223ahead-of-time linking rules, and should be able to get by just fine by putting
224all of their code in a single JITDylib. However, clients who want to JIT code
225for languages/projects that traditionally rely on ahead-of-time linking (e.g.
226C++) will find that this feature makes life much easier.
227
228Symbol lookup in ORC serves two other important functions, beyond providing
229addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
230(if they have not been compiled already), and (2) it provides the
231synchronization mechanism for concurrent compilation. The pseudo-code for the
232lookup process is:
233
234.. code-block:: none
235
236  construct a query object from a query set and query handler
237  lock the session
238  lodge query against requested symbols, collect required materializers (if any)
239  unlock the session
240  dispatch materializers (if any)
241
242In this context a materializer is something that provides a working definition
243of a symbol upon request. Usually materializers are just wrappers for compilers,
244but they may also wrap a jit-linker directly (if the program representation
245backing the definitions is an object file), or may even be a class that writes
246bits directly into memory (for example, if the definitions are
247stubs). Materialization is the blanket term for any actions (compiling, linking,
248splatting bits, registering with runtimes, etc.) that are required to generate a
249symbol definition that is safe to call or access.
250
251As each materializer completes its work it notifies the JITDylib, which in turn
252notifies any query objects that are waiting on the newly materialized
253definitions. Each query object maintains a count of the number of symbols that
254it is still waiting on, and once this count reaches zero the query object calls
255the query handler with a *SymbolMap* (a map of symbol names to addresses)
256describing the result. If any symbol fails to materialize the query immediately
257calls the query handler with an error.
258
259The collected materialization units are sent to the ExecutionSession to be
260dispatched, and the dispatch behavior can be set by the client. By default each
261materializer is run on the calling thread. Clients are free to create new
262threads to run materializers, or to send the work to a work queue for a thread
263pool (this is what LLJIT/LLLazyJIT do).
264
265Top Level APIs
266==============
267
268Many of ORC's top-level APIs are visible in the example above:
269
270- *ExecutionSession* represents the JIT'd program and provides context for the
271  JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
272  materializers.
273
274- *JITDylibs* provide the symbol tables.
275
276- *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
277  allow clients to add uncompiled program representations supported by those
278  compilers to JITDylibs.
279
280- *ResourceTrackers* allow you to remove code.
281
282Several other important APIs are used explicitly. JIT clients need not be aware
283of them, but Layer authors will use them:
284
285- *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
286  program representation (in this example, C++ source) in a MaterializationUnit,
287  which is then stored in the JITDylib. MaterializationUnits are responsible for
288  describing the definitions they provide, and for unwrapping the program
289  representation and passing it back to the layer when compilation is required
290  (this ownership shuffle makes writing thread-safe layers easier, since the
291  ownership of the program representation will be passed back on the stack,
292  rather than having to be fished out of a Layer member, which would require
293  synchronization).
294
295- *MaterializationResponsibility* - When a MaterializationUnit hands a program
296  representation back to the layer it comes with an associated
297  MaterializationResponsibility object. This object tracks the definitions
298  that must be materialized and provides a way to notify the JITDylib once they
299  are either successfully materialized or a failure occurs.
300
301Absolute Symbols, Aliases, and Reexports
302========================================
303
304ORC makes it easy to define symbols with absolute addresses, or symbols that
305are simply aliases of other symbols:
306
307Absolute Symbols
308----------------
309
310Absolute symbols are symbols that map directly to addresses without requiring
311further materialization, for example: "foo" = 0x1234. One use case for
312absolute symbols is allowing resolution of process symbols. E.g.
313
314.. code-block:: c++
315
316  JD.define(absoluteSymbols(SymbolMap({
317      { Mangle("printf"),
318        { ExecutorAddr::fromPtr(&printf),
319          JITSymbolFlags::Callable } }
320    });
321
322With this mapping established code added to the JIT can refer to printf
323symbolically rather than requiring the address of printf to be "baked in".
324This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
325to be re-used across JIT sessions as the JIT'd code no longer changes, only the
326absolute symbol definition does.
327
328For process and library symbols the DynamicLibrarySearchGenerator utility (See
329:ref:`How to Add Process and Library Symbols to JITDylibs
330<ProcessAndLibrarySymbols>`) can be used to automatically build absolute
331symbol mappings for you. However the absoluteSymbols function is still useful
332for making non-global objects in your JIT visible to JIT'd code. For example,
333imagine that your JIT standard library needs access to your JIT object to make
334some calls. We could bake the address of your object into the library, but then
335it would need to be recompiled for each session:
336
337.. code-block:: c++
338
339  // From standard library for JIT'd code:
340
341  class MyJIT {
342  public:
343    void log(const char *Msg);
344  };
345
346  void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
347
348We can turn this into a symbolic reference in the JIT standard library:
349
350.. code-block:: c++
351
352  extern MyJIT *__MyJITInstance;
353
354  void log(const char *Msg) { __MyJITInstance->log(Msg); }
355
356And then make our JIT object visible to the JIT standard library with an
357absolute symbol definition when the JIT is started:
358
359.. code-block:: c++
360
361  MyJIT J = ...;
362
363  auto &JITStdLibJD = ... ;
364
365  JITStdLibJD.define(absoluteSymbols(SymbolMap({
366      { Mangle("__MyJITInstance"),
367        { ExecutorAddr::fromPtr(&J), JITSymbolFlags() } }
368    });
369
370Aliases and Reexports
371---------------------
372
373Aliases and reexports allow you to define new symbols that map to existing
374symbols. This can be useful for changing linkage relationships between symbols
375across sessions without having to recompile code. For example, imagine that
376JIT'd code has access to a log function, ``void log(const char*)`` for which
377there are two implementations in the JIT standard library: ``log_fast`` and
378``log_detailed``. Your JIT can choose which one of these definitions will be
379used when the ``log`` symbol is referenced by setting up an alias at JIT startup
380time:
381
382.. code-block:: c++
383
384  auto &JITStdLibJD = ... ;
385
386  auto LogImplementationSymbol =
387   Verbose ? Mangle("log_detailed") : Mangle("log_fast");
388
389  JITStdLibJD.define(
390    symbolAliases(SymbolAliasMap({
391        { Mangle("log"),
392          { LogImplementationSymbol
393            JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
394      });
395
396The ``symbolAliases`` function allows you to define aliases within a single
397JITDylib. The ``reexports`` function provides the same functionality, but
398operates across JITDylib boundaries. E.g.
399
400.. code-block:: c++
401
402  auto &JD1 = ... ;
403  auto &JD2 = ... ;
404
405  // Make 'bar' in JD2 an alias for 'foo' from JD1.
406  JD2.define(
407    reexports(JD1, SymbolAliasMap({
408        { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
409      });
410
411The reexports utility can be handy for composing a single JITDylib interface by
412re-exporting symbols from several other JITDylibs.
413
414.. _Laziness:
415
416Laziness
417========
418
419Laziness in ORC is provided by a utility called "lazy reexports". A lazy
420reexport is similar to a regular reexport or alias: It provides a new name for
421an existing symbol. Unlike regular reexports however, lookups of lazy reexports
422do not trigger immediate materialization of the reexported symbol. Instead, they
423only trigger materialization of a function stub. This function stub is
424initialized to point at a *lazy call-through*, which provides reentry into the
425JIT. If the stub is called at runtime then the lazy call-through will look up
426the reexported symbol (triggering materialization for it if necessary), update
427the stub (to call directly to the reexported symbol on subsequent calls), and
428then return via the reexported symbol. By re-using the existing symbol lookup
429mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
430reexports can be made from multiple threads concurrently, and the reexported
431symbol can be any state of compilation (uncompiled, already in the process of
432being compiled, or already compiled) and the call will succeed. This allows
433laziness to be safely mixed with features like remote compilation, concurrent
434compilation, concurrent JIT'd code, and speculative compilation.
435
436There is one other key difference between regular reexports and lazy reexports
437that some clients must be aware of: The address of a lazy reexport will be
438*different* from the address of the reexported symbol (whereas a regular
439reexport is guaranteed to have the same address as the reexported symbol).
440Clients who care about pointer equality will generally want to use the address
441of the reexport as the canonical address of the reexported symbol. This will
442allow the address to be taken without forcing materialization of the reexport.
443
444Usage example:
445
446If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
447``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
448``JD2`` by calling:
449
450.. code-block:: c++
451
452  auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
453  JD2.define(
454    lazyReexports(CallThroughMgr, StubsMgr, JD,
455                  SymbolAliasMap({
456                    { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
457                    { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
458                  }));
459
460A full example of how to use lazyReexports with the LLJIT class can be found at
461``llvm/examples/OrcV2Examples/LLJITWithLazyReexports``.
462
463Supporting Custom Compilers
464===========================
465
466TBD.
467
468.. _transitioning_orcv1_to_orcv2:
469
470Transitioning from ORCv1 to ORCv2
471=================================
472
473Since LLVM 7.0, new ORC development work has focused on adding support for
474concurrent JIT compilation. The new APIs (including new layer interfaces and
475implementations, and new utilities) that support concurrency are collectively
476referred to as ORCv2, and the original, non-concurrent layers and utilities
477are now referred to as ORCv1.
478
479The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
480prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
48112.0 ORCv1 will be removed entirely.
482
483Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
484ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
485substituted. However there are some design differences between ORCv1 and ORCv2
486to be aware of:
487
488  1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
489     (and other program representations, e.g. Object Files)  are no longer added
490     directly to JIT classes or layers. Instead, they are added to ``JITDylib``
491     instances *by* layers. The ``JITDylib`` determines *where* the definitions
492     reside, the layers determine *how* the definitions will be compiled.
493     Linkage relationships between ``JITDylibs`` determine how inter-module
494     references are resolved, and symbol resolvers are no longer used. See the
495     section `Design Overview`_ for more details.
496
497     Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
498     clients should place all code in a single JITDylib.
499     MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place
500     code in LLJIT's default created main JITDylib (See
501     ``LLJIT::getMainJITDylib()``).
502
503  2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
504     manages the string pool, error reporting, synchronization, and symbol
505     lookup.
506
507  3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
508     string values in order to reduce memory overhead and improve lookup
509     performance. See the subsection `How to manage symbol strings`_.
510
511  4. IR layers require ThreadSafeModule instances, rather than
512     std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
513     Modules that use the same LLVMContext are not accessed concurrently.
514     See `How to use ThreadSafeModule and ThreadSafeContext`_.
515
516  5. Symbol lookup is no longer handled by layers. Instead, there is a
517     ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
518
519     .. code-block:: c++
520
521       ExecutionSession ES;
522       JITDylib &JD1 = ...;
523       JITDylib &JD2 = ...;
524
525       auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
526
527  6. The removeModule/removeObject methods are replaced by
528     ``ResourceTracker::remove``.
529     See the subsection `How to remove code`_.
530
531For code examples and suggestions of how to use the ORCv2 APIs, please see
532the section `How-tos`_.
533
534How-tos
535=======
536
537How to manage symbol strings
538----------------------------
539
540Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
541overhead, and allow symbol names to function as efficient keys. To get the
542unique ``SymbolStringPtr`` for a string value, call the
543``ExecutionSession::intern`` method:
544
545  .. code-block:: c++
546
547    ExecutionSession ES;
548    /// ...
549    auto MainSymbolName = ES.intern("main");
550
551If you wish to perform lookup using the C/IR name of a symbol you will also
552need to apply the platform linker-mangling before interning the string. On
553Linux this mangling is a no-op, but on other platforms it usually involves
554adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
555based on the DataLayout for the target. Given a DataLayout and an
556ExecutionSession, you can create a MangleAndInterner function object that
557will perform both jobs for you:
558
559  .. code-block:: c++
560
561    ExecutionSession ES;
562    const DataLayout &DL = ...;
563    MangleAndInterner Mangle(ES, DL);
564
565    // ...
566
567    // Portable IR-symbol-name lookup:
568    auto Sym = ES.lookup({&MainJD}, Mangle("main"));
569
570How to create JITDylibs and set up linkage relationships
571--------------------------------------------------------
572
573In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
574calling the ``ExecutionSession::createJITDylib`` method with a unique name:
575
576  .. code-block:: c++
577
578    ExecutionSession ES;
579    auto &JD = ES.createJITDylib("libFoo.dylib");
580
581The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
582when it is destroyed.
583
584How to remove code
585------------------
586
587To remove an individual module from a JITDylib it must first be added using an
588explicit ``ResourceTracker``. The module can then be removed by calling
589``ResourceTracker::remove``:
590
591  .. code-block:: c++
592
593    auto &JD = ... ;
594    auto M = ... ;
595
596    auto RT = JD.createResourceTracker();
597    Layer.add(RT, std::move(M)); // Add M to JD, tracking resources with RT
598
599    RT.remove(); // Remove M from JD.
600
601Modules added directly to a JITDylib will be tracked by that JITDylib's default
602resource tracker.
603
604All code can be removed from a JITDylib by calling ``JITDylib::clear``. This
605leaves the cleared JITDylib in an empty but usable state.
606
607JITDylibs can be removed by calling ``ExecutionSession::removeJITDylib``. This
608clears the JITDylib and then puts it into a defunct state. No further operations
609can be performed on the JITDylib, and it will be destroyed as soon as the last
610handle to it is released.
611
612An example of how to use the resource management APIs can be found at
613``llvm/examples/OrcV2Examples/LLJITRemovableCode``.
614
615
616How to add the support for custom program representation
617--------------------------------------------------------
618In order to add the support for a custom program representation, a custom ``MaterializationUnit``
619for the program representation, and a custom ``Layer`` are needed. The Layer will have two
620operations: ``add`` and ``emit``. The ``add`` operation takes an instance of your program
621representation, builds one of your custom ``MaterializationUnits`` to hold it, then adds it
622to a ``JITDylib``. The emit operation takes a ``MaterializationResponsibility`` object and an
623instance of your program representation and materializes it, usually by compiling it and handing
624the resulting object off to an ``ObjectLinkingLayer``.
625
626Your custom ``MaterializationUnit`` will have two operations: ``materialize`` and ``discard``. The
627``materialize`` function will be called for you when any symbol provided by the unit is looked up,
628and it should just call the ``emit`` function on your layer, passing in the given
629``MaterializationResponsibility`` and the wrapped program representation. The ``discard`` function
630will be called if some weak symbol provided by your unit is not needed (because the JIT found an
631overriding definition). You can use this to drop your definition early, or just ignore it and let
632the linker drops the definition later.
633
634Here is an example of an ASTLayer:
635
636  .. code-block:: c++
637
638    // ... In you JIT class
639    AstLayer astLayer;
640    // ...
641
642
643    class AstMaterializationUnit : public orc::MaterializationUnit {
644    public:
645      AstMaterializationUnit(AstLayer &l, Ast &ast)
646      : llvm::orc::MaterializationUnit(l.getInterface(ast)), astLayer(l),
647      ast(ast) {};
648
649      llvm::StringRef getName() const override {
650        return "AstMaterializationUnit";
651      }
652
653      void materialize(std::unique_ptr<orc::MaterializationResponsibility> r) override {
654        astLayer.emit(std::move(r), ast);
655      };
656
657    private:
658      void discard(const llvm::orc::JITDylib &jd, const llvm::orc::SymbolStringPtr &sym) override {
659        llvm_unreachable("functions are not overridable");
660      }
661
662
663      AstLayer &astLayer;
664      Ast &ast;
665    };
666
667    class AstLayer {
668      llvhm::orc::IRLayer &baseLayer;
669      llvhm::orc::MangleAndInterner &mangler;
670
671    public:
672      AstLayer(llvm::orc::IRLayer &baseLayer, llvm::orc::MangleAndInterner &mangler)
673      : baseLayer(baseLayer), mangler(mangler){};
674
675      llvm::Error add(llvm::orc::ResourceTrackerSP &rt, Ast &ast) {
676        return rt->getJITDylib().define(std::make_unique<AstMaterializationUnit>(*this, ast), rt);
677      }
678
679      void emit(std::unique_ptr<orc::MaterializationResponsibility> mr, Ast &ast) {
680        // compileAst is just function that compiles the given AST and returns
681        // a `llvm::orc::ThreadSafeModule`
682        baseLayer.emit(std::move(mr), compileAst(ast));
683      }
684
685      llvm::orc::MaterializationUnit::Interface getInterface(Ast &ast) {
686          SymbolFlagsMap Symbols;
687          // Find all the symbols in the AST and for each of them
688          // add it to the Symbols map.
689          Symbols[mangler(someNameFromAST)] =
690            JITSymbolFlags(JITSymbolFlags::Exported | JITSymbolFlags::Callable);
691          return MaterializationUnit::Interface(std::move(Symbols), nullptr);
692      }
693    };
694
695Take look at the source code of `Building A JIT's Chapter 4 <tutorial/BuildingAJIT4.html>`_ for a complete example.
696
697How to use ThreadSafeModule and ThreadSafeContext
698-------------------------------------------------
699
700ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
701LLVMContexts respectively. A ThreadSafeModule is a pair of a
702std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
703ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
704This design serves two purposes: providing a locking scheme and lifetime
705management for LLVMContexts. The ThreadSafeContext may be locked to prevent
706accidental concurrent access by two Modules that use the same LLVMContext.
707The underlying LLVMContext is freed once all ThreadSafeContext values pointing
708to it are destroyed, allowing the context memory to be reclaimed as soon as
709the Modules referring to it are destroyed.
710
711ThreadSafeContexts can be explicitly constructed from a
712std::unique_ptr<LLVMContext>:
713
714  .. code-block:: c++
715
716    ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
717
718ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
719and a ThreadSafeContext value. ThreadSafeContext values may be shared between
720multiple ThreadSafeModules:
721
722  .. code-block:: c++
723
724    ThreadSafeModule TSM1(
725      std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
726
727    ThreadSafeModule TSM2(
728      std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
729
730Before using a ThreadSafeContext, clients should ensure that either the context
731is only accessible on the current thread, or that the context is locked. In the
732example above (where the context is never locked) we rely on the fact that both
733``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
734going to be shared between threads then it must be locked before any accessing
735or creating any Modules attached to it. E.g.
736
737  .. code-block:: c++
738
739    ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
740
741    DefaultThreadPool TP(NumThreads);
742    JITStack J;
743
744    for (auto &ModulePath : ModulePaths) {
745      TP.async(
746        [&]() {
747          auto Lock = TSCtx.getLock();
748          auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
749          J.addModule(ThreadSafeModule(std::move(M), TSCtx));
750        });
751    }
752
753    TP.wait();
754
755To make exclusive access to Modules easier to manage the ThreadSafeModule class
756provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
757associated context, (2) runs a given function object, (3) unlocks the context,
758and (3) returns the result generated by the function object. E.g.
759
760  .. code-block:: c++
761
762    ThreadSafeModule TSM = getModule(...);
763
764    // Dump the module:
765    size_t NumFunctionsInModule =
766      TSM.withModuleDo(
767        [](Module &M) { // <- Context locked before entering lambda.
768          return M.size();
769        } // <- Context unlocked after leaving.
770      );
771
772Clients wishing to maximize possibilities for concurrent compilation will want
773to create every new ThreadSafeModule on a new ThreadSafeContext. For this
774reason a convenience constructor for ThreadSafeModule is provided that implicitly
775constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
776
777  .. code-block:: c++
778
779    // Maximize concurrency opportunities by loading every module on a
780    // separate context.
781    for (const auto &IRPath : IRPaths) {
782      auto Ctx = std::make_unique<LLVMContext>();
783      auto M = std::make_unique<Module>("M", *Ctx);
784      CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx)));
785    }
786
787Clients who plan to run single-threaded may choose to save memory by loading
788all modules on the same context:
789
790  .. code-block:: c++
791
792    // Save memory by using one context for all Modules:
793    ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
794    for (const auto &IRPath : IRPaths) {
795      ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
796      CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM));
797    }
798
799.. _ProcessAndLibrarySymbols:
800
801How to Add Process and Library Symbols to JITDylibs
802===================================================
803
804JIT'd code may need to access symbols in the host program or in supporting
805libraries. The best way to enable this is to reflect these symbols into your
806JITDylibs so that they appear the same as any other symbol defined within the
807execution session (i.e. they are findable via `ExecutionSession::lookup`, and
808so visible to the JIT linker during linking).
809
810One way to reflect external symbols is to add them manually using the
811absoluteSymbols function:
812
813  .. code-block:: c++
814
815    const DataLayout &DL = getDataLayout();
816    MangleAndInterner Mangle(ES, DL);
817
818    auto &JD = ES.createJITDylib("main");
819
820    JD.define(
821      absoluteSymbols({
822        { Mangle("puts"), ExecutorAddr::fromPtr(&puts)},
823        { Mangle("gets"), ExecutorAddr::fromPtr(&getS)}
824      }));
825
826Using absoluteSymbols is reasonable if the set of symbols to be reflected is
827small and fixed. On the other hand, if the set of symbols is large or variable
828it may make more sense to have the definitions added for you on demand by a
829*definition generator*.A definition generator is an object that can be attached
830to a JITDylib, receiving a callback whenever a lookup within that JITDylib fails
831to find one or more symbols. The definition generator is given a chance to
832produce a definition of the missing symbol(s) before the lookup proceeds.
833
834ORC provides the ``DynamicLibrarySearchGenerator`` utility for reflecting symbols
835from the process (or a specific dynamic library) for you. For example, to reflect
836the whole interface of a runtime library:
837
838  .. code-block:: c++
839
840    const DataLayout &DL = getDataLayout();
841    auto &JD = ES.createJITDylib("main");
842
843    if (auto DLSGOrErr =
844        DynamicLibrarySearchGenerator::Load("/path/to/lib"
845                                            DL.getGlobalPrefix()))
846      JD.addGenerator(std::move(*DLSGOrErr);
847    else
848      return DLSGOrErr.takeError();
849
850    // IR added to JD can now link against all symbols exported by the library
851    // at '/path/to/lib'.
852    CompileLayer.add(JD, loadModule(...));
853
854The ``DynamicLibrarySearchGenerator`` utility can also be constructed with a
855filter function to restrict the set of symbols that may be reflected. For
856example, to expose an allowed set of symbols from the main process:
857
858  .. code-block:: c++
859
860    const DataLayout &DL = getDataLayout();
861    MangleAndInterner Mangle(ES, DL);
862
863    auto &JD = ES.createJITDylib("main");
864
865    DenseSet<SymbolStringPtr> AllowList({
866        Mangle("puts"),
867        Mangle("gets")
868      });
869
870    // Use GetForCurrentProcess with a predicate function that checks the
871    // allowed list.
872    JD.addGenerator(cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(
873          DL.getGlobalPrefix(),
874          [&](const SymbolStringPtr &S) { return AllowList.count(S); })));
875
876    // IR added to JD can now link against any symbols exported by the process
877    // and contained in the list.
878    CompileLayer.add(JD, loadModule(...));
879
880References to process or library symbols could also be hardcoded into your IR
881or object files using the symbols' raw addresses, however symbolic resolution
882using the JIT symbol tables should be preferred: it keeps the IR and objects
883readable and reusable in subsequent JIT sessions. Hardcoded addresses are
884difficult to read, and usually only good for one session.
885
886Roadmap
887=======
888
889ORC is still undergoing active development. Some current and future works are
890listed below.
891
892Current Work
893------------
894
8951. **TargetProcessControl: Improvements to in-tree support for out-of-process
896   execution**
897
898   The ``TargetProcessControl`` API provides various operations on the JIT
899   target process (the one which will execute the JIT'd code), including
900   memory allocation, memory writes, function execution, and process queries
901   (e.g. for the target triple). By targeting this API new components can be
902   developed which will work equally well for in-process and out-of-process
903   JITing.
904
905
9062. **ORC RPC based TargetProcessControl implementation**
907
908   An ORC RPC based implementation of the ``TargetProcessControl`` API is
909   currently under development to enable easy out-of-process JITing via
910   file descriptors / sockets.
911
9123. **Core State Machine Cleanup**
913
914   The core ORC state machine is currently implemented between JITDylib and
915   ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This
916   will tidy up the code base, and also allow us to support asynchronous removal
917   of JITDylibs (in practice deleting an associated state object in
918   ExecutionSession and leaving the JITDylib instance in a defunct state until
919   all references to it have been released).
920
921Near Future Work
922----------------
923
9241. **ORC JIT Runtime Libraries**
925
926   We need a runtime library for JIT'd code. This would include things like
927   TLS registration, reentry functions, registration code for language runtimes
928   (e.g. Objective C and Swift) and other JIT specific runtime code. This should
929   be built in a similar manner to compiler-rt (possibly even as part of it).
930
9312. **Remote jit_dlopen / jit_dlclose**
932
933   To more fully mimic the environment that static programs operate in we would
934   like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of
935   their initializers/deinitializers on the current thread. This would require
936   support from the runtime library described above.
937
9383. **Debugging support**
939
940   ORC currently supports the GDBRegistrationListener API when using RuntimeDyld
941   as the underlying JIT linker. We will need a new solution for JITLink based
942   platforms.
943
944Further Future Work
945-------------------
946
9471. **Speculative Compilation**
948
949   ORC's support for concurrent compilation allows us to easily enable
950   *speculative* JIT compilation: compilation of code that is not needed yet,
951   but which we have reason to believe will be needed in the future. This can be
952   used to hide compile latency and improve JIT throughput. A proof-of-concept
953   example of speculative compilation with ORC has already been developed (see
954   ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on
955   re-using and improving existing profiling support (currently used by PGO) to
956   feed speculation decisions, as well as built-in tools to simplify use of
957   speculative compilation.
958
959.. [1] Formats/architectures vary in terms of supported features. MachO and
960       ELF tend to have better support than COFF. Patches very welcome!
961
962.. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
963       ``RemoteObjectServerLayer`` do not have counterparts in the new
964       system. In the case of ``LazyEmittingLayer`` it was simply no longer
965       needed: in ORCv2, deferring compilation until symbols are looked up is
966       the default. The removal of ``RemoteObjectClientLayer`` and
967       ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
968       across processes, however this functionality appears not to have been
969       used.
970
971.. [3] Weak definitions are currently handled correctly within dylibs, but if
972       multiple dylibs provide a weak definition of a symbol then each will end
973       up with its own definition (similar to how weak definitions are handled
974       in Windows DLLs). This will be fixed in the future.
975