xref: /netbsd-src/external/apache2/llvm/dist/llvm/docs/GarbageCollection.rst (revision 82d56013d7b633d116a93943de88e08335357a7c)
17330f729Sjoerg=====================================
27330f729SjoergGarbage Collection with LLVM
37330f729Sjoerg=====================================
47330f729Sjoerg
57330f729Sjoerg.. contents::
67330f729Sjoerg   :local:
77330f729Sjoerg
87330f729SjoergAbstract
97330f729Sjoerg========
107330f729Sjoerg
117330f729SjoergThis document covers how to integrate LLVM into a compiler for a language which
127330f729Sjoergsupports garbage collection.  **Note that LLVM itself does not provide a
137330f729Sjoerggarbage collector.**  You must provide your own.
147330f729Sjoerg
157330f729SjoergQuick Start
167330f729Sjoerg============
177330f729Sjoerg
187330f729SjoergFirst, you should pick a collector strategy.  LLVM includes a number of built
197330f729Sjoergin ones, but you can also implement a loadable plugin with a custom definition.
207330f729SjoergNote that the collector strategy is a description of how LLVM should generate
217330f729Sjoergcode such that it interacts with your collector and runtime, not a description
227330f729Sjoergof the collector itself.
237330f729Sjoerg
247330f729SjoergNext, mark your generated functions as using your chosen collector strategy.
257330f729SjoergFrom c++, you can call:
267330f729Sjoerg
277330f729Sjoerg.. code-block:: c++
287330f729Sjoerg
297330f729Sjoerg  F.setGC(<collector description name>);
307330f729Sjoerg
317330f729Sjoerg
327330f729SjoergThis will produce IR like the following fragment:
337330f729Sjoerg
347330f729Sjoerg.. code-block:: llvm
357330f729Sjoerg
367330f729Sjoerg  define void @foo() gc "<collector description name>" { ... }
377330f729Sjoerg
387330f729Sjoerg
397330f729SjoergWhen generating LLVM IR for your functions, you will need to:
407330f729Sjoerg
417330f729Sjoerg* Use ``@llvm.gcread`` and/or ``@llvm.gcwrite`` in place of standard load and
427330f729Sjoerg  store instructions.  These intrinsics are used to represent load and store
437330f729Sjoerg  barriers.  If you collector does not require such barriers, you can skip
447330f729Sjoerg  this step.
457330f729Sjoerg
467330f729Sjoerg* Use the memory allocation routines provided by your garbage collector's
477330f729Sjoerg  runtime library.
487330f729Sjoerg
497330f729Sjoerg* If your collector requires them, generate type maps according to your
507330f729Sjoerg  runtime's binary interface.  LLVM is not involved in the process.  In
517330f729Sjoerg  particular, the LLVM type system is not suitable for conveying such
527330f729Sjoerg  information though the compiler.
537330f729Sjoerg
547330f729Sjoerg* Insert any coordination code required for interacting with your collector.
557330f729Sjoerg  Many collectors require running application code to periodically check a
567330f729Sjoerg  flag and conditionally call a runtime function.  This is often referred to
577330f729Sjoerg  as a safepoint poll.
587330f729Sjoerg
597330f729SjoergYou will need to identify roots (i.e. references to heap objects your collector
607330f729Sjoergneeds to know about) in your generated IR, so that LLVM can encode them into
617330f729Sjoergyour final stack maps.  Depending on the collector strategy chosen, this is
627330f729Sjoergaccomplished by using either the ``@llvm.gcroot`` intrinsics or an
637330f729Sjoerg``gc.statepoint`` relocation sequence.
647330f729Sjoerg
657330f729SjoergDon't forget to create a root for each intermediate value that is generated when
667330f729Sjoergevaluating an expression.  In ``h(f(), g())``, the result of ``f()`` could
677330f729Sjoergeasily be collected if evaluating ``g()`` triggers a collection.
687330f729Sjoerg
697330f729SjoergFinally, you need to link your runtime library with the generated program
707330f729Sjoergexecutable (for a static compiler) or ensure the appropriate symbols are
717330f729Sjoergavailable for the runtime linker (for a JIT compiler).
727330f729Sjoerg
737330f729Sjoerg
747330f729SjoergIntroduction
757330f729Sjoerg============
767330f729Sjoerg
777330f729SjoergWhat is Garbage Collection?
787330f729Sjoerg---------------------------
797330f729Sjoerg
807330f729SjoergGarbage collection is a widely used technique that frees the programmer from
817330f729Sjoerghaving to know the lifetimes of heap objects, making software easier to produce
827330f729Sjoergand maintain.  Many programming languages rely on garbage collection for
837330f729Sjoergautomatic memory management.  There are two primary forms of garbage collection:
847330f729Sjoergconservative and accurate.
857330f729Sjoerg
867330f729SjoergConservative garbage collection often does not require any special support from
877330f729Sjoergeither the language or the compiler: it can handle non-type-safe programming
887330f729Sjoerglanguages (such as C/C++) and does not require any special information from the
897330f729Sjoergcompiler.  The `Boehm collector
90*82d56013Sjoerg<https://hboehm.info/gc/>`__ is an example of a
917330f729Sjoergstate-of-the-art conservative collector.
927330f729Sjoerg
937330f729SjoergAccurate garbage collection requires the ability to identify all pointers in the
947330f729Sjoergprogram at run-time (which requires that the source-language be type-safe in
957330f729Sjoergmost cases).  Identifying pointers at run-time requires compiler support to
967330f729Sjoerglocate all places that hold live pointer variables at run-time, including the
977330f729Sjoerg:ref:`processor stack and registers <gcroot>`.
987330f729Sjoerg
997330f729SjoergConservative garbage collection is attractive because it does not require any
1007330f729Sjoergspecial compiler support, but it does have problems.  In particular, because the
1017330f729Sjoergconservative garbage collector cannot *know* that a particular word in the
1027330f729Sjoergmachine is a pointer, it cannot move live objects in the heap (preventing the
1037330f729Sjoerguse of compacting and generational GC algorithms) and it can occasionally suffer
1047330f729Sjoergfrom memory leaks due to integer values that happen to point to objects in the
1057330f729Sjoergprogram.  In addition, some aggressive compiler transformations can break
1067330f729Sjoergconservative garbage collectors (though these seem rare in practice).
1077330f729Sjoerg
1087330f729SjoergAccurate garbage collectors do not suffer from any of these problems, but they
1097330f729Sjoergcan suffer from degraded scalar optimization of the program.  In particular,
1107330f729Sjoergbecause the runtime must be able to identify and update all pointers active in
1117330f729Sjoergthe program, some optimizations are less effective.  In practice, however, the
1127330f729Sjoerglocality and performance benefits of using aggressive garbage collection
1137330f729Sjoergtechniques dominates any low-level losses.
1147330f729Sjoerg
1157330f729SjoergThis document describes the mechanisms and interfaces provided by LLVM to
1167330f729Sjoergsupport accurate garbage collection.
1177330f729Sjoerg
1187330f729SjoergGoals and non-goals
1197330f729Sjoerg-------------------
1207330f729Sjoerg
1217330f729SjoergLLVM's intermediate representation provides :ref:`garbage collection intrinsics
1227330f729Sjoerg<gc_intrinsics>` that offer support for a broad class of collector models.  For
1237330f729Sjoerginstance, the intrinsics permit:
1247330f729Sjoerg
1257330f729Sjoerg* semi-space collectors
1267330f729Sjoerg
1277330f729Sjoerg* mark-sweep collectors
1287330f729Sjoerg
1297330f729Sjoerg* generational collectors
1307330f729Sjoerg
1317330f729Sjoerg* incremental collectors
1327330f729Sjoerg
1337330f729Sjoerg* concurrent collectors
1347330f729Sjoerg
1357330f729Sjoerg* cooperative collectors
1367330f729Sjoerg
1377330f729Sjoerg* reference counting
1387330f729Sjoerg
1397330f729SjoergWe hope that the support built into the LLVM IR is sufficient to support a
1407330f729Sjoergbroad class of garbage collected languages including Scheme, ML, Java, C#,
1417330f729SjoergPerl, Python, Lua, Ruby, other scripting languages, and more.
1427330f729Sjoerg
1437330f729SjoergNote that LLVM **does not itself provide a garbage collector** --- this should
1447330f729Sjoergbe part of your language's runtime library.  LLVM provides a framework for
1457330f729Sjoergdescribing the garbage collectors requirements to the compiler.  In particular,
1467330f729SjoergLLVM provides support for generating stack maps at call sites, polling for a
1477330f729Sjoergsafepoint, and emitting load and store barriers.  You can also extend LLVM -
1487330f729Sjoergpossibly through a loadable :ref:`code generation plugins <plugin>` - to
1497330f729Sjoerggenerate code and data structures which conforms to the *binary interface*
1507330f729Sjoergspecified by the *runtime library*.  This is similar to the relationship between
1517330f729SjoergLLVM and DWARF debugging info, for example.  The difference primarily lies in
1527330f729Sjoergthe lack of an established standard in the domain of garbage collection --- thus
1537330f729Sjoergthe need for a flexible extension mechanism.
1547330f729Sjoerg
1557330f729SjoergThe aspects of the binary interface with which LLVM's GC support is
1567330f729Sjoergconcerned are:
1577330f729Sjoerg
1587330f729Sjoerg* Creation of GC safepoints within code where collection is allowed to execute
1597330f729Sjoerg  safely.
1607330f729Sjoerg
1617330f729Sjoerg* Computation of the stack map.  For each safe point in the code, object
1627330f729Sjoerg  references within the stack frame must be identified so that the collector may
1637330f729Sjoerg  traverse and perhaps update them.
1647330f729Sjoerg
1657330f729Sjoerg* Write barriers when storing object references to the heap.  These are commonly
1667330f729Sjoerg  used to optimize incremental scans in generational collectors.
1677330f729Sjoerg
1687330f729Sjoerg* Emission of read barriers when loading object references.  These are useful
1697330f729Sjoerg  for interoperating with concurrent collectors.
1707330f729Sjoerg
1717330f729SjoergThere are additional areas that LLVM does not directly address:
1727330f729Sjoerg
1737330f729Sjoerg* Registration of global roots with the runtime.
1747330f729Sjoerg
1757330f729Sjoerg* Registration of stack map entries with the runtime.
1767330f729Sjoerg
1777330f729Sjoerg* The functions used by the program to allocate memory, trigger a collection,
1787330f729Sjoerg  etc.
1797330f729Sjoerg
1807330f729Sjoerg* Computation or compilation of type maps, or registration of them with the
1817330f729Sjoerg  runtime.  These are used to crawl the heap for object references.
1827330f729Sjoerg
1837330f729SjoergIn general, LLVM's support for GC does not include features which can be
1847330f729Sjoergadequately addressed with other features of the IR and does not specify a
1857330f729Sjoergparticular binary interface.  On the plus side, this means that you should be
1867330f729Sjoergable to integrate LLVM with an existing runtime.  On the other hand, it can
1877330f729Sjoerghave the effect of leaving a lot of work for the developer of a novel
1887330f729Sjoerglanguage.  We try to mitigate this by providing built in collector strategy
1897330f729Sjoergdescriptions that can work with many common collector designs and easy
1907330f729Sjoergextension points.  If you don't already have a specific binary interface
1917330f729Sjoergyou need to support, we recommend trying to use one of these built in collector
1927330f729Sjoergstrategies.
1937330f729Sjoerg
1947330f729Sjoerg.. _gc_intrinsics:
1957330f729Sjoerg
1967330f729SjoergLLVM IR Features
1977330f729Sjoerg================
1987330f729Sjoerg
1997330f729SjoergThis section describes the garbage collection facilities provided by the
2007330f729Sjoerg:doc:`LLVM intermediate representation <LangRef>`.  The exact behavior of these
2017330f729SjoergIR features is specified by the selected :ref:`GC strategy description
2027330f729Sjoerg<plugin>`.
2037330f729Sjoerg
2047330f729SjoergSpecifying GC code generation: ``gc "..."``
2057330f729Sjoerg-------------------------------------------
2067330f729Sjoerg
2077330f729Sjoerg.. code-block:: text
2087330f729Sjoerg
2097330f729Sjoerg  define <returntype> @name(...) gc "name" { ... }
2107330f729Sjoerg
2117330f729SjoergThe ``gc`` function attribute is used to specify the desired GC strategy to the
2127330f729Sjoergcompiler.  Its programmatic equivalent is the ``setGC`` method of ``Function``.
2137330f729Sjoerg
2147330f729SjoergSetting ``gc "name"`` on a function triggers a search for a matching subclass
2157330f729Sjoergof GCStrategy.  Some collector strategies are built in.  You can add others
2167330f729Sjoergusing either the loadable plugin mechanism, or by patching your copy of LLVM.
2177330f729SjoergIt is the selected GC strategy which defines the exact nature of the code
2187330f729Sjoerggenerated to support GC.  If none is found, the compiler will raise an error.
2197330f729Sjoerg
2207330f729SjoergSpecifying the GC style on a per-function basis allows LLVM to link together
2217330f729Sjoergprograms that use different garbage collection algorithms (or none at all).
2227330f729Sjoerg
2237330f729Sjoerg.. _gcroot:
2247330f729Sjoerg
2257330f729SjoergIdentifying GC roots on the stack
2267330f729Sjoerg----------------------------------
2277330f729Sjoerg
2287330f729SjoergLLVM currently supports two different mechanisms for describing references in
2297330f729Sjoergcompiled code at safepoints.  ``llvm.gcroot`` is the older mechanism;
2307330f729Sjoerg``gc.statepoint`` has been added more recently.  At the moment, you can choose
2317330f729Sjoergeither implementation (on a per :ref:`GC strategy <plugin>` basis).  Longer
2327330f729Sjoergterm, we will probably either migrate away from ``llvm.gcroot`` entirely, or
2337330f729Sjoergsubstantially merge their implementations. Note that most new development
2347330f729Sjoergwork is focused on ``gc.statepoint``.
2357330f729Sjoerg
2367330f729SjoergUsing ``gc.statepoint``
2377330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^
2387330f729Sjoerg:doc:`This page <Statepoints>` contains detailed documentation for
2397330f729Sjoerg``gc.statepoint``.
2407330f729Sjoerg
2417330f729SjoergUsing ``llvm.gcwrite``
2427330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2437330f729Sjoerg
2447330f729Sjoerg.. code-block:: llvm
2457330f729Sjoerg
2467330f729Sjoerg  void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
2477330f729Sjoerg
2487330f729SjoergThe ``llvm.gcroot`` intrinsic is used to inform LLVM that a stack variable
2497330f729Sjoergreferences an object on the heap and is to be tracked for garbage collection.
2507330f729SjoergThe exact impact on generated code is specified by the Function's selected
2517330f729Sjoerg:ref:`GC strategy <plugin>`.  All calls to ``llvm.gcroot`` **must** reside
2527330f729Sjoerginside the first basic block.
2537330f729Sjoerg
2547330f729SjoergThe first argument **must** be a value referring to an alloca instruction or a
2557330f729Sjoergbitcast of an alloca.  The second contains a pointer to metadata that should be
2567330f729Sjoergassociated with the pointer, and **must** be a constant or global value
2577330f729Sjoergaddress.  If your target collector uses tags, use a null pointer for metadata.
2587330f729Sjoerg
2597330f729SjoergA compiler which performs manual SSA construction **must** ensure that SSA
2607330f729Sjoergvalues representing GC references are stored in to the alloca passed to the
2617330f729Sjoergrespective ``gcroot`` before every call site and reloaded after every call.
2627330f729SjoergA compiler which uses mem2reg to raise imperative code using ``alloca`` into
2637330f729SjoergSSA form need only add a call to ``@llvm.gcroot`` for those variables which
2647330f729Sjoergare pointers into the GC heap.
2657330f729Sjoerg
2667330f729SjoergIt is also important to mark intermediate values with ``llvm.gcroot``.  For
2677330f729Sjoergexample, consider ``h(f(), g())``.  Beware leaking the result of ``f()`` in the
2687330f729Sjoergcase that ``g()`` triggers a collection.  Note, that stack variables must be
2697330f729Sjoerginitialized and marked with ``llvm.gcroot`` in function's prologue.
2707330f729Sjoerg
2717330f729SjoergThe ``%metadata`` argument can be used to avoid requiring heap objects to have
2727330f729Sjoerg'isa' pointers or tag bits. [Appel89_, Goldberg91_, Tolmach94_] If specified,
2737330f729Sjoergits value will be tracked along with the location of the pointer in the stack
2747330f729Sjoergframe.
2757330f729Sjoerg
2767330f729SjoergConsider the following fragment of Java code:
2777330f729Sjoerg
2787330f729Sjoerg.. code-block:: java
2797330f729Sjoerg
2807330f729Sjoerg   {
2817330f729Sjoerg     Object X;   // A null-initialized reference to an object
2827330f729Sjoerg     ...
2837330f729Sjoerg   }
2847330f729Sjoerg
2857330f729SjoergThis block (which may be located in the middle of a function or in a loop nest),
2867330f729Sjoergcould be compiled to this LLVM code:
2877330f729Sjoerg
2887330f729Sjoerg.. code-block:: llvm
2897330f729Sjoerg
2907330f729Sjoerg  Entry:
2917330f729Sjoerg     ;; In the entry block for the function, allocate the
2927330f729Sjoerg     ;; stack space for X, which is an LLVM pointer.
2937330f729Sjoerg     %X = alloca %Object*
2947330f729Sjoerg
2957330f729Sjoerg     ;; Tell LLVM that the stack space is a stack root.
2967330f729Sjoerg     ;; Java has type-tags on objects, so we pass null as metadata.
2977330f729Sjoerg     %tmp = bitcast %Object** %X to i8**
2987330f729Sjoerg     call void @llvm.gcroot(i8** %tmp, i8* null)
2997330f729Sjoerg     ...
3007330f729Sjoerg
3017330f729Sjoerg     ;; "CodeBlock" is the block corresponding to the start
3027330f729Sjoerg     ;;  of the scope above.
3037330f729Sjoerg  CodeBlock:
3047330f729Sjoerg     ;; Java null-initializes pointers.
3057330f729Sjoerg     store %Object* null, %Object** %X
3067330f729Sjoerg
3077330f729Sjoerg     ...
3087330f729Sjoerg
3097330f729Sjoerg     ;; As the pointer goes out of scope, store a null value into
3107330f729Sjoerg     ;; it, to indicate that the value is no longer live.
3117330f729Sjoerg     store %Object* null, %Object** %X
3127330f729Sjoerg     ...
3137330f729Sjoerg
3147330f729SjoergReading and writing references in the heap
3157330f729Sjoerg------------------------------------------
3167330f729Sjoerg
3177330f729SjoergSome collectors need to be informed when the mutator (the program that needs
3187330f729Sjoerggarbage collection) either reads a pointer from or writes a pointer to a field
3197330f729Sjoergof a heap object.  The code fragments inserted at these points are called *read
3207330f729Sjoergbarriers* and *write barriers*, respectively.  The amount of code that needs to
3217330f729Sjoergbe executed is usually quite small and not on the critical path of any
3227330f729Sjoergcomputation, so the overall performance impact of the barrier is tolerable.
3237330f729Sjoerg
3247330f729SjoergBarriers often require access to the *object pointer* rather than the *derived
3257330f729Sjoergpointer* (which is a pointer to the field within the object).  Accordingly,
3267330f729Sjoergthese intrinsics take both pointers as separate arguments for completeness.  In
3277330f729Sjoergthis snippet, ``%object`` is the object pointer, and ``%derived`` is the derived
3287330f729Sjoergpointer:
3297330f729Sjoerg
3307330f729Sjoerg.. code-block:: llvm
3317330f729Sjoerg
3327330f729Sjoerg  ;; An array type.
3337330f729Sjoerg  %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
3347330f729Sjoerg  ...
3357330f729Sjoerg
3367330f729Sjoerg  ;; Load the object pointer from a gcroot.
3377330f729Sjoerg  %object = load %class.Array** %object_addr
3387330f729Sjoerg
3397330f729Sjoerg  ;; Compute the derived pointer.
3407330f729Sjoerg  %derived = getelementptr %object, i32 0, i32 2, i32 %n
3417330f729Sjoerg
3427330f729SjoergLLVM does not enforce this relationship between the object and derived pointer
3437330f729Sjoerg(although a particular :ref:`collector strategy <plugin>` might).  However, it
3447330f729Sjoergwould be an unusual collector that violated it.
3457330f729Sjoerg
3467330f729SjoergThe use of these intrinsics is naturally optional if the target GC does not
3477330f729Sjoergrequire the corresponding barrier.  The GC strategy used with such a collector
3487330f729Sjoergshould replace the intrinsic calls with the corresponding ``load`` or
3497330f729Sjoerg``store`` instruction if they are used.
3507330f729Sjoerg
3517330f729SjoergOne known deficiency with the current design is that the barrier intrinsics do
3527330f729Sjoergnot include the size or alignment of the underlying operation performed.  It is
3537330f729Sjoergcurrently assumed that the operation is of pointer size and the alignment is
3547330f729Sjoergassumed to be the target machine's default alignment.
3557330f729Sjoerg
3567330f729SjoergWrite barrier: ``llvm.gcwrite``
3577330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3587330f729Sjoerg
3597330f729Sjoerg.. code-block:: llvm
3607330f729Sjoerg
3617330f729Sjoerg  void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)
3627330f729Sjoerg
3637330f729SjoergFor write barriers, LLVM provides the ``llvm.gcwrite`` intrinsic function.  It
3647330f729Sjoerghas exactly the same semantics as a non-volatile ``store`` to the derived
3657330f729Sjoergpointer (the third argument).  The exact code generated is specified by the
3667330f729SjoergFunction's selected :ref:`GC strategy <plugin>`.
3677330f729Sjoerg
3687330f729SjoergMany important algorithms require write barriers, including generational and
3697330f729Sjoergconcurrent collectors.  Additionally, write barriers could be used to implement
3707330f729Sjoergreference counting.
3717330f729Sjoerg
3727330f729SjoergRead barrier: ``llvm.gcread``
3737330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3747330f729Sjoerg
3757330f729Sjoerg.. code-block:: llvm
3767330f729Sjoerg
3777330f729Sjoerg  i8* @llvm.gcread(i8* %object, i8** %derived)
3787330f729Sjoerg
3797330f729SjoergFor read barriers, LLVM provides the ``llvm.gcread`` intrinsic function.  It has
3807330f729Sjoergexactly the same semantics as a non-volatile ``load`` from the derived pointer
3817330f729Sjoerg(the second argument).  The exact code generated is specified by the Function's
3827330f729Sjoergselected :ref:`GC strategy <plugin>`.
3837330f729Sjoerg
3847330f729SjoergRead barriers are needed by fewer algorithms than write barriers, and may have a
3857330f729Sjoerggreater performance impact since pointer reads are more frequent than writes.
3867330f729Sjoerg
3877330f729Sjoerg.. _plugin:
3887330f729Sjoerg
3897330f729Sjoerg.. _builtin-gc-strategies:
3907330f729Sjoerg
3917330f729SjoergBuilt In GC Strategies
3927330f729Sjoerg======================
3937330f729Sjoerg
3947330f729SjoergLLVM includes built in support for several varieties of garbage collectors.
3957330f729Sjoerg
3967330f729SjoergThe Shadow Stack GC
3977330f729Sjoerg----------------------
3987330f729Sjoerg
3997330f729SjoergTo use this collector strategy, mark your functions with:
4007330f729Sjoerg
4017330f729Sjoerg.. code-block:: c++
4027330f729Sjoerg
4037330f729Sjoerg  F.setGC("shadow-stack");
4047330f729Sjoerg
4057330f729SjoergUnlike many GC algorithms which rely on a cooperative code generator to compile
4067330f729Sjoergstack maps, this algorithm carefully maintains a linked list of stack roots
4077330f729Sjoerg[:ref:`Henderson2002 <henderson02>`].  This so-called "shadow stack" mirrors the
4087330f729Sjoergmachine stack.  Maintaining this data structure is slower than using a stack map
4097330f729Sjoergcompiled into the executable as constant data, but has a significant portability
4107330f729Sjoergadvantage because it requires no special support from the target code generator,
4117330f729Sjoergand does not require tricky platform-specific code to crawl the machine stack.
4127330f729Sjoerg
4137330f729SjoergThe tradeoff for this simplicity and portability is:
4147330f729Sjoerg
4157330f729Sjoerg* High overhead per function call.
4167330f729Sjoerg
4177330f729Sjoerg* Not thread-safe.
4187330f729Sjoerg
4197330f729SjoergStill, it's an easy way to get started.  After your compiler and runtime are up
4207330f729Sjoergand running, writing a :ref:`plugin <plugin>` will allow you to take advantage
4217330f729Sjoergof :ref:`more advanced GC features <collector-algos>` of LLVM in order to
4227330f729Sjoergimprove performance.
4237330f729Sjoerg
4247330f729Sjoerg
4257330f729SjoergThe shadow stack doesn't imply a memory allocation algorithm.  A semispace
4267330f729Sjoergcollector or building atop ``malloc`` are great places to start, and can be
4277330f729Sjoergimplemented with very little code.
4287330f729Sjoerg
4297330f729SjoergWhen it comes time to collect, however, your runtime needs to traverse the stack
4307330f729Sjoergroots, and for this it needs to integrate with the shadow stack.  Luckily, doing
4317330f729Sjoergso is very simple. (This code is heavily commented to help you understand the
4327330f729Sjoergdata structure, but there are only 20 lines of meaningful code.)
4337330f729Sjoerg
4347330f729Sjoerg.. code-block:: c++
4357330f729Sjoerg
4367330f729Sjoerg  /// The map for a single function's stack frame.  One of these is
4377330f729Sjoerg  ///        compiled as constant data into the executable for each function.
4387330f729Sjoerg  ///
4397330f729Sjoerg  /// Storage of metadata values is elided if the %metadata parameter to
4407330f729Sjoerg  /// @llvm.gcroot is null.
4417330f729Sjoerg  struct FrameMap {
4427330f729Sjoerg    int32_t NumRoots;    //< Number of roots in stack frame.
4437330f729Sjoerg    int32_t NumMeta;     //< Number of metadata entries.  May be < NumRoots.
4447330f729Sjoerg    const void *Meta[0]; //< Metadata for each root.
4457330f729Sjoerg  };
4467330f729Sjoerg
4477330f729Sjoerg  /// A link in the dynamic shadow stack.  One of these is embedded in
4487330f729Sjoerg  ///        the stack frame of each function on the call stack.
4497330f729Sjoerg  struct StackEntry {
4507330f729Sjoerg    StackEntry *Next;    //< Link to next stack entry (the caller's).
4517330f729Sjoerg    const FrameMap *Map; //< Pointer to constant FrameMap.
4527330f729Sjoerg    void *Roots[0];      //< Stack roots (in-place array).
4537330f729Sjoerg  };
4547330f729Sjoerg
4557330f729Sjoerg  /// The head of the singly-linked list of StackEntries.  Functions push
4567330f729Sjoerg  ///        and pop onto this in their prologue and epilogue.
4577330f729Sjoerg  ///
4587330f729Sjoerg  /// Since there is only a global list, this technique is not threadsafe.
4597330f729Sjoerg  StackEntry *llvm_gc_root_chain;
4607330f729Sjoerg
4617330f729Sjoerg  /// Calls Visitor(root, meta) for each GC root on the stack.
4627330f729Sjoerg  ///        root and meta are exactly the values passed to
4637330f729Sjoerg  ///        @llvm.gcroot.
4647330f729Sjoerg  ///
4657330f729Sjoerg  /// Visitor could be a function to recursively mark live objects.  Or it
4667330f729Sjoerg  /// might copy them to another heap or generation.
4677330f729Sjoerg  ///
4687330f729Sjoerg  /// @param Visitor A function to invoke for every GC root on the stack.
4697330f729Sjoerg  void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) {
4707330f729Sjoerg    for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) {
4717330f729Sjoerg      unsigned i = 0;
4727330f729Sjoerg
4737330f729Sjoerg      // For roots [0, NumMeta), the metadata pointer is in the FrameMap.
4747330f729Sjoerg      for (unsigned e = R->Map->NumMeta; i != e; ++i)
4757330f729Sjoerg        Visitor(&R->Roots[i], R->Map->Meta[i]);
4767330f729Sjoerg
4777330f729Sjoerg      // For roots [NumMeta, NumRoots), the metadata pointer is null.
4787330f729Sjoerg      for (unsigned e = R->Map->NumRoots; i != e; ++i)
4797330f729Sjoerg        Visitor(&R->Roots[i], NULL);
4807330f729Sjoerg    }
4817330f729Sjoerg  }
4827330f729Sjoerg
4837330f729Sjoerg
4847330f729SjoergThe 'Erlang' and 'Ocaml' GCs
4857330f729Sjoerg-----------------------------
4867330f729Sjoerg
4877330f729SjoergLLVM ships with two example collectors which leverage the ``gcroot``
4887330f729Sjoergmechanisms.  To our knowledge, these are not actually used by any language
4897330f729Sjoergruntime, but they do provide a reasonable starting point for someone interested
4907330f729Sjoergin writing an ``gcroot`` compatible GC plugin.  In particular, these are the
4917330f729Sjoergonly in tree examples of how to produce a custom binary stack map format using
4927330f729Sjoerga ``gcroot`` strategy.
4937330f729Sjoerg
4947330f729SjoergAs there names imply, the binary format produced is intended to model that
4957330f729Sjoergused by the Erlang and OCaml compilers respectively.
4967330f729Sjoerg
4977330f729Sjoerg.. _statepoint_example_gc:
4987330f729Sjoerg
4997330f729SjoergThe Statepoint Example GC
5007330f729Sjoerg-------------------------
5017330f729Sjoerg
5027330f729Sjoerg.. code-block:: c++
5037330f729Sjoerg
5047330f729Sjoerg  F.setGC("statepoint-example");
5057330f729Sjoerg
5067330f729SjoergThis GC provides an example of how one might use the infrastructure provided
5077330f729Sjoergby ``gc.statepoint``. This example GC is compatible with the
5087330f729Sjoerg:ref:`PlaceSafepoints` and :ref:`RewriteStatepointsForGC` utility passes
5097330f729Sjoergwhich simplify ``gc.statepoint`` sequence insertion. If you need to build a
5107330f729Sjoergcustom GC strategy around the ``gc.statepoints`` mechanisms, it is recommended
5117330f729Sjoergthat you use this one as a starting point.
5127330f729Sjoerg
5137330f729SjoergThis GC strategy does not support read or write barriers.  As a result, these
5147330f729Sjoergintrinsics are lowered to normal loads and stores.
5157330f729Sjoerg
5167330f729SjoergThe stack map format generated by this GC strategy can be found in the
5177330f729Sjoerg:ref:`stackmap-section` using a format documented :ref:`here
5187330f729Sjoerg<statepoint-stackmap-format>`. This format is intended to be the standard
5197330f729Sjoergformat supported by LLVM going forward.
5207330f729Sjoerg
5217330f729SjoergThe CoreCLR GC
5227330f729Sjoerg-------------------------
5237330f729Sjoerg
5247330f729Sjoerg.. code-block:: c++
5257330f729Sjoerg
5267330f729Sjoerg  F.setGC("coreclr");
5277330f729Sjoerg
5287330f729SjoergThis GC leverages the ``gc.statepoint`` mechanism to support the
5297330f729Sjoerg`CoreCLR <https://github.com/dotnet/coreclr>`__ runtime.
5307330f729Sjoerg
5317330f729SjoergSupport for this GC strategy is a work in progress. This strategy will
5327330f729Sjoergdiffer from
5337330f729Sjoerg:ref:`statepoint-example GC<statepoint_example_gc>` strategy in
5347330f729Sjoergcertain aspects like:
5357330f729Sjoerg
5367330f729Sjoerg* Base-pointers of interior pointers are not explicitly
5377330f729Sjoerg  tracked and reported.
5387330f729Sjoerg
5397330f729Sjoerg* A different format is used for encoding stack maps.
5407330f729Sjoerg
5417330f729Sjoerg* Safe-point polls are only needed before loop-back edges
5427330f729Sjoerg  and before tail-calls (not needed at function-entry).
5437330f729Sjoerg
5447330f729SjoergCustom GC Strategies
5457330f729Sjoerg====================
5467330f729Sjoerg
5477330f729SjoergIf none of the built in GC strategy descriptions met your needs above, you will
5487330f729Sjoergneed to define a custom GCStrategy and possibly, a custom LLVM pass to perform
5497330f729Sjoerglowering.  Your best example of where to start defining a custom GCStrategy
5507330f729Sjoergwould be to look at one of the built in strategies.
5517330f729Sjoerg
5527330f729SjoergYou may be able to structure this additional code as a loadable plugin library.
5537330f729SjoergLoadable plugins are sufficient if all you need is to enable a different
5547330f729Sjoergcombination of built in functionality, but if you need to provide a custom
5557330f729Sjoerglowering pass, you will need to build a patched version of LLVM.  If you think
5567330f729Sjoergyou need a patched build, please ask for advice on llvm-dev.  There may be an
5577330f729Sjoergeasy way we can extend the support to make it work for your use case without
5587330f729Sjoergrequiring a custom build.
5597330f729Sjoerg
5607330f729SjoergCollector Requirements
5617330f729Sjoerg----------------------
5627330f729Sjoerg
5637330f729SjoergYou should be able to leverage any existing collector library that includes the following elements:
5647330f729Sjoerg
5657330f729Sjoerg#. A memory allocator which exposes an allocation function your compiled
5667330f729Sjoerg   code can call.
5677330f729Sjoerg
5687330f729Sjoerg#. A binary format for the stack map.  A stack map describes the location
5697330f729Sjoerg   of references at a safepoint and is used by precise collectors to identify
5707330f729Sjoerg   references within a stack frame on the machine stack. Note that collectors
5717330f729Sjoerg   which conservatively scan the stack don't require such a structure.
5727330f729Sjoerg
5737330f729Sjoerg#. A stack crawler to discover functions on the call stack, and enumerate the
5747330f729Sjoerg   references listed in the stack map for each call site.
5757330f729Sjoerg
5767330f729Sjoerg#. A mechanism for identifying references in global locations (e.g. global
5777330f729Sjoerg   variables).
5787330f729Sjoerg
5797330f729Sjoerg#. If you collector requires them, an LLVM IR implementation of your collectors
5807330f729Sjoerg   load and store barriers.  Note that since many collectors don't require
5817330f729Sjoerg   barriers at all, LLVM defaults to lowering such barriers to normal loads
5827330f729Sjoerg   and stores unless you arrange otherwise.
5837330f729Sjoerg
5847330f729Sjoerg
5857330f729SjoergImplementing a collector plugin
5867330f729Sjoerg-------------------------------
5877330f729Sjoerg
5887330f729SjoergUser code specifies which GC code generation to use with the ``gc`` function
5897330f729Sjoergattribute or, equivalently, with the ``setGC`` method of ``Function``.
5907330f729Sjoerg
5917330f729SjoergTo implement a GC plugin, it is necessary to subclass ``llvm::GCStrategy``,
5927330f729Sjoergwhich can be accomplished in a few lines of boilerplate code.  LLVM's
5937330f729Sjoerginfrastructure provides access to several important algorithms.  For an
5947330f729Sjoerguncontroversial collector, all that remains may be to compile LLVM's computed
5957330f729Sjoergstack map to assembly code (using the binary representation expected by the
5967330f729Sjoergruntime library).  This can be accomplished in about 100 lines of code.
5977330f729Sjoerg
5987330f729SjoergThis is not the appropriate place to implement a garbage collected heap or a
5997330f729Sjoerggarbage collector itself.  That code should exist in the language's runtime
6007330f729Sjoerglibrary.  The compiler plugin is responsible for generating code which conforms
6017330f729Sjoergto the binary interface defined by library, most essentially the :ref:`stack map
6027330f729Sjoerg<stack-map>`.
6037330f729Sjoerg
6047330f729SjoergTo subclass ``llvm::GCStrategy`` and register it with the compiler:
6057330f729Sjoerg
6067330f729Sjoerg.. code-block:: c++
6077330f729Sjoerg
6087330f729Sjoerg  // lib/MyGC/MyGC.cpp - Example LLVM GC plugin
6097330f729Sjoerg
6107330f729Sjoerg  #include "llvm/CodeGen/GCStrategy.h"
6117330f729Sjoerg  #include "llvm/CodeGen/GCMetadata.h"
6127330f729Sjoerg  #include "llvm/Support/Compiler.h"
6137330f729Sjoerg
6147330f729Sjoerg  using namespace llvm;
6157330f729Sjoerg
6167330f729Sjoerg  namespace {
6177330f729Sjoerg    class LLVM_LIBRARY_VISIBILITY MyGC : public GCStrategy {
6187330f729Sjoerg    public:
6197330f729Sjoerg      MyGC() {}
6207330f729Sjoerg    };
6217330f729Sjoerg
6227330f729Sjoerg    GCRegistry::Add<MyGC>
6237330f729Sjoerg    X("mygc", "My bespoke garbage collector.");
6247330f729Sjoerg  }
6257330f729Sjoerg
6267330f729SjoergThis boilerplate collector does nothing.  More specifically:
6277330f729Sjoerg
6287330f729Sjoerg* ``llvm.gcread`` calls are replaced with the corresponding ``load``
6297330f729Sjoerg  instruction.
6307330f729Sjoerg
6317330f729Sjoerg* ``llvm.gcwrite`` calls are replaced with the corresponding ``store``
6327330f729Sjoerg  instruction.
6337330f729Sjoerg
6347330f729Sjoerg* No safe points are added to the code.
6357330f729Sjoerg
6367330f729Sjoerg* The stack map is not compiled into the executable.
6377330f729Sjoerg
6387330f729SjoergUsing the LLVM makefiles, this code
6397330f729Sjoergcan be compiled as a plugin using a simple makefile:
6407330f729Sjoerg
6417330f729Sjoerg.. code-block:: make
6427330f729Sjoerg
6437330f729Sjoerg  # lib/MyGC/Makefile
6447330f729Sjoerg
6457330f729Sjoerg  LEVEL := ../..
6467330f729Sjoerg  LIBRARYNAME = MyGC
6477330f729Sjoerg  LOADABLE_MODULE = 1
6487330f729Sjoerg
6497330f729Sjoerg  include $(LEVEL)/Makefile.common
6507330f729Sjoerg
6517330f729SjoergOnce the plugin is compiled, code using it may be compiled using ``llc
6527330f729Sjoerg-load=MyGC.so`` (though MyGC.so may have some other platform-specific
6537330f729Sjoergextension):
6547330f729Sjoerg
6557330f729Sjoerg::
6567330f729Sjoerg
6577330f729Sjoerg  $ cat sample.ll
6587330f729Sjoerg  define void @f() gc "mygc" {
6597330f729Sjoerg  entry:
6607330f729Sjoerg    ret void
6617330f729Sjoerg  }
6627330f729Sjoerg  $ llvm-as < sample.ll | llc -load=MyGC.so
6637330f729Sjoerg
6647330f729SjoergIt is also possible to statically link the collector plugin into tools, such as
6657330f729Sjoerga language-specific compiler front-end.
6667330f729Sjoerg
6677330f729Sjoerg.. _collector-algos:
6687330f729Sjoerg
6697330f729SjoergOverview of available features
6707330f729Sjoerg------------------------------
6717330f729Sjoerg
6727330f729Sjoerg``GCStrategy`` provides a range of features through which a plugin may do useful
6737330f729Sjoergwork.  Some of these are callbacks, some are algorithms that can be enabled,
6747330f729Sjoergdisabled, or customized.  This matrix summarizes the supported (and planned)
6757330f729Sjoergfeatures and correlates them with the collection techniques which typically
6767330f729Sjoergrequire them.
6777330f729Sjoerg
6787330f729Sjoerg.. |v| unicode:: 0x2714
6797330f729Sjoerg   :trim:
6807330f729Sjoerg
6817330f729Sjoerg.. |x| unicode:: 0x2718
6827330f729Sjoerg   :trim:
6837330f729Sjoerg
6847330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
6857330f729Sjoerg| Algorithm  | Done | Shadow | refcount | mark- | copying | incremental | threaded | concurrent |
6867330f729Sjoerg|            |      | stack  |          | sweep |         |             |          |            |
6877330f729Sjoerg+============+======+========+==========+=======+=========+=============+==========+============+
6887330f729Sjoerg| stack map  | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
6897330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
6907330f729Sjoerg| initialize | |v|  | |x|    | |x|      | |x|   | |x|     | |x|         | |x|      | |x|        |
6917330f729Sjoerg| roots      |      |        |          |       |         |             |          |            |
6927330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
6937330f729Sjoerg| derived    | NO   |        |          |       |         |             | **N**\*  | **N**\*    |
6947330f729Sjoerg| pointers   |      |        |          |       |         |             |          |            |
6957330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
6967330f729Sjoerg| **custom   | |v|  |        |          |       |         |             |          |            |
6977330f729Sjoerg| lowering** |      |        |          |       |         |             |          |            |
6987330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
6997330f729Sjoerg| *gcroot*   | |v|  | |x|    | |x|      |       |         |             |          |            |
7007330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7017330f729Sjoerg| *gcwrite*  | |v|  |        | |x|      |       |         | |x|         |          | |x|        |
7027330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7037330f729Sjoerg| *gcread*   | |v|  |        |          |       |         |             |          | |x|        |
7047330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7057330f729Sjoerg| **safe     |      |        |          |       |         |             |          |            |
7067330f729Sjoerg| points**   |      |        |          |       |         |             |          |            |
7077330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7087330f729Sjoerg| *in        | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
7097330f729Sjoerg| calls*     |      |        |          |       |         |             |          |            |
7107330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7117330f729Sjoerg| *before    | |v|  |        |          |       |         |             | |x|      | |x|        |
7127330f729Sjoerg| calls*     |      |        |          |       |         |             |          |            |
7137330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7147330f729Sjoerg| *for       | NO   |        |          |       |         |             | **N**    | **N**      |
7157330f729Sjoerg| loops*     |      |        |          |       |         |             |          |            |
7167330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7177330f729Sjoerg| *before    | |v|  |        |          |       |         |             | |x|      | |x|        |
7187330f729Sjoerg| escape*    |      |        |          |       |         |             |          |            |
7197330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7207330f729Sjoerg| emit code  | NO   |        |          |       |         |             | **N**    | **N**      |
7217330f729Sjoerg| at safe    |      |        |          |       |         |             |          |            |
7227330f729Sjoerg| points     |      |        |          |       |         |             |          |            |
7237330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7247330f729Sjoerg| **output** |      |        |          |       |         |             |          |            |
7257330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7267330f729Sjoerg| *assembly* | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
7277330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7287330f729Sjoerg| *JIT*      | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
7297330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7307330f729Sjoerg| *obj*      | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
7317330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7327330f729Sjoerg| live       | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
7337330f729Sjoerg| analysis   |      |        |          |       |         |             |          |            |
7347330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7357330f729Sjoerg| register   | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
7367330f729Sjoerg| map        |      |        |          |       |         |             |          |            |
7377330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7387330f729Sjoerg| \* Derived pointers only pose a hasard to copying collections.                                |
7397330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7407330f729Sjoerg| **?** denotes a feature which could be utilized if available.                                 |
7417330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+
7427330f729Sjoerg
7437330f729SjoergTo be clear, the collection techniques above are defined as:
7447330f729Sjoerg
7457330f729SjoergShadow Stack
7467330f729Sjoerg  The mutator carefully maintains a linked list of stack roots.
7477330f729Sjoerg
7487330f729SjoergReference Counting
7497330f729Sjoerg  The mutator maintains a reference count for each object and frees an object
7507330f729Sjoerg  when its count falls to zero.
7517330f729Sjoerg
7527330f729SjoergMark-Sweep
7537330f729Sjoerg  When the heap is exhausted, the collector marks reachable objects starting
7547330f729Sjoerg  from the roots, then deallocates unreachable objects in a sweep phase.
7557330f729Sjoerg
7567330f729SjoergCopying
7577330f729Sjoerg  As reachability analysis proceeds, the collector copies objects from one heap
7587330f729Sjoerg  area to another, compacting them in the process.  Copying collectors enable
7597330f729Sjoerg  highly efficient "bump pointer" allocation and can improve locality of
7607330f729Sjoerg  reference.
7617330f729Sjoerg
7627330f729SjoergIncremental
7637330f729Sjoerg  (Including generational collectors.) Incremental collectors generally have all
7647330f729Sjoerg  the properties of a copying collector (regardless of whether the mature heap
7657330f729Sjoerg  is compacting), but bring the added complexity of requiring write barriers.
7667330f729Sjoerg
7677330f729SjoergThreaded
7687330f729Sjoerg  Denotes a multithreaded mutator; the collector must still stop the mutator
7697330f729Sjoerg  ("stop the world") before beginning reachability analysis.  Stopping a
7707330f729Sjoerg  multithreaded mutator is a complicated problem.  It generally requires highly
7717330f729Sjoerg  platform-specific code in the runtime, and the production of carefully
7727330f729Sjoerg  designed machine code at safe points.
7737330f729Sjoerg
7747330f729SjoergConcurrent
7757330f729Sjoerg  In this technique, the mutator and the collector run concurrently, with the
7767330f729Sjoerg  goal of eliminating pause times.  In a *cooperative* collector, the mutator
7777330f729Sjoerg  further aids with collection should a pause occur, allowing collection to take
7787330f729Sjoerg  advantage of multiprocessor hosts.  The "stop the world" problem of threaded
7797330f729Sjoerg  collectors is generally still present to a limited extent.  Sophisticated
7807330f729Sjoerg  marking algorithms are necessary.  Read barriers may be necessary.
7817330f729Sjoerg
7827330f729SjoergAs the matrix indicates, LLVM's garbage collection infrastructure is already
7837330f729Sjoergsuitable for a wide variety of collectors, but does not currently extend to
7847330f729Sjoergmultithreaded programs.  This will be added in the future as there is
7857330f729Sjoerginterest.
7867330f729Sjoerg
7877330f729Sjoerg.. _stack-map:
7887330f729Sjoerg
7897330f729SjoergComputing stack maps
7907330f729Sjoerg--------------------
7917330f729Sjoerg
7927330f729SjoergLLVM automatically computes a stack map.  One of the most important features
7937330f729Sjoergof a ``GCStrategy`` is to compile this information into the executable in
7947330f729Sjoergthe binary representation expected by the runtime library.
7957330f729Sjoerg
7967330f729SjoergThe stack map consists of the location and identity of each GC root in the
7977330f729Sjoergeach function in the module.  For each root:
7987330f729Sjoerg
7997330f729Sjoerg* ``RootNum``: The index of the root.
8007330f729Sjoerg
8017330f729Sjoerg* ``StackOffset``: The offset of the object relative to the frame pointer.
8027330f729Sjoerg
8037330f729Sjoerg* ``RootMetadata``: The value passed as the ``%metadata`` parameter to the
8047330f729Sjoerg  ``@llvm.gcroot`` intrinsic.
8057330f729Sjoerg
8067330f729SjoergAlso, for the function as a whole:
8077330f729Sjoerg
8087330f729Sjoerg* ``getFrameSize()``: The overall size of the function's initial stack frame,
8097330f729Sjoerg   not accounting for any dynamic allocation.
8107330f729Sjoerg
8117330f729Sjoerg* ``roots_size()``: The count of roots in the function.
8127330f729Sjoerg
8137330f729SjoergTo access the stack map, use ``GCFunctionMetadata::roots_begin()`` and
8147330f729Sjoerg-``end()`` from the :ref:`GCMetadataPrinter <assembly>`:
8157330f729Sjoerg
8167330f729Sjoerg.. code-block:: c++
8177330f729Sjoerg
8187330f729Sjoerg  for (iterator I = begin(), E = end(); I != E; ++I) {
8197330f729Sjoerg    GCFunctionInfo *FI = *I;
8207330f729Sjoerg    unsigned FrameSize = FI->getFrameSize();
8217330f729Sjoerg    size_t RootCount = FI->roots_size();
8227330f729Sjoerg
8237330f729Sjoerg    for (GCFunctionInfo::roots_iterator RI = FI->roots_begin(),
8247330f729Sjoerg                                        RE = FI->roots_end();
8257330f729Sjoerg                                        RI != RE; ++RI) {
8267330f729Sjoerg      int RootNum = RI->Num;
8277330f729Sjoerg      int RootStackOffset = RI->StackOffset;
8287330f729Sjoerg      Constant *RootMetadata = RI->Metadata;
8297330f729Sjoerg    }
8307330f729Sjoerg  }
8317330f729Sjoerg
8327330f729SjoergIf the ``llvm.gcroot`` intrinsic is eliminated before code generation by a
8337330f729Sjoergcustom lowering pass, LLVM will compute an empty stack map.  This may be useful
8347330f729Sjoergfor collector plugins which implement reference counting or a shadow stack.
8357330f729Sjoerg
8367330f729Sjoerg.. _init-roots:
8377330f729Sjoerg
8387330f729SjoergInitializing roots to null
8397330f729Sjoerg---------------------------
8407330f729Sjoerg
8417330f729SjoergIt is recommended that frontends initialize roots explicitly to avoid
8427330f729Sjoergpotentially confusing the optimizer.  This prevents the GC from visiting
8437330f729Sjoerguninitialized pointers, which will almost certainly cause it to crash.
8447330f729Sjoerg
8457330f729SjoergAs a fallback, LLVM will automatically initialize each root to ``null``
8467330f729Sjoergupon entry to the function.  Support for this mode in code generation is
8477330f729Sjoerglargely a legacy detail to keep old collector implementations working.
8487330f729Sjoerg
8497330f729SjoergCustom lowering of intrinsics
8507330f729Sjoerg------------------------------
8517330f729Sjoerg
8527330f729SjoergFor GCs which use barriers or unusual treatment of stack roots, the
8537330f729Sjoergimplementor is responsibly for providing a custom pass to lower the
8547330f729Sjoergintrinsics with the desired semantics.  If you have opted in to custom
8557330f729Sjoerglowering of a particular intrinsic your pass **must** eliminate all
8567330f729Sjoerginstances of the corresponding intrinsic in functions which opt in to
8577330f729Sjoergyour GC.  The best example of such a pass is the ShadowStackGC and it's
8587330f729SjoergShadowStackGCLowering pass.
8597330f729Sjoerg
8607330f729SjoergThere is currently no way to register such a custom lowering pass
8617330f729Sjoergwithout building a custom copy of LLVM.
8627330f729Sjoerg
8637330f729Sjoerg.. _safe-points:
8647330f729Sjoerg
8657330f729SjoergGenerating safe points
8667330f729Sjoerg-----------------------
8677330f729Sjoerg
8687330f729SjoergLLVM provides support for associating stackmaps with the return address of
8697330f729Sjoerga call.  Any loop or return safepoints required by a given collector design
8707330f729Sjoergcan be modeled via calls to runtime routines, or potentially patchable call
8717330f729Sjoergsequences.  Using gcroot, all call instructions are inferred to be possible
8727330f729Sjoergsafepoints and will thus have an associated stackmap.
8737330f729Sjoerg
8747330f729Sjoerg.. _assembly:
8757330f729Sjoerg
8767330f729SjoergEmitting assembly code: ``GCMetadataPrinter``
8777330f729Sjoerg---------------------------------------------
8787330f729Sjoerg
8797330f729SjoergLLVM allows a plugin to print arbitrary assembly code before and after the rest
8807330f729Sjoergof a module's assembly code.  At the end of the module, the GC can compile the
8817330f729SjoergLLVM stack map into assembly code. (At the beginning, this information is not
8827330f729Sjoergyet computed.)
8837330f729Sjoerg
8847330f729SjoergSince AsmWriter and CodeGen are separate components of LLVM, a separate abstract
8857330f729Sjoergbase class and registry is provided for printing assembly code, the
8867330f729Sjoerg``GCMetadaPrinter`` and ``GCMetadataPrinterRegistry``.  The AsmWriter will look
8877330f729Sjoergfor such a subclass if the ``GCStrategy`` sets ``UsesMetadata``:
8887330f729Sjoerg
8897330f729Sjoerg.. code-block:: c++
8907330f729Sjoerg
8917330f729Sjoerg  MyGC::MyGC() {
8927330f729Sjoerg    UsesMetadata = true;
8937330f729Sjoerg  }
8947330f729Sjoerg
8957330f729SjoergThis separation allows JIT-only clients to be smaller.
8967330f729Sjoerg
8977330f729SjoergNote that LLVM does not currently have analogous APIs to support code generation
8987330f729Sjoergin the JIT, nor using the object writers.
8997330f729Sjoerg
9007330f729Sjoerg.. code-block:: c++
9017330f729Sjoerg
9027330f729Sjoerg  // lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer
9037330f729Sjoerg
9047330f729Sjoerg  #include "llvm/CodeGen/GCMetadataPrinter.h"
9057330f729Sjoerg  #include "llvm/Support/Compiler.h"
9067330f729Sjoerg
9077330f729Sjoerg  using namespace llvm;
9087330f729Sjoerg
9097330f729Sjoerg  namespace {
9107330f729Sjoerg    class LLVM_LIBRARY_VISIBILITY MyGCPrinter : public GCMetadataPrinter {
9117330f729Sjoerg    public:
9127330f729Sjoerg      virtual void beginAssembly(AsmPrinter &AP);
9137330f729Sjoerg
9147330f729Sjoerg      virtual void finishAssembly(AsmPrinter &AP);
9157330f729Sjoerg    };
9167330f729Sjoerg
9177330f729Sjoerg    GCMetadataPrinterRegistry::Add<MyGCPrinter>
9187330f729Sjoerg    X("mygc", "My bespoke garbage collector.");
9197330f729Sjoerg  }
9207330f729Sjoerg
9217330f729SjoergThe collector should use ``AsmPrinter`` to print portable assembly code.  The
9227330f729Sjoergcollector itself contains the stack map for the entire module, and may access
9237330f729Sjoergthe ``GCFunctionInfo`` using its own ``begin()`` and ``end()`` methods.  Here's
9247330f729Sjoerga realistic example:
9257330f729Sjoerg
9267330f729Sjoerg.. code-block:: c++
9277330f729Sjoerg
9287330f729Sjoerg  #include "llvm/CodeGen/AsmPrinter.h"
9297330f729Sjoerg  #include "llvm/IR/Function.h"
9307330f729Sjoerg  #include "llvm/IR/DataLayout.h"
9317330f729Sjoerg  #include "llvm/Target/TargetAsmInfo.h"
9327330f729Sjoerg  #include "llvm/Target/TargetMachine.h"
9337330f729Sjoerg
9347330f729Sjoerg  void MyGCPrinter::beginAssembly(AsmPrinter &AP) {
9357330f729Sjoerg    // Nothing to do.
9367330f729Sjoerg  }
9377330f729Sjoerg
9387330f729Sjoerg  void MyGCPrinter::finishAssembly(AsmPrinter &AP) {
9397330f729Sjoerg    MCStreamer &OS = AP.OutStreamer;
9407330f729Sjoerg    unsigned IntPtrSize = AP.getPointerSize();
9417330f729Sjoerg
9427330f729Sjoerg    // Put this in the data section.
9437330f729Sjoerg    OS.SwitchSection(AP.getObjFileLowering().getDataSection());
9447330f729Sjoerg
9457330f729Sjoerg    // For each function...
9467330f729Sjoerg    for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
9477330f729Sjoerg      GCFunctionInfo &MD = **FI;
9487330f729Sjoerg
9497330f729Sjoerg      // A compact GC layout. Emit this data structure:
9507330f729Sjoerg      //
9517330f729Sjoerg      // struct {
9527330f729Sjoerg      //   int32_t PointCount;
9537330f729Sjoerg      //   void *SafePointAddress[PointCount];
9547330f729Sjoerg      //   int32_t StackFrameSize; // in words
9557330f729Sjoerg      //   int32_t StackArity;
9567330f729Sjoerg      //   int32_t LiveCount;
9577330f729Sjoerg      //   int32_t LiveOffsets[LiveCount];
9587330f729Sjoerg      // } __gcmap_<FUNCTIONNAME>;
9597330f729Sjoerg
9607330f729Sjoerg      // Align to address width.
961*82d56013Sjoerg      AP.emitAlignment(IntPtrSize == 4 ? 2 : 3);
9627330f729Sjoerg
9637330f729Sjoerg      // Emit PointCount.
9647330f729Sjoerg      OS.AddComment("safe point count");
9657330f729Sjoerg      AP.emitInt32(MD.size());
9667330f729Sjoerg
9677330f729Sjoerg      // And each safe point...
9687330f729Sjoerg      for (GCFunctionInfo::iterator PI = MD.begin(),
9697330f729Sjoerg                                    PE = MD.end(); PI != PE; ++PI) {
9707330f729Sjoerg        // Emit the address of the safe point.
9717330f729Sjoerg        OS.AddComment("safe point address");
9727330f729Sjoerg        MCSymbol *Label = PI->Label;
973*82d56013Sjoerg        AP.emitLabelPlusOffset(Label/*Hi*/, 0/*Offset*/, 4/*Size*/);
9747330f729Sjoerg      }
9757330f729Sjoerg
9767330f729Sjoerg      // Stack information never change in safe points! Only print info from the
9777330f729Sjoerg      // first call-site.
9787330f729Sjoerg      GCFunctionInfo::iterator PI = MD.begin();
9797330f729Sjoerg
9807330f729Sjoerg      // Emit the stack frame size.
9817330f729Sjoerg      OS.AddComment("stack frame size (in words)");
9827330f729Sjoerg      AP.emitInt32(MD.getFrameSize() / IntPtrSize);
9837330f729Sjoerg
9847330f729Sjoerg      // Emit stack arity, i.e. the number of stacked arguments.
9857330f729Sjoerg      unsigned RegisteredArgs = IntPtrSize == 4 ? 5 : 6;
9867330f729Sjoerg      unsigned StackArity = MD.getFunction().arg_size() > RegisteredArgs ?
9877330f729Sjoerg                            MD.getFunction().arg_size() - RegisteredArgs : 0;
9887330f729Sjoerg      OS.AddComment("stack arity");
9897330f729Sjoerg      AP.emitInt32(StackArity);
9907330f729Sjoerg
9917330f729Sjoerg      // Emit the number of live roots in the function.
9927330f729Sjoerg      OS.AddComment("live root count");
9937330f729Sjoerg      AP.emitInt32(MD.live_size(PI));
9947330f729Sjoerg
9957330f729Sjoerg      // And for each live root...
9967330f729Sjoerg      for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
9977330f729Sjoerg                                         LE = MD.live_end(PI);
9987330f729Sjoerg                                         LI != LE; ++LI) {
9997330f729Sjoerg        // Emit live root's offset within the stack frame.
10007330f729Sjoerg        OS.AddComment("stack index (offset / wordsize)");
10017330f729Sjoerg        AP.emitInt32(LI->StackOffset);
10027330f729Sjoerg      }
10037330f729Sjoerg    }
10047330f729Sjoerg  }
10057330f729Sjoerg
10067330f729SjoergReferences
10077330f729Sjoerg==========
10087330f729Sjoerg
10097330f729Sjoerg.. _appel89:
10107330f729Sjoerg
10117330f729Sjoerg[Appel89] Runtime Tags Aren't Necessary. Andrew W. Appel. Lisp and Symbolic
10127330f729SjoergComputation 19(7):703-705, July 1989.
10137330f729Sjoerg
10147330f729Sjoerg.. _goldberg91:
10157330f729Sjoerg
10167330f729Sjoerg[Goldberg91] Tag-free garbage collection for strongly typed programming
10177330f729Sjoerglanguages. Benjamin Goldberg. ACM SIGPLAN PLDI'91.
10187330f729Sjoerg
10197330f729Sjoerg.. _tolmach94:
10207330f729Sjoerg
10217330f729Sjoerg[Tolmach94] Tag-free garbage collection using explicit type parameters. Andrew
10227330f729SjoergTolmach. Proceedings of the 1994 ACM conference on LISP and functional
10237330f729Sjoergprogramming.
10247330f729Sjoerg
10257330f729Sjoerg.. _henderson02:
10267330f729Sjoerg
10277330f729Sjoerg[Henderson2002] `Accurate Garbage Collection in an Uncooperative Environment
10287330f729Sjoerg<http://citeseer.ist.psu.edu/henderson02accurate.html>`__
1029