17330f729Sjoerg===================================== 27330f729SjoergGarbage Collection with LLVM 37330f729Sjoerg===================================== 47330f729Sjoerg 57330f729Sjoerg.. contents:: 67330f729Sjoerg :local: 77330f729Sjoerg 87330f729SjoergAbstract 97330f729Sjoerg======== 107330f729Sjoerg 117330f729SjoergThis document covers how to integrate LLVM into a compiler for a language which 127330f729Sjoergsupports garbage collection. **Note that LLVM itself does not provide a 137330f729Sjoerggarbage collector.** You must provide your own. 147330f729Sjoerg 157330f729SjoergQuick Start 167330f729Sjoerg============ 177330f729Sjoerg 187330f729SjoergFirst, you should pick a collector strategy. LLVM includes a number of built 197330f729Sjoergin ones, but you can also implement a loadable plugin with a custom definition. 207330f729SjoergNote that the collector strategy is a description of how LLVM should generate 217330f729Sjoergcode such that it interacts with your collector and runtime, not a description 227330f729Sjoergof the collector itself. 237330f729Sjoerg 247330f729SjoergNext, mark your generated functions as using your chosen collector strategy. 257330f729SjoergFrom c++, you can call: 267330f729Sjoerg 277330f729Sjoerg.. code-block:: c++ 287330f729Sjoerg 297330f729Sjoerg F.setGC(<collector description name>); 307330f729Sjoerg 317330f729Sjoerg 327330f729SjoergThis will produce IR like the following fragment: 337330f729Sjoerg 347330f729Sjoerg.. code-block:: llvm 357330f729Sjoerg 367330f729Sjoerg define void @foo() gc "<collector description name>" { ... } 377330f729Sjoerg 387330f729Sjoerg 397330f729SjoergWhen generating LLVM IR for your functions, you will need to: 407330f729Sjoerg 417330f729Sjoerg* Use ``@llvm.gcread`` and/or ``@llvm.gcwrite`` in place of standard load and 427330f729Sjoerg store instructions. These intrinsics are used to represent load and store 437330f729Sjoerg barriers. If you collector does not require such barriers, you can skip 447330f729Sjoerg this step. 457330f729Sjoerg 467330f729Sjoerg* Use the memory allocation routines provided by your garbage collector's 477330f729Sjoerg runtime library. 487330f729Sjoerg 497330f729Sjoerg* If your collector requires them, generate type maps according to your 507330f729Sjoerg runtime's binary interface. LLVM is not involved in the process. In 517330f729Sjoerg particular, the LLVM type system is not suitable for conveying such 527330f729Sjoerg information though the compiler. 537330f729Sjoerg 547330f729Sjoerg* Insert any coordination code required for interacting with your collector. 557330f729Sjoerg Many collectors require running application code to periodically check a 567330f729Sjoerg flag and conditionally call a runtime function. This is often referred to 577330f729Sjoerg as a safepoint poll. 587330f729Sjoerg 597330f729SjoergYou will need to identify roots (i.e. references to heap objects your collector 607330f729Sjoergneeds to know about) in your generated IR, so that LLVM can encode them into 617330f729Sjoergyour final stack maps. Depending on the collector strategy chosen, this is 627330f729Sjoergaccomplished by using either the ``@llvm.gcroot`` intrinsics or an 637330f729Sjoerg``gc.statepoint`` relocation sequence. 647330f729Sjoerg 657330f729SjoergDon't forget to create a root for each intermediate value that is generated when 667330f729Sjoergevaluating an expression. In ``h(f(), g())``, the result of ``f()`` could 677330f729Sjoergeasily be collected if evaluating ``g()`` triggers a collection. 687330f729Sjoerg 697330f729SjoergFinally, you need to link your runtime library with the generated program 707330f729Sjoergexecutable (for a static compiler) or ensure the appropriate symbols are 717330f729Sjoergavailable for the runtime linker (for a JIT compiler). 727330f729Sjoerg 737330f729Sjoerg 747330f729SjoergIntroduction 757330f729Sjoerg============ 767330f729Sjoerg 777330f729SjoergWhat is Garbage Collection? 787330f729Sjoerg--------------------------- 797330f729Sjoerg 807330f729SjoergGarbage collection is a widely used technique that frees the programmer from 817330f729Sjoerghaving to know the lifetimes of heap objects, making software easier to produce 827330f729Sjoergand maintain. Many programming languages rely on garbage collection for 837330f729Sjoergautomatic memory management. There are two primary forms of garbage collection: 847330f729Sjoergconservative and accurate. 857330f729Sjoerg 867330f729SjoergConservative garbage collection often does not require any special support from 877330f729Sjoergeither the language or the compiler: it can handle non-type-safe programming 887330f729Sjoerglanguages (such as C/C++) and does not require any special information from the 897330f729Sjoergcompiler. The `Boehm collector 90*82d56013Sjoerg<https://hboehm.info/gc/>`__ is an example of a 917330f729Sjoergstate-of-the-art conservative collector. 927330f729Sjoerg 937330f729SjoergAccurate garbage collection requires the ability to identify all pointers in the 947330f729Sjoergprogram at run-time (which requires that the source-language be type-safe in 957330f729Sjoergmost cases). Identifying pointers at run-time requires compiler support to 967330f729Sjoerglocate all places that hold live pointer variables at run-time, including the 977330f729Sjoerg:ref:`processor stack and registers <gcroot>`. 987330f729Sjoerg 997330f729SjoergConservative garbage collection is attractive because it does not require any 1007330f729Sjoergspecial compiler support, but it does have problems. In particular, because the 1017330f729Sjoergconservative garbage collector cannot *know* that a particular word in the 1027330f729Sjoergmachine is a pointer, it cannot move live objects in the heap (preventing the 1037330f729Sjoerguse of compacting and generational GC algorithms) and it can occasionally suffer 1047330f729Sjoergfrom memory leaks due to integer values that happen to point to objects in the 1057330f729Sjoergprogram. In addition, some aggressive compiler transformations can break 1067330f729Sjoergconservative garbage collectors (though these seem rare in practice). 1077330f729Sjoerg 1087330f729SjoergAccurate garbage collectors do not suffer from any of these problems, but they 1097330f729Sjoergcan suffer from degraded scalar optimization of the program. In particular, 1107330f729Sjoergbecause the runtime must be able to identify and update all pointers active in 1117330f729Sjoergthe program, some optimizations are less effective. In practice, however, the 1127330f729Sjoerglocality and performance benefits of using aggressive garbage collection 1137330f729Sjoergtechniques dominates any low-level losses. 1147330f729Sjoerg 1157330f729SjoergThis document describes the mechanisms and interfaces provided by LLVM to 1167330f729Sjoergsupport accurate garbage collection. 1177330f729Sjoerg 1187330f729SjoergGoals and non-goals 1197330f729Sjoerg------------------- 1207330f729Sjoerg 1217330f729SjoergLLVM's intermediate representation provides :ref:`garbage collection intrinsics 1227330f729Sjoerg<gc_intrinsics>` that offer support for a broad class of collector models. For 1237330f729Sjoerginstance, the intrinsics permit: 1247330f729Sjoerg 1257330f729Sjoerg* semi-space collectors 1267330f729Sjoerg 1277330f729Sjoerg* mark-sweep collectors 1287330f729Sjoerg 1297330f729Sjoerg* generational collectors 1307330f729Sjoerg 1317330f729Sjoerg* incremental collectors 1327330f729Sjoerg 1337330f729Sjoerg* concurrent collectors 1347330f729Sjoerg 1357330f729Sjoerg* cooperative collectors 1367330f729Sjoerg 1377330f729Sjoerg* reference counting 1387330f729Sjoerg 1397330f729SjoergWe hope that the support built into the LLVM IR is sufficient to support a 1407330f729Sjoergbroad class of garbage collected languages including Scheme, ML, Java, C#, 1417330f729SjoergPerl, Python, Lua, Ruby, other scripting languages, and more. 1427330f729Sjoerg 1437330f729SjoergNote that LLVM **does not itself provide a garbage collector** --- this should 1447330f729Sjoergbe part of your language's runtime library. LLVM provides a framework for 1457330f729Sjoergdescribing the garbage collectors requirements to the compiler. In particular, 1467330f729SjoergLLVM provides support for generating stack maps at call sites, polling for a 1477330f729Sjoergsafepoint, and emitting load and store barriers. You can also extend LLVM - 1487330f729Sjoergpossibly through a loadable :ref:`code generation plugins <plugin>` - to 1497330f729Sjoerggenerate code and data structures which conforms to the *binary interface* 1507330f729Sjoergspecified by the *runtime library*. This is similar to the relationship between 1517330f729SjoergLLVM and DWARF debugging info, for example. The difference primarily lies in 1527330f729Sjoergthe lack of an established standard in the domain of garbage collection --- thus 1537330f729Sjoergthe need for a flexible extension mechanism. 1547330f729Sjoerg 1557330f729SjoergThe aspects of the binary interface with which LLVM's GC support is 1567330f729Sjoergconcerned are: 1577330f729Sjoerg 1587330f729Sjoerg* Creation of GC safepoints within code where collection is allowed to execute 1597330f729Sjoerg safely. 1607330f729Sjoerg 1617330f729Sjoerg* Computation of the stack map. For each safe point in the code, object 1627330f729Sjoerg references within the stack frame must be identified so that the collector may 1637330f729Sjoerg traverse and perhaps update them. 1647330f729Sjoerg 1657330f729Sjoerg* Write barriers when storing object references to the heap. These are commonly 1667330f729Sjoerg used to optimize incremental scans in generational collectors. 1677330f729Sjoerg 1687330f729Sjoerg* Emission of read barriers when loading object references. These are useful 1697330f729Sjoerg for interoperating with concurrent collectors. 1707330f729Sjoerg 1717330f729SjoergThere are additional areas that LLVM does not directly address: 1727330f729Sjoerg 1737330f729Sjoerg* Registration of global roots with the runtime. 1747330f729Sjoerg 1757330f729Sjoerg* Registration of stack map entries with the runtime. 1767330f729Sjoerg 1777330f729Sjoerg* The functions used by the program to allocate memory, trigger a collection, 1787330f729Sjoerg etc. 1797330f729Sjoerg 1807330f729Sjoerg* Computation or compilation of type maps, or registration of them with the 1817330f729Sjoerg runtime. These are used to crawl the heap for object references. 1827330f729Sjoerg 1837330f729SjoergIn general, LLVM's support for GC does not include features which can be 1847330f729Sjoergadequately addressed with other features of the IR and does not specify a 1857330f729Sjoergparticular binary interface. On the plus side, this means that you should be 1867330f729Sjoergable to integrate LLVM with an existing runtime. On the other hand, it can 1877330f729Sjoerghave the effect of leaving a lot of work for the developer of a novel 1887330f729Sjoerglanguage. We try to mitigate this by providing built in collector strategy 1897330f729Sjoergdescriptions that can work with many common collector designs and easy 1907330f729Sjoergextension points. If you don't already have a specific binary interface 1917330f729Sjoergyou need to support, we recommend trying to use one of these built in collector 1927330f729Sjoergstrategies. 1937330f729Sjoerg 1947330f729Sjoerg.. _gc_intrinsics: 1957330f729Sjoerg 1967330f729SjoergLLVM IR Features 1977330f729Sjoerg================ 1987330f729Sjoerg 1997330f729SjoergThis section describes the garbage collection facilities provided by the 2007330f729Sjoerg:doc:`LLVM intermediate representation <LangRef>`. The exact behavior of these 2017330f729SjoergIR features is specified by the selected :ref:`GC strategy description 2027330f729Sjoerg<plugin>`. 2037330f729Sjoerg 2047330f729SjoergSpecifying GC code generation: ``gc "..."`` 2057330f729Sjoerg------------------------------------------- 2067330f729Sjoerg 2077330f729Sjoerg.. code-block:: text 2087330f729Sjoerg 2097330f729Sjoerg define <returntype> @name(...) gc "name" { ... } 2107330f729Sjoerg 2117330f729SjoergThe ``gc`` function attribute is used to specify the desired GC strategy to the 2127330f729Sjoergcompiler. Its programmatic equivalent is the ``setGC`` method of ``Function``. 2137330f729Sjoerg 2147330f729SjoergSetting ``gc "name"`` on a function triggers a search for a matching subclass 2157330f729Sjoergof GCStrategy. Some collector strategies are built in. You can add others 2167330f729Sjoergusing either the loadable plugin mechanism, or by patching your copy of LLVM. 2177330f729SjoergIt is the selected GC strategy which defines the exact nature of the code 2187330f729Sjoerggenerated to support GC. If none is found, the compiler will raise an error. 2197330f729Sjoerg 2207330f729SjoergSpecifying the GC style on a per-function basis allows LLVM to link together 2217330f729Sjoergprograms that use different garbage collection algorithms (or none at all). 2227330f729Sjoerg 2237330f729Sjoerg.. _gcroot: 2247330f729Sjoerg 2257330f729SjoergIdentifying GC roots on the stack 2267330f729Sjoerg---------------------------------- 2277330f729Sjoerg 2287330f729SjoergLLVM currently supports two different mechanisms for describing references in 2297330f729Sjoergcompiled code at safepoints. ``llvm.gcroot`` is the older mechanism; 2307330f729Sjoerg``gc.statepoint`` has been added more recently. At the moment, you can choose 2317330f729Sjoergeither implementation (on a per :ref:`GC strategy <plugin>` basis). Longer 2327330f729Sjoergterm, we will probably either migrate away from ``llvm.gcroot`` entirely, or 2337330f729Sjoergsubstantially merge their implementations. Note that most new development 2347330f729Sjoergwork is focused on ``gc.statepoint``. 2357330f729Sjoerg 2367330f729SjoergUsing ``gc.statepoint`` 2377330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^ 2387330f729Sjoerg:doc:`This page <Statepoints>` contains detailed documentation for 2397330f729Sjoerg``gc.statepoint``. 2407330f729Sjoerg 2417330f729SjoergUsing ``llvm.gcwrite`` 2427330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2437330f729Sjoerg 2447330f729Sjoerg.. code-block:: llvm 2457330f729Sjoerg 2467330f729Sjoerg void @llvm.gcroot(i8** %ptrloc, i8* %metadata) 2477330f729Sjoerg 2487330f729SjoergThe ``llvm.gcroot`` intrinsic is used to inform LLVM that a stack variable 2497330f729Sjoergreferences an object on the heap and is to be tracked for garbage collection. 2507330f729SjoergThe exact impact on generated code is specified by the Function's selected 2517330f729Sjoerg:ref:`GC strategy <plugin>`. All calls to ``llvm.gcroot`` **must** reside 2527330f729Sjoerginside the first basic block. 2537330f729Sjoerg 2547330f729SjoergThe first argument **must** be a value referring to an alloca instruction or a 2557330f729Sjoergbitcast of an alloca. The second contains a pointer to metadata that should be 2567330f729Sjoergassociated with the pointer, and **must** be a constant or global value 2577330f729Sjoergaddress. If your target collector uses tags, use a null pointer for metadata. 2587330f729Sjoerg 2597330f729SjoergA compiler which performs manual SSA construction **must** ensure that SSA 2607330f729Sjoergvalues representing GC references are stored in to the alloca passed to the 2617330f729Sjoergrespective ``gcroot`` before every call site and reloaded after every call. 2627330f729SjoergA compiler which uses mem2reg to raise imperative code using ``alloca`` into 2637330f729SjoergSSA form need only add a call to ``@llvm.gcroot`` for those variables which 2647330f729Sjoergare pointers into the GC heap. 2657330f729Sjoerg 2667330f729SjoergIt is also important to mark intermediate values with ``llvm.gcroot``. For 2677330f729Sjoergexample, consider ``h(f(), g())``. Beware leaking the result of ``f()`` in the 2687330f729Sjoergcase that ``g()`` triggers a collection. Note, that stack variables must be 2697330f729Sjoerginitialized and marked with ``llvm.gcroot`` in function's prologue. 2707330f729Sjoerg 2717330f729SjoergThe ``%metadata`` argument can be used to avoid requiring heap objects to have 2727330f729Sjoerg'isa' pointers or tag bits. [Appel89_, Goldberg91_, Tolmach94_] If specified, 2737330f729Sjoergits value will be tracked along with the location of the pointer in the stack 2747330f729Sjoergframe. 2757330f729Sjoerg 2767330f729SjoergConsider the following fragment of Java code: 2777330f729Sjoerg 2787330f729Sjoerg.. code-block:: java 2797330f729Sjoerg 2807330f729Sjoerg { 2817330f729Sjoerg Object X; // A null-initialized reference to an object 2827330f729Sjoerg ... 2837330f729Sjoerg } 2847330f729Sjoerg 2857330f729SjoergThis block (which may be located in the middle of a function or in a loop nest), 2867330f729Sjoergcould be compiled to this LLVM code: 2877330f729Sjoerg 2887330f729Sjoerg.. code-block:: llvm 2897330f729Sjoerg 2907330f729Sjoerg Entry: 2917330f729Sjoerg ;; In the entry block for the function, allocate the 2927330f729Sjoerg ;; stack space for X, which is an LLVM pointer. 2937330f729Sjoerg %X = alloca %Object* 2947330f729Sjoerg 2957330f729Sjoerg ;; Tell LLVM that the stack space is a stack root. 2967330f729Sjoerg ;; Java has type-tags on objects, so we pass null as metadata. 2977330f729Sjoerg %tmp = bitcast %Object** %X to i8** 2987330f729Sjoerg call void @llvm.gcroot(i8** %tmp, i8* null) 2997330f729Sjoerg ... 3007330f729Sjoerg 3017330f729Sjoerg ;; "CodeBlock" is the block corresponding to the start 3027330f729Sjoerg ;; of the scope above. 3037330f729Sjoerg CodeBlock: 3047330f729Sjoerg ;; Java null-initializes pointers. 3057330f729Sjoerg store %Object* null, %Object** %X 3067330f729Sjoerg 3077330f729Sjoerg ... 3087330f729Sjoerg 3097330f729Sjoerg ;; As the pointer goes out of scope, store a null value into 3107330f729Sjoerg ;; it, to indicate that the value is no longer live. 3117330f729Sjoerg store %Object* null, %Object** %X 3127330f729Sjoerg ... 3137330f729Sjoerg 3147330f729SjoergReading and writing references in the heap 3157330f729Sjoerg------------------------------------------ 3167330f729Sjoerg 3177330f729SjoergSome collectors need to be informed when the mutator (the program that needs 3187330f729Sjoerggarbage collection) either reads a pointer from or writes a pointer to a field 3197330f729Sjoergof a heap object. The code fragments inserted at these points are called *read 3207330f729Sjoergbarriers* and *write barriers*, respectively. The amount of code that needs to 3217330f729Sjoergbe executed is usually quite small and not on the critical path of any 3227330f729Sjoergcomputation, so the overall performance impact of the barrier is tolerable. 3237330f729Sjoerg 3247330f729SjoergBarriers often require access to the *object pointer* rather than the *derived 3257330f729Sjoergpointer* (which is a pointer to the field within the object). Accordingly, 3267330f729Sjoergthese intrinsics take both pointers as separate arguments for completeness. In 3277330f729Sjoergthis snippet, ``%object`` is the object pointer, and ``%derived`` is the derived 3287330f729Sjoergpointer: 3297330f729Sjoerg 3307330f729Sjoerg.. code-block:: llvm 3317330f729Sjoerg 3327330f729Sjoerg ;; An array type. 3337330f729Sjoerg %class.Array = type { %class.Object, i32, [0 x %class.Object*] } 3347330f729Sjoerg ... 3357330f729Sjoerg 3367330f729Sjoerg ;; Load the object pointer from a gcroot. 3377330f729Sjoerg %object = load %class.Array** %object_addr 3387330f729Sjoerg 3397330f729Sjoerg ;; Compute the derived pointer. 3407330f729Sjoerg %derived = getelementptr %object, i32 0, i32 2, i32 %n 3417330f729Sjoerg 3427330f729SjoergLLVM does not enforce this relationship between the object and derived pointer 3437330f729Sjoerg(although a particular :ref:`collector strategy <plugin>` might). However, it 3447330f729Sjoergwould be an unusual collector that violated it. 3457330f729Sjoerg 3467330f729SjoergThe use of these intrinsics is naturally optional if the target GC does not 3477330f729Sjoergrequire the corresponding barrier. The GC strategy used with such a collector 3487330f729Sjoergshould replace the intrinsic calls with the corresponding ``load`` or 3497330f729Sjoerg``store`` instruction if they are used. 3507330f729Sjoerg 3517330f729SjoergOne known deficiency with the current design is that the barrier intrinsics do 3527330f729Sjoergnot include the size or alignment of the underlying operation performed. It is 3537330f729Sjoergcurrently assumed that the operation is of pointer size and the alignment is 3547330f729Sjoergassumed to be the target machine's default alignment. 3557330f729Sjoerg 3567330f729SjoergWrite barrier: ``llvm.gcwrite`` 3577330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3587330f729Sjoerg 3597330f729Sjoerg.. code-block:: llvm 3607330f729Sjoerg 3617330f729Sjoerg void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived) 3627330f729Sjoerg 3637330f729SjoergFor write barriers, LLVM provides the ``llvm.gcwrite`` intrinsic function. It 3647330f729Sjoerghas exactly the same semantics as a non-volatile ``store`` to the derived 3657330f729Sjoergpointer (the third argument). The exact code generated is specified by the 3667330f729SjoergFunction's selected :ref:`GC strategy <plugin>`. 3677330f729Sjoerg 3687330f729SjoergMany important algorithms require write barriers, including generational and 3697330f729Sjoergconcurrent collectors. Additionally, write barriers could be used to implement 3707330f729Sjoergreference counting. 3717330f729Sjoerg 3727330f729SjoergRead barrier: ``llvm.gcread`` 3737330f729Sjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3747330f729Sjoerg 3757330f729Sjoerg.. code-block:: llvm 3767330f729Sjoerg 3777330f729Sjoerg i8* @llvm.gcread(i8* %object, i8** %derived) 3787330f729Sjoerg 3797330f729SjoergFor read barriers, LLVM provides the ``llvm.gcread`` intrinsic function. It has 3807330f729Sjoergexactly the same semantics as a non-volatile ``load`` from the derived pointer 3817330f729Sjoerg(the second argument). The exact code generated is specified by the Function's 3827330f729Sjoergselected :ref:`GC strategy <plugin>`. 3837330f729Sjoerg 3847330f729SjoergRead barriers are needed by fewer algorithms than write barriers, and may have a 3857330f729Sjoerggreater performance impact since pointer reads are more frequent than writes. 3867330f729Sjoerg 3877330f729Sjoerg.. _plugin: 3887330f729Sjoerg 3897330f729Sjoerg.. _builtin-gc-strategies: 3907330f729Sjoerg 3917330f729SjoergBuilt In GC Strategies 3927330f729Sjoerg====================== 3937330f729Sjoerg 3947330f729SjoergLLVM includes built in support for several varieties of garbage collectors. 3957330f729Sjoerg 3967330f729SjoergThe Shadow Stack GC 3977330f729Sjoerg---------------------- 3987330f729Sjoerg 3997330f729SjoergTo use this collector strategy, mark your functions with: 4007330f729Sjoerg 4017330f729Sjoerg.. code-block:: c++ 4027330f729Sjoerg 4037330f729Sjoerg F.setGC("shadow-stack"); 4047330f729Sjoerg 4057330f729SjoergUnlike many GC algorithms which rely on a cooperative code generator to compile 4067330f729Sjoergstack maps, this algorithm carefully maintains a linked list of stack roots 4077330f729Sjoerg[:ref:`Henderson2002 <henderson02>`]. This so-called "shadow stack" mirrors the 4087330f729Sjoergmachine stack. Maintaining this data structure is slower than using a stack map 4097330f729Sjoergcompiled into the executable as constant data, but has a significant portability 4107330f729Sjoergadvantage because it requires no special support from the target code generator, 4117330f729Sjoergand does not require tricky platform-specific code to crawl the machine stack. 4127330f729Sjoerg 4137330f729SjoergThe tradeoff for this simplicity and portability is: 4147330f729Sjoerg 4157330f729Sjoerg* High overhead per function call. 4167330f729Sjoerg 4177330f729Sjoerg* Not thread-safe. 4187330f729Sjoerg 4197330f729SjoergStill, it's an easy way to get started. After your compiler and runtime are up 4207330f729Sjoergand running, writing a :ref:`plugin <plugin>` will allow you to take advantage 4217330f729Sjoergof :ref:`more advanced GC features <collector-algos>` of LLVM in order to 4227330f729Sjoergimprove performance. 4237330f729Sjoerg 4247330f729Sjoerg 4257330f729SjoergThe shadow stack doesn't imply a memory allocation algorithm. A semispace 4267330f729Sjoergcollector or building atop ``malloc`` are great places to start, and can be 4277330f729Sjoergimplemented with very little code. 4287330f729Sjoerg 4297330f729SjoergWhen it comes time to collect, however, your runtime needs to traverse the stack 4307330f729Sjoergroots, and for this it needs to integrate with the shadow stack. Luckily, doing 4317330f729Sjoergso is very simple. (This code is heavily commented to help you understand the 4327330f729Sjoergdata structure, but there are only 20 lines of meaningful code.) 4337330f729Sjoerg 4347330f729Sjoerg.. code-block:: c++ 4357330f729Sjoerg 4367330f729Sjoerg /// The map for a single function's stack frame. One of these is 4377330f729Sjoerg /// compiled as constant data into the executable for each function. 4387330f729Sjoerg /// 4397330f729Sjoerg /// Storage of metadata values is elided if the %metadata parameter to 4407330f729Sjoerg /// @llvm.gcroot is null. 4417330f729Sjoerg struct FrameMap { 4427330f729Sjoerg int32_t NumRoots; //< Number of roots in stack frame. 4437330f729Sjoerg int32_t NumMeta; //< Number of metadata entries. May be < NumRoots. 4447330f729Sjoerg const void *Meta[0]; //< Metadata for each root. 4457330f729Sjoerg }; 4467330f729Sjoerg 4477330f729Sjoerg /// A link in the dynamic shadow stack. One of these is embedded in 4487330f729Sjoerg /// the stack frame of each function on the call stack. 4497330f729Sjoerg struct StackEntry { 4507330f729Sjoerg StackEntry *Next; //< Link to next stack entry (the caller's). 4517330f729Sjoerg const FrameMap *Map; //< Pointer to constant FrameMap. 4527330f729Sjoerg void *Roots[0]; //< Stack roots (in-place array). 4537330f729Sjoerg }; 4547330f729Sjoerg 4557330f729Sjoerg /// The head of the singly-linked list of StackEntries. Functions push 4567330f729Sjoerg /// and pop onto this in their prologue and epilogue. 4577330f729Sjoerg /// 4587330f729Sjoerg /// Since there is only a global list, this technique is not threadsafe. 4597330f729Sjoerg StackEntry *llvm_gc_root_chain; 4607330f729Sjoerg 4617330f729Sjoerg /// Calls Visitor(root, meta) for each GC root on the stack. 4627330f729Sjoerg /// root and meta are exactly the values passed to 4637330f729Sjoerg /// @llvm.gcroot. 4647330f729Sjoerg /// 4657330f729Sjoerg /// Visitor could be a function to recursively mark live objects. Or it 4667330f729Sjoerg /// might copy them to another heap or generation. 4677330f729Sjoerg /// 4687330f729Sjoerg /// @param Visitor A function to invoke for every GC root on the stack. 4697330f729Sjoerg void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) { 4707330f729Sjoerg for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) { 4717330f729Sjoerg unsigned i = 0; 4727330f729Sjoerg 4737330f729Sjoerg // For roots [0, NumMeta), the metadata pointer is in the FrameMap. 4747330f729Sjoerg for (unsigned e = R->Map->NumMeta; i != e; ++i) 4757330f729Sjoerg Visitor(&R->Roots[i], R->Map->Meta[i]); 4767330f729Sjoerg 4777330f729Sjoerg // For roots [NumMeta, NumRoots), the metadata pointer is null. 4787330f729Sjoerg for (unsigned e = R->Map->NumRoots; i != e; ++i) 4797330f729Sjoerg Visitor(&R->Roots[i], NULL); 4807330f729Sjoerg } 4817330f729Sjoerg } 4827330f729Sjoerg 4837330f729Sjoerg 4847330f729SjoergThe 'Erlang' and 'Ocaml' GCs 4857330f729Sjoerg----------------------------- 4867330f729Sjoerg 4877330f729SjoergLLVM ships with two example collectors which leverage the ``gcroot`` 4887330f729Sjoergmechanisms. To our knowledge, these are not actually used by any language 4897330f729Sjoergruntime, but they do provide a reasonable starting point for someone interested 4907330f729Sjoergin writing an ``gcroot`` compatible GC plugin. In particular, these are the 4917330f729Sjoergonly in tree examples of how to produce a custom binary stack map format using 4927330f729Sjoerga ``gcroot`` strategy. 4937330f729Sjoerg 4947330f729SjoergAs there names imply, the binary format produced is intended to model that 4957330f729Sjoergused by the Erlang and OCaml compilers respectively. 4967330f729Sjoerg 4977330f729Sjoerg.. _statepoint_example_gc: 4987330f729Sjoerg 4997330f729SjoergThe Statepoint Example GC 5007330f729Sjoerg------------------------- 5017330f729Sjoerg 5027330f729Sjoerg.. code-block:: c++ 5037330f729Sjoerg 5047330f729Sjoerg F.setGC("statepoint-example"); 5057330f729Sjoerg 5067330f729SjoergThis GC provides an example of how one might use the infrastructure provided 5077330f729Sjoergby ``gc.statepoint``. This example GC is compatible with the 5087330f729Sjoerg:ref:`PlaceSafepoints` and :ref:`RewriteStatepointsForGC` utility passes 5097330f729Sjoergwhich simplify ``gc.statepoint`` sequence insertion. If you need to build a 5107330f729Sjoergcustom GC strategy around the ``gc.statepoints`` mechanisms, it is recommended 5117330f729Sjoergthat you use this one as a starting point. 5127330f729Sjoerg 5137330f729SjoergThis GC strategy does not support read or write barriers. As a result, these 5147330f729Sjoergintrinsics are lowered to normal loads and stores. 5157330f729Sjoerg 5167330f729SjoergThe stack map format generated by this GC strategy can be found in the 5177330f729Sjoerg:ref:`stackmap-section` using a format documented :ref:`here 5187330f729Sjoerg<statepoint-stackmap-format>`. This format is intended to be the standard 5197330f729Sjoergformat supported by LLVM going forward. 5207330f729Sjoerg 5217330f729SjoergThe CoreCLR GC 5227330f729Sjoerg------------------------- 5237330f729Sjoerg 5247330f729Sjoerg.. code-block:: c++ 5257330f729Sjoerg 5267330f729Sjoerg F.setGC("coreclr"); 5277330f729Sjoerg 5287330f729SjoergThis GC leverages the ``gc.statepoint`` mechanism to support the 5297330f729Sjoerg`CoreCLR <https://github.com/dotnet/coreclr>`__ runtime. 5307330f729Sjoerg 5317330f729SjoergSupport for this GC strategy is a work in progress. This strategy will 5327330f729Sjoergdiffer from 5337330f729Sjoerg:ref:`statepoint-example GC<statepoint_example_gc>` strategy in 5347330f729Sjoergcertain aspects like: 5357330f729Sjoerg 5367330f729Sjoerg* Base-pointers of interior pointers are not explicitly 5377330f729Sjoerg tracked and reported. 5387330f729Sjoerg 5397330f729Sjoerg* A different format is used for encoding stack maps. 5407330f729Sjoerg 5417330f729Sjoerg* Safe-point polls are only needed before loop-back edges 5427330f729Sjoerg and before tail-calls (not needed at function-entry). 5437330f729Sjoerg 5447330f729SjoergCustom GC Strategies 5457330f729Sjoerg==================== 5467330f729Sjoerg 5477330f729SjoergIf none of the built in GC strategy descriptions met your needs above, you will 5487330f729Sjoergneed to define a custom GCStrategy and possibly, a custom LLVM pass to perform 5497330f729Sjoerglowering. Your best example of where to start defining a custom GCStrategy 5507330f729Sjoergwould be to look at one of the built in strategies. 5517330f729Sjoerg 5527330f729SjoergYou may be able to structure this additional code as a loadable plugin library. 5537330f729SjoergLoadable plugins are sufficient if all you need is to enable a different 5547330f729Sjoergcombination of built in functionality, but if you need to provide a custom 5557330f729Sjoerglowering pass, you will need to build a patched version of LLVM. If you think 5567330f729Sjoergyou need a patched build, please ask for advice on llvm-dev. There may be an 5577330f729Sjoergeasy way we can extend the support to make it work for your use case without 5587330f729Sjoergrequiring a custom build. 5597330f729Sjoerg 5607330f729SjoergCollector Requirements 5617330f729Sjoerg---------------------- 5627330f729Sjoerg 5637330f729SjoergYou should be able to leverage any existing collector library that includes the following elements: 5647330f729Sjoerg 5657330f729Sjoerg#. A memory allocator which exposes an allocation function your compiled 5667330f729Sjoerg code can call. 5677330f729Sjoerg 5687330f729Sjoerg#. A binary format for the stack map. A stack map describes the location 5697330f729Sjoerg of references at a safepoint and is used by precise collectors to identify 5707330f729Sjoerg references within a stack frame on the machine stack. Note that collectors 5717330f729Sjoerg which conservatively scan the stack don't require such a structure. 5727330f729Sjoerg 5737330f729Sjoerg#. A stack crawler to discover functions on the call stack, and enumerate the 5747330f729Sjoerg references listed in the stack map for each call site. 5757330f729Sjoerg 5767330f729Sjoerg#. A mechanism for identifying references in global locations (e.g. global 5777330f729Sjoerg variables). 5787330f729Sjoerg 5797330f729Sjoerg#. If you collector requires them, an LLVM IR implementation of your collectors 5807330f729Sjoerg load and store barriers. Note that since many collectors don't require 5817330f729Sjoerg barriers at all, LLVM defaults to lowering such barriers to normal loads 5827330f729Sjoerg and stores unless you arrange otherwise. 5837330f729Sjoerg 5847330f729Sjoerg 5857330f729SjoergImplementing a collector plugin 5867330f729Sjoerg------------------------------- 5877330f729Sjoerg 5887330f729SjoergUser code specifies which GC code generation to use with the ``gc`` function 5897330f729Sjoergattribute or, equivalently, with the ``setGC`` method of ``Function``. 5907330f729Sjoerg 5917330f729SjoergTo implement a GC plugin, it is necessary to subclass ``llvm::GCStrategy``, 5927330f729Sjoergwhich can be accomplished in a few lines of boilerplate code. LLVM's 5937330f729Sjoerginfrastructure provides access to several important algorithms. For an 5947330f729Sjoerguncontroversial collector, all that remains may be to compile LLVM's computed 5957330f729Sjoergstack map to assembly code (using the binary representation expected by the 5967330f729Sjoergruntime library). This can be accomplished in about 100 lines of code. 5977330f729Sjoerg 5987330f729SjoergThis is not the appropriate place to implement a garbage collected heap or a 5997330f729Sjoerggarbage collector itself. That code should exist in the language's runtime 6007330f729Sjoerglibrary. The compiler plugin is responsible for generating code which conforms 6017330f729Sjoergto the binary interface defined by library, most essentially the :ref:`stack map 6027330f729Sjoerg<stack-map>`. 6037330f729Sjoerg 6047330f729SjoergTo subclass ``llvm::GCStrategy`` and register it with the compiler: 6057330f729Sjoerg 6067330f729Sjoerg.. code-block:: c++ 6077330f729Sjoerg 6087330f729Sjoerg // lib/MyGC/MyGC.cpp - Example LLVM GC plugin 6097330f729Sjoerg 6107330f729Sjoerg #include "llvm/CodeGen/GCStrategy.h" 6117330f729Sjoerg #include "llvm/CodeGen/GCMetadata.h" 6127330f729Sjoerg #include "llvm/Support/Compiler.h" 6137330f729Sjoerg 6147330f729Sjoerg using namespace llvm; 6157330f729Sjoerg 6167330f729Sjoerg namespace { 6177330f729Sjoerg class LLVM_LIBRARY_VISIBILITY MyGC : public GCStrategy { 6187330f729Sjoerg public: 6197330f729Sjoerg MyGC() {} 6207330f729Sjoerg }; 6217330f729Sjoerg 6227330f729Sjoerg GCRegistry::Add<MyGC> 6237330f729Sjoerg X("mygc", "My bespoke garbage collector."); 6247330f729Sjoerg } 6257330f729Sjoerg 6267330f729SjoergThis boilerplate collector does nothing. More specifically: 6277330f729Sjoerg 6287330f729Sjoerg* ``llvm.gcread`` calls are replaced with the corresponding ``load`` 6297330f729Sjoerg instruction. 6307330f729Sjoerg 6317330f729Sjoerg* ``llvm.gcwrite`` calls are replaced with the corresponding ``store`` 6327330f729Sjoerg instruction. 6337330f729Sjoerg 6347330f729Sjoerg* No safe points are added to the code. 6357330f729Sjoerg 6367330f729Sjoerg* The stack map is not compiled into the executable. 6377330f729Sjoerg 6387330f729SjoergUsing the LLVM makefiles, this code 6397330f729Sjoergcan be compiled as a plugin using a simple makefile: 6407330f729Sjoerg 6417330f729Sjoerg.. code-block:: make 6427330f729Sjoerg 6437330f729Sjoerg # lib/MyGC/Makefile 6447330f729Sjoerg 6457330f729Sjoerg LEVEL := ../.. 6467330f729Sjoerg LIBRARYNAME = MyGC 6477330f729Sjoerg LOADABLE_MODULE = 1 6487330f729Sjoerg 6497330f729Sjoerg include $(LEVEL)/Makefile.common 6507330f729Sjoerg 6517330f729SjoergOnce the plugin is compiled, code using it may be compiled using ``llc 6527330f729Sjoerg-load=MyGC.so`` (though MyGC.so may have some other platform-specific 6537330f729Sjoergextension): 6547330f729Sjoerg 6557330f729Sjoerg:: 6567330f729Sjoerg 6577330f729Sjoerg $ cat sample.ll 6587330f729Sjoerg define void @f() gc "mygc" { 6597330f729Sjoerg entry: 6607330f729Sjoerg ret void 6617330f729Sjoerg } 6627330f729Sjoerg $ llvm-as < sample.ll | llc -load=MyGC.so 6637330f729Sjoerg 6647330f729SjoergIt is also possible to statically link the collector plugin into tools, such as 6657330f729Sjoerga language-specific compiler front-end. 6667330f729Sjoerg 6677330f729Sjoerg.. _collector-algos: 6687330f729Sjoerg 6697330f729SjoergOverview of available features 6707330f729Sjoerg------------------------------ 6717330f729Sjoerg 6727330f729Sjoerg``GCStrategy`` provides a range of features through which a plugin may do useful 6737330f729Sjoergwork. Some of these are callbacks, some are algorithms that can be enabled, 6747330f729Sjoergdisabled, or customized. This matrix summarizes the supported (and planned) 6757330f729Sjoergfeatures and correlates them with the collection techniques which typically 6767330f729Sjoergrequire them. 6777330f729Sjoerg 6787330f729Sjoerg.. |v| unicode:: 0x2714 6797330f729Sjoerg :trim: 6807330f729Sjoerg 6817330f729Sjoerg.. |x| unicode:: 0x2718 6827330f729Sjoerg :trim: 6837330f729Sjoerg 6847330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 6857330f729Sjoerg| Algorithm | Done | Shadow | refcount | mark- | copying | incremental | threaded | concurrent | 6867330f729Sjoerg| | | stack | | sweep | | | | | 6877330f729Sjoerg+============+======+========+==========+=======+=========+=============+==========+============+ 6887330f729Sjoerg| stack map | |v| | | | |x| | |x| | |x| | |x| | |x| | 6897330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 6907330f729Sjoerg| initialize | |v| | |x| | |x| | |x| | |x| | |x| | |x| | |x| | 6917330f729Sjoerg| roots | | | | | | | | | 6927330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 6937330f729Sjoerg| derived | NO | | | | | | **N**\* | **N**\* | 6947330f729Sjoerg| pointers | | | | | | | | | 6957330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 6967330f729Sjoerg| **custom | |v| | | | | | | | | 6977330f729Sjoerg| lowering** | | | | | | | | | 6987330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 6997330f729Sjoerg| *gcroot* | |v| | |x| | |x| | | | | | | 7007330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7017330f729Sjoerg| *gcwrite* | |v| | | |x| | | | |x| | | |x| | 7027330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7037330f729Sjoerg| *gcread* | |v| | | | | | | | |x| | 7047330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7057330f729Sjoerg| **safe | | | | | | | | | 7067330f729Sjoerg| points** | | | | | | | | | 7077330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7087330f729Sjoerg| *in | |v| | | | |x| | |x| | |x| | |x| | |x| | 7097330f729Sjoerg| calls* | | | | | | | | | 7107330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7117330f729Sjoerg| *before | |v| | | | | | | |x| | |x| | 7127330f729Sjoerg| calls* | | | | | | | | | 7137330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7147330f729Sjoerg| *for | NO | | | | | | **N** | **N** | 7157330f729Sjoerg| loops* | | | | | | | | | 7167330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7177330f729Sjoerg| *before | |v| | | | | | | |x| | |x| | 7187330f729Sjoerg| escape* | | | | | | | | | 7197330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7207330f729Sjoerg| emit code | NO | | | | | | **N** | **N** | 7217330f729Sjoerg| at safe | | | | | | | | | 7227330f729Sjoerg| points | | | | | | | | | 7237330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7247330f729Sjoerg| **output** | | | | | | | | | 7257330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7267330f729Sjoerg| *assembly* | |v| | | | |x| | |x| | |x| | |x| | |x| | 7277330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7287330f729Sjoerg| *JIT* | NO | | | **?** | **?** | **?** | **?** | **?** | 7297330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7307330f729Sjoerg| *obj* | NO | | | **?** | **?** | **?** | **?** | **?** | 7317330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7327330f729Sjoerg| live | NO | | | **?** | **?** | **?** | **?** | **?** | 7337330f729Sjoerg| analysis | | | | | | | | | 7347330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7357330f729Sjoerg| register | NO | | | **?** | **?** | **?** | **?** | **?** | 7367330f729Sjoerg| map | | | | | | | | | 7377330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7387330f729Sjoerg| \* Derived pointers only pose a hasard to copying collections. | 7397330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7407330f729Sjoerg| **?** denotes a feature which could be utilized if available. | 7417330f729Sjoerg+------------+------+--------+----------+-------+---------+-------------+----------+------------+ 7427330f729Sjoerg 7437330f729SjoergTo be clear, the collection techniques above are defined as: 7447330f729Sjoerg 7457330f729SjoergShadow Stack 7467330f729Sjoerg The mutator carefully maintains a linked list of stack roots. 7477330f729Sjoerg 7487330f729SjoergReference Counting 7497330f729Sjoerg The mutator maintains a reference count for each object and frees an object 7507330f729Sjoerg when its count falls to zero. 7517330f729Sjoerg 7527330f729SjoergMark-Sweep 7537330f729Sjoerg When the heap is exhausted, the collector marks reachable objects starting 7547330f729Sjoerg from the roots, then deallocates unreachable objects in a sweep phase. 7557330f729Sjoerg 7567330f729SjoergCopying 7577330f729Sjoerg As reachability analysis proceeds, the collector copies objects from one heap 7587330f729Sjoerg area to another, compacting them in the process. Copying collectors enable 7597330f729Sjoerg highly efficient "bump pointer" allocation and can improve locality of 7607330f729Sjoerg reference. 7617330f729Sjoerg 7627330f729SjoergIncremental 7637330f729Sjoerg (Including generational collectors.) Incremental collectors generally have all 7647330f729Sjoerg the properties of a copying collector (regardless of whether the mature heap 7657330f729Sjoerg is compacting), but bring the added complexity of requiring write barriers. 7667330f729Sjoerg 7677330f729SjoergThreaded 7687330f729Sjoerg Denotes a multithreaded mutator; the collector must still stop the mutator 7697330f729Sjoerg ("stop the world") before beginning reachability analysis. Stopping a 7707330f729Sjoerg multithreaded mutator is a complicated problem. It generally requires highly 7717330f729Sjoerg platform-specific code in the runtime, and the production of carefully 7727330f729Sjoerg designed machine code at safe points. 7737330f729Sjoerg 7747330f729SjoergConcurrent 7757330f729Sjoerg In this technique, the mutator and the collector run concurrently, with the 7767330f729Sjoerg goal of eliminating pause times. In a *cooperative* collector, the mutator 7777330f729Sjoerg further aids with collection should a pause occur, allowing collection to take 7787330f729Sjoerg advantage of multiprocessor hosts. The "stop the world" problem of threaded 7797330f729Sjoerg collectors is generally still present to a limited extent. Sophisticated 7807330f729Sjoerg marking algorithms are necessary. Read barriers may be necessary. 7817330f729Sjoerg 7827330f729SjoergAs the matrix indicates, LLVM's garbage collection infrastructure is already 7837330f729Sjoergsuitable for a wide variety of collectors, but does not currently extend to 7847330f729Sjoergmultithreaded programs. This will be added in the future as there is 7857330f729Sjoerginterest. 7867330f729Sjoerg 7877330f729Sjoerg.. _stack-map: 7887330f729Sjoerg 7897330f729SjoergComputing stack maps 7907330f729Sjoerg-------------------- 7917330f729Sjoerg 7927330f729SjoergLLVM automatically computes a stack map. One of the most important features 7937330f729Sjoergof a ``GCStrategy`` is to compile this information into the executable in 7947330f729Sjoergthe binary representation expected by the runtime library. 7957330f729Sjoerg 7967330f729SjoergThe stack map consists of the location and identity of each GC root in the 7977330f729Sjoergeach function in the module. For each root: 7987330f729Sjoerg 7997330f729Sjoerg* ``RootNum``: The index of the root. 8007330f729Sjoerg 8017330f729Sjoerg* ``StackOffset``: The offset of the object relative to the frame pointer. 8027330f729Sjoerg 8037330f729Sjoerg* ``RootMetadata``: The value passed as the ``%metadata`` parameter to the 8047330f729Sjoerg ``@llvm.gcroot`` intrinsic. 8057330f729Sjoerg 8067330f729SjoergAlso, for the function as a whole: 8077330f729Sjoerg 8087330f729Sjoerg* ``getFrameSize()``: The overall size of the function's initial stack frame, 8097330f729Sjoerg not accounting for any dynamic allocation. 8107330f729Sjoerg 8117330f729Sjoerg* ``roots_size()``: The count of roots in the function. 8127330f729Sjoerg 8137330f729SjoergTo access the stack map, use ``GCFunctionMetadata::roots_begin()`` and 8147330f729Sjoerg-``end()`` from the :ref:`GCMetadataPrinter <assembly>`: 8157330f729Sjoerg 8167330f729Sjoerg.. code-block:: c++ 8177330f729Sjoerg 8187330f729Sjoerg for (iterator I = begin(), E = end(); I != E; ++I) { 8197330f729Sjoerg GCFunctionInfo *FI = *I; 8207330f729Sjoerg unsigned FrameSize = FI->getFrameSize(); 8217330f729Sjoerg size_t RootCount = FI->roots_size(); 8227330f729Sjoerg 8237330f729Sjoerg for (GCFunctionInfo::roots_iterator RI = FI->roots_begin(), 8247330f729Sjoerg RE = FI->roots_end(); 8257330f729Sjoerg RI != RE; ++RI) { 8267330f729Sjoerg int RootNum = RI->Num; 8277330f729Sjoerg int RootStackOffset = RI->StackOffset; 8287330f729Sjoerg Constant *RootMetadata = RI->Metadata; 8297330f729Sjoerg } 8307330f729Sjoerg } 8317330f729Sjoerg 8327330f729SjoergIf the ``llvm.gcroot`` intrinsic is eliminated before code generation by a 8337330f729Sjoergcustom lowering pass, LLVM will compute an empty stack map. This may be useful 8347330f729Sjoergfor collector plugins which implement reference counting or a shadow stack. 8357330f729Sjoerg 8367330f729Sjoerg.. _init-roots: 8377330f729Sjoerg 8387330f729SjoergInitializing roots to null 8397330f729Sjoerg--------------------------- 8407330f729Sjoerg 8417330f729SjoergIt is recommended that frontends initialize roots explicitly to avoid 8427330f729Sjoergpotentially confusing the optimizer. This prevents the GC from visiting 8437330f729Sjoerguninitialized pointers, which will almost certainly cause it to crash. 8447330f729Sjoerg 8457330f729SjoergAs a fallback, LLVM will automatically initialize each root to ``null`` 8467330f729Sjoergupon entry to the function. Support for this mode in code generation is 8477330f729Sjoerglargely a legacy detail to keep old collector implementations working. 8487330f729Sjoerg 8497330f729SjoergCustom lowering of intrinsics 8507330f729Sjoerg------------------------------ 8517330f729Sjoerg 8527330f729SjoergFor GCs which use barriers or unusual treatment of stack roots, the 8537330f729Sjoergimplementor is responsibly for providing a custom pass to lower the 8547330f729Sjoergintrinsics with the desired semantics. If you have opted in to custom 8557330f729Sjoerglowering of a particular intrinsic your pass **must** eliminate all 8567330f729Sjoerginstances of the corresponding intrinsic in functions which opt in to 8577330f729Sjoergyour GC. The best example of such a pass is the ShadowStackGC and it's 8587330f729SjoergShadowStackGCLowering pass. 8597330f729Sjoerg 8607330f729SjoergThere is currently no way to register such a custom lowering pass 8617330f729Sjoergwithout building a custom copy of LLVM. 8627330f729Sjoerg 8637330f729Sjoerg.. _safe-points: 8647330f729Sjoerg 8657330f729SjoergGenerating safe points 8667330f729Sjoerg----------------------- 8677330f729Sjoerg 8687330f729SjoergLLVM provides support for associating stackmaps with the return address of 8697330f729Sjoerga call. Any loop or return safepoints required by a given collector design 8707330f729Sjoergcan be modeled via calls to runtime routines, or potentially patchable call 8717330f729Sjoergsequences. Using gcroot, all call instructions are inferred to be possible 8727330f729Sjoergsafepoints and will thus have an associated stackmap. 8737330f729Sjoerg 8747330f729Sjoerg.. _assembly: 8757330f729Sjoerg 8767330f729SjoergEmitting assembly code: ``GCMetadataPrinter`` 8777330f729Sjoerg--------------------------------------------- 8787330f729Sjoerg 8797330f729SjoergLLVM allows a plugin to print arbitrary assembly code before and after the rest 8807330f729Sjoergof a module's assembly code. At the end of the module, the GC can compile the 8817330f729SjoergLLVM stack map into assembly code. (At the beginning, this information is not 8827330f729Sjoergyet computed.) 8837330f729Sjoerg 8847330f729SjoergSince AsmWriter and CodeGen are separate components of LLVM, a separate abstract 8857330f729Sjoergbase class and registry is provided for printing assembly code, the 8867330f729Sjoerg``GCMetadaPrinter`` and ``GCMetadataPrinterRegistry``. The AsmWriter will look 8877330f729Sjoergfor such a subclass if the ``GCStrategy`` sets ``UsesMetadata``: 8887330f729Sjoerg 8897330f729Sjoerg.. code-block:: c++ 8907330f729Sjoerg 8917330f729Sjoerg MyGC::MyGC() { 8927330f729Sjoerg UsesMetadata = true; 8937330f729Sjoerg } 8947330f729Sjoerg 8957330f729SjoergThis separation allows JIT-only clients to be smaller. 8967330f729Sjoerg 8977330f729SjoergNote that LLVM does not currently have analogous APIs to support code generation 8987330f729Sjoergin the JIT, nor using the object writers. 8997330f729Sjoerg 9007330f729Sjoerg.. code-block:: c++ 9017330f729Sjoerg 9027330f729Sjoerg // lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer 9037330f729Sjoerg 9047330f729Sjoerg #include "llvm/CodeGen/GCMetadataPrinter.h" 9057330f729Sjoerg #include "llvm/Support/Compiler.h" 9067330f729Sjoerg 9077330f729Sjoerg using namespace llvm; 9087330f729Sjoerg 9097330f729Sjoerg namespace { 9107330f729Sjoerg class LLVM_LIBRARY_VISIBILITY MyGCPrinter : public GCMetadataPrinter { 9117330f729Sjoerg public: 9127330f729Sjoerg virtual void beginAssembly(AsmPrinter &AP); 9137330f729Sjoerg 9147330f729Sjoerg virtual void finishAssembly(AsmPrinter &AP); 9157330f729Sjoerg }; 9167330f729Sjoerg 9177330f729Sjoerg GCMetadataPrinterRegistry::Add<MyGCPrinter> 9187330f729Sjoerg X("mygc", "My bespoke garbage collector."); 9197330f729Sjoerg } 9207330f729Sjoerg 9217330f729SjoergThe collector should use ``AsmPrinter`` to print portable assembly code. The 9227330f729Sjoergcollector itself contains the stack map for the entire module, and may access 9237330f729Sjoergthe ``GCFunctionInfo`` using its own ``begin()`` and ``end()`` methods. Here's 9247330f729Sjoerga realistic example: 9257330f729Sjoerg 9267330f729Sjoerg.. code-block:: c++ 9277330f729Sjoerg 9287330f729Sjoerg #include "llvm/CodeGen/AsmPrinter.h" 9297330f729Sjoerg #include "llvm/IR/Function.h" 9307330f729Sjoerg #include "llvm/IR/DataLayout.h" 9317330f729Sjoerg #include "llvm/Target/TargetAsmInfo.h" 9327330f729Sjoerg #include "llvm/Target/TargetMachine.h" 9337330f729Sjoerg 9347330f729Sjoerg void MyGCPrinter::beginAssembly(AsmPrinter &AP) { 9357330f729Sjoerg // Nothing to do. 9367330f729Sjoerg } 9377330f729Sjoerg 9387330f729Sjoerg void MyGCPrinter::finishAssembly(AsmPrinter &AP) { 9397330f729Sjoerg MCStreamer &OS = AP.OutStreamer; 9407330f729Sjoerg unsigned IntPtrSize = AP.getPointerSize(); 9417330f729Sjoerg 9427330f729Sjoerg // Put this in the data section. 9437330f729Sjoerg OS.SwitchSection(AP.getObjFileLowering().getDataSection()); 9447330f729Sjoerg 9457330f729Sjoerg // For each function... 9467330f729Sjoerg for (iterator FI = begin(), FE = end(); FI != FE; ++FI) { 9477330f729Sjoerg GCFunctionInfo &MD = **FI; 9487330f729Sjoerg 9497330f729Sjoerg // A compact GC layout. Emit this data structure: 9507330f729Sjoerg // 9517330f729Sjoerg // struct { 9527330f729Sjoerg // int32_t PointCount; 9537330f729Sjoerg // void *SafePointAddress[PointCount]; 9547330f729Sjoerg // int32_t StackFrameSize; // in words 9557330f729Sjoerg // int32_t StackArity; 9567330f729Sjoerg // int32_t LiveCount; 9577330f729Sjoerg // int32_t LiveOffsets[LiveCount]; 9587330f729Sjoerg // } __gcmap_<FUNCTIONNAME>; 9597330f729Sjoerg 9607330f729Sjoerg // Align to address width. 961*82d56013Sjoerg AP.emitAlignment(IntPtrSize == 4 ? 2 : 3); 9627330f729Sjoerg 9637330f729Sjoerg // Emit PointCount. 9647330f729Sjoerg OS.AddComment("safe point count"); 9657330f729Sjoerg AP.emitInt32(MD.size()); 9667330f729Sjoerg 9677330f729Sjoerg // And each safe point... 9687330f729Sjoerg for (GCFunctionInfo::iterator PI = MD.begin(), 9697330f729Sjoerg PE = MD.end(); PI != PE; ++PI) { 9707330f729Sjoerg // Emit the address of the safe point. 9717330f729Sjoerg OS.AddComment("safe point address"); 9727330f729Sjoerg MCSymbol *Label = PI->Label; 973*82d56013Sjoerg AP.emitLabelPlusOffset(Label/*Hi*/, 0/*Offset*/, 4/*Size*/); 9747330f729Sjoerg } 9757330f729Sjoerg 9767330f729Sjoerg // Stack information never change in safe points! Only print info from the 9777330f729Sjoerg // first call-site. 9787330f729Sjoerg GCFunctionInfo::iterator PI = MD.begin(); 9797330f729Sjoerg 9807330f729Sjoerg // Emit the stack frame size. 9817330f729Sjoerg OS.AddComment("stack frame size (in words)"); 9827330f729Sjoerg AP.emitInt32(MD.getFrameSize() / IntPtrSize); 9837330f729Sjoerg 9847330f729Sjoerg // Emit stack arity, i.e. the number of stacked arguments. 9857330f729Sjoerg unsigned RegisteredArgs = IntPtrSize == 4 ? 5 : 6; 9867330f729Sjoerg unsigned StackArity = MD.getFunction().arg_size() > RegisteredArgs ? 9877330f729Sjoerg MD.getFunction().arg_size() - RegisteredArgs : 0; 9887330f729Sjoerg OS.AddComment("stack arity"); 9897330f729Sjoerg AP.emitInt32(StackArity); 9907330f729Sjoerg 9917330f729Sjoerg // Emit the number of live roots in the function. 9927330f729Sjoerg OS.AddComment("live root count"); 9937330f729Sjoerg AP.emitInt32(MD.live_size(PI)); 9947330f729Sjoerg 9957330f729Sjoerg // And for each live root... 9967330f729Sjoerg for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI), 9977330f729Sjoerg LE = MD.live_end(PI); 9987330f729Sjoerg LI != LE; ++LI) { 9997330f729Sjoerg // Emit live root's offset within the stack frame. 10007330f729Sjoerg OS.AddComment("stack index (offset / wordsize)"); 10017330f729Sjoerg AP.emitInt32(LI->StackOffset); 10027330f729Sjoerg } 10037330f729Sjoerg } 10047330f729Sjoerg } 10057330f729Sjoerg 10067330f729SjoergReferences 10077330f729Sjoerg========== 10087330f729Sjoerg 10097330f729Sjoerg.. _appel89: 10107330f729Sjoerg 10117330f729Sjoerg[Appel89] Runtime Tags Aren't Necessary. Andrew W. Appel. Lisp and Symbolic 10127330f729SjoergComputation 19(7):703-705, July 1989. 10137330f729Sjoerg 10147330f729Sjoerg.. _goldberg91: 10157330f729Sjoerg 10167330f729Sjoerg[Goldberg91] Tag-free garbage collection for strongly typed programming 10177330f729Sjoerglanguages. Benjamin Goldberg. ACM SIGPLAN PLDI'91. 10187330f729Sjoerg 10197330f729Sjoerg.. _tolmach94: 10207330f729Sjoerg 10217330f729Sjoerg[Tolmach94] Tag-free garbage collection using explicit type parameters. Andrew 10227330f729SjoergTolmach. Proceedings of the 1994 ACM conference on LISP and functional 10237330f729Sjoergprogramming. 10247330f729Sjoerg 10257330f729Sjoerg.. _henderson02: 10267330f729Sjoerg 10277330f729Sjoerg[Henderson2002] `Accurate Garbage Collection in an Uncooperative Environment 10287330f729Sjoerg<http://citeseer.ist.psu.edu/henderson02accurate.html>`__ 1029