#
6641d29b |
| 12-Oct-2021 |
Lang Hames <lhames@gmail.com> |
Revert "[JITLink][ORC] Major JITLinkMemoryManager refactor."
This reverts commit e50aea58d59c8cfae807a7fee21c4227472c0678 while I investigate bot failures.
|
#
e50aea58 |
| 11-Oct-2021 |
Lang Hames <lhames@gmail.com> |
[JITLink][ORC] Major JITLinkMemoryManager refactor.
This commit substantially refactors the JITLinkMemoryManager API to: (1) add asynchronous versions of key operations, (2) give memory manager impl
[JITLink][ORC] Major JITLinkMemoryManager refactor.
This commit substantially refactors the JITLinkMemoryManager API to: (1) add asynchronous versions of key operations, (2) give memory manager implementations full control over link graph address layout, (3) enable more efficient tracking of allocated memory, and (4) support "allocation actions" and finalize-lifetime memory.
Together these changes provide a more usable API, and enable more powerful and efficient memory manager implementations.
To support these changes the JITLinkMemoryManager::Allocation inner class has been split into two new classes: InFlightAllocation, and FinalizedAllocation. The allocate method returns an InFlightAllocation that tracks memory (both working and executor memory) prior to finalization. The finalize method returns a FinalizedAllocation object, and the InFlightAllocation is discarded. Breaking Allocation into InFlightAllocation and FinalizedAllocation allows InFlightAllocation subclassses to be written more naturally, and FinalizedAlloc to be implemented and used efficiently (see (3) below).
In addition to the memory manager changes this commit also introduces a new MemProt type to represent memory protections (MemProt replaces use of sys::Memory::ProtectionFlags in JITLink), and a new MemDeallocPolicy type that can be used to indicate when a section should be deallocated (see (4) below).
Plugin/pass writers who were using sys::Memory::ProtectionFlags will have to switch to MemProt -- this should be straightworward. Clients with out-of-tree memory managers will need to update their implementations. Clients using in-tree memory managers should mostly be able to ignore it.
Major features:
(1) More asynchrony:
The allocate and deallocate methods are now asynchronous by default, with synchronous convenience wrappers supplied. The asynchronous versions allow clients (including JITLink) to request and deallocate memory without blocking.
(2) Improved control over graph address layout:
Instead of a SegmentRequestMap, JITLinkMemoryManager::allocate now takes a reference to the LinkGraph to be allocated. The memory manager is responsible for calculating the memory requirements for the graph, and laying out the graph (setting working and executor memory addresses) within the allocated memory. This gives memory managers full control over JIT'd memory layout. For clients that don't need or want this degree of control the new "BasicLayout" utility can be used to get a segment-based view of the graph, similar to the one provided by SegmentRequestMap. Once segment addresses are assigned the BasicLayout::apply method can be used to automatically lay out the graph.
(3) Efficient tracking of allocated memory.
The FinalizedAlloc type is a wrapper for an ExecutorAddr and requires only 64-bits to store in the controller. The meaning of the address held by the FinalizedAlloc is left up to the memory manager implementation, but the FinalizedAlloc type enforces a requirement that deallocate be called on any non-default values prior to destruction. The deallocate method takes a vector<FinalizedAlloc>, allowing for bulk deallocation of many allocations in a single call.
Memory manager implementations will typically store the address of some allocation metadata in the executor in the FinalizedAlloc, as holding this metadata in the executor is often cheaper and may allow for clean deallocation even in failure cases where the connection with the controller is lost.
(4) Support for "allocation actions" and finalize-lifetime memory.
Allocation actions are pairs (finalize_act, deallocate_act) of JITTargetAddress triples (fn, arg_buffer_addr, arg_buffer_size), that can be attached to a finalize request. At finalization time, after memory protections have been applied, each of the "finalize_act" elements will be called in order (skipping any elements whose fn value is zero) as
((char*(*)(const char *, size_t))fn)((const char *)arg_buffer_addr, (size_t)arg_buffer_size);
At deallocation time the deallocate elements will be run in reverse order (again skipping any elements where fn is zero).
The returned char * should be null to indicate success, or a non-null heap-allocated string error message to indicate failure.
These actions allow finalization and deallocation to be extended to include operations like registering and deregistering eh-frames, TLS sections, initializer and deinitializers, and language metadata sections. Previously these operations required separate callWrapper invocations. Compared to callWrapper invocations, actions require no extra IPC/RPC, reducing costs and eliminating a potential source of errors.
Finalize lifetime memory can be used to support finalize actions: Sections with finalize lifetime should be destroyed by memory managers immediately after finalization actions have been run. Finalize memory can be used to support finalize actions (e.g. with extra-metadata, or synthesized finalize actions) without incurring permanent memory overhead.
show more ...
|
#
4d7cea3d |
| 10-Oct-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Add optional RunPolicy to ExecutorProcessControl::callWrapperAsync.
The callWrapperAsync and callSPSWrapperAsync methods take a handler object that is run on the return value of the call when
[ORC] Add optional RunPolicy to ExecutorProcessControl::callWrapperAsync.
The callWrapperAsync and callSPSWrapperAsync methods take a handler object that is run on the return value of the call when it is ready. The new RunPolicy parameters allow clients to control how these handlers are run. If no policy is specified then the handler will be packaged as a GenericNamedTask and dispatched using the ExecutorProcessControl's TaskDispatch member. Callers can use the ExecutorProcessControl::RunInPlace policy to cause the handler to be run directly instead, which may be preferrable for simple handlers, or they can write their own policy object (e.g. to dispatch as some other kind of Task, rather than GenericNamedTask).
show more ...
|
#
f3411616 |
| 09-Oct-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Add TaskDispatch API and thread it through ExecutorProcessControl.
ExecutorProcessControl objects will now have a TaskDispatcher member which should be used to dispatch work (in particular, ha
[ORC] Add TaskDispatch API and thread it through ExecutorProcessControl.
ExecutorProcessControl objects will now have a TaskDispatcher member which should be used to dispatch work (in particular, handling incoming packets in the implementation of remote EPC implementations like SimpleRemoteEPC).
The GenericNamedTask template can be used to wrap function objects that are callable as 'void()' (along with an optional name to describe the task). The makeGenericNamedTask functions can be used to create GenericNamedTask instances without having to name the function object type.
In a future patch ExecutionSession will be updated to use the ExecutorProcessControl's dispatcher, instead of its DispatchTaskFunction.
show more ...
|
#
da7f993a |
| 10-Oct-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Reorder callWrapperAsync and callSPSWrapperAsync parameters.
The callee address is now the first parameter and the 'SendResult' function the second. This change improves consistentency with th
[ORC] Reorder callWrapperAsync and callSPSWrapperAsync parameters.
The callee address is now the first parameter and the 'SendResult' function the second. This change improves consistentency with the non-async functions where the callee is the first address and the return value the second.
show more ...
|
#
21a06254 |
| 27-Sep-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Switch from JITTargetAddress to ExecutorAddr for EPC-call APIs.
Part of the ongoing move to ExecutorAddr.
|
#
6fe2e9a9 |
| 27-Sep-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Hold shared_ptr<SymbolStringPool> in errors containing SymbolStringPtrs.
This allows these error values to remain valid, even if they tear down the JIT itself.
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
#
ef391df2 |
| 23-Sep-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Rename ExecutorAddress to ExecutorAddr.
Removing the 'ess' suffix improves the ergonomics without sacrificing clarity. Since this class is likely to be used more frequently in the future it's
[ORC] Rename ExecutorAddress to ExecutorAddr.
Removing the 'ess' suffix improves the ergonomics without sacrificing clarity. Since this class is likely to be used more frequently in the future it's worth some short term pain to fix this now.
show more ...
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
#
4290d0fe |
| 20-Aug-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Add 'Async' suffix to ExecutorProcessControl::MemoryAccess methods.
This prevents the async methods (which shoud be overridden by subclasses) from hiding the blocking helper methods, avoiding
[ORC] Add 'Async' suffix to ExecutorProcessControl::MemoryAccess methods.
This prevents the async methods (which shoud be overridden by subclasses) from hiding the blocking helper methods, avoiding a lot of 'using MemoryAccess::...' boilerplate.
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
2487db1f |
| 27-Jul-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Require ExecutorProcessControl when constructing an ExecutionSession.
Wrapper function call and dispatch handler helpers are moved to ExecutionSession, and existing EPC-based tools are re-writ
[ORC] Require ExecutorProcessControl when constructing an ExecutionSession.
Wrapper function call and dispatch handler helpers are moved to ExecutionSession, and existing EPC-based tools are re-written to take an ExecutionSession argument instead.
Requiring an ExecutorProcessControl instance simplifies existing EPC based utilities (which only need to take an ES now), and should encourage more utilities to use the EPC interface. It also simplifies process termination, since the session can automatically call ExecutorProcessControl::disconnect (previously this had to be done manually, and carefully ordered with the rest of JIT tear-down to work correctly).
show more ...
|
#
bb5f97e3 |
| 14-Jul-2021 |
Lang Hames <lhames@gmail.com> |
[ORC][ORC-RT] Introduce ORC-runtime based MachO-Platform.
Adds support for MachO static initializers/deinitializers and eh-frame registration via the ORC runtime.
This commit introduces cooperative
[ORC][ORC-RT] Introduce ORC-runtime based MachO-Platform.
Adds support for MachO static initializers/deinitializers and eh-frame registration via the ORC runtime.
This commit introduces cooperative support code into the ORC runtime and ORC LLVM libraries (especially the MachOPlatform class) to support macho runtime features for JIT'd code. This commit introduces support for static initializers, static destructors (via cxa_atexit interposition), and eh-frame registration. Near-future commits will add support for MachO native thread-local variables, and language runtime registration (e.g. for Objective-C and Swift).
The llvm-jitlink tool is updated to use the ORC runtime where available, and regression tests for the new MachOPlatform support are added to compiler-rt.
Notable changes on the ORC runtime side:
1. The new macho_platform.h / macho_platform.cpp files contain the bulk of the runtime-side support. This includes eh-frame registration; jit versions of dlopen, dlsym, and dlclose; a cxa_atexit interpose to record static destructors, and an '__orc_rt_macho_run_program' function that defines running a JIT'd MachO program in terms of the jit- dlopen/dlsym/dlclose functions.
2. Replaces JITTargetAddress (and casting operations) with ExecutorAddress (copied from LLVM) to improve type-safety of address management.
3. Adds serialization support for ExecutorAddress and unordered_map types to the runtime-side Simple Packed Serialization code.
4. Adds orc-runtime regression tests to ensure that static initializers and cxa-atexit interposes work as expected.
Notable changes on the LLVM side:
1. The MachOPlatform class is updated to:
1.1. Load the ORC runtime into the ExecutionSession. 1.2. Set up standard aliases for macho-specific runtime functions. E.g. ___cxa_atexit -> ___orc_rt_macho_cxa_atexit. 1.3. Install the MachOPlatformPlugin to scrape LinkGraphs for information needed to support MachO features (e.g. eh-frames, mod-inits), and communicate this information to the runtime. 1.4. Provide entry-points that the runtime can call to request initializers, perform symbol lookup, and request deinitialiers (the latter is implemented as an empty placeholder as macho object deinits are rarely used). 1.5. Create a MachO header object for each JITDylib (defining the __mh_header and __dso_handle symbols).
2. The llvm-jitlink tool (and llvm-jitlink-executor) are updated to use the runtime when available.
3. A `lookupInitSymbolsAsync` method is added to the Platform base class. This can be used to issue an async lookup for initializer symbols. The existing `lookupInitSymbols` method is retained (the GenericIRPlatform code is still using it), but is deprecated and will be removed soon.
4. JIT-dispatch support code is added to ExecutorProcessControl.
The JIT-dispatch system allows handlers in the JIT process to be associated with 'tag' symbols in the executor, and allows the executor to make remote procedure calls back to the JIT process (via __orc_rt_jit_dispatch) using those tags.
The primary use case is ORC runtime code that needs to call bakc to handlers in orc::Platform subclasses. E.g. __orc_rt_macho_jit_dlopen calling back to MachOPlatform::rt_getInitializers using __orc_rt_macho_get_initializers_tag. (The system is generic however, and could be used by non-runtime code).
The new ExecutorProcessControl::JITDispatchInfo struct provides the address (in the executor) of the jit-dispatch function and a jit-dispatch context object, and implementations of the dispatch function are added to SelfExecutorProcessControl and OrcRPCExecutorProcessControl.
5. OrcRPCTPCServer is updated to support JIT-dispatch calls over ORC-RPC.
6. Serialization support for StringMap is added to the LLVM-side Simple Packed Serialization code.
7. A JITLink::allocateBuffer operation is introduced to allocate writable memory attached to the graph. This is used by the MachO header synthesis code, and will be generically useful for other clients who want to create new graph content from scratch.
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
#
39f64c4c |
| 19-Jun-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Add wrapper-function support methods to ExecutorProcessControl.
Adds support for both synchronous and asynchronous calls to wrapper functions using SPS (Simple Packed Serialization). Also adds
[ORC] Add wrapper-function support methods to ExecutorProcessControl.
Adds support for both synchronous and asynchronous calls to wrapper functions using SPS (Simple Packed Serialization). Also adds support for wrapping functions on the JIT side in SPS-based wrappers that can be called from the executor.
These new methods simplify calls between the JIT and Executor, and will be used in upcoming ORC runtime patches to enable communication between ORC and the runtime.
show more ...
|
#
662c5544 |
| 01-Jul-2021 |
Lang Hames <lhames@gmail.com> |
[ORC] Rename TargetProcessControl to ExecutorProcessControl. NFC.
This is a first step towards consistently using the term 'executor' for the process that executes JIT'd code. I've opted for 'execut
[ORC] Rename TargetProcessControl to ExecutorProcessControl. NFC.
This is a first step towards consistently using the term 'executor' for the process that executes JIT'd code. I've opted for 'executor' as the preferred term over 'target' as target is already heavily overloaded ("the target machine for the executor" is much clearer than "the target machine for the target").
show more ...
|