Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 1 | =============================== |
| 2 | ORC Design and Implementation |
| 3 | =============================== |
| 4 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 5 | .. contents:: |
| 6 | :local: |
| 7 | |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 8 | Introduction |
| 9 | ============ |
| 10 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 11 | This document aims to provide a high-level overview of the design and |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 12 | implementation of the ORC JIT APIs. Except where otherwise stated all discussion |
| 13 | refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to |
Florian Hahn | 35e461a | 2020-11-13 09:42:36 +0000 | [diff] [blame] | 14 | transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`. |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 15 | |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 16 | Use-cases |
| 17 | ========= |
| 18 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 19 | ORC provides a modular API for building JIT compilers. There are a number |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 20 | of use cases for such an API. For example: |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 21 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 22 | 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions |
Nico Weber | bb69208 | 2019-09-13 14:58:24 +0000 | [diff] [blame] | 23 | compiled from a toy language: Kaleidoscope. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 24 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 25 | 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression |
| 26 | evaluation. In this use case, cross compilation allows expressions compiled |
| 27 | in the debugger process to be executed on the debug target process, which may |
| 28 | be on a different device/architecture. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 29 | |
| 30 | 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's |
| 31 | optimizations within an existing JIT infrastructure. |
| 32 | |
| 33 | 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter. |
| 34 | |
Nico Weber | bb69208 | 2019-09-13 14:58:24 +0000 | [diff] [blame] | 35 | By adopting a modular, library-based design we aim to make ORC useful in as many |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 36 | of these contexts as possible. |
| 37 | |
| 38 | Features |
| 39 | ======== |
| 40 | |
| 41 | ORC provides the following features: |
| 42 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 43 | **JIT-linking** |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 44 | ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_ |
| 45 | into a target process at runtime. The target process may be the same process |
| 46 | that contains the JIT session object and jit-linker, or may be another process |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 47 | (even one running on a different machine or architecture) that communicates |
| 48 | with the JIT via RPC. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 49 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 50 | **LLVM IR compilation** |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 51 | ORC provides off the shelf components (IRCompileLayer, SimpleCompiler, |
| 52 | ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 53 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 54 | **Eager and lazy compilation** |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 55 | By default, ORC will compile symbols as soon as they are looked up in the JIT |
| 56 | session object (``ExecutionSession``). Compiling eagerly by default makes it |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 57 | easy to use ORC as an in-memory compiler for an existing JIT (similar to how |
| 58 | MCJIT is commonly used). However ORC also provides built-in support for lazy |
| 59 | compilation via lazy-reexports (see :ref:`Laziness`). |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 60 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 61 | **Support for Custom Compilers and Program Representations** |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 62 | Clients can supply custom compilers for each symbol that they define in their |
| 63 | JIT session. ORC will run the user-supplied compiler when the a definition of |
| 64 | a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not |
| 65 | treated specially, and is supported via the same wrapper mechanism (the |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 66 | ``MaterializationUnit`` class) that is used for custom compilers. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 67 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 68 | **Concurrent JIT'd code** and **Concurrent Compilation** |
| 69 | JIT'd code may be executed in multiple threads, may spawn new threads, and may |
| 70 | re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple |
| 71 | threads. Compilers launched my ORC can run concurrently (provided the client |
| 72 | sets up an appropriate dispatcher). Built-in dependency tracking ensures that |
| 73 | ORC does not release pointers to JIT'd code or data until all dependencies |
| 74 | have also been JIT'd and they are safe to call or use. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 75 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 76 | **Removable Code** |
| 77 | Resources for JIT'd program representations |
| 78 | |
| 79 | **Orthogonality** and **Composability** |
| 80 | Each of the features above can be used independently. It is possible to put |
| 81 | ORC components together to make a non-lazy, in-process, single threaded JIT |
| 82 | or a lazy, out-of-process, concurrent JIT, or anything in between. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 83 | |
| 84 | LLJIT and LLLazyJIT |
| 85 | =================== |
| 86 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 87 | ORC provides two basic JIT classes off-the-shelf. These are useful both as |
| 88 | examples of how to assemble ORC components to make a JIT, and as replacements |
| 89 | for earlier LLVM JIT APIs (e.g. MCJIT). |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 90 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 91 | The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support |
| 92 | compilation of LLVM IR and linking of relocatable object files. All operations |
| 93 | are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled |
| 94 | as soon as you attempt to look up its address). LLJIT is a suitable replacement |
| 95 | for MCJIT in most cases (note: some more advanced features, e.g. |
| 96 | JITEventListeners are not supported yet). |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 97 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 98 | The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy |
| 99 | compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule |
| 100 | method, function bodies in that module will not be compiled until they are first |
| 101 | called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT) |
| 102 | JIT API. |
| 103 | |
| 104 | LLJIT and LLLazyJIT instances can be created using their respective builder |
| 105 | classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a |
Hans Wennborg | e334a3a | 2020-01-07 16:06:14 +0100 | [diff] [blame] | 106 | module ``M`` loaded on a ThreadSafeContext ``Ctx``: |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 107 | |
| 108 | .. code-block:: c++ |
| 109 | |
| 110 | // Try to detect the host arch and construct an LLJIT instance. |
| 111 | auto JIT = LLJITBuilder().create(); |
| 112 | |
| 113 | // If we could not construct an instance, return an error. |
| 114 | if (!JIT) |
| 115 | return JIT.takeError(); |
| 116 | |
| 117 | // Add the module. |
| 118 | if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx))) |
| 119 | return Err; |
| 120 | |
| 121 | // Look up the JIT'd code entry point. |
| 122 | auto EntrySym = JIT->lookup("entry"); |
| 123 | if (!EntrySym) |
| 124 | return EntrySym.takeError(); |
| 125 | |
Lang Hames | e452633 | 2019-09-04 18:38:26 +0000 | [diff] [blame] | 126 | // Cast the entry point address to a function pointer. |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 127 | auto *Entry = (void(*)())EntrySym.getAddress(); |
| 128 | |
Lang Hames | e452633 | 2019-09-04 18:38:26 +0000 | [diff] [blame] | 129 | // Call into JIT'd code. |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 130 | Entry(); |
| 131 | |
Kazuaki Ishizaki | f65d4aa | 2020-01-22 11:30:57 +0800 | [diff] [blame] | 132 | The builder classes provide a number of configuration options that can be |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 133 | specified before the JIT instance is constructed. For example: |
| 134 | |
Lang Hames | 54dc01c | 2019-05-20 21:33:25 +0000 | [diff] [blame] | 135 | .. code-block:: c++ |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 136 | |
| 137 | // Build an LLLazyJIT instance that uses four worker threads for compilation, |
| 138 | // and jumps to a specific error handler (rather than null) on lazy compile |
| 139 | // failures. |
| 140 | |
| 141 | void handleLazyCompileFailure() { |
| 142 | // JIT'd code will jump here if lazy compilation fails, giving us an |
| 143 | // opportunity to exit or throw an exception into JIT'd code. |
| 144 | throw JITFailed(); |
| 145 | } |
| 146 | |
| 147 | auto JIT = LLLazyJITBuilder() |
| 148 | .setNumCompileThreads(4) |
| 149 | .setLazyCompileFailureAddr( |
| 150 | toJITTargetAddress(&handleLazyCompileFailure)) |
| 151 | .create(); |
| 152 | |
| 153 | // ... |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 154 | |
Lang Hames | 00be4e6 | 2019-05-22 21:44:46 +0000 | [diff] [blame] | 155 | For users wanting to get started with LLJIT a minimal example program can be |
| 156 | found at ``llvm/examples/HowToUseLLJIT``. |
| 157 | |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 158 | Design Overview |
| 159 | =============== |
| 160 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 161 | ORC's JIT program model aims to emulate the linking and symbol resolution |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 162 | rules used by the static and dynamic linkers. This allows ORC to JIT |
| 163 | arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g. |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 164 | clang) that uses constructs like symbol linkage and visibility, and weak [3]_ |
Lang Hames | 001a554 | 2019-07-31 18:07:37 +0000 | [diff] [blame] | 165 | and common symbol definitions. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 166 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 167 | To see how this works, imagine a program ``foo`` which links against a pair |
| 168 | of dynamic libraries: ``libA`` and ``libB``. On the command line, building this |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 169 | program might look like: |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 170 | |
| 171 | .. code-block:: bash |
| 172 | |
| 173 | $ clang++ -shared -o libA.dylib a1.cpp a2.cpp |
| 174 | $ clang++ -shared -o libB.dylib b1.cpp b2.cpp |
| 175 | $ clang++ -o myapp myapp.cpp -L. -lA -lB |
| 176 | $ ./myapp |
| 177 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 178 | In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer |
| 179 | (with error checking omitted for brevity) as: |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 180 | |
| 181 | .. code-block:: c++ |
| 182 | |
| 183 | ExecutionSession ES; |
| 184 | RTDyldObjectLinkingLayer ObjLinkingLayer( |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 185 | ES, []() { return std::make_unique<SectionMemoryManager>(); }); |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 186 | CXXCompileLayer CXXLayer(ES, ObjLinkingLayer); |
| 187 | |
| 188 | // Create JITDylib "A" and add code to it using the CXX layer. |
| 189 | auto &LibA = ES.createJITDylib("A"); |
| 190 | CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp")); |
| 191 | CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp")); |
| 192 | |
| 193 | // Create JITDylib "B" and add code to it using the CXX layer. |
| 194 | auto &LibB = ES.createJITDylib("B"); |
| 195 | CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp")); |
| 196 | CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp")); |
| 197 | |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 198 | // Create and specify the search order for the main JITDylib. This is |
| 199 | // equivalent to a "links against" relationship in a command-line link. |
| 200 | auto &MainJD = ES.createJITDylib("main"); |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 201 | MainJD.addToLinkOrder(&LibA); |
| 202 | MainJD.addToLinkOrder(&LibB); |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 203 | CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp")); |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 204 | |
| 205 | // Look up the JIT'd main, cast it to a function pointer, then call it. |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 206 | auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main")); |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 207 | auto *Main = (int(*)(int, char*[]))MainSym.getAddress(); |
| 208 | |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 209 | int Result = Main(...); |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 210 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 211 | This example tells us nothing about *how* or *when* compilation will happen. |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 212 | That will depend on the implementation of the hypothetical CXXCompilingLayer. |
| 213 | The same linker-based symbol resolution rules will apply regardless of that |
| 214 | implementation, however. For example, if a1.cpp and a2.cpp both define a |
| 215 | function "foo" then ORCv2 will generate a duplicate definition error. On the |
| 216 | other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different |
| 217 | dynamic libraries may define the same symbol). If main.cpp refers to "foo", it |
| 218 | should bind to the definition in LibA rather than the one in LibB, since |
| 219 | main.cpp is part of the "main" dylib, and the main dylib links against LibA |
| 220 | before LibB. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 221 | |
| 222 | Many JIT clients will have no need for this strict adherence to the usual |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 223 | ahead-of-time linking rules, and should be able to get by just fine by putting |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 224 | all of their code in a single JITDylib. However, clients who want to JIT code |
| 225 | for languages/projects that traditionally rely on ahead-of-time linking (e.g. |
| 226 | C++) will find that this feature makes life much easier. |
| 227 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 228 | Symbol lookup in ORC serves two other important functions, beyond providing |
| 229 | addresses for symbols: (1) It triggers compilation of the symbol(s) searched for |
| 230 | (if they have not been compiled already), and (2) it provides the |
| 231 | synchronization mechanism for concurrent compilation. The pseudo-code for the |
| 232 | lookup process is: |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 233 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 234 | .. code-block:: none |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 235 | |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 236 | construct a query object from a query set and query handler |
| 237 | lock the session |
| 238 | lodge query against requested symbols, collect required materializers (if any) |
| 239 | unlock the session |
| 240 | dispatch materializers (if any) |
| 241 | |
| 242 | In this context a materializer is something that provides a working definition |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 243 | of a symbol upon request. Usually materializers are just wrappers for compilers, |
| 244 | but they may also wrap a jit-linker directly (if the program representation |
| 245 | backing the definitions is an object file), or may even be a class that writes |
| 246 | bits directly into memory (for example, if the definitions are |
| 247 | stubs). Materialization is the blanket term for any actions (compiling, linking, |
Nico Weber | bb69208 | 2019-09-13 14:58:24 +0000 | [diff] [blame] | 248 | splatting bits, registering with runtimes, etc.) that are required to generate a |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 249 | symbol definition that is safe to call or access. |
Lang Hames | 4dfa665 | 2019-05-20 21:07:16 +0000 | [diff] [blame] | 250 | |
| 251 | As each materializer completes its work it notifies the JITDylib, which in turn |
| 252 | notifies any query objects that are waiting on the newly materialized |
| 253 | definitions. Each query object maintains a count of the number of symbols that |
| 254 | it is still waiting on, and once this count reaches zero the query object calls |
| 255 | the query handler with a *SymbolMap* (a map of symbol names to addresses) |
| 256 | describing the result. If any symbol fails to materialize the query immediately |
| 257 | calls the query handler with an error. |
| 258 | |
| 259 | The collected materialization units are sent to the ExecutionSession to be |
| 260 | dispatched, and the dispatch behavior can be set by the client. By default each |
| 261 | materializer is run on the calling thread. Clients are free to create new |
| 262 | threads to run materializers, or to send the work to a work queue for a thread |
| 263 | pool (this is what LLJIT/LLLazyJIT do). |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 264 | |
| 265 | Top Level APIs |
| 266 | ============== |
| 267 | |
| 268 | Many of ORC's top-level APIs are visible in the example above: |
| 269 | |
| 270 | - *ExecutionSession* represents the JIT'd program and provides context for the |
| 271 | JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the |
| 272 | materializers. |
| 273 | |
| 274 | - *JITDylibs* provide the symbol tables. |
| 275 | |
| 276 | - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and |
| 277 | allow clients to add uncompiled program representations supported by those |
| 278 | compilers to JITDylibs. |
| 279 | |
| 280 | Several other important APIs are used explicitly. JIT clients need not be aware |
| 281 | of them, but Layer authors will use them: |
| 282 | |
| 283 | - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given |
| 284 | program representation (in this example, C++ source) in a MaterializationUnit, |
| 285 | which is then stored in the JITDylib. MaterializationUnits are responsible for |
| 286 | describing the definitions they provide, and for unwrapping the program |
| 287 | representation and passing it back to the layer when compilation is required |
| 288 | (this ownership shuffle makes writing thread-safe layers easier, since the |
| 289 | ownership of the program representation will be passed back on the stack, |
| 290 | rather than having to be fished out of a Layer member, which would require |
| 291 | synchronization). |
| 292 | |
| 293 | - *MaterializationResponsibility* - When a MaterializationUnit hands a program |
| 294 | representation back to the layer it comes with an associated |
| 295 | MaterializationResponsibility object. This object tracks the definitions |
| 296 | that must be materialized and provides a way to notify the JITDylib once they |
| 297 | are either successfully materialized or a failure occurs. |
| 298 | |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 299 | Absolute Symbols, Aliases, and Reexports |
| 300 | ======================================== |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 301 | |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 302 | ORC makes it easy to define symbols with absolute addresses, or symbols that |
| 303 | are simply aliases of other symbols: |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 304 | |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 305 | Absolute Symbols |
| 306 | ---------------- |
| 307 | |
| 308 | Absolute symbols are symbols that map directly to addresses without requiring |
| 309 | further materialization, for example: "foo" = 0x1234. One use case for |
| 310 | absolute symbols is allowing resolution of process symbols. E.g. |
| 311 | |
| 312 | .. code-block: c++ |
| 313 | |
| 314 | JD.define(absoluteSymbols(SymbolMap({ |
| 315 | { Mangle("printf"), |
| 316 | { pointerToJITTargetAddress(&printf), |
| 317 | JITSymbolFlags::Callable } } |
| 318 | }); |
| 319 | |
| 320 | With this mapping established code added to the JIT can refer to printf |
| 321 | symbolically rather than requiring the address of printf to be "baked in". |
| 322 | This in turn allows cached versions of the JIT'd code (e.g. compiled objects) |
| 323 | to be re-used across JIT sessions as the JIT'd code no longer changes, only the |
| 324 | absolute symbol definition does. |
| 325 | |
| 326 | For process and library symbols the DynamicLibrarySearchGenerator utility (See |
Lang Hames | 479db97 | 2021-02-24 07:27:39 +1100 | [diff] [blame] | 327 | :ref:`How to Add Process and Library Symbols to JITDylibs |
| 328 | <ProcessAndLibrarySymbols>`) can be used to automatically build absolute |
| 329 | symbol mappings for you. However the absoluteSymbols function is still useful |
| 330 | for making non-global objects in your JIT visible to JIT'd code. For example, |
| 331 | imagine that your JIT standard library needs access to your JIT object to make |
| 332 | some calls. We could bake the address of your object into the library, but then |
| 333 | it would need to be recompiled for each session: |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 334 | |
| 335 | .. code-block: c++ |
| 336 | |
| 337 | // From standard library for JIT'd code: |
| 338 | |
| 339 | class MyJIT { |
| 340 | public: |
| 341 | void log(const char *Msg); |
| 342 | }; |
| 343 | |
| 344 | void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); } |
| 345 | |
| 346 | We can turn this into a symbolic reference in the JIT standard library: |
| 347 | |
| 348 | .. code-block: c++ |
| 349 | |
| 350 | extern MyJIT *__MyJITInstance; |
| 351 | |
| 352 | void log(const char *Msg) { __MyJITInstance->log(Msg); } |
| 353 | |
| 354 | And then make our JIT object visible to the JIT standard library with an |
| 355 | absolute symbol definition when the JIT is started: |
| 356 | |
| 357 | .. code-block: c++ |
| 358 | |
| 359 | MyJIT J = ...; |
| 360 | |
| 361 | auto &JITStdLibJD = ... ; |
| 362 | |
| 363 | JITStdLibJD.define(absoluteSymbols(SymbolMap({ |
| 364 | { Mangle("__MyJITInstance"), |
| 365 | { pointerToJITTargetAddress(&J), JITSymbolFlags() } } |
| 366 | }); |
| 367 | |
| 368 | Aliases and Reexports |
| 369 | --------------------- |
| 370 | |
| 371 | Aliases and reexports allow you to define new symbols that map to existing |
| 372 | symbols. This can be useful for changing linkage relationships between symbols |
| 373 | across sessions without having to recompile code. For example, imagine that |
| 374 | JIT'd code has access to a log function, ``void log(const char*)`` for which |
| 375 | there are two implementations in the JIT standard library: ``log_fast`` and |
| 376 | ``log_detailed``. Your JIT can choose which one of these definitions will be |
| 377 | used when the ``log`` symbol is referenced by setting up an alias at JIT startup |
| 378 | time: |
| 379 | |
| 380 | .. code-block: c++ |
| 381 | |
| 382 | auto &JITStdLibJD = ... ; |
| 383 | |
| 384 | auto LogImplementationSymbol = |
| 385 | Verbose ? Mangle("log_detailed") : Mangle("log_fast"); |
| 386 | |
| 387 | JITStdLibJD.define( |
| 388 | symbolAliases(SymbolAliasMap({ |
| 389 | { Mangle("log"), |
| 390 | { LogImplementationSymbol |
| 391 | JITSymbolFlags::Exported | JITSymbolFlags::Callable } } |
| 392 | }); |
| 393 | |
| 394 | The ``symbolAliases`` function allows you to define aliases within a single |
| 395 | JITDylib. The ``reexports`` function provides the same functionality, but |
| 396 | operates across JITDylib boundaries. E.g. |
| 397 | |
| 398 | .. code-block: c++ |
| 399 | |
| 400 | auto &JD1 = ... ; |
| 401 | auto &JD2 = ... ; |
| 402 | |
| 403 | // Make 'bar' in JD2 an alias for 'foo' from JD1. |
| 404 | JD2.define( |
| 405 | reexports(JD1, SymbolAliasMap({ |
| 406 | { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } } |
| 407 | }); |
| 408 | |
| 409 | The reexports utility can be handy for composing a single JITDylib interface by |
| 410 | re-exporting symbols from several other JITDylibs. |
| 411 | |
| 412 | .. _Laziness: |
Lang Hames | adef2f5 | 2020-01-16 21:09:54 -0800 | [diff] [blame] | 413 | |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 414 | Laziness |
| 415 | ======== |
| 416 | |
Lang Hames | d629525 | 2020-01-15 11:30:04 -0800 | [diff] [blame] | 417 | Laziness in ORC is provided by a utility called "lazy reexports". A lazy |
| 418 | reexport is similar to a regular reexport or alias: It provides a new name for |
| 419 | an existing symbol. Unlike regular reexports however, lookups of lazy reexports |
| 420 | do not trigger immediate materialization of the reexported symbol. Instead, they |
| 421 | only trigger materialization of a function stub. This function stub is |
| 422 | initialized to point at a *lazy call-through*, which provides reentry into the |
| 423 | JIT. If the stub is called at runtime then the lazy call-through will look up |
| 424 | the reexported symbol (triggering materialization for it if necessary), update |
| 425 | the stub (to call directly to the reexported symbol on subsequent calls), and |
| 426 | then return via the reexported symbol. By re-using the existing symbol lookup |
| 427 | mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy |
| 428 | reexports can be made from multiple threads concurrently, and the reexported |
| 429 | symbol can be any state of compilation (uncompiled, already in the process of |
| 430 | being compiled, or already compiled) and the call will succeed. This allows |
| 431 | laziness to be safely mixed with features like remote compilation, concurrent |
| 432 | compilation, concurrent JIT'd code, and speculative compilation. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 433 | |
Lang Hames | d629525 | 2020-01-15 11:30:04 -0800 | [diff] [blame] | 434 | There is one other key difference between regular reexports and lazy reexports |
| 435 | that some clients must be aware of: The address of a lazy reexport will be |
| 436 | *different* from the address of the reexported symbol (whereas a regular |
| 437 | reexport is guaranteed to have the same address as the reexported symbol). |
| 438 | Clients who care about pointer equality will generally want to use the address |
| 439 | of the reexport as the canonical address of the reexported symbol. This will |
| 440 | allow the address to be taken without forcing materialization of the reexport. |
| 441 | |
| 442 | Usage example: |
| 443 | |
| 444 | If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and |
| 445 | ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib |
| 446 | ``JD2`` by calling: |
| 447 | |
| 448 | .. code-block:: c++ |
| 449 | |
| 450 | auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable; |
| 451 | JD2.define( |
| 452 | lazyReexports(CallThroughMgr, StubsMgr, JD, |
| 453 | SymbolAliasMap({ |
| 454 | { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } }, |
| 455 | { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } } |
| 456 | })); |
| 457 | |
| 458 | A full example of how to use lazyReexports with the LLJIT class can be found at |
| 459 | ``llvm_project/llvm/examples/LLJITExamples/LLJITWithLazyReexports``. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 460 | |
| 461 | Supporting Custom Compilers |
| 462 | =========================== |
| 463 | |
| 464 | TBD. |
| 465 | |
Florian Hahn | 35e461a | 2020-11-13 09:42:36 +0000 | [diff] [blame] | 466 | .. _transitioning_orcv1_to_orcv2: |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 467 | |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 468 | Transitioning from ORCv1 to ORCv2 |
| 469 | ================================= |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 470 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 471 | Since LLVM 7.0, new ORC development work has focused on adding support for |
| 472 | concurrent JIT compilation. The new APIs (including new layer interfaces and |
| 473 | implementations, and new utilities) that support concurrency are collectively |
| 474 | referred to as ORCv2, and the original, non-concurrent layers and utilities |
| 475 | are now referred to as ORCv1. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 476 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 477 | The majority of the ORCv1 layers and utilities were renamed with a 'Legacy' |
| 478 | prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM |
Lang Hames | 44da6c2 | 2020-09-14 14:23:20 -0700 | [diff] [blame] | 479 | 12.0 ORCv1 will be removed entirely. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 480 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 481 | Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the |
Lang Hames | 001a554 | 2019-07-31 18:07:37 +0000 | [diff] [blame] | 482 | ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 483 | substituted. However there are some design differences between ORCv1 and ORCv2 |
| 484 | to be aware of: |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 485 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 486 | 1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules |
| 487 | (and other program representations, e.g. Object Files) are no longer added |
| 488 | directly to JIT classes or layers. Instead, they are added to ``JITDylib`` |
| 489 | instances *by* layers. The ``JITDylib`` determines *where* the definitions |
| 490 | reside, the layers determine *how* the definitions will be compiled. |
| 491 | Linkage relationships between ``JITDylibs`` determine how inter-module |
| 492 | references are resolved, and symbol resolvers are no longer used. See the |
| 493 | section `Design Overview`_ for more details. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 494 | |
Kazuaki Ishizaki | f65d4aa | 2020-01-22 11:30:57 +0800 | [diff] [blame] | 495 | Unless multiple JITDylibs are needed to model linkage relationships, ORCv1 |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 496 | clients should place all code in a single JITDylib. |
| 497 | MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place |
| 498 | code in LLJIT's default created main JITDylib (See |
| 499 | ``LLJIT::getMainJITDylib()``). |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 500 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 501 | 2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession |
| 502 | manages the string pool, error reporting, synchronization, and symbol |
| 503 | lookup. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 504 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 505 | 3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than |
| 506 | string values in order to reduce memory overhead and improve lookup |
| 507 | performance. See the subsection `How to manage symbol strings`_. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 508 | |
| 509 | 4. IR layers require ThreadSafeModule instances, rather than |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 510 | std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that |
| 511 | Modules that use the same LLVMContext are not accessed concurrently. |
| 512 | See `How to use ThreadSafeModule and ThreadSafeContext`_. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 513 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 514 | 5. Symbol lookup is no longer handled by layers. Instead, there is a |
| 515 | ``lookup`` method on JITDylib that takes a list of JITDylibs to scan. |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 516 | |
| 517 | .. code-block:: c++ |
| 518 | |
| 519 | ExecutionSession ES; |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 520 | JITDylib &JD1 = ...; |
| 521 | JITDylib &JD2 = ...; |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 522 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 523 | auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main")); |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 524 | |
| 525 | 6. Module removal is not yet supported. There is no equivalent of the |
| 526 | layer concept removeModule/removeObject methods. Work on resource tracking |
| 527 | and removal in ORCv2 is ongoing. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 528 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 529 | For code examples and suggestions of how to use the ORCv2 APIs, please see |
| 530 | the section `How-tos`_. |
| 531 | |
| 532 | How-tos |
| 533 | ======= |
| 534 | |
| 535 | How to manage symbol strings |
Lang Hames | f6d6b98 | 2020-01-16 21:46:35 -0800 | [diff] [blame] | 536 | ---------------------------- |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 537 | |
| 538 | Symbol strings in ORC are uniqued to improve lookup performance, reduce memory |
| 539 | overhead, and allow symbol names to function as efficient keys. To get the |
| 540 | unique ``SymbolStringPtr`` for a string value, call the |
| 541 | ``ExecutionSession::intern`` method: |
| 542 | |
| 543 | .. code-block:: c++ |
| 544 | |
| 545 | ExecutionSession ES; |
| 546 | /// ... |
| 547 | auto MainSymbolName = ES.intern("main"); |
| 548 | |
| 549 | If you wish to perform lookup using the C/IR name of a symbol you will also |
| 550 | need to apply the platform linker-mangling before interning the string. On |
| 551 | Linux this mangling is a no-op, but on other platforms it usually involves |
| 552 | adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is |
| 553 | based on the DataLayout for the target. Given a DataLayout and an |
| 554 | ExecutionSession, you can create a MangleAndInterner function object that |
| 555 | will perform both jobs for you: |
| 556 | |
| 557 | .. code-block:: c++ |
| 558 | |
| 559 | ExecutionSession ES; |
| 560 | const DataLayout &DL = ...; |
| 561 | MangleAndInterner Mangle(ES, DL); |
| 562 | |
| 563 | // ... |
| 564 | |
| 565 | // Portable IR-symbol-name lookup: |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 566 | auto Sym = ES.lookup({&MainJD}, Mangle("main")); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 567 | |
| 568 | How to create JITDylibs and set up linkage relationships |
Lang Hames | f6d6b98 | 2020-01-16 21:46:35 -0800 | [diff] [blame] | 569 | -------------------------------------------------------- |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 570 | |
| 571 | In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by |
| 572 | calling the ``ExecutionSession::createJITDylib`` method with a unique name: |
| 573 | |
| 574 | .. code-block:: c++ |
| 575 | |
| 576 | ExecutionSession ES; |
| 577 | auto &JD = ES.createJITDylib("libFoo.dylib"); |
| 578 | |
| 579 | The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed |
| 580 | when it is destroyed. |
| 581 | |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 582 | How to use ThreadSafeModule and ThreadSafeContext |
Lang Hames | f6d6b98 | 2020-01-16 21:46:35 -0800 | [diff] [blame] | 583 | ------------------------------------------------- |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 584 | |
| 585 | ThreadSafeModule and ThreadSafeContext are wrappers around Modules and |
| 586 | LLVMContexts respectively. A ThreadSafeModule is a pair of a |
| 587 | std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A |
| 588 | ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock. |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 589 | This design serves two purposes: providing a locking scheme and lifetime |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 590 | management for LLVMContexts. The ThreadSafeContext may be locked to prevent |
| 591 | accidental concurrent access by two Modules that use the same LLVMContext. |
| 592 | The underlying LLVMContext is freed once all ThreadSafeContext values pointing |
| 593 | to it are destroyed, allowing the context memory to be reclaimed as soon as |
| 594 | the Modules referring to it are destroyed. |
| 595 | |
| 596 | ThreadSafeContexts can be explicitly constructed from a |
| 597 | std::unique_ptr<LLVMContext>: |
| 598 | |
| 599 | .. code-block:: c++ |
Lang Hames | c23619b | 2019-07-16 21:41:43 +0000 | [diff] [blame] | 600 | |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 601 | ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 602 | |
| 603 | ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module> |
| 604 | and a ThreadSafeContext value. ThreadSafeContext values may be shared between |
| 605 | multiple ThreadSafeModules: |
| 606 | |
| 607 | .. code-block:: c++ |
| 608 | |
| 609 | ThreadSafeModule TSM1( |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 610 | std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 611 | |
| 612 | ThreadSafeModule TSM2( |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 613 | std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 614 | |
| 615 | Before using a ThreadSafeContext, clients should ensure that either the context |
| 616 | is only accessible on the current thread, or that the context is locked. In the |
| 617 | example above (where the context is never locked) we rely on the fact that both |
| 618 | ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 619 | going to be shared between threads then it must be locked before any accessing |
| 620 | or creating any Modules attached to it. E.g. |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 621 | |
| 622 | .. code-block:: c++ |
| 623 | |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 624 | ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 625 | |
Jordan Rupprecht | 1737f71 | 2019-08-14 22:18:01 +0000 | [diff] [blame] | 626 | ThreadPool TP(NumThreads); |
| 627 | JITStack J; |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 628 | |
Jordan Rupprecht | 1737f71 | 2019-08-14 22:18:01 +0000 | [diff] [blame] | 629 | for (auto &ModulePath : ModulePaths) { |
| 630 | TP.async( |
| 631 | [&]() { |
| 632 | auto Lock = TSCtx.getLock(); |
| 633 | auto M = loadModuleOnContext(ModulePath, TSCtx.getContext()); |
| 634 | J.addModule(ThreadSafeModule(std::move(M), TSCtx)); |
| 635 | }); |
| 636 | } |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 637 | |
Jordan Rupprecht | 1737f71 | 2019-08-14 22:18:01 +0000 | [diff] [blame] | 638 | TP.wait(); |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 639 | |
| 640 | To make exclusive access to Modules easier to manage the ThreadSafeModule class |
Nico Weber | bb69208 | 2019-09-13 14:58:24 +0000 | [diff] [blame] | 641 | provides a convenience function, ``withModuleDo``, that implicitly (1) locks the |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 642 | associated context, (2) runs a given function object, (3) unlocks the context, |
| 643 | and (3) returns the result generated by the function object. E.g. |
| 644 | |
| 645 | .. code-block:: c++ |
| 646 | |
| 647 | ThreadSafeModule TSM = getModule(...); |
| 648 | |
| 649 | // Dump the module: |
| 650 | size_t NumFunctionsInModule = |
| 651 | TSM.withModuleDo( |
| 652 | [](Module &M) { // <- Context locked before entering lambda. |
| 653 | return M.size(); |
| 654 | } // <- Context unlocked after leaving. |
| 655 | ); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 656 | |
| 657 | Clients wishing to maximize possibilities for concurrent compilation will want |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 658 | to create every new ThreadSafeModule on a new ThreadSafeContext. For this |
Lang Hames | 001a554 | 2019-07-31 18:07:37 +0000 | [diff] [blame] | 659 | reason a convenience constructor for ThreadSafeModule is provided that implicitly |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 660 | constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>: |
| 661 | |
| 662 | .. code-block:: c++ |
| 663 | |
| 664 | // Maximize concurrency opportunities by loading every module on a |
| 665 | // separate context. |
| 666 | for (const auto &IRPath : IRPaths) { |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 667 | auto Ctx = std::make_unique<LLVMContext>(); |
| 668 | auto M = std::make_unique<LLVMContext>("M", *Ctx); |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 669 | CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx))); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 670 | } |
| 671 | |
| 672 | Clients who plan to run single-threaded may choose to save memory by loading |
| 673 | all modules on the same context: |
| 674 | |
| 675 | .. code-block:: c++ |
| 676 | |
| 677 | // Save memory by using one context for all Modules: |
Jonas Devlieghere | 0eaee54 | 2019-08-15 15:54:37 +0000 | [diff] [blame] | 678 | ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 679 | for (const auto &IRPath : IRPaths) { |
| 680 | ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx); |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 681 | CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM)); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 682 | } |
| 683 | |
Lang Hames | 0d3d584 | 2020-01-15 13:39:43 -0800 | [diff] [blame] | 684 | .. _ProcessAndLibrarySymbols: |
Lang Hames | adef2f5 | 2020-01-16 21:09:54 -0800 | [diff] [blame] | 685 | |
Lang Hames | 479db97 | 2021-02-24 07:27:39 +1100 | [diff] [blame] | 686 | How to Add Process and Library Symbols to the JITDylibs |
| 687 | ======================================================= |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 688 | |
| 689 | JIT'd code typically needs access to symbols in the host program or in |
| 690 | supporting libraries. References to process symbols can be "baked in" to code |
| 691 | as it is compiled by turning external references into pre-resolved integer |
| 692 | constants, however this ties the JIT'd code to the current process's virtual |
| 693 | memory layout (meaning that it can not be cached between runs) and makes |
| 694 | debugging lower level program representations difficult (as all external |
| 695 | references are opaque integer values). A bettor solution is to maintain symbolic |
| 696 | external references and let the jit-linker bind them for you at runtime. To |
| 697 | allow the JIT linker to find these external definitions their addresses must |
| 698 | be added to a JITDylib that the JIT'd definitions link against. |
| 699 | |
| 700 | Adding definitions for external symbols could be done using the absoluteSymbols |
| 701 | function: |
| 702 | |
| 703 | .. code-block:: c++ |
| 704 | |
| 705 | const DataLayout &DL = getDataLayout(); |
| 706 | MangleAndInterner Mangle(ES, DL); |
| 707 | |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 708 | auto &JD = ES.createJITDylib("main"); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 709 | |
| 710 | JD.define( |
| 711 | absoluteSymbols({ |
| 712 | { Mangle("puts"), pointerToJITTargetAddress(&puts)}, |
| 713 | { Mangle("gets"), pointerToJITTargetAddress(&getS)} |
| 714 | })); |
| 715 | |
| 716 | Manually adding absolute symbols for a large or changing interface is cumbersome |
| 717 | however, so ORC provides an alternative to generate new definitions on demand: |
| 718 | *definition generators*. If a definition generator is attached to a JITDylib, |
| 719 | then any unsuccessful lookup on that JITDylib will fall back to calling the |
| 720 | definition generator, and the definition generator may choose to generate a new |
| 721 | definition for the missing symbols. Of particular use here is the |
| 722 | ``DynamicLibrarySearchGenerator`` utility. This can be used to reflect the whole |
| 723 | exported symbol set of the process or a specific dynamic library, or a subset |
| 724 | of either of these determined by a predicate. |
| 725 | |
| 726 | For example, to load the whole interface of a runtime library: |
| 727 | |
| 728 | .. code-block:: c++ |
| 729 | |
| 730 | const DataLayout &DL = getDataLayout(); |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 731 | auto &JD = ES.createJITDylib("main"); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 732 | |
Jon Roelofs | 0bae937 | 2021-04-15 15:54:28 -0700 | [diff] [blame] | 733 | JD.addGenerator(DynamicLibrarySearchGenerator::Load("/path/to/lib" |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 734 | DL.getGlobalPrefix())); |
| 735 | |
| 736 | // IR added to JD can now link against all symbols exported by the library |
| 737 | // at '/path/to/lib'. |
| 738 | CompileLayer.add(JD, loadModule(...)); |
| 739 | |
Eric Christopher | 8116d01 | 2020-06-20 14:04:48 -0700 | [diff] [blame] | 740 | Or, to expose an allowed set of symbols from the main process: |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 741 | |
| 742 | .. code-block:: c++ |
| 743 | |
| 744 | const DataLayout &DL = getDataLayout(); |
| 745 | MangleAndInterner Mangle(ES, DL); |
| 746 | |
Lang Hames | 840a23b | 2020-04-13 12:51:46 -0700 | [diff] [blame] | 747 | auto &JD = ES.createJITDylib("main"); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 748 | |
Eric Christopher | ae2fa77 | 2020-06-20 00:51:18 -0700 | [diff] [blame] | 749 | DenseSet<SymbolStringPtr> AllowList({ |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 750 | Mangle("puts"), |
| 751 | Mangle("gets") |
| 752 | }); |
| 753 | |
| 754 | // Use GetForCurrentProcess with a predicate function that checks the |
Eric Christopher | ae2fa77 | 2020-06-20 00:51:18 -0700 | [diff] [blame] | 755 | // allowed list. |
Jon Roelofs | 0bae937 | 2021-04-15 15:54:28 -0700 | [diff] [blame] | 756 | JD.addGenerator( |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 757 | DynamicLibrarySearchGenerator::GetForCurrentProcess( |
| 758 | DL.getGlobalPrefix(), |
Eric Christopher | ae2fa77 | 2020-06-20 00:51:18 -0700 | [diff] [blame] | 759 | [&](const SymbolStringPtr &S) { return AllowList.count(S); })); |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 760 | |
| 761 | // IR added to JD can now link against any symbols exported by the process |
Eric Christopher | ae2fa77 | 2020-06-20 00:51:18 -0700 | [diff] [blame] | 762 | // and contained in the list. |
Lang Hames | 607cd44 | 2019-07-16 21:34:59 +0000 | [diff] [blame] | 763 | CompileLayer.add(JD, loadModule(...)); |
| 764 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 765 | Roadmap |
| 766 | ======= |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 767 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 768 | ORC is still undergoing active development. Some current and future works are |
| 769 | listed below. |
| 770 | |
| 771 | Current Work |
| 772 | ------------ |
| 773 | |
Lang Hames | 48ee1ea | 2020-11-12 11:08:58 +1100 | [diff] [blame] | 774 | 1. **TargetProcessControl: Improvements to in-tree support for out-of-process |
| 775 | execution** |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 776 | |
| 777 | The ``TargetProcessControl`` API provides various operations on the JIT |
| 778 | target process (the one which will execute the JIT'd code), including |
| 779 | memory allocation, memory writes, function execution, and process queries |
| 780 | (e.g. for the target triple). By targeting this API new components can be |
| 781 | developed which will work equally well for in-process and out-of-process |
| 782 | JITing. |
| 783 | |
| 784 | |
Lang Hames | 48ee1ea | 2020-11-12 11:08:58 +1100 | [diff] [blame] | 785 | 2. **ORC RPC based TargetProcessControl implementation** |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 786 | |
| 787 | An ORC RPC based implementation of the ``TargetProcessControl`` API is |
| 788 | currently under development to enable easy out-of-process JITing via |
| 789 | file descriptors / sockets. |
| 790 | |
| 791 | 3. **Core State Machine Cleanup** |
| 792 | |
| 793 | The core ORC state machine is currently implemented between JITDylib and |
| 794 | ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This |
| 795 | will tidy up the code base, and also allow us to support asynchronous removal |
| 796 | of JITDylibs (in practice deleting an associated state object in |
| 797 | ExecutionSession and leaving the JITDylib instance in a defunct state until |
| 798 | all references to it have been released). |
| 799 | |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 800 | Near Future Work |
| 801 | ---------------- |
| 802 | |
| 803 | 1. **ORC JIT Runtime Libraries** |
| 804 | |
| 805 | We need a runtime library for JIT'd code. This would include things like |
| 806 | TLS registration, reentry functions, registration code for language runtimes |
| 807 | (e.g. Objective C and Swift) and other JIT specific runtime code. This should |
| 808 | be built in a similar manner to compiler-rt (possibly even as part of it). |
| 809 | |
Lang Hames | c7e64df | 2020-11-12 13:10:47 +1100 | [diff] [blame] | 810 | 2. **Remote jit_dlopen / jit_dlclose** |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 811 | |
| 812 | To more fully mimic the environment that static programs operate in we would |
Lang Hames | c7e64df | 2020-11-12 13:10:47 +1100 | [diff] [blame] | 813 | like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of |
| 814 | their initializers/deinitializers on the current thread. This would require |
| 815 | support from the runtime library described above. |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 816 | |
| 817 | 3. **Debugging support** |
| 818 | |
| 819 | ORC currently supports the GDBRegistrationListener API when using RuntimeDyld |
| 820 | as the underlying JIT linker. We will need a new solution for JITLink based |
| 821 | platforms. |
| 822 | |
| 823 | Further Future Work |
| 824 | ------------------- |
| 825 | |
| 826 | 1. **Speculative Compilation** |
| 827 | |
| 828 | ORC's support for concurrent compilation allows us to easily enable |
| 829 | *speculative* JIT compilation: compilation of code that is not needed yet, |
| 830 | but which we have reason to believe will be needed in the future. This can be |
| 831 | used to hide compile latency and improve JIT throughput. A proof-of-concept |
Kazu Hirata | e8fa901 | 2021-02-27 10:09:23 -0800 | [diff] [blame] | 832 | example of speculative compilation with ORC has already been developed (see |
Lang Hames | 984e879 | 2020-11-12 10:05:43 +1100 | [diff] [blame] | 833 | ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on |
| 834 | re-using and improving existing profiling support (currently used by PGO) to |
| 835 | feed speculation decisions, as well as built-in tools to simplify use of |
| 836 | speculative compilation. |
Lang Hames | 5f36a28 | 2019-05-18 03:08:49 +0000 | [diff] [blame] | 837 | |
| 838 | .. [1] Formats/architectures vary in terms of supported features. MachO and |
Lang Hames | a13cca4 | 2019-07-15 15:36:37 +0000 | [diff] [blame] | 839 | ELF tend to have better support than COFF. Patches very welcome! |
| 840 | |
| 841 | .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and |
| 842 | ``RemoteObjectServerLayer`` do not have counterparts in the new |
| 843 | system. In the case of ``LazyEmittingLayer`` it was simply no longer |
| 844 | needed: in ORCv2, deferring compilation until symbols are looked up is |
| 845 | the default. The removal of ``RemoteObjectClientLayer`` and |
| 846 | ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split |
| 847 | across processes, however this functionality appears not to have been |
| 848 | used. |
| 849 | |
Lang Hames | 809e9d1 | 2019-08-02 15:21:37 +0000 | [diff] [blame] | 850 | .. [3] Weak definitions are currently handled correctly within dylibs, but if |
Lang Hames | 001a554 | 2019-07-31 18:07:37 +0000 | [diff] [blame] | 851 | multiple dylibs provide a weak definition of a symbol then each will end |
| 852 | up with its own definition (similar to how weak definitions are handled |
| 853 | in Windows DLLs). This will be fixed in the future. |