blob: c5bc189af8f8067a7b0885701391d5b12a1d4fb7 [file] [log] [blame]
Lang Hames5f36a282019-05-18 03:08:49 +00001===============================
2ORC Design and Implementation
3===============================
4
Lang Hames607cd442019-07-16 21:34:59 +00005.. contents::
6 :local:
7
Lang Hames5f36a282019-05-18 03:08:49 +00008Introduction
9============
10
Lang Hames4dfa6652019-05-20 21:07:16 +000011This document aims to provide a high-level overview of the design and
Lang Hames984e8792020-11-12 10:05:43 +110012implementation of the ORC JIT APIs. Except where otherwise stated all discussion
13refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to
Florian Hahn35e461a2020-11-13 09:42:36 +000014transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`.
Lang Hames4dfa6652019-05-20 21:07:16 +000015
Lang Hames5f36a282019-05-18 03:08:49 +000016Use-cases
17=========
18
Lang Hames984e8792020-11-12 10:05:43 +110019ORC provides a modular API for building JIT compilers. There are a number
Lang Hamesa13cca42019-07-15 15:36:37 +000020of use cases for such an API. For example:
Lang Hames5f36a282019-05-18 03:08:49 +000021
Lang Hames4dfa6652019-05-20 21:07:16 +0000221. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
Nico Weberbb692082019-09-13 14:58:24 +000023compiled from a toy language: Kaleidoscope.
Lang Hames5f36a282019-05-18 03:08:49 +000024
Lang Hames4dfa6652019-05-20 21:07:16 +0000252. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
26evaluation. In this use case, cross compilation allows expressions compiled
27in the debugger process to be executed on the debug target process, which may
28be on a different device/architecture.
Lang Hames5f36a282019-05-18 03:08:49 +000029
303. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
31optimizations within an existing JIT infrastructure.
32
334. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
34
Nico Weberbb692082019-09-13 14:58:24 +000035By adopting a modular, library-based design we aim to make ORC useful in as many
Lang Hames5f36a282019-05-18 03:08:49 +000036of these contexts as possible.
37
38Features
39========
40
41ORC provides the following features:
42
Lang Hames984e8792020-11-12 10:05:43 +110043**JIT-linking**
Lang Hames0d3d5842020-01-15 13:39:43 -080044 ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
45 into a target process at runtime. The target process may be the same process
46 that contains the JIT session object and jit-linker, or may be another process
Lang Hames4dfa6652019-05-20 21:07:16 +000047 (even one running on a different machine or architecture) that communicates
48 with the JIT via RPC.
Lang Hames5f36a282019-05-18 03:08:49 +000049
Lang Hames984e8792020-11-12 10:05:43 +110050**LLVM IR compilation**
Lang Hames0d3d5842020-01-15 13:39:43 -080051 ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
52 ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
Lang Hames5f36a282019-05-18 03:08:49 +000053
Lang Hames984e8792020-11-12 10:05:43 +110054**Eager and lazy compilation**
Lang Hames0d3d5842020-01-15 13:39:43 -080055 By default, ORC will compile symbols as soon as they are looked up in the JIT
56 session object (``ExecutionSession``). Compiling eagerly by default makes it
Lang Hames984e8792020-11-12 10:05:43 +110057 easy to use ORC as an in-memory compiler for an existing JIT (similar to how
58 MCJIT is commonly used). However ORC also provides built-in support for lazy
59 compilation via lazy-reexports (see :ref:`Laziness`).
Lang Hames5f36a282019-05-18 03:08:49 +000060
Lang Hames984e8792020-11-12 10:05:43 +110061**Support for Custom Compilers and Program Representations**
Lang Hames0d3d5842020-01-15 13:39:43 -080062 Clients can supply custom compilers for each symbol that they define in their
63 JIT session. ORC will run the user-supplied compiler when the a definition of
64 a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
65 treated specially, and is supported via the same wrapper mechanism (the
Lang Hamesa13cca42019-07-15 15:36:37 +000066 ``MaterializationUnit`` class) that is used for custom compilers.
Lang Hames5f36a282019-05-18 03:08:49 +000067
Lang Hames984e8792020-11-12 10:05:43 +110068**Concurrent JIT'd code** and **Concurrent Compilation**
69 JIT'd code may be executed in multiple threads, may spawn new threads, and may
70 re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple
71 threads. Compilers launched my ORC can run concurrently (provided the client
72 sets up an appropriate dispatcher). Built-in dependency tracking ensures that
73 ORC does not release pointers to JIT'd code or data until all dependencies
74 have also been JIT'd and they are safe to call or use.
Lang Hames5f36a282019-05-18 03:08:49 +000075
Lang Hames984e8792020-11-12 10:05:43 +110076**Removable Code**
77 Resources for JIT'd program representations
78
79**Orthogonality** and **Composability**
80 Each of the features above can be used independently. It is possible to put
81 ORC components together to make a non-lazy, in-process, single threaded JIT
82 or a lazy, out-of-process, concurrent JIT, or anything in between.
Lang Hames5f36a282019-05-18 03:08:49 +000083
84LLJIT and LLLazyJIT
85===================
86
Lang Hames4dfa6652019-05-20 21:07:16 +000087ORC provides two basic JIT classes off-the-shelf. These are useful both as
88examples of how to assemble ORC components to make a JIT, and as replacements
89for earlier LLVM JIT APIs (e.g. MCJIT).
Lang Hames5f36a282019-05-18 03:08:49 +000090
Lang Hames4dfa6652019-05-20 21:07:16 +000091The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
92compilation of LLVM IR and linking of relocatable object files. All operations
93are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
94as soon as you attempt to look up its address). LLJIT is a suitable replacement
95for MCJIT in most cases (note: some more advanced features, e.g.
96JITEventListeners are not supported yet).
Lang Hames5f36a282019-05-18 03:08:49 +000097
Lang Hames4dfa6652019-05-20 21:07:16 +000098The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
99compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
100method, function bodies in that module will not be compiled until they are first
101called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
102JIT API.
103
104LLJIT and LLLazyJIT instances can be created using their respective builder
105classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
Hans Wennborge334a3a2020-01-07 16:06:14 +0100106module ``M`` loaded on a ThreadSafeContext ``Ctx``:
Lang Hames4dfa6652019-05-20 21:07:16 +0000107
108.. code-block:: c++
109
110 // Try to detect the host arch and construct an LLJIT instance.
111 auto JIT = LLJITBuilder().create();
112
113 // If we could not construct an instance, return an error.
114 if (!JIT)
115 return JIT.takeError();
116
117 // Add the module.
118 if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
119 return Err;
120
121 // Look up the JIT'd code entry point.
122 auto EntrySym = JIT->lookup("entry");
123 if (!EntrySym)
124 return EntrySym.takeError();
125
Lang Hamese4526332019-09-04 18:38:26 +0000126 // Cast the entry point address to a function pointer.
Lang Hames4dfa6652019-05-20 21:07:16 +0000127 auto *Entry = (void(*)())EntrySym.getAddress();
128
Lang Hamese4526332019-09-04 18:38:26 +0000129 // Call into JIT'd code.
Lang Hames4dfa6652019-05-20 21:07:16 +0000130 Entry();
131
Kazuaki Ishizakif65d4aa2020-01-22 11:30:57 +0800132The builder classes provide a number of configuration options that can be
Lang Hames4dfa6652019-05-20 21:07:16 +0000133specified before the JIT instance is constructed. For example:
134
Lang Hames54dc01c2019-05-20 21:33:25 +0000135.. code-block:: c++
Lang Hames4dfa6652019-05-20 21:07:16 +0000136
137 // Build an LLLazyJIT instance that uses four worker threads for compilation,
138 // and jumps to a specific error handler (rather than null) on lazy compile
139 // failures.
140
141 void handleLazyCompileFailure() {
142 // JIT'd code will jump here if lazy compilation fails, giving us an
143 // opportunity to exit or throw an exception into JIT'd code.
144 throw JITFailed();
145 }
146
147 auto JIT = LLLazyJITBuilder()
148 .setNumCompileThreads(4)
149 .setLazyCompileFailureAddr(
150 toJITTargetAddress(&handleLazyCompileFailure))
151 .create();
152
153 // ...
Lang Hames5f36a282019-05-18 03:08:49 +0000154
Lang Hames00be4e62019-05-22 21:44:46 +0000155For users wanting to get started with LLJIT a minimal example program can be
156found at ``llvm/examples/HowToUseLLJIT``.
157
Lang Hames5f36a282019-05-18 03:08:49 +0000158Design Overview
159===============
160
Lang Hames984e8792020-11-12 10:05:43 +1100161ORC's JIT program model aims to emulate the linking and symbol resolution
Lang Hames4dfa6652019-05-20 21:07:16 +0000162rules used by the static and dynamic linkers. This allows ORC to JIT
163arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
Lang Hames809e9d12019-08-02 15:21:37 +0000164clang) that uses constructs like symbol linkage and visibility, and weak [3]_
Lang Hames001a5542019-07-31 18:07:37 +0000165and common symbol definitions.
Lang Hames5f36a282019-05-18 03:08:49 +0000166
Lang Hames4dfa6652019-05-20 21:07:16 +0000167To see how this works, imagine a program ``foo`` which links against a pair
168of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
Lang Hames607cd442019-07-16 21:34:59 +0000169program might look like:
Lang Hames5f36a282019-05-18 03:08:49 +0000170
171.. code-block:: bash
172
173 $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
174 $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
175 $ clang++ -o myapp myapp.cpp -L. -lA -lB
176 $ ./myapp
177
Lang Hames984e8792020-11-12 10:05:43 +1100178In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer
179(with error checking omitted for brevity) as:
Lang Hames5f36a282019-05-18 03:08:49 +0000180
181.. code-block:: c++
182
183 ExecutionSession ES;
184 RTDyldObjectLinkingLayer ObjLinkingLayer(
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000185 ES, []() { return std::make_unique<SectionMemoryManager>(); });
Lang Hames5f36a282019-05-18 03:08:49 +0000186 CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
187
188 // Create JITDylib "A" and add code to it using the CXX layer.
189 auto &LibA = ES.createJITDylib("A");
190 CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
191 CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
192
193 // Create JITDylib "B" and add code to it using the CXX layer.
194 auto &LibB = ES.createJITDylib("B");
195 CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
196 CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
197
Lang Hames840a23b2020-04-13 12:51:46 -0700198 // Create and specify the search order for the main JITDylib. This is
199 // equivalent to a "links against" relationship in a command-line link.
200 auto &MainJD = ES.createJITDylib("main");
Lang Hames984e8792020-11-12 10:05:43 +1100201 MainJD.addToLinkOrder(&LibA);
202 MainJD.addToLinkOrder(&LibB);
Lang Hames840a23b2020-04-13 12:51:46 -0700203 CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp"));
Lang Hames5f36a282019-05-18 03:08:49 +0000204
205 // Look up the JIT'd main, cast it to a function pointer, then call it.
Lang Hames840a23b2020-04-13 12:51:46 -0700206 auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main"));
Lang Hames5f36a282019-05-18 03:08:49 +0000207 auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
208
Lang Hames0d3d5842020-01-15 13:39:43 -0800209 int Result = Main(...);
Lang Hames5f36a282019-05-18 03:08:49 +0000210
Lang Hames4dfa6652019-05-20 21:07:16 +0000211This example tells us nothing about *how* or *when* compilation will happen.
Lang Hames607cd442019-07-16 21:34:59 +0000212That will depend on the implementation of the hypothetical CXXCompilingLayer.
213The same linker-based symbol resolution rules will apply regardless of that
214implementation, however. For example, if a1.cpp and a2.cpp both define a
215function "foo" then ORCv2 will generate a duplicate definition error. On the
216other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
217dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
218should bind to the definition in LibA rather than the one in LibB, since
219main.cpp is part of the "main" dylib, and the main dylib links against LibA
220before LibB.
Lang Hames5f36a282019-05-18 03:08:49 +0000221
222Many JIT clients will have no need for this strict adherence to the usual
Lang Hames607cd442019-07-16 21:34:59 +0000223ahead-of-time linking rules, and should be able to get by just fine by putting
Lang Hames5f36a282019-05-18 03:08:49 +0000224all of their code in a single JITDylib. However, clients who want to JIT code
225for languages/projects that traditionally rely on ahead-of-time linking (e.g.
226C++) will find that this feature makes life much easier.
227
Lang Hames607cd442019-07-16 21:34:59 +0000228Symbol lookup in ORC serves two other important functions, beyond providing
229addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
230(if they have not been compiled already), and (2) it provides the
231synchronization mechanism for concurrent compilation. The pseudo-code for the
232lookup process is:
Lang Hames5f36a282019-05-18 03:08:49 +0000233
Lang Hames4dfa6652019-05-20 21:07:16 +0000234.. code-block:: none
Lang Hames5f36a282019-05-18 03:08:49 +0000235
Lang Hames4dfa6652019-05-20 21:07:16 +0000236 construct a query object from a query set and query handler
237 lock the session
238 lodge query against requested symbols, collect required materializers (if any)
239 unlock the session
240 dispatch materializers (if any)
241
242In this context a materializer is something that provides a working definition
Lang Hames607cd442019-07-16 21:34:59 +0000243of a symbol upon request. Usually materializers are just wrappers for compilers,
244but they may also wrap a jit-linker directly (if the program representation
245backing the definitions is an object file), or may even be a class that writes
246bits directly into memory (for example, if the definitions are
247stubs). Materialization is the blanket term for any actions (compiling, linking,
Nico Weberbb692082019-09-13 14:58:24 +0000248splatting bits, registering with runtimes, etc.) that are required to generate a
Lang Hames607cd442019-07-16 21:34:59 +0000249symbol definition that is safe to call or access.
Lang Hames4dfa6652019-05-20 21:07:16 +0000250
251As each materializer completes its work it notifies the JITDylib, which in turn
252notifies any query objects that are waiting on the newly materialized
253definitions. Each query object maintains a count of the number of symbols that
254it is still waiting on, and once this count reaches zero the query object calls
255the query handler with a *SymbolMap* (a map of symbol names to addresses)
256describing the result. If any symbol fails to materialize the query immediately
257calls the query handler with an error.
258
259The collected materialization units are sent to the ExecutionSession to be
260dispatched, and the dispatch behavior can be set by the client. By default each
261materializer is run on the calling thread. Clients are free to create new
262threads to run materializers, or to send the work to a work queue for a thread
263pool (this is what LLJIT/LLLazyJIT do).
Lang Hames5f36a282019-05-18 03:08:49 +0000264
265Top Level APIs
266==============
267
268Many of ORC's top-level APIs are visible in the example above:
269
270- *ExecutionSession* represents the JIT'd program and provides context for the
271 JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
272 materializers.
273
274- *JITDylibs* provide the symbol tables.
275
276- *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
277 allow clients to add uncompiled program representations supported by those
278 compilers to JITDylibs.
279
280Several other important APIs are used explicitly. JIT clients need not be aware
281of them, but Layer authors will use them:
282
283- *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
284 program representation (in this example, C++ source) in a MaterializationUnit,
285 which is then stored in the JITDylib. MaterializationUnits are responsible for
286 describing the definitions they provide, and for unwrapping the program
287 representation and passing it back to the layer when compilation is required
288 (this ownership shuffle makes writing thread-safe layers easier, since the
289 ownership of the program representation will be passed back on the stack,
290 rather than having to be fished out of a Layer member, which would require
291 synchronization).
292
293- *MaterializationResponsibility* - When a MaterializationUnit hands a program
294 representation back to the layer it comes with an associated
295 MaterializationResponsibility object. This object tracks the definitions
296 that must be materialized and provides a way to notify the JITDylib once they
297 are either successfully materialized or a failure occurs.
298
Lang Hames0d3d5842020-01-15 13:39:43 -0800299Absolute Symbols, Aliases, and Reexports
300========================================
Lang Hames5f36a282019-05-18 03:08:49 +0000301
Lang Hames0d3d5842020-01-15 13:39:43 -0800302ORC makes it easy to define symbols with absolute addresses, or symbols that
303are simply aliases of other symbols:
Lang Hames5f36a282019-05-18 03:08:49 +0000304
Lang Hames0d3d5842020-01-15 13:39:43 -0800305Absolute Symbols
306----------------
307
308Absolute symbols are symbols that map directly to addresses without requiring
309further materialization, for example: "foo" = 0x1234. One use case for
310absolute symbols is allowing resolution of process symbols. E.g.
311
312.. code-block: c++
313
314 JD.define(absoluteSymbols(SymbolMap({
315 { Mangle("printf"),
316 { pointerToJITTargetAddress(&printf),
317 JITSymbolFlags::Callable } }
318 });
319
320With this mapping established code added to the JIT can refer to printf
321symbolically rather than requiring the address of printf to be "baked in".
322This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
323to be re-used across JIT sessions as the JIT'd code no longer changes, only the
324absolute symbol definition does.
325
326For process and library symbols the DynamicLibrarySearchGenerator utility (See
Lang Hames479db972021-02-24 07:27:39 +1100327:ref:`How to Add Process and Library Symbols to JITDylibs
328<ProcessAndLibrarySymbols>`) can be used to automatically build absolute
329symbol mappings for you. However the absoluteSymbols function is still useful
330for making non-global objects in your JIT visible to JIT'd code. For example,
331imagine that your JIT standard library needs access to your JIT object to make
332some calls. We could bake the address of your object into the library, but then
333it would need to be recompiled for each session:
Lang Hames0d3d5842020-01-15 13:39:43 -0800334
335.. code-block: c++
336
337 // From standard library for JIT'd code:
338
339 class MyJIT {
340 public:
341 void log(const char *Msg);
342 };
343
344 void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
345
346We can turn this into a symbolic reference in the JIT standard library:
347
348.. code-block: c++
349
350 extern MyJIT *__MyJITInstance;
351
352 void log(const char *Msg) { __MyJITInstance->log(Msg); }
353
354And then make our JIT object visible to the JIT standard library with an
355absolute symbol definition when the JIT is started:
356
357.. code-block: c++
358
359 MyJIT J = ...;
360
361 auto &JITStdLibJD = ... ;
362
363 JITStdLibJD.define(absoluteSymbols(SymbolMap({
364 { Mangle("__MyJITInstance"),
365 { pointerToJITTargetAddress(&J), JITSymbolFlags() } }
366 });
367
368Aliases and Reexports
369---------------------
370
371Aliases and reexports allow you to define new symbols that map to existing
372symbols. This can be useful for changing linkage relationships between symbols
373across sessions without having to recompile code. For example, imagine that
374JIT'd code has access to a log function, ``void log(const char*)`` for which
375there are two implementations in the JIT standard library: ``log_fast`` and
376``log_detailed``. Your JIT can choose which one of these definitions will be
377used when the ``log`` symbol is referenced by setting up an alias at JIT startup
378time:
379
380.. code-block: c++
381
382 auto &JITStdLibJD = ... ;
383
384 auto LogImplementationSymbol =
385 Verbose ? Mangle("log_detailed") : Mangle("log_fast");
386
387 JITStdLibJD.define(
388 symbolAliases(SymbolAliasMap({
389 { Mangle("log"),
390 { LogImplementationSymbol
391 JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
392 });
393
394The ``symbolAliases`` function allows you to define aliases within a single
395JITDylib. The ``reexports`` function provides the same functionality, but
396operates across JITDylib boundaries. E.g.
397
398.. code-block: c++
399
400 auto &JD1 = ... ;
401 auto &JD2 = ... ;
402
403 // Make 'bar' in JD2 an alias for 'foo' from JD1.
404 JD2.define(
405 reexports(JD1, SymbolAliasMap({
406 { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
407 });
408
409The reexports utility can be handy for composing a single JITDylib interface by
410re-exporting symbols from several other JITDylibs.
411
412.. _Laziness:
Lang Hamesadef2f52020-01-16 21:09:54 -0800413
Lang Hames5f36a282019-05-18 03:08:49 +0000414Laziness
415========
416
Lang Hamesd6295252020-01-15 11:30:04 -0800417Laziness in ORC is provided by a utility called "lazy reexports". A lazy
418reexport is similar to a regular reexport or alias: It provides a new name for
419an existing symbol. Unlike regular reexports however, lookups of lazy reexports
420do not trigger immediate materialization of the reexported symbol. Instead, they
421only trigger materialization of a function stub. This function stub is
422initialized to point at a *lazy call-through*, which provides reentry into the
423JIT. If the stub is called at runtime then the lazy call-through will look up
424the reexported symbol (triggering materialization for it if necessary), update
425the stub (to call directly to the reexported symbol on subsequent calls), and
426then return via the reexported symbol. By re-using the existing symbol lookup
427mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
428reexports can be made from multiple threads concurrently, and the reexported
429symbol can be any state of compilation (uncompiled, already in the process of
430being compiled, or already compiled) and the call will succeed. This allows
431laziness to be safely mixed with features like remote compilation, concurrent
432compilation, concurrent JIT'd code, and speculative compilation.
Lang Hames5f36a282019-05-18 03:08:49 +0000433
Lang Hamesd6295252020-01-15 11:30:04 -0800434There is one other key difference between regular reexports and lazy reexports
435that some clients must be aware of: The address of a lazy reexport will be
436*different* from the address of the reexported symbol (whereas a regular
437reexport is guaranteed to have the same address as the reexported symbol).
438Clients who care about pointer equality will generally want to use the address
439of the reexport as the canonical address of the reexported symbol. This will
440allow the address to be taken without forcing materialization of the reexport.
441
442Usage example:
443
444If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
445``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
446``JD2`` by calling:
447
448.. code-block:: c++
449
450 auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
451 JD2.define(
452 lazyReexports(CallThroughMgr, StubsMgr, JD,
453 SymbolAliasMap({
454 { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
455 { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
456 }));
457
458A full example of how to use lazyReexports with the LLJIT class can be found at
459``llvm_project/llvm/examples/LLJITExamples/LLJITWithLazyReexports``.
Lang Hames5f36a282019-05-18 03:08:49 +0000460
461Supporting Custom Compilers
462===========================
463
464TBD.
465
Florian Hahn35e461a2020-11-13 09:42:36 +0000466.. _transitioning_orcv1_to_orcv2:
Lang Hames984e8792020-11-12 10:05:43 +1100467
Lang Hamesa13cca42019-07-15 15:36:37 +0000468Transitioning from ORCv1 to ORCv2
469=================================
Lang Hames5f36a282019-05-18 03:08:49 +0000470
Lang Hames607cd442019-07-16 21:34:59 +0000471Since LLVM 7.0, new ORC development work has focused on adding support for
472concurrent JIT compilation. The new APIs (including new layer interfaces and
473implementations, and new utilities) that support concurrency are collectively
474referred to as ORCv2, and the original, non-concurrent layers and utilities
475are now referred to as ORCv1.
Lang Hamesa13cca42019-07-15 15:36:37 +0000476
Lang Hames607cd442019-07-16 21:34:59 +0000477The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
478prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
Lang Hames44da6c22020-09-14 14:23:20 -070047912.0 ORCv1 will be removed entirely.
Lang Hamesa13cca42019-07-15 15:36:37 +0000480
Lang Hames607cd442019-07-16 21:34:59 +0000481Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
Lang Hames001a5542019-07-31 18:07:37 +0000482ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
Lang Hames607cd442019-07-16 21:34:59 +0000483substituted. However there are some design differences between ORCv1 and ORCv2
484to be aware of:
Lang Hamesa13cca42019-07-15 15:36:37 +0000485
Lang Hames607cd442019-07-16 21:34:59 +0000486 1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
487 (and other program representations, e.g. Object Files) are no longer added
488 directly to JIT classes or layers. Instead, they are added to ``JITDylib``
489 instances *by* layers. The ``JITDylib`` determines *where* the definitions
490 reside, the layers determine *how* the definitions will be compiled.
491 Linkage relationships between ``JITDylibs`` determine how inter-module
492 references are resolved, and symbol resolvers are no longer used. See the
493 section `Design Overview`_ for more details.
Lang Hamesa13cca42019-07-15 15:36:37 +0000494
Kazuaki Ishizakif65d4aa2020-01-22 11:30:57 +0800495 Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
Lang Hames840a23b2020-04-13 12:51:46 -0700496 clients should place all code in a single JITDylib.
497 MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place
498 code in LLJIT's default created main JITDylib (See
499 ``LLJIT::getMainJITDylib()``).
Lang Hamesa13cca42019-07-15 15:36:37 +0000500
Lang Hames607cd442019-07-16 21:34:59 +0000501 2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
502 manages the string pool, error reporting, synchronization, and symbol
503 lookup.
Lang Hamesa13cca42019-07-15 15:36:37 +0000504
Lang Hames607cd442019-07-16 21:34:59 +0000505 3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
506 string values in order to reduce memory overhead and improve lookup
507 performance. See the subsection `How to manage symbol strings`_.
Lang Hamesa13cca42019-07-15 15:36:37 +0000508
509 4. IR layers require ThreadSafeModule instances, rather than
Lang Hames607cd442019-07-16 21:34:59 +0000510 std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
511 Modules that use the same LLVMContext are not accessed concurrently.
512 See `How to use ThreadSafeModule and ThreadSafeContext`_.
Lang Hamesa13cca42019-07-15 15:36:37 +0000513
Lang Hames607cd442019-07-16 21:34:59 +0000514 5. Symbol lookup is no longer handled by layers. Instead, there is a
515 ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
Lang Hamesa13cca42019-07-15 15:36:37 +0000516
517 .. code-block:: c++
518
519 ExecutionSession ES;
Lang Hames607cd442019-07-16 21:34:59 +0000520 JITDylib &JD1 = ...;
521 JITDylib &JD2 = ...;
Lang Hamesa13cca42019-07-15 15:36:37 +0000522
Lang Hames607cd442019-07-16 21:34:59 +0000523 auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
Lang Hamesa13cca42019-07-15 15:36:37 +0000524
525 6. Module removal is not yet supported. There is no equivalent of the
526 layer concept removeModule/removeObject methods. Work on resource tracking
527 and removal in ORCv2 is ongoing.
Lang Hames5f36a282019-05-18 03:08:49 +0000528
Lang Hames607cd442019-07-16 21:34:59 +0000529For code examples and suggestions of how to use the ORCv2 APIs, please see
530the section `How-tos`_.
531
532How-tos
533=======
534
535How to manage symbol strings
Lang Hamesf6d6b982020-01-16 21:46:35 -0800536----------------------------
Lang Hames607cd442019-07-16 21:34:59 +0000537
538Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
539overhead, and allow symbol names to function as efficient keys. To get the
540unique ``SymbolStringPtr`` for a string value, call the
541``ExecutionSession::intern`` method:
542
543 .. code-block:: c++
544
545 ExecutionSession ES;
546 /// ...
547 auto MainSymbolName = ES.intern("main");
548
549If you wish to perform lookup using the C/IR name of a symbol you will also
550need to apply the platform linker-mangling before interning the string. On
551Linux this mangling is a no-op, but on other platforms it usually involves
552adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
553based on the DataLayout for the target. Given a DataLayout and an
554ExecutionSession, you can create a MangleAndInterner function object that
555will perform both jobs for you:
556
557 .. code-block:: c++
558
559 ExecutionSession ES;
560 const DataLayout &DL = ...;
561 MangleAndInterner Mangle(ES, DL);
562
563 // ...
564
565 // Portable IR-symbol-name lookup:
Lang Hames840a23b2020-04-13 12:51:46 -0700566 auto Sym = ES.lookup({&MainJD}, Mangle("main"));
Lang Hames607cd442019-07-16 21:34:59 +0000567
568How to create JITDylibs and set up linkage relationships
Lang Hamesf6d6b982020-01-16 21:46:35 -0800569--------------------------------------------------------
Lang Hames607cd442019-07-16 21:34:59 +0000570
571In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
572calling the ``ExecutionSession::createJITDylib`` method with a unique name:
573
574 .. code-block:: c++
575
576 ExecutionSession ES;
577 auto &JD = ES.createJITDylib("libFoo.dylib");
578
579The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
580when it is destroyed.
581
Lang Hames607cd442019-07-16 21:34:59 +0000582How to use ThreadSafeModule and ThreadSafeContext
Lang Hamesf6d6b982020-01-16 21:46:35 -0800583-------------------------------------------------
Lang Hames607cd442019-07-16 21:34:59 +0000584
585ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
586LLVMContexts respectively. A ThreadSafeModule is a pair of a
587std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
588ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
Lang Hames809e9d12019-08-02 15:21:37 +0000589This design serves two purposes: providing a locking scheme and lifetime
Lang Hames607cd442019-07-16 21:34:59 +0000590management for LLVMContexts. The ThreadSafeContext may be locked to prevent
591accidental concurrent access by two Modules that use the same LLVMContext.
592The underlying LLVMContext is freed once all ThreadSafeContext values pointing
593to it are destroyed, allowing the context memory to be reclaimed as soon as
594the Modules referring to it are destroyed.
595
596ThreadSafeContexts can be explicitly constructed from a
597std::unique_ptr<LLVMContext>:
598
599 .. code-block:: c++
Lang Hamesc23619b2019-07-16 21:41:43 +0000600
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000601 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
Lang Hames607cd442019-07-16 21:34:59 +0000602
603ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
604and a ThreadSafeContext value. ThreadSafeContext values may be shared between
605multiple ThreadSafeModules:
606
607 .. code-block:: c++
608
609 ThreadSafeModule TSM1(
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000610 std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
Lang Hames607cd442019-07-16 21:34:59 +0000611
612 ThreadSafeModule TSM2(
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000613 std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
Lang Hames607cd442019-07-16 21:34:59 +0000614
615Before using a ThreadSafeContext, clients should ensure that either the context
616is only accessible on the current thread, or that the context is locked. In the
617example above (where the context is never locked) we rely on the fact that both
618``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
Lang Hames809e9d12019-08-02 15:21:37 +0000619going to be shared between threads then it must be locked before any accessing
620or creating any Modules attached to it. E.g.
Lang Hames607cd442019-07-16 21:34:59 +0000621
622 .. code-block:: c++
623
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000624 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
Lang Hames607cd442019-07-16 21:34:59 +0000625
Jordan Rupprecht1737f712019-08-14 22:18:01 +0000626 ThreadPool TP(NumThreads);
627 JITStack J;
Lang Hames607cd442019-07-16 21:34:59 +0000628
Jordan Rupprecht1737f712019-08-14 22:18:01 +0000629 for (auto &ModulePath : ModulePaths) {
630 TP.async(
631 [&]() {
632 auto Lock = TSCtx.getLock();
633 auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
634 J.addModule(ThreadSafeModule(std::move(M), TSCtx));
635 });
636 }
Lang Hames607cd442019-07-16 21:34:59 +0000637
Jordan Rupprecht1737f712019-08-14 22:18:01 +0000638 TP.wait();
Lang Hames809e9d12019-08-02 15:21:37 +0000639
640To make exclusive access to Modules easier to manage the ThreadSafeModule class
Nico Weberbb692082019-09-13 14:58:24 +0000641provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
Lang Hames809e9d12019-08-02 15:21:37 +0000642associated context, (2) runs a given function object, (3) unlocks the context,
643and (3) returns the result generated by the function object. E.g.
644
645 .. code-block:: c++
646
647 ThreadSafeModule TSM = getModule(...);
648
649 // Dump the module:
650 size_t NumFunctionsInModule =
651 TSM.withModuleDo(
652 [](Module &M) { // <- Context locked before entering lambda.
653 return M.size();
654 } // <- Context unlocked after leaving.
655 );
Lang Hames607cd442019-07-16 21:34:59 +0000656
657Clients wishing to maximize possibilities for concurrent compilation will want
Lang Hames809e9d12019-08-02 15:21:37 +0000658to create every new ThreadSafeModule on a new ThreadSafeContext. For this
Lang Hames001a5542019-07-31 18:07:37 +0000659reason a convenience constructor for ThreadSafeModule is provided that implicitly
Lang Hames607cd442019-07-16 21:34:59 +0000660constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
661
662 .. code-block:: c++
663
664 // Maximize concurrency opportunities by loading every module on a
665 // separate context.
666 for (const auto &IRPath : IRPaths) {
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000667 auto Ctx = std::make_unique<LLVMContext>();
668 auto M = std::make_unique<LLVMContext>("M", *Ctx);
Lang Hames840a23b2020-04-13 12:51:46 -0700669 CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx)));
Lang Hames607cd442019-07-16 21:34:59 +0000670 }
671
672Clients who plan to run single-threaded may choose to save memory by loading
673all modules on the same context:
674
675 .. code-block:: c++
676
677 // Save memory by using one context for all Modules:
Jonas Devlieghere0eaee542019-08-15 15:54:37 +0000678 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
Lang Hames607cd442019-07-16 21:34:59 +0000679 for (const auto &IRPath : IRPaths) {
680 ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
Lang Hames840a23b2020-04-13 12:51:46 -0700681 CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM));
Lang Hames607cd442019-07-16 21:34:59 +0000682 }
683
Lang Hames0d3d5842020-01-15 13:39:43 -0800684.. _ProcessAndLibrarySymbols:
Lang Hamesadef2f52020-01-16 21:09:54 -0800685
Lang Hames479db972021-02-24 07:27:39 +1100686How to Add Process and Library Symbols to the JITDylibs
687=======================================================
Lang Hames607cd442019-07-16 21:34:59 +0000688
689JIT'd code typically needs access to symbols in the host program or in
690supporting libraries. References to process symbols can be "baked in" to code
691as it is compiled by turning external references into pre-resolved integer
692constants, however this ties the JIT'd code to the current process's virtual
693memory layout (meaning that it can not be cached between runs) and makes
694debugging lower level program representations difficult (as all external
695references are opaque integer values). A bettor solution is to maintain symbolic
696external references and let the jit-linker bind them for you at runtime. To
697allow the JIT linker to find these external definitions their addresses must
698be added to a JITDylib that the JIT'd definitions link against.
699
700Adding definitions for external symbols could be done using the absoluteSymbols
701function:
702
703 .. code-block:: c++
704
705 const DataLayout &DL = getDataLayout();
706 MangleAndInterner Mangle(ES, DL);
707
Lang Hames840a23b2020-04-13 12:51:46 -0700708 auto &JD = ES.createJITDylib("main");
Lang Hames607cd442019-07-16 21:34:59 +0000709
710 JD.define(
711 absoluteSymbols({
712 { Mangle("puts"), pointerToJITTargetAddress(&puts)},
713 { Mangle("gets"), pointerToJITTargetAddress(&getS)}
714 }));
715
716Manually adding absolute symbols for a large or changing interface is cumbersome
717however, so ORC provides an alternative to generate new definitions on demand:
718*definition generators*. If a definition generator is attached to a JITDylib,
719then any unsuccessful lookup on that JITDylib will fall back to calling the
720definition generator, and the definition generator may choose to generate a new
721definition for the missing symbols. Of particular use here is the
722``DynamicLibrarySearchGenerator`` utility. This can be used to reflect the whole
723exported symbol set of the process or a specific dynamic library, or a subset
724of either of these determined by a predicate.
725
726For example, to load the whole interface of a runtime library:
727
728 .. code-block:: c++
729
730 const DataLayout &DL = getDataLayout();
Lang Hames840a23b2020-04-13 12:51:46 -0700731 auto &JD = ES.createJITDylib("main");
Lang Hames607cd442019-07-16 21:34:59 +0000732
Jon Roelofs0bae9372021-04-15 15:54:28 -0700733 JD.addGenerator(DynamicLibrarySearchGenerator::Load("/path/to/lib"
Lang Hames607cd442019-07-16 21:34:59 +0000734 DL.getGlobalPrefix()));
735
736 // IR added to JD can now link against all symbols exported by the library
737 // at '/path/to/lib'.
738 CompileLayer.add(JD, loadModule(...));
739
Eric Christopher8116d012020-06-20 14:04:48 -0700740Or, to expose an allowed set of symbols from the main process:
Lang Hames607cd442019-07-16 21:34:59 +0000741
742 .. code-block:: c++
743
744 const DataLayout &DL = getDataLayout();
745 MangleAndInterner Mangle(ES, DL);
746
Lang Hames840a23b2020-04-13 12:51:46 -0700747 auto &JD = ES.createJITDylib("main");
Lang Hames607cd442019-07-16 21:34:59 +0000748
Eric Christopherae2fa772020-06-20 00:51:18 -0700749 DenseSet<SymbolStringPtr> AllowList({
Lang Hames607cd442019-07-16 21:34:59 +0000750 Mangle("puts"),
751 Mangle("gets")
752 });
753
754 // Use GetForCurrentProcess with a predicate function that checks the
Eric Christopherae2fa772020-06-20 00:51:18 -0700755 // allowed list.
Jon Roelofs0bae9372021-04-15 15:54:28 -0700756 JD.addGenerator(
Lang Hames607cd442019-07-16 21:34:59 +0000757 DynamicLibrarySearchGenerator::GetForCurrentProcess(
758 DL.getGlobalPrefix(),
Eric Christopherae2fa772020-06-20 00:51:18 -0700759 [&](const SymbolStringPtr &S) { return AllowList.count(S); }));
Lang Hames607cd442019-07-16 21:34:59 +0000760
761 // IR added to JD can now link against any symbols exported by the process
Eric Christopherae2fa772020-06-20 00:51:18 -0700762 // and contained in the list.
Lang Hames607cd442019-07-16 21:34:59 +0000763 CompileLayer.add(JD, loadModule(...));
764
Lang Hames984e8792020-11-12 10:05:43 +1100765Roadmap
766=======
Lang Hames5f36a282019-05-18 03:08:49 +0000767
Lang Hames984e8792020-11-12 10:05:43 +1100768ORC is still undergoing active development. Some current and future works are
769listed below.
770
771Current Work
772------------
773
Lang Hames48ee1ea2020-11-12 11:08:58 +11007741. **TargetProcessControl: Improvements to in-tree support for out-of-process
775 execution**
Lang Hames984e8792020-11-12 10:05:43 +1100776
777 The ``TargetProcessControl`` API provides various operations on the JIT
778 target process (the one which will execute the JIT'd code), including
779 memory allocation, memory writes, function execution, and process queries
780 (e.g. for the target triple). By targeting this API new components can be
781 developed which will work equally well for in-process and out-of-process
782 JITing.
783
784
Lang Hames48ee1ea2020-11-12 11:08:58 +11007852. **ORC RPC based TargetProcessControl implementation**
Lang Hames984e8792020-11-12 10:05:43 +1100786
787 An ORC RPC based implementation of the ``TargetProcessControl`` API is
788 currently under development to enable easy out-of-process JITing via
789 file descriptors / sockets.
790
7913. **Core State Machine Cleanup**
792
793 The core ORC state machine is currently implemented between JITDylib and
794 ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This
795 will tidy up the code base, and also allow us to support asynchronous removal
796 of JITDylibs (in practice deleting an associated state object in
797 ExecutionSession and leaving the JITDylib instance in a defunct state until
798 all references to it have been released).
799
Lang Hames984e8792020-11-12 10:05:43 +1100800Near Future Work
801----------------
802
8031. **ORC JIT Runtime Libraries**
804
805 We need a runtime library for JIT'd code. This would include things like
806 TLS registration, reentry functions, registration code for language runtimes
807 (e.g. Objective C and Swift) and other JIT specific runtime code. This should
808 be built in a similar manner to compiler-rt (possibly even as part of it).
809
Lang Hamesc7e64df2020-11-12 13:10:47 +11008102. **Remote jit_dlopen / jit_dlclose**
Lang Hames984e8792020-11-12 10:05:43 +1100811
812 To more fully mimic the environment that static programs operate in we would
Lang Hamesc7e64df2020-11-12 13:10:47 +1100813 like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of
814 their initializers/deinitializers on the current thread. This would require
815 support from the runtime library described above.
Lang Hames984e8792020-11-12 10:05:43 +1100816
8173. **Debugging support**
818
819 ORC currently supports the GDBRegistrationListener API when using RuntimeDyld
820 as the underlying JIT linker. We will need a new solution for JITLink based
821 platforms.
822
823Further Future Work
824-------------------
825
8261. **Speculative Compilation**
827
828 ORC's support for concurrent compilation allows us to easily enable
829 *speculative* JIT compilation: compilation of code that is not needed yet,
830 but which we have reason to believe will be needed in the future. This can be
831 used to hide compile latency and improve JIT throughput. A proof-of-concept
Kazu Hiratae8fa9012021-02-27 10:09:23 -0800832 example of speculative compilation with ORC has already been developed (see
Lang Hames984e8792020-11-12 10:05:43 +1100833 ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on
834 re-using and improving existing profiling support (currently used by PGO) to
835 feed speculation decisions, as well as built-in tools to simplify use of
836 speculative compilation.
Lang Hames5f36a282019-05-18 03:08:49 +0000837
838.. [1] Formats/architectures vary in terms of supported features. MachO and
Lang Hamesa13cca42019-07-15 15:36:37 +0000839 ELF tend to have better support than COFF. Patches very welcome!
840
841.. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
842 ``RemoteObjectServerLayer`` do not have counterparts in the new
843 system. In the case of ``LazyEmittingLayer`` it was simply no longer
844 needed: in ORCv2, deferring compilation until symbols are looked up is
845 the default. The removal of ``RemoteObjectClientLayer`` and
846 ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
847 across processes, however this functionality appears not to have been
848 used.
849
Lang Hames809e9d12019-08-02 15:21:37 +0000850.. [3] Weak definitions are currently handled correctly within dylibs, but if
Lang Hames001a5542019-07-31 18:07:37 +0000851 multiple dylibs provide a weak definition of a symbol then each will end
852 up with its own definition (similar to how weak definitions are handled
853 in Windows DLLs). This will be fixed in the future.