| ============================== |
| LLVM Language Reference Manual |
| ============================== |
| |
| .. contents:: |
| :local: |
| :depth: 3 |
| |
| Abstract |
| ======== |
| |
| This document is a reference manual for the LLVM assembly language. LLVM |
| is a Static Single Assignment (SSA) based representation that provides |
| type safety, low-level operations, flexibility, and the capability of |
| representing 'all' high-level languages cleanly. It is the common code |
| representation used throughout all phases of the LLVM compilation |
| strategy. |
| |
| Introduction |
| ============ |
| |
| The LLVM code representation is designed to be used in three different |
| forms: as an in-memory compiler IR, as an on-disk bitcode representation |
| (suitable for fast loading by a Just-In-Time compiler), and as a human |
| readable assembly language representation. This allows LLVM to provide a |
| powerful intermediate representation for efficient compiler |
| transformations and analysis, while providing a natural means to debug |
| and visualize the transformations. The three different forms of LLVM are |
| all equivalent. This document describes the human readable |
| representation and notation. |
| |
| The LLVM representation aims to be light-weight and low-level while |
| being expressive, typed, and extensible at the same time. It aims to be |
| a "universal IR" of sorts, by being at a low enough level that |
| high-level ideas may be cleanly mapped to it (similar to how |
| microprocessors are "universal IR's", allowing many source languages to |
| be mapped to them). By providing type information, LLVM can be used as |
| the target of optimizations: for example, through pointer analysis, it |
| can be proven that a C automatic variable is never accessed outside of |
| the current function, allowing it to be promoted to a simple SSA value |
| instead of a memory location. |
| |
| .. _wellformed: |
| |
| Well-Formedness |
| --------------- |
| |
| It is important to note that this document describes 'well formed' LLVM |
| assembly language. There is a difference between what the parser accepts |
| and what is considered 'well formed'. For example, the following |
| instruction is syntactically okay, but not well formed: |
| |
| .. code-block:: llvm |
| |
| %x = add i32 1, %x |
| |
| because the definition of ``%x`` does not dominate all of its uses. The |
| LLVM infrastructure provides a verification pass that may be used to |
| verify that an LLVM module is well formed. This pass is automatically |
| run by the parser after parsing input assembly and by the optimizer |
| before it outputs bitcode. The violations pointed out by the verifier |
| pass indicate bugs in transformation passes or input to the parser. |
| |
| .. _identifiers: |
| |
| Identifiers |
| =========== |
| |
| LLVM identifiers come in two basic types: global and local. Global |
| identifiers (functions, global variables) begin with the ``'@'`` |
| character. Local identifiers (register names, types) begin with the |
| ``'%'`` character. Additionally, there are three different formats for |
| identifiers, for different purposes: |
| |
| #. Named values are represented as a string of characters with their |
| prefix. For example, ``%foo``, ``@DivisionByZero``, |
| ``%a.really.long.identifier``. The actual regular expression used is |
| '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other |
| characters in their names can be surrounded with quotes. Special |
| characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII |
| code for the character in hexadecimal. In this way, any character can |
| be used in a name value, even quotes themselves. The ``"\01"`` prefix |
| can be used on global values to suppress mangling. |
| #. Unnamed values are represented as an unsigned numeric value with |
| their prefix. For example, ``%12``, ``@2``, ``%44``. |
| #. Constants, which are described in the section Constants_ below. |
| |
| LLVM requires that values start with a prefix for two reasons: Compilers |
| don't need to worry about name clashes with reserved words, and the set |
| of reserved words may be expanded in the future without penalty. |
| Additionally, unnamed identifiers allow a compiler to quickly come up |
| with a temporary variable without having to avoid symbol table |
| conflicts. |
| |
| Reserved words in LLVM are very similar to reserved words in other |
| languages. There are keywords for different opcodes ('``add``', |
| '``bitcast``', '``ret``', etc...), for primitive type names ('``void``', |
| '``i32``', etc...), and others. These reserved words cannot conflict |
| with variable names, because none of them start with a prefix character |
| (``'%'`` or ``'@'``). |
| |
| Here is an example of LLVM code to multiply the integer variable |
| '``%X``' by 8: |
| |
| The easy way: |
| |
| .. code-block:: llvm |
| |
| %result = mul i32 %X, 8 |
| |
| After strength reduction: |
| |
| .. code-block:: llvm |
| |
| %result = shl i32 %X, 3 |
| |
| And the hard way: |
| |
| .. code-block:: llvm |
| |
| %0 = add i32 %X, %X ; yields i32:%0 |
| %1 = add i32 %0, %0 ; yields i32:%1 |
| %result = add i32 %1, %1 |
| |
| This last way of multiplying ``%X`` by 8 illustrates several important |
| lexical features of LLVM: |
| |
| #. Comments are delimited with a '``;``' and go until the end of line. |
| #. Unnamed temporaries are created when the result of a computation is |
| not assigned to a named value. |
| #. Unnamed temporaries are numbered sequentially (using a per-function |
| incrementing counter, starting with 0). Note that basic blocks and unnamed |
| function parameters are included in this numbering. For example, if the |
| entry basic block is not given a label name and all function parameters are |
| named, then it will get number 0. |
| |
| It also shows a convention that we follow in this document. When |
| demonstrating instructions, we will follow an instruction with a comment |
| that defines the type and name of value produced. |
| |
| High Level Structure |
| ==================== |
| |
| Module Structure |
| ---------------- |
| |
| LLVM programs are composed of ``Module``'s, each of which is a |
| translation unit of the input programs. Each module consists of |
| functions, global variables, and symbol table entries. Modules may be |
| combined together with the LLVM linker, which merges function (and |
| global variable) definitions, resolves forward declarations, and merges |
| symbol table entries. Here is an example of the "hello world" module: |
| |
| .. code-block:: llvm |
| |
| ; Declare the string constant as a global constant. |
| @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" |
| |
| ; External declaration of the puts function |
| declare i32 @puts(ptr nocapture) nounwind |
| |
| ; Definition of main function |
| define i32 @main() { |
| ; Call puts function to write out the string to stdout. |
| call i32 @puts(ptr @.str) |
| ret i32 0 |
| } |
| |
| ; Named metadata |
| !0 = !{i32 42, null, !"string"} |
| !foo = !{!0} |
| |
| This example is made up of a :ref:`global variable <globalvars>` named |
| "``.str``", an external declaration of the "``puts``" function, a |
| :ref:`function definition <functionstructure>` for "``main``" and |
| :ref:`named metadata <namedmetadatastructure>` "``foo``". |
| |
| In general, a module is made up of a list of global values (where both |
| functions and global variables are global values). Global values are |
| represented by a pointer to a memory location (in this case, a pointer |
| to an array of char, and a pointer to a function), and have one of the |
| following :ref:`linkage types <linkage>`. |
| |
| .. _linkage: |
| |
| Linkage Types |
| ------------- |
| |
| All Global Variables and Functions have one of the following types of |
| linkage: |
| |
| ``private`` |
| Global values with "``private``" linkage are only directly |
| accessible by objects in the current module. In particular, linking |
| code into a module with a private global value may cause the |
| private to be renamed as necessary to avoid collisions. Because the |
| symbol is private to the module, all references can be updated. This |
| doesn't show up in any symbol table in the object file. |
| ``internal`` |
| Similar to private, but the value shows as a local symbol |
| (``STB_LOCAL`` in the case of ELF) in the object file. This |
| corresponds to the notion of the '``static``' keyword in C. |
| ``available_externally`` |
| Globals with "``available_externally``" linkage are never emitted into |
| the object file corresponding to the LLVM module. From the linker's |
| perspective, an ``available_externally`` global is equivalent to |
| an external declaration. They exist to allow inlining and other |
| optimizations to take place given knowledge of the definition of the |
| global, which is known to be somewhere outside the module. Globals |
| with ``available_externally`` linkage are allowed to be discarded at |
| will, and allow inlining and other optimizations. This linkage type is |
| only allowed on definitions, not declarations. |
| ``linkonce`` |
| Globals with "``linkonce``" linkage are merged with other globals of |
| the same name when linkage occurs. This can be used to implement |
| some forms of inline functions, templates, or other code which must |
| be generated in each translation unit that uses it, but where the |
| body may be overridden with a more definitive definition later. |
| Unreferenced ``linkonce`` globals are allowed to be discarded. Note |
| that ``linkonce`` linkage does not actually allow the optimizer to |
| inline the body of this function into callers because it doesn't |
| know if this definition of the function is the definitive definition |
| within the program or whether it will be overridden by a stronger |
| definition. To enable inlining and other optimizations, use |
| "``linkonce_odr``" linkage. |
| ``weak`` |
| "``weak``" linkage has the same merging semantics as ``linkonce`` |
| linkage, except that unreferenced globals with ``weak`` linkage may |
| not be discarded. This is used for globals that are declared "weak" |
| in C source code. |
| ``common`` |
| "``common``" linkage is most similar to "``weak``" linkage, but they |
| are used for tentative definitions in C, such as "``int X;``" at |
| global scope. Symbols with "``common``" linkage are merged in the |
| same way as ``weak symbols``, and they may not be deleted if |
| unreferenced. ``common`` symbols may not have an explicit section, |
| must have a zero initializer, and may not be marked |
| ':ref:`constant <globalvars>`'. Functions and aliases may not have |
| common linkage. |
| |
| .. _linkage_appending: |
| |
| ``appending`` |
| "``appending``" linkage may only be applied to global variables of |
| pointer to array type. When two global variables with appending |
| linkage are linked together, the two global arrays are appended |
| together. This is the LLVM, typesafe, equivalent of having the |
| system linker append together "sections" with identical names when |
| .o files are linked. |
| |
| Unfortunately this doesn't correspond to any feature in .o files, so it |
| can only be used for variables like ``llvm.global_ctors`` which llvm |
| interprets specially. |
| |
| ``extern_weak`` |
| The semantics of this linkage follow the ELF object file model: the |
| symbol is weak until linked, if not linked, the symbol becomes null |
| instead of being an undefined reference. |
| ``linkonce_odr``, ``weak_odr`` |
| Some languages allow differing globals to be merged, such as two |
| functions with different semantics. Other languages, such as |
| ``C++``, ensure that only equivalent globals are ever merged (the |
| "one definition rule" --- "ODR"). Such languages can use the |
| ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the |
| global will only be merged with equivalent globals. These linkage |
| types are otherwise the same as their non-``odr`` versions. |
| ``external`` |
| If none of the above identifiers are used, the global is externally |
| visible, meaning that it participates in linkage and can be used to |
| resolve external symbol references. |
| |
| It is illegal for a global variable or function *declaration* to have any |
| linkage type other than ``external`` or ``extern_weak``. |
| |
| .. _callingconv: |
| |
| Calling Conventions |
| ------------------- |
| |
| LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and |
| :ref:`invokes <i_invoke>` can all have an optional calling convention |
| specified for the call. The calling convention of any pair of dynamic |
| caller/callee must match, or the behavior of the program is undefined. |
| The following calling conventions are supported by LLVM, and more may be |
| added in the future: |
| |
| "``ccc``" - The C calling convention |
| This calling convention (the default if no other calling convention |
| is specified) matches the target C calling conventions. This calling |
| convention supports varargs function calls and tolerates some |
| mismatch in the declared prototype and implemented declaration of |
| the function (as does normal C). |
| "``fastcc``" - The fast calling convention |
| This calling convention attempts to make calls as fast as possible |
| (e.g. by passing things in registers). This calling convention |
| allows the target to use whatever tricks it wants to produce fast |
| code for the target, without having to conform to an externally |
| specified ABI (Application Binary Interface). `Tail calls can only |
| be optimized when this, the tailcc, the GHC or the HiPE convention is |
| used. <CodeGenerator.html#tail-call-optimization>`_ This calling |
| convention does not support varargs and requires the prototype of all |
| callees to exactly match the prototype of the function definition. |
| "``coldcc``" - The cold calling convention |
| This calling convention attempts to make code in the caller as |
| efficient as possible under the assumption that the call is not |
| commonly executed. As such, these calls often preserve all registers |
| so that the call does not break any live ranges in the caller side. |
| This calling convention does not support varargs and requires the |
| prototype of all callees to exactly match the prototype of the |
| function definition. Furthermore the inliner doesn't consider such function |
| calls for inlining. |
| "``cc 10``" - GHC convention |
| This calling convention has been implemented specifically for use by |
| the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. |
| It passes everything in registers, going to extremes to achieve this |
| by disabling callee save registers. This calling convention should |
| not be used lightly but only for specific situations such as an |
| alternative to the *register pinning* performance technique often |
| used when implementing functional programming languages. At the |
| moment only X86 supports this convention and it has the following |
| limitations: |
| |
| - On *X86-32* only supports up to 4 bit type parameters. No |
| floating-point types are supported. |
| - On *X86-64* only supports up to 10 bit type parameters and 6 |
| floating-point parameters. |
| |
| This calling convention supports `tail call |
| optimization <CodeGenerator.html#tail-call-optimization>`_ but requires |
| both the caller and callee are using it. |
| "``cc 11``" - The HiPE calling convention |
| This calling convention has been implemented specifically for use by |
| the `High-Performance Erlang |
| (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* |
| native code compiler of the `Ericsson's Open Source Erlang/OTP |
| system <http://www.erlang.org/download.shtml>`_. It uses more |
| registers for argument passing than the ordinary C calling |
| convention and defines no callee-saved registers. The calling |
| convention properly supports `tail call |
| optimization <CodeGenerator.html#tail-call-optimization>`_ but requires |
| that both the caller and the callee use it. It uses a *register pinning* |
| mechanism, similar to GHC's convention, for keeping frequently |
| accessed runtime components pinned to specific hardware registers. |
| At the moment only X86 supports this convention (both 32 and 64 |
| bit). |
| "``webkit_jscc``" - WebKit's JavaScript calling convention |
| This calling convention has been implemented for `WebKit FTL JIT |
| <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the |
| stack right to left (as cdecl does), and returns a value in the |
| platform's customary return register. |
| "``anyregcc``" - Dynamic calling convention for code patching |
| This is a special convention that supports patching an arbitrary code |
| sequence in place of a call site. This convention forces the call |
| arguments into registers but allows them to be dynamically |
| allocated. This can currently only be used with calls to |
| llvm.experimental.patchpoint because only this intrinsic records |
| the location of its arguments in a side table. See :doc:`StackMaps`. |
| "``preserve_mostcc``" - The `PreserveMost` calling convention |
| This calling convention attempts to make the code in the caller as |
| unintrusive as possible. This convention behaves identically to the `C` |
| calling convention on how arguments and return values are passed, but it |
| uses a different set of caller/callee-saved registers. This alleviates the |
| burden of saving and recovering a large register set before and after the |
| call in the caller. If the arguments are passed in callee-saved registers, |
| then they will be preserved by the callee across the call. This doesn't |
| apply for values returned in callee-saved registers. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| R11 and return registers, if any. R11 can be used as a scratch register. |
| Floating-point registers (XMMs/YMMs) are not preserved and need to be |
| saved by the caller. |
| |
| - On AArch64 the callee preserve all general purpose registers, except X0-X8 |
| and X16-X18. |
| |
| The idea behind this convention is to support calls to runtime functions |
| that have a hot path and a cold path. The hot path is usually a small piece |
| of code that doesn't use many registers. The cold path might need to call out to |
| another function and therefore only needs to preserve the caller-saved |
| registers, which haven't already been saved by the caller. The |
| `PreserveMost` calling convention is very similar to the `cold` calling |
| convention in terms of caller/callee-saved registers, but they are used for |
| different types of function calls. `coldcc` is for function calls that are |
| rarely executed, whereas `preserve_mostcc` function calls are intended to be |
| on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` |
| doesn't prevent the inliner from inlining the function call. |
| |
| This calling convention will be used by a future version of the ObjectiveC |
| runtime and should therefore still be considered experimental at this time. |
| Although this convention was created to optimize certain runtime calls to |
| the ObjectiveC runtime, it is not limited to this runtime and might be used |
| by other runtimes in the future too. The current implementation only |
| supports X86-64, but the intention is to support more architectures in the |
| future. |
| "``preserve_allcc``" - The `PreserveAll` calling convention |
| This calling convention attempts to make the code in the caller even less |
| intrusive than the `PreserveMost` calling convention. This calling |
| convention also behaves identical to the `C` calling convention on how |
| arguments and return values are passed, but it uses a different set of |
| caller/callee-saved registers. This removes the burden of saving and |
| recovering a large register set before and after the call in the caller. If |
| the arguments are passed in callee-saved registers, then they will be |
| preserved by the callee across the call. This doesn't apply for values |
| returned in callee-saved registers. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| R11. R11 can be used as a scratch register. Furthermore it also preserves |
| all floating-point registers (XMMs/YMMs). |
| |
| - On AArch64 the callee preserve all general purpose registers, except X0-X8 |
| and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD - |
| floating point registers. |
| |
| The idea behind this convention is to support calls to runtime functions |
| that don't need to call out to any other functions. |
| |
| This calling convention, like the `PreserveMost` calling convention, will be |
| used by a future version of the ObjectiveC runtime and should be considered |
| experimental at this time. |
| "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions |
| Clang generates an access function to access C++-style TLS. The access |
| function generally has an entry block, an exit block and an initialization |
| block that is run at the first time. The entry and exit blocks can access |
| a few TLS IR variables, each access will be lowered to a platform-specific |
| sequence. |
| |
| This calling convention aims to minimize overhead in the caller by |
| preserving as many registers as possible (all the registers that are |
| preserved on the fast path, composed of the entry and exit blocks). |
| |
| This calling convention behaves identical to the `C` calling convention on |
| how arguments and return values are passed, but it uses a different set of |
| caller/callee-saved registers. |
| |
| Given that each platform has its own lowering sequence, hence its own set |
| of preserved registers, we can't use the existing `PreserveMost`. |
| |
| - On X86-64 the callee preserves all general purpose registers, except for |
| RDI and RAX. |
| "``tailcc``" - Tail callable calling convention |
| This calling convention ensures that calls in tail position will always be |
| tail call optimized. This calling convention is equivalent to fastcc, |
| except for an additional guarantee that tail calls will be produced |
| whenever possible. `Tail calls can only be optimized when this, the fastcc, |
| the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_ |
| This calling convention does not support varargs and requires the prototype of |
| all callees to exactly match the prototype of the function definition. |
| "``swiftcc``" - This calling convention is used for Swift language. |
| - On X86-64 RCX and R8 are available for additional integer returns, and |
| XMM2 and XMM3 are available for additional FP/vector returns. |
| - On iOS platforms, we use AAPCS-VFP calling convention. |
| "``swifttailcc``" |
| This calling convention is like ``swiftcc`` in most respects, but also the |
| callee pops the argument area of the stack so that mandatory tail calls are |
| possible as in ``tailcc``. |
| "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism) |
| This calling convention is used for the Control Flow Guard check function, |
| calls to which can be inserted before indirect calls to check that the call |
| target is a valid function address. The check function has no return value, |
| but it will trigger an OS-level error if the address is not a valid target. |
| The set of registers preserved by the check function, and the register |
| containing the target address are architecture-specific. |
| |
| - On X86 the target address is passed in ECX. |
| - On ARM the target address is passed in R0. |
| - On AArch64 the target address is passed in X15. |
| "``cc <n>``" - Numbered convention |
| Any calling convention may be specified by number, allowing |
| target-specific calling conventions to be used. Target specific |
| calling conventions start at 64. |
| |
| More calling conventions can be added/defined on an as-needed basis, to |
| support Pascal conventions or any other well-known target-independent |
| convention. |
| |
| .. _visibilitystyles: |
| |
| Visibility Styles |
| ----------------- |
| |
| All Global Variables and Functions have one of the following visibility |
| styles: |
| |
| "``default``" - Default style |
| On targets that use the ELF object file format, default visibility |
| means that the declaration is visible to other modules and, in |
| shared libraries, means that the declared entity may be overridden. |
| On Darwin, default visibility means that the declaration is visible |
| to other modules. On XCOFF, default visibility means no explicit |
| visibility bit will be set and whether the symbol is visible |
| (i.e "exported") to other modules depends primarily on export lists |
| provided to the linker. Default visibility corresponds to "external |
| linkage" in the language. |
| "``hidden``" - Hidden style |
| Two declarations of an object with hidden visibility refer to the |
| same object if they are in the same shared object. Usually, hidden |
| visibility indicates that the symbol will not be placed into the |
| dynamic symbol table, so no other module (executable or shared |
| library) can reference it directly. |
| "``protected``" - Protected style |
| On ELF, protected visibility indicates that the symbol will be |
| placed in the dynamic symbol table, but that references within the |
| defining module will bind to the local symbol. That is, the symbol |
| cannot be overridden by another module. |
| |
| A symbol with ``internal`` or ``private`` linkage must have ``default`` |
| visibility. |
| |
| .. _dllstorageclass: |
| |
| DLL Storage Classes |
| ------------------- |
| |
| All Global Variables, Functions and Aliases can have one of the following |
| DLL storage class: |
| |
| ``dllimport`` |
| "``dllimport``" causes the compiler to reference a function or variable via |
| a global pointer to a pointer that is set up by the DLL exporting the |
| symbol. On Microsoft Windows targets, the pointer name is formed by |
| combining ``__imp_`` and the function or variable name. |
| ``dllexport`` |
| On Microsoft Windows targets, "``dllexport``" causes the compiler to provide |
| a global pointer to a pointer in a DLL, so that it can be referenced with the |
| ``dllimport`` attribute. the pointer name is formed by combining ``__imp_`` |
| and the function or variable name. On XCOFF targets, ``dllexport`` indicates |
| that the symbol will be made visible to other modules using "exported" |
| visibility and thus placed by the linker in the loader section symbol table. |
| Since this storage class exists for defining a dll interface, the compiler, |
| assembler and linker know it is externally referenced and must refrain from |
| deleting the symbol. |
| |
| A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage |
| class. |
| |
| .. _tls_model: |
| |
| Thread Local Storage Models |
| --------------------------- |
| |
| A variable may be defined as ``thread_local``, which means that it will |
| not be shared by threads (each thread will have a separated copy of the |
| variable). Not all targets support thread-local variables. Optionally, a |
| TLS model may be specified: |
| |
| ``localdynamic`` |
| For variables that are only used within the current shared library. |
| ``initialexec`` |
| For variables in modules that will not be loaded dynamically. |
| ``localexec`` |
| For variables defined in the executable and only used within it. |
| |
| If no explicit model is given, the "general dynamic" model is used. |
| |
| The models correspond to the ELF TLS models; see `ELF Handling For |
| Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for |
| more information on under which circumstances the different models may |
| be used. The target may choose a different TLS model if the specified |
| model is not supported, or if a better choice of model can be made. |
| |
| A model can also be specified in an alias, but then it only governs how |
| the alias is accessed. It will not have any effect in the aliasee. |
| |
| For platforms without linker support of ELF TLS model, the -femulated-tls |
| flag can be used to generate GCC compatible emulated TLS code. |
| |
| .. _runtime_preemption_model: |
| |
| Runtime Preemption Specifiers |
| ----------------------------- |
| |
| Global variables, functions and aliases may have an optional runtime preemption |
| specifier. If a preemption specifier isn't given explicitly, then a |
| symbol is assumed to be ``dso_preemptable``. |
| |
| ``dso_preemptable`` |
| Indicates that the function or variable may be replaced by a symbol from |
| outside the linkage unit at runtime. |
| |
| ``dso_local`` |
| The compiler may assume that a function or variable marked as ``dso_local`` |
| will resolve to a symbol within the same linkage unit. Direct access will |
| be generated even if the definition is not within this compilation unit. |
| |
| .. _namedtypes: |
| |
| Structure Types |
| --------------- |
| |
| LLVM IR allows you to specify both "identified" and "literal" :ref:`structure |
| types <t_struct>`. Literal types are uniqued structurally, but identified types |
| are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used |
| to forward declare a type that is not yet available. |
| |
| An example of an identified structure specification is: |
| |
| .. code-block:: llvm |
| |
| %mytype = type { %mytype*, i32 } |
| |
| Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only |
| literal types are uniqued in recent versions of LLVM. |
| |
| .. _nointptrtype: |
| |
| Non-Integral Pointer Type |
| ------------------------- |
| |
| Note: non-integral pointer types are a work in progress, and they should be |
| considered experimental at this time. |
| |
| LLVM IR optionally allows the frontend to denote pointers in certain address |
| spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. |
| Non-integral pointer types represent pointers that have an *unspecified* bitwise |
| representation; that is, the integral representation may be target dependent or |
| unstable (not backed by a fixed integer). |
| |
| ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for |
| integral (i.e. normal) pointers in that they convert integers to and from |
| corresponding pointer types, but there are additional implications to be |
| aware of. Because the bit-representation of a non-integral pointer may |
| not be stable, two identical casts of the same operand may or may not |
| return the same value. Said differently, the conversion to or from the |
| non-integral type depends on environmental state in an implementation |
| defined manner. |
| |
| If the frontend wishes to observe a *particular* value following a cast, the |
| generated IR must fence with the underlying environment in an implementation |
| defined manner. (In practice, this tends to require ``noinline`` routines for |
| such operations.) |
| |
| From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for |
| non-integral types are analogous to ones on integral types with one |
| key exception: the optimizer may not, in general, insert new dynamic |
| occurrences of such casts. If a new cast is inserted, the optimizer would |
| need to either ensure that a) all possible values are valid, or b) |
| appropriate fencing is inserted. Since the appropriate fencing is |
| implementation defined, the optimizer can't do the latter. The former is |
| challenging as many commonly expected properties, such as |
| ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types. |
| |
| .. _globalvars: |
| |
| Global Variables |
| ---------------- |
| |
| Global variables define regions of memory allocated at compilation time |
| instead of run-time. |
| |
| Global variable definitions must be initialized. |
| |
| Global variables in other translation units can also be declared, in which |
| case they don't have an initializer. |
| |
| Global variables can optionally specify a :ref:`linkage type <linkage>`. |
| |
| Either global variable definitions or declarations may have an explicit section |
| to be placed in and may have an optional explicit alignment specified. If there |
| is a mismatch between the explicit or inferred section information for the |
| variable declaration and its definition the resulting behavior is undefined. |
| |
| A variable may be defined as a global ``constant``, which indicates that |
| the contents of the variable will **never** be modified (enabling better |
| optimization, allowing the global data to be placed in the read-only |
| section of an executable, etc). Note that variables that need runtime |
| initialization cannot be marked ``constant`` as there is a store to the |
| variable. |
| |
| LLVM explicitly allows *declarations* of global variables to be marked |
| constant, even if the final definition of the global is not. This |
| capability can be used to enable slightly better optimization of the |
| program, but requires the language definition to guarantee that |
| optimizations based on the 'constantness' are valid for the translation |
| units that do not include the definition. |
| |
| As SSA values, global variables define pointer values that are in scope |
| (i.e. they dominate) all basic blocks in the program. Global variables |
| always define a pointer to their "content" type because they describe a |
| region of memory, and all memory objects in LLVM are accessed through |
| pointers. |
| |
| Global variables can be marked with ``unnamed_addr`` which indicates |
| that the address is not significant, only the content. Constants marked |
| like this can be merged with other constants if they have the same |
| initializer. Note that a constant with significant address *can* be |
| merged with a ``unnamed_addr`` constant, the result being a constant |
| whose address is significant. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| A global variable may be declared to reside in a target-specific |
| numbered address space. For targets that support them, address spaces |
| may affect how optimizations are performed and/or what target |
| instructions are used to access the variable. The default address space |
| is zero. The address space qualifier must precede any other attributes. |
| |
| LLVM allows an explicit section to be specified for globals. If the |
| target supports it, it will emit globals to the section specified. |
| Additionally, the global can placed in a comdat if the target has the necessary |
| support. |
| |
| External declarations may have an explicit section specified. Section |
| information is retained in LLVM IR for targets that make use of this |
| information. Attaching section information to an external declaration is an |
| assertion that its definition is located in the specified section. If the |
| definition is located in a different section, the behavior is undefined. |
| |
| By default, global initializers are optimized by assuming that global |
| variables defined within the module are not modified from their |
| initial values before the start of the global initializer. This is |
| true even for variables potentially accessible from outside the |
| module, including those with external linkage or appearing in |
| ``@llvm.used`` or dllexported variables. This assumption may be suppressed |
| by marking the variable with ``externally_initialized``. |
| |
| An explicit alignment may be specified for a global, which must be a |
| power of 2. If not present, or if the alignment is set to zero, the |
| alignment of the global is set by the target to whatever it feels |
| convenient. If an explicit alignment is specified, the global is forced |
| to have exactly that alignment. Targets and optimizers are not allowed |
| to over-align the global if the global has an assigned section. In this |
| case, the extra alignment could be observable: for example, code could |
| assume that the globals are densely packed in their section and try to |
| iterate over them as an array, alignment padding would break this |
| iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present, |
| limits the alignment to the given value. Optimizers are not allowed to |
| impose a stronger alignment on these variables. The maximum alignment |
| is ``1 << 32``. |
| |
| For global variable declarations, as well as definitions that may be |
| replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common`` |
| linkage types), the allocation size and alignment of the definition it resolves |
| to must be greater than or equal to that of the declaration or replaceable |
| definition, otherwise the behavior is undefined. |
| |
| Globals can also have a :ref:`DLL storage class <dllstorageclass>`, |
| an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, |
| an optional :ref:`global attributes <glattrs>` and |
| an optional list of attached :ref:`metadata <metadata>`. |
| |
| Variables and aliases can have a |
| :ref:`Thread Local Storage Model <tls_model>`. |
| |
| Globals cannot be or contain :ref:`Scalable vectors <t_vector>` because their |
| size is unknown at compile time. They are allowed in structs to facilitate |
| intrinsics returning multiple values. Generally, structs containing scalable |
| vectors are not considered "sized" and cannot be used in loads, stores, allocas, |
| or GEPs. The only exception to this rule is for structs that contain scalable |
| vectors of the same type (e.g. ``{<vscale x 2 x i32>, <vscale x 2 x i32>}`` |
| contains the same type while ``{<vscale x 2 x i32>, <vscale x 2 x i64>}`` |
| doesn't). These kinds of structs (we may call them homogeneous scalable vector |
| structs) are considered sized and can be used in loads, stores, allocas, but |
| not GEPs. |
| |
| Syntax:: |
| |
| @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] |
| [DLLStorageClass] [ThreadLocal] |
| [(unnamed_addr|local_unnamed_addr)] [AddrSpace] |
| [ExternallyInitialized] |
| <global | constant> <Type> [<InitializerConstant>] |
| [, section "name"] [, partition "name"] |
| [, comdat [($name)]] [, align <Alignment>] |
| [, no_sanitize_address] [, no_sanitize_hwaddress] |
| [, sanitize_address_dyninit] [, sanitize_memtag] |
| (, !name !N)* |
| |
| For example, the following defines a global in a numbered address space |
| with an initializer, section, and alignment: |
| |
| .. code-block:: llvm |
| |
| @G = addrspace(5) constant float 1.0, section "foo", align 4 |
| |
| The following example just declares a global variable |
| |
| .. code-block:: llvm |
| |
| @G = external global i32 |
| |
| The following example defines a thread-local global with the |
| ``initialexec`` TLS model: |
| |
| .. code-block:: llvm |
| |
| @G = thread_local(initialexec) global i32 0, align 4 |
| |
| .. _functionstructure: |
| |
| Functions |
| --------- |
| |
| LLVM function definitions consist of the "``define``" keyword, an |
| optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption |
| specifier <runtime_preemption_model>`, an optional :ref:`visibility |
| style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, |
| an optional :ref:`calling convention <callingconv>`, |
| an optional ``unnamed_addr`` attribute, a return type, an optional |
| :ref:`parameter attribute <paramattrs>` for the return type, a function |
| name, a (possibly empty) argument list (each with optional :ref:`parameter |
| attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, |
| an optional address space, an optional section, an optional partition, |
| an optional alignment, an optional :ref:`comdat <langref_comdats>`, |
| an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, |
| an optional :ref:`prologue <prologuedata>`, |
| an optional :ref:`personality <personalityfn>`, |
| an optional list of attached :ref:`metadata <metadata>`, |
| an opening curly brace, a list of basic blocks, and a closing curly brace. |
| |
| Syntax:: |
| |
| define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] |
| [cconv] [ret attrs] |
| <ResultType> @<FunctionName> ([argument list]) |
| [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] |
| [section "name"] [partition "name"] [comdat [($name)]] [align N] |
| [gc] [prefix Constant] [prologue Constant] [personality Constant] |
| (!name !N)* { ... } |
| |
| The argument list is a comma separated sequence of arguments where each |
| argument is of the following form: |
| |
| Syntax:: |
| |
| <type> [parameter Attrs] [name] |
| |
| LLVM function declarations consist of the "``declare``" keyword, an |
| optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style |
| <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an |
| optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` |
| or ``local_unnamed_addr`` attribute, an optional address space, a return type, |
| an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly |
| empty list of arguments, an optional alignment, an optional :ref:`garbage |
| collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional |
| :ref:`prologue <prologuedata>`. |
| |
| Syntax:: |
| |
| declare [linkage] [visibility] [DLLStorageClass] |
| [cconv] [ret attrs] |
| <ResultType> @<FunctionName> ([argument list]) |
| [(unnamed_addr|local_unnamed_addr)] [align N] [gc] |
| [prefix Constant] [prologue Constant] |
| |
| A function definition contains a list of basic blocks, forming the CFG (Control |
| Flow Graph) for the function. Each basic block may optionally start with a label |
| (giving the basic block a symbol table entry), contains a list of instructions, |
| and ends with a :ref:`terminator <terminators>` instruction (such as a branch or |
| function return). If an explicit label name is not provided, a block is assigned |
| an implicit numbered label, using the next value from the same counter as used |
| for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a |
| function entry block does not have an explicit label, it will be assigned label |
| "%0", then the first unnamed temporary in that block will be "%1", etc. If a |
| numeric label is explicitly specified, it must match the numeric label that |
| would be used implicitly. |
| |
| The first basic block in a function is special in two ways: it is |
| immediately executed on entrance to the function, and it is not allowed |
| to have predecessor basic blocks (i.e. there can not be any branches to |
| the entry block of a function). Because the block can have no |
| predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. |
| |
| LLVM allows an explicit section to be specified for functions. If the |
| target supports it, it will emit functions to the section specified. |
| Additionally, the function can be placed in a COMDAT. |
| |
| An explicit alignment may be specified for a function. If not present, |
| or if the alignment is set to zero, the alignment of the function is set |
| by the target to whatever it feels convenient. If an explicit alignment |
| is specified, the function is forced to have at least that much |
| alignment. All alignments must be a power of 2. |
| |
| If the ``unnamed_addr`` attribute is given, the address is known to not |
| be significant and two identical functions can be merged. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| If an explicit address space is not given, it will default to the program |
| address space from the :ref:`datalayout string<langref_datalayout>`. |
| |
| .. _langref_aliases: |
| |
| Aliases |
| ------- |
| |
| Aliases, unlike function or variables, don't create any new data. They |
| are just a new symbol and metadata for an existing position. |
| |
| Aliases have a name and an aliasee that is either a global value or a |
| constant expression. |
| |
| Aliases may have an optional :ref:`linkage type <linkage>`, an optional |
| :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional |
| :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class |
| <dllstorageclass>` and an optional :ref:`tls model <tls_model>`. |
| |
| Syntax:: |
| |
| @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> |
| [, partition "name"] |
| |
| The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, |
| ``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note |
| that some system linkers might not correctly handle dropping a weak symbol that |
| is aliased. |
| |
| Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as |
| the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point |
| to the same content. |
| |
| If the ``local_unnamed_addr`` attribute is given, the address is known to |
| not be significant within the module. |
| |
| Since aliases are only a second name, some restrictions apply, of which |
| some can only be checked when producing an object file: |
| |
| * The expression defining the aliasee must be computable at assembly |
| time. Since it is just a name, no relocations can be used. |
| |
| * No alias in the expression can be weak as the possibility of the |
| intermediate alias being overridden cannot be represented in an |
| object file. |
| |
| * If the alias has the ``available_externally`` linkage, the aliasee must be an |
| ``available_externally`` global value; otherwise the aliasee can be an |
| expression but no global value in the expression can be a declaration, since |
| that would require a relocation, which is not possible. |
| |
| * If either the alias or the aliasee may be replaced by a symbol outside the |
| module at link time or runtime, any optimization cannot replace the alias with |
| the aliasee, since the behavior may be different. The alias may be used as a |
| name guaranteed to point to the content in the current module. |
| |
| .. _langref_ifunc: |
| |
| IFuncs |
| ------- |
| |
| IFuncs, like as aliases, don't create any new data or func. They are just a new |
| symbol that dynamic linker resolves at runtime by calling a resolver function. |
| |
| IFuncs have a name and a resolver that is a function called by dynamic linker |
| that returns address of another function associated with the name. |
| |
| IFunc may have an optional :ref:`linkage type <linkage>` and an optional |
| :ref:`visibility style <visibility>`. |
| |
| Syntax:: |
| |
| @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> |
| [, partition "name"] |
| |
| |
| .. _langref_comdats: |
| |
| Comdats |
| ------- |
| |
| Comdat IR provides access to object file COMDAT/section group functionality |
| which represents interrelated sections. |
| |
| Comdats have a name which represents the COMDAT key and a selection kind to |
| provide input on how the linker deduplicates comdats with the same key in two |
| different object files. A comdat must be included or omitted as a unit. |
| Discarding the whole comdat is allowed but discarding a subset is not. |
| |
| A global object may be a member of at most one comdat. Aliases are placed in the |
| same COMDAT that their aliasee computes to, if any. |
| |
| Syntax:: |
| |
| $<Name> = comdat SelectionKind |
| |
| For selection kinds other than ``nodeduplicate``, only one of the duplicate |
| comdats may be retained by the linker and the members of the remaining comdats |
| must be discarded. The following selection kinds are supported: |
| |
| ``any`` |
| The linker may choose any COMDAT key, the choice is arbitrary. |
| ``exactmatch`` |
| The linker may choose any COMDAT key but the sections must contain the |
| same data. |
| ``largest`` |
| The linker will choose the section containing the largest COMDAT key. |
| ``nodeduplicate`` |
| No deduplication is performed. |
| ``samesize`` |
| The linker may choose any COMDAT key but the sections must contain the |
| same amount of data. |
| |
| - XCOFF and Mach-O don't support COMDATs. |
| - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need |
| a non-local linkage COMDAT symbol. |
| - ELF supports ``any`` and ``nodeduplicate``. |
| - WebAssembly only supports ``any``. |
| |
| Here is an example of a COFF COMDAT where a function will only be selected if |
| the COMDAT key's section is the largest: |
| |
| .. code-block:: text |
| |
| $foo = comdat largest |
| @foo = global i32 2, comdat($foo) |
| |
| define void @bar() comdat($foo) { |
| ret void |
| } |
| |
| In a COFF object file, this will create a COMDAT section with selection kind |
| ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol |
| and another COMDAT section with selection kind |
| ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT |
| section and contains the contents of the ``@bar`` symbol. |
| |
| As a syntactic sugar the ``$name`` can be omitted if the name is the same as |
| the global name: |
| |
| .. code-block:: llvm |
| |
| $foo = comdat any |
| @foo = global i32 2, comdat |
| @bar = global i32 3, comdat($foo) |
| |
| There are some restrictions on the properties of the global object. |
| It, or an alias to it, must have the same name as the COMDAT group when |
| targeting COFF. |
| The contents and size of this object may be used during link-time to determine |
| which COMDAT groups get selected depending on the selection kind. |
| Because the name of the object must match the name of the COMDAT group, the |
| linkage of the global object must not be local; local symbols can get renamed |
| if a collision occurs in the symbol table. |
| |
| The combined use of COMDATS and section attributes may yield surprising results. |
| For example: |
| |
| .. code-block:: llvm |
| |
| $foo = comdat any |
| $bar = comdat any |
| @g1 = global i32 42, section "sec", comdat($foo) |
| @g2 = global i32 42, section "sec", comdat($bar) |
| |
| From the object file perspective, this requires the creation of two sections |
| with the same name. This is necessary because both globals belong to different |
| COMDAT groups and COMDATs, at the object file level, are represented by |
| sections. |
| |
| Note that certain IR constructs like global variables and functions may |
| create COMDATs in the object file in addition to any which are specified using |
| COMDAT IR. This arises when the code generator is configured to emit globals |
| in individual sections (e.g. when `-data-sections` or `-function-sections` |
| is supplied to `llc`). |
| |
| .. _namedmetadatastructure: |
| |
| Named Metadata |
| -------------- |
| |
| Named metadata is a collection of metadata. :ref:`Metadata |
| nodes <metadata>` (but not metadata strings) are the only valid |
| operands for a named metadata. |
| |
| #. Named metadata are represented as a string of characters with the |
| metadata prefix. The rules for metadata names are the same as for |
| identifiers, but quoted names are not allowed. ``"\xx"`` type escapes |
| are still valid, which allows any character to be part of a name. |
| |
| Syntax:: |
| |
| ; Some unnamed metadata nodes, which are referenced by the named metadata. |
| !0 = !{!"zero"} |
| !1 = !{!"one"} |
| !2 = !{!"two"} |
| ; A named metadata. |
| !name = !{!0, !1, !2} |
| |
| .. _paramattrs: |
| |
| Parameter Attributes |
| -------------------- |
| |
| The return type and each parameter of a function type may have a set of |
| *parameter attributes* associated with them. Parameter attributes are |
| used to communicate additional information about the result or |
| parameters of a function. Parameter attributes are considered to be part |
| of the function, not of the function type, so functions with different |
| parameter attributes can have the same function type. |
| |
| Parameter attributes are simple keywords that follow the type specified. |
| If multiple parameter attributes are needed, they are space separated. |
| For example: |
| |
| .. code-block:: llvm |
| |
| declare i32 @printf(ptr noalias nocapture, ...) |
| declare i32 @atoi(i8 zeroext) |
| declare signext i8 @returns_signed_char() |
| |
| Note that any attributes for the function result (``nonnull``, |
| ``signext``) come before the result type. |
| |
| Currently, only the following parameter attributes are defined: |
| |
| ``zeroext`` |
| This indicates to the code generator that the parameter or return |
| value should be zero-extended to the extent required by the target's |
| ABI by the caller (for a parameter) or the callee (for a return value). |
| ``signext`` |
| This indicates to the code generator that the parameter or return |
| value should be sign-extended to the extent required by the target's |
| ABI (which is usually 32-bits) by the caller (for a parameter) or |
| the callee (for a return value). |
| ``inreg`` |
| This indicates that this parameter or return value should be treated |
| in a special target-dependent fashion while emitting code for |
| a function call or return (usually, by putting it in a register as |
| opposed to memory, though some targets use it to distinguish between |
| two different kinds of registers). Use of this attribute is |
| target-specific. |
| ``byval(<ty>)`` |
| This indicates that the pointer parameter should really be passed by |
| value to the function. The attribute implies that a hidden copy of |
| the pointee is made between the caller and the callee, so the callee |
| is unable to modify the value in the caller. This attribute is only |
| valid on LLVM pointer arguments. It is generally used to pass |
| structs and arrays by value, but is also valid on pointers to |
| scalars. The copy is considered to belong to the caller not the |
| callee (for example, ``readonly`` functions should not write to |
| ``byval`` parameters). This is not a valid attribute for return |
| values. |
| |
| The byval type argument indicates the in-memory value type, and |
| must be the same as the pointee type of the argument. |
| |
| The byval attribute also supports specifying an alignment with the |
| align attribute. It indicates the alignment of the stack slot to |
| form and the known alignment of the pointer specified to the call |
| site. If the alignment is not specified, then the code generator |
| makes a target-specific assumption. |
| |
| .. _attr_byref: |
| |
| ``byref(<ty>)`` |
| |
| The ``byref`` argument attribute allows specifying the pointee |
| memory type of an argument. This is similar to ``byval``, but does |
| not imply a copy is made anywhere, or that the argument is passed |
| on the stack. This implies the pointer is dereferenceable up to |
| the storage size of the type. |
| |
| It is not generally permissible to introduce a write to an |
| ``byref`` pointer. The pointer may have any address space and may |
| be read only. |
| |
| This is not a valid attribute for return values. |
| |
| The alignment for an ``byref`` parameter can be explicitly |
| specified by combining it with the ``align`` attribute, similar to |
| ``byval``. If the alignment is not specified, then the code generator |
| makes a target-specific assumption. |
| |
| This is intended for representing ABI constraints, and is not |
| intended to be inferred for optimization use. |
| |
| .. _attr_preallocated: |
| |
| ``preallocated(<ty>)`` |
| This indicates that the pointer parameter should really be passed by |
| value to the function, and that the pointer parameter's pointee has |
| already been initialized before the call instruction. This attribute |
| is only valid on LLVM pointer arguments. The argument must be the value |
| returned by the appropriate |
| :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non |
| ``musttail`` calls, or the corresponding caller parameter in ``musttail`` |
| calls, although it is ignored during codegen. |
| |
| A non ``musttail`` function call with a ``preallocated`` attribute in |
| any parameter must have a ``"preallocated"`` operand bundle. A ``musttail`` |
| function call cannot have a ``"preallocated"`` operand bundle. |
| |
| The preallocated attribute requires a type argument, which must be |
| the same as the pointee type of the argument. |
| |
| The preallocated attribute also supports specifying an alignment with the |
| align attribute. It indicates the alignment of the stack slot to |
| form and the known alignment of the pointer specified to the call |
| site. If the alignment is not specified, then the code generator |
| makes a target-specific assumption. |
| |
| .. _attr_inalloca: |
| |
| ``inalloca(<ty>)`` |
| |
| The ``inalloca`` argument attribute allows the caller to take the |
| address of outgoing stack arguments. An ``inalloca`` argument must |
| be a pointer to stack memory produced by an ``alloca`` instruction. |
| The alloca, or argument allocation, must also be tagged with the |
| inalloca keyword. Only the last argument may have the ``inalloca`` |
| attribute, and that argument is guaranteed to be passed in memory. |
| |
| An argument allocation may be used by a call at most once because |
| the call may deallocate it. The ``inalloca`` attribute cannot be |
| used in conjunction with other attributes that affect argument |
| storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The |
| ``inalloca`` attribute also disables LLVM's implicit lowering of |
| large aggregate return values, which means that frontend authors |
| must lower them with ``sret`` pointers. |
| |
| When the call site is reached, the argument allocation must have |
| been the most recent stack allocation that is still live, or the |
| behavior is undefined. It is possible to allocate additional stack |
| space after an argument allocation and before its call site, but it |
| must be cleared off with :ref:`llvm.stackrestore |
| <int_stackrestore>`. |
| |
| The inalloca attribute requires a type argument, which must be the |
| same as the pointee type of the argument. |
| |
| See :doc:`InAlloca` for more information on how to use this |
| attribute. |
| |
| ``sret(<ty>)`` |
| This indicates that the pointer parameter specifies the address of a |
| structure that is the return value of the function in the source |
| program. This pointer must be guaranteed by the caller to be valid: |
| loads and stores to the structure may be assumed by the callee not |
| to trap and to be properly aligned. This is not a valid attribute |
| for return values. |
| |
| The sret type argument specifies the in memory type, which must be |
| the same as the pointee type of the argument. |
| |
| .. _attr_elementtype: |
| |
| ``elementtype(<ty>)`` |
| |
| The ``elementtype`` argument attribute can be used to specify a pointer |
| element type in a way that is compatible with `opaque pointers |
| <OpaquePointers.html>`__. |
| |
| The ``elementtype`` attribute by itself does not carry any specific |
| semantics. However, certain intrinsics may require this attribute to be |
| present and assign it particular semantics. This will be documented on |
| individual intrinsics. |
| |
| The attribute may only be applied to pointer typed arguments of intrinsic |
| calls. It cannot be applied to non-intrinsic calls, and cannot be applied |
| to parameters on function declarations. For non-opaque pointers, the type |
| passed to ``elementtype`` must match the pointer element type. |
| |
| .. _attr_align: |
| |
| ``align <n>`` or ``align(<n>)`` |
| This indicates that the pointer value or vector of pointers has the |
| specified alignment. If applied to a vector of pointers, *all* pointers |
| (elements) have the specified alignment. If the pointer value does not have |
| the specified alignment, :ref:`poison value <poisonvalues>` is returned or |
| passed instead. The ``align`` attribute should be combined with the |
| ``noundef`` attribute to ensure a pointer is aligned, or otherwise the |
| behavior is undefined. Note that ``align 1`` has no effect on non-byval, |
| non-preallocated arguments. |
| |
| Note that this attribute has additional semantics when combined with the |
| ``byval`` or ``preallocated`` attribute, which are documented there. |
| |
| .. _noalias: |
| |
| ``noalias`` |
| This indicates that memory locations accessed via pointer values |
| :ref:`based <pointeraliasing>` on the argument or return value are not also |
| accessed, during the execution of the function, via pointer values not |
| *based* on the argument or return value. This guarantee only holds for |
| memory locations that are *modified*, by any means, during the execution of |
| the function. The attribute on a return value also has additional semantics |
| described below. The caller shares the responsibility with the callee for |
| ensuring that these requirements are met. For further details, please see |
| the discussion of the NoAlias response in :ref:`alias analysis <Must, May, |
| or No>`. |
| |
| Note that this definition of ``noalias`` is intentionally similar |
| to the definition of ``restrict`` in C99 for function arguments. |
| |
| For function return values, C99's ``restrict`` is not meaningful, |
| while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` |
| attribute on return values are stronger than the semantics of the attribute |
| when used on function arguments. On function return values, the ``noalias`` |
| attribute indicates that the function acts like a system memory allocation |
| function, returning a pointer to allocated storage disjoint from the |
| storage for any other object accessible to the caller. |
| |
| .. _nocapture: |
| |
| ``nocapture`` |
| This indicates that the callee does not :ref:`capture <pointercapture>` the |
| pointer. This is not a valid attribute for return values. |
| This attribute applies only to the particular copy of the pointer passed in |
| this argument. A caller could pass two copies of the same pointer with one |
| being annotated nocapture and the other not, and the callee could validly |
| capture through the non annotated parameter. |
| |
| .. code-block:: llvm |
| |
| define void @f(ptr nocapture %a, ptr %b) { |
| ; (capture %b) |
| } |
| |
| call void @f(ptr @glb, ptr @glb) ; well-defined |
| |
| ``nofree`` |
| This indicates that callee does not free the pointer argument. This is not |
| a valid attribute for return values. |
| |
| .. _nest: |
| |
| ``nest`` |
| This indicates that the pointer parameter can be excised using the |
| :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid |
| attribute for return values and can only be applied to one parameter. |
| |
| ``returned`` |
| This indicates that the function always returns the argument as its return |
| value. This is a hint to the optimizer and code generator used when |
| generating the caller, allowing value propagation, tail call optimization, |
| and omission of register saves and restores in some cases; it is not |
| checked or enforced when generating the callee. The parameter and the |
| function return type must be valid operands for the |
| :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for |
| return values and can only be applied to one parameter. |
| |
| ``nonnull`` |
| This indicates that the parameter or return pointer is not null. This |
| attribute may only be applied to pointer typed parameters. This is not |
| checked or enforced by LLVM; if the parameter or return pointer is null, |
| :ref:`poison value <poisonvalues>` is returned or passed instead. |
| The ``nonnull`` attribute should be combined with the ``noundef`` attribute |
| to ensure a pointer is not null or otherwise the behavior is undefined. |
| |
| ``dereferenceable(<n>)`` |
| This indicates that the parameter or return pointer is dereferenceable. This |
| attribute may only be applied to pointer typed parameters. A pointer that |
| is dereferenceable can be loaded from speculatively without a risk of |
| trapping. The number of bytes known to be dereferenceable must be provided |
| in parentheses. It is legal for the number of bytes to be less than the |
| size of the pointee type. The ``nonnull`` attribute does not imply |
| dereferenceability (consider a pointer to one element past the end of an |
| array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in |
| ``addrspace(0)`` (which is the default address space), except if the |
| ``null_pointer_is_valid`` function attribute is present. |
| ``n`` should be a positive number. The pointer should be well defined, |
| otherwise it is undefined behavior. This means ``dereferenceable(<n>)`` |
| implies ``noundef``. |
| |
| ``dereferenceable_or_null(<n>)`` |
| This indicates that the parameter or return value isn't both |
| non-null and non-dereferenceable (up to ``<n>`` bytes) at the same |
| time. All non-null pointers tagged with |
| ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. |
| For address space 0 ``dereferenceable_or_null(<n>)`` implies that |
| a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, |
| and in other address spaces ``dereferenceable_or_null(<n>)`` |
| implies that a pointer is at least one of ``dereferenceable(<n>)`` |
| or ``null`` (i.e. it may be both ``null`` and |
| ``dereferenceable(<n>)``). This attribute may only be applied to |
| pointer typed parameters. |
| |
| ``swiftself`` |
| This indicates that the parameter is the self/context parameter. This is not |
| a valid attribute for return values and can only be applied to one |
| parameter. |
| |
| .. _swiftasync: |
| |
| ``swiftasync`` |
| This indicates that the parameter is the asynchronous context parameter and |
| triggers the creation of a target-specific extended frame record to store |
| this pointer. This is not a valid attribute for return values and can only |
| be applied to one parameter. |
| |
| ``swifterror`` |
| This attribute is motivated to model and optimize Swift error handling. It |
| can be applied to a parameter with pointer to pointer type or a |
| pointer-sized alloca. At the call site, the actual argument that corresponds |
| to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or |
| the ``swifterror`` parameter of the caller. A ``swifterror`` value (either |
| the parameter or the alloca) can only be loaded and stored from, or used as |
| a ``swifterror`` argument. This is not a valid attribute for return values |
| and can only be applied to one parameter. |
| |
| These constraints allow the calling convention to optimize access to |
| ``swifterror`` variables by associating them with a specific register at |
| call boundaries rather than placing them in memory. Since this does change |
| the calling convention, a function which uses the ``swifterror`` attribute |
| on a parameter is not ABI-compatible with one which does not. |
| |
| These constraints also allow LLVM to assume that a ``swifterror`` argument |
| does not alias any other memory visible within a function and that a |
| ``swifterror`` alloca passed as an argument does not escape. |
| |
| ``immarg`` |
| This indicates the parameter is required to be an immediate |
| value. This must be a trivial immediate integer or floating-point |
| constant. Undef or constant expressions are not valid. This is |
| only valid on intrinsic declarations and cannot be applied to a |
| call site or arbitrary function. |
| |
| ``noundef`` |
| This attribute applies to parameters and return values. If the value |
| representation contains any undefined or poison bits, the behavior is |
| undefined. Note that this does not refer to padding introduced by the |
| type's storage representation. |
| |
| .. _nofpclass: |
| |
| ``nofpclass(<test mask>)`` |
| This attribute applies to parameters and return values with |
| floating-point and vector of floating-point types, as well as |
| arrays of such types. The test mask has the same format as the |
| second argument to the :ref:`llvm.is.fpclass <llvm.is.fpclass>`, |
| and indicates which classes of floating-point values are not |
| permitted for the value. For example a bitmask of 3 indicates |
| the parameter may not be a NaN. |
| |
| If the value is a floating-point class indicated by the |
| ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is |
| passed or returned instead. |
| |
| .. code-block:: text |
| :caption: The following invariants hold |
| |
| @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false |
| @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true |
| nofpclass(all) => poison |
| .. |
| |
| In textual IR, various string names are supported for readability |
| and can be combined. For example ``nofpclass(nan pinf nzero)`` |
| evaluates to a mask of 547. |
| |
| This does not depend on the floating-point environment. For |
| example, a function parameter marked ``nofpclass(zero)`` indicates |
| no zero inputs. If this is applied to an argument in a function |
| marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>` |
| indicating zero treatment of input denormals, it does not imply the |
| value cannot be a denormal value which would compare equal to 0. |
| |
| .. table:: Recognized test mask names |
| |
| +-------+----------------------+---------------+ |
| | Name | floating-point class | Bitmask value | |
| +=======+======================+===============+ |
| | nan | Any NaN | 3 | |
| +-------+----------------------+---------------+ |
| | inf | +/- infinity | 516 | |
| +-------+----------------------+---------------+ |
| | norm | +/- normal | 26 | |
| +-------+----------------------+---------------+ |
| | sub | +/- subnormal | 144 | |
| +-------+----------------------+---------------+ |
| | zero | +/- 0 | 96 | |
| +-------+----------------------+---------------+ |
| | all | All values | 1023 | |
| +-------+----------------------+---------------+ |
| | snan | Signaling NaN | 1 | |
| +-------+----------------------+---------------+ |
| | qnan | Quiet NaN | 2 | |
| +-------+----------------------+---------------+ |
| | ninf | Negative infinity | 4 | |
| +-------+----------------------+---------------+ |
| | nnorm | Negative normal | 8 | |
| +-------+----------------------+---------------+ |
| | nsub | Negative subnormal | 16 | |
| +-------+----------------------+---------------+ |
| | nzero | Negative zero | 32 | |
| +-------+----------------------+---------------+ |
| | pzero | Positive zero | 64 | |
| +-------+----------------------+---------------+ |
| | psub | Positive subnormal | 128 | |
| +-------+----------------------+---------------+ |
| | pnorm | Positive normal | 256 | |
| +-------+----------------------+---------------+ |
| | pinf | Positive infinity | 512 | |
| +-------+----------------------+---------------+ |
| |
| |
| ``alignstack(<n>)`` |
| This indicates the alignment that should be considered by the backend when |
| assigning this parameter to a stack slot during calling convention |
| lowering. The enforcement of the specified alignment is target-dependent, |
| as target-specific calling convention rules may override this value. This |
| attribute serves the purpose of carrying language specific alignment |
| information that is not mapped to base types in the backend (for example, |
| over-alignment specification through language attributes). |
| |
| ``allocalign`` |
| The function parameter marked with this attribute is is the alignment in bytes of the |
| newly allocated block returned by this function. The returned value must either have |
| the specified alignment or be the null pointer. The return value MAY be more aligned |
| than the requested alignment, but not less aligned. Invalid (e.g. non-power-of-2) |
| alignments are permitted for the allocalign parameter, so long as the returned pointer |
| is null. This attribute may only be applied to integer parameters. |
| |
| ``allocptr`` |
| The function parameter marked with this attribute is the pointer |
| that will be manipulated by the allocator. For a realloc-like |
| function the pointer will be invalidated upon success (but the |
| same address may be returned), for a free-like function the |
| pointer will always be invalidated. |
| |
| ``readnone`` |
| This attribute indicates that the function does not dereference that |
| pointer argument, even though it may read or write the memory that the |
| pointer points to if accessed through other pointers. |
| |
| If a function reads from or writes to a readnone pointer argument, the |
| behavior is undefined. |
| |
| ``readonly`` |
| This attribute indicates that the function does not write through this |
| pointer argument, even though it may write to the memory that the pointer |
| points to. |
| |
| If a function writes to a readonly pointer argument, the behavior is |
| undefined. |
| |
| ``writeonly`` |
| This attribute indicates that the function may write to, but does not read |
| through this pointer argument (even though it may read from the memory that |
| the pointer points to). |
| |
| If a function reads from a writeonly pointer argument, the behavior is |
| undefined. |
| |
| .. _gc: |
| |
| Garbage Collector Strategy Names |
| -------------------------------- |
| |
| Each function may specify a garbage collector strategy name, which is simply a |
| string: |
| |
| .. code-block:: llvm |
| |
| define void @f() gc "name" { ... } |
| |
| The supported values of *name* includes those :ref:`built in to LLVM |
| <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC |
| strategy will cause the compiler to alter its output in order to support the |
| named garbage collection algorithm. Note that LLVM itself does not contain a |
| garbage collector, this functionality is restricted to generating machine code |
| which can interoperate with a collector provided externally. |
| |
| .. _prefixdata: |
| |
| Prefix Data |
| ----------- |
| |
| Prefix data is data associated with a function which the code |
| generator will emit immediately before the function's entrypoint. |
| The purpose of this feature is to allow frontends to associate |
| language-specific runtime metadata with specific functions and make it |
| available through the function pointer while still allowing the |
| function pointer to be called. |
| |
| To access the data for a given function, a program may bitcast the |
| function pointer to a pointer to the constant's type and dereference |
| index -1. This implies that the IR symbol points just past the end of |
| the prefix data. For instance, take the example of a function annotated |
| with a single ``i32``, |
| |
| .. code-block:: llvm |
| |
| define void @f() prefix i32 123 { ... } |
| |
| The prefix data can be referenced as, |
| |
| .. code-block:: llvm |
| |
| %a = getelementptr inbounds i32, ptr @f, i32 -1 |
| %b = load i32, ptr %a |
| |
| Prefix data is laid out as if it were an initializer for a global variable |
| of the prefix data's type. The function will be placed such that the |
| beginning of the prefix data is aligned. This means that if the size |
| of the prefix data is not a multiple of the alignment size, the |
| function's entrypoint will not be aligned. If alignment of the |
| function's entrypoint is desired, padding must be added to the prefix |
| data. |
| |
| A function may have prefix data but no body. This has similar semantics |
| to the ``available_externally`` linkage in that the data may be used by the |
| optimizers but will not be emitted in the object file. |
| |
| .. _prologuedata: |
| |
| Prologue Data |
| ------------- |
| |
| The ``prologue`` attribute allows arbitrary code (encoded as bytes) to |
| be inserted prior to the function body. This can be used for enabling |
| function hot-patching and instrumentation. |
| |
| To maintain the semantics of ordinary function calls, the prologue data must |
| have a particular format. Specifically, it must begin with a sequence of |
| bytes which decode to a sequence of machine instructions, valid for the |
| module's target, which transfer control to the point immediately succeeding |
| the prologue data, without performing any other visible action. This allows |
| the inliner and other passes to reason about the semantics of the function |
| definition without needing to reason about the prologue data. Obviously this |
| makes the format of the prologue data highly target dependent. |
| |
| A trivial example of valid prologue data for the x86 architecture is ``i8 144``, |
| which encodes the ``nop`` instruction: |
| |
| .. code-block:: text |
| |
| define void @f() prologue i8 144 { ... } |
| |
| Generally prologue data can be formed by encoding a relative branch instruction |
| which skips the metadata, as in this example of valid prologue data for the |
| x86_64 architecture, where the first two bytes encode ``jmp .+10``: |
| |
| .. code-block:: text |
| |
| %0 = type <{ i8, i8, ptr }> |
| |
| define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... } |
| |
| A function may have prologue data but no body. This has similar semantics |
| to the ``available_externally`` linkage in that the data may be used by the |
| optimizers but will not be emitted in the object file. |
| |
| .. _personalityfn: |
| |
| Personality Function |
| -------------------- |
| |
| The ``personality`` attribute permits functions to specify what function |
| to use for exception handling. |
| |
| .. _attrgrp: |
| |
| Attribute Groups |
| ---------------- |
| |
| Attribute groups are groups of attributes that are referenced by objects within |
| the IR. They are important for keeping ``.ll`` files readable, because a lot of |
| functions will use the same set of attributes. In the degenerative case of a |
| ``.ll`` file that corresponds to a single ``.c`` file, the single attribute |
| group will capture the important command line flags used to build that file. |
| |
| An attribute group is a module-level object. To use an attribute group, an |
| object references the attribute group's ID (e.g. ``#37``). An object may refer |
| to more than one attribute group. In that situation, the attributes from the |
| different groups are merged. |
| |
| Here is an example of attribute groups for a function that should always be |
| inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: |
| |
| .. code-block:: llvm |
| |
| ; Target-independent attributes: |
| attributes #0 = { alwaysinline alignstack=4 } |
| |
| ; Target-dependent attributes: |
| attributes #1 = { "no-sse" } |
| |
| ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". |
| define void @f() #0 #1 { ... } |
| |
| .. _fnattrs: |
| |
| Function Attributes |
| ------------------- |
| |
| Function attributes are set to communicate additional information about |
| a function. Function attributes are considered to be part of the |
| function, not of the function type, so functions with different function |
| attributes can have the same function type. |
| |
| Function attributes are simple keywords that follow the type specified. |
| If multiple attributes are needed, they are space separated. For |
| example: |
| |
| .. code-block:: llvm |
| |
| define void @f() noinline { ... } |
| define void @f() alwaysinline { ... } |
| define void @f() alwaysinline optsize { ... } |
| define void @f() optsize { ... } |
| |
| ``alignstack(<n>)`` |
| This attribute indicates that, when emitting the prologue and |
| epilogue, the backend should forcibly align the stack pointer. |
| Specify the desired alignment, which must be a power of two, in |
| parentheses. |
| ``"alloc-family"="FAMILY"`` |
| This indicates which "family" an allocator function is part of. To avoid |
| collisions, the family name should match the mangled name of the primary |
| allocator function, that is "malloc" for malloc/calloc/realloc/free, |
| "_Znwm" for ``::operator::new`` and ``::operator::delete``, and |
| "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and |
| ``::operator::delete``. Matching malloc/realloc/free calls within a family |
| can be optimized, but mismatched ones will be left alone. |
| ``allockind("KIND")`` |
| Describes the behavior of an allocation function. The KIND string contains comma |
| separated entries from the following options: |
| |
| * "alloc": the function returns a new block of memory or null. |
| * "realloc": the function returns a new block of memory or null. If the |
| result is non-null the memory contents from the start of the block up to |
| the smaller of the original allocation size and the new allocation size |
| will match that of the ``allocptr`` argument and the ``allocptr`` |
| argument is invalidated, even if the function returns the same address. |
| * "free": the function frees the block of memory specified by ``allocptr``. |
| Functions marked as "free" ``allockind`` must return void. |
| * "uninitialized": Any newly-allocated memory (either a new block from |
| a "alloc" function or the enlarged capacity from a "realloc" function) |
| will be uninitialized. |
| * "zeroed": Any newly-allocated memory (either a new block from a "alloc" |
| function or the enlarged capacity from a "realloc" function) will be |
| zeroed. |
| * "aligned": the function returns memory aligned according to the |
| ``allocalign`` parameter. |
| |
| The first three options are mutually exclusive, and the remaining options |
| describe more details of how the function behaves. The remaining options |
| are invalid for "free"-type functions. |
| ``allocsize(<EltSizeParam>[, <NumEltsParam>])`` |
| This attribute indicates that the annotated function will always return at |
| least a given number of bytes (or null). Its arguments are zero-indexed |
| parameter numbers; if one argument is provided, then it's assumed that at |
| least ``CallSite.Args[EltSizeParam]`` bytes will be available at the |
| returned pointer. If two are provided, then it's assumed that |
| ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are |
| available. The referenced parameters must be integer types. No assumptions |
| are made about the contents of the returned block of memory. |
| ``alwaysinline`` |
| This attribute indicates that the inliner should attempt to inline |
| this function into callers whenever possible, ignoring any active |
| inlining size threshold for this caller. |
| ``builtin`` |
| This indicates that the callee function at a call site should be |
| recognized as a built-in function, even though the function's declaration |
| uses the ``nobuiltin`` attribute. This is only valid at call sites for |
| direct calls to functions that are declared with the ``nobuiltin`` |
| attribute. |
| ``cold`` |
| This attribute indicates that this function is rarely called. When |
| computing edge weights, basic blocks post-dominated by a cold |
| function call are also considered to be cold; and, thus, given low |
| weight. |
| |
| .. _attr_convergent: |
| |
| ``convergent`` |
| This attribute indicates that this function is convergent. |
| When it appears on a call/invoke, the convergent attribute |
| indicates that we should treat the call as though we’re calling a |
| convergent function. This is particularly useful on indirect |
| calls; without this we may treat such calls as though the target |
| is non-convergent. |
| |
| See :doc:`ConvergentOperations` for further details. |
| |
| It is an error to call :ref:`llvm.experimental.convergence.entry |
| <llvm.experimental.convergence.entry>` from a function that |
| does not have this attribute. |
| ``disable_sanitizer_instrumentation`` |
| When instrumenting code with sanitizers, it can be important to skip certain |
| functions to ensure no instrumentation is applied to them. |
| |
| This attribute is not always similar to absent ``sanitize_<name>`` |
| attributes: depending on the specific sanitizer, code can be inserted into |
| functions regardless of the ``sanitize_<name>`` attribute to prevent false |
| positive reports. |
| |
| ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation, |
| taking precedence over the ``sanitize_<name>`` attributes and other compiler |
| flags. |
| ``"dontcall-error"`` |
| This attribute denotes that an error diagnostic should be emitted when a |
| call of a function with this attribute is not eliminated via optimization. |
| Front ends can provide optional ``srcloc`` metadata nodes on call sites of |
| such callees to attach information about where in the source language such a |
| call came from. A string value can be provided as a note. |
| ``"dontcall-warn"`` |
| This attribute denotes that a warning diagnostic should be emitted when a |
| call of a function with this attribute is not eliminated via optimization. |
| Front ends can provide optional ``srcloc`` metadata nodes on call sites of |
| such callees to attach information about where in the source language such a |
| call came from. A string value can be provided as a note. |
| ``fn_ret_thunk_extern`` |
| This attribute tells the code generator that returns from functions should |
| be replaced with jumps to externally-defined architecture-specific symbols. |
| For X86, this symbol's identifier is ``__x86_return_thunk``. |
| ``"frame-pointer"`` |
| This attribute tells the code generator whether the function |
| should keep the frame pointer. The code generator may emit the frame pointer |
| even if this attribute says the frame pointer can be eliminated. |
| The allowed string values are: |
| |
| * ``"none"`` (default) - the frame pointer can be eliminated. |
| * ``"non-leaf"`` - the frame pointer should be kept if the function calls |
| other functions. |
| * ``"all"`` - the frame pointer should be kept. |
| ``hot`` |
| This attribute indicates that this function is a hot spot of the program |
| execution. The function will be optimized more aggressively and will be |
| placed into special subsection of the text section to improving locality. |
| |
| When profile feedback is enabled, this attribute has the precedence over |
| the profile information. By marking a function ``hot``, users can work |
| around the cases where the training input does not have good coverage |
| on all the hot functions. |
| ``inlinehint`` |
| This attribute indicates that the source code contained a hint that |
| inlining this function is desirable (such as the "inline" keyword in |
| C/C++). It is just a hint; it imposes no requirements on the |
| inliner. |
| ``jumptable`` |
| This attribute indicates that the function should be added to a |
| jump-instruction table at code-generation time, and that all address-taken |
| references to this function should be replaced with a reference to the |
| appropriate jump-instruction-table function pointer. Note that this creates |
| a new pointer for the original function, which means that code that depends |
| on function-pointer identity can break. So, any function annotated with |
| ``jumptable`` must also be ``unnamed_addr``. |
| ``memory(...)`` |
| This attribute specifies the possible memory effects of the call-site or |
| function. It allows specifying the possible access kinds (``none``, |
| ``read``, ``write``, or ``readwrite``) for the possible memory location |
| kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best |
| understood by example: |
| |
| - ``memory(none)``: Does not access any memory. |
| - ``memory(read)``: May read (but not write) any memory. |
| - ``memory(write)``: May write (but not read) any memory. |
| - ``memory(readwrite)``: May read or write any memory. |
| - ``memory(argmem: read)``: May only read argument memory. |
| - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument |
| memory and only write inaccessible memory. |
| - ``memory(read, argmem: readwrite)``: May read any memory (default mode) |
| and additionally write argument memory. |
| - ``memory(readwrite, argmem: none)``: May access any memory apart from |
| argument memory. |
| |
| The supported memory location kinds are: |
| |
| - ``argmem``: This refers to accesses that are based on pointer arguments |
| to the function. |
| - ``inaccessiblemem``: This refers to accesses to memory which is not |
| accessible by the current module (before return from the function -- an |
| allocator function may return newly accessible memory while only |
| accessing inaccessible memory itself). Inaccessible memory is often used |
| to model control dependencies of intrinsics. |
| - The default access kind (specified without a location prefix) applies to |
| all locations that haven't been specified explicitly, including those that |
| don't currently have a dedicated location kind (e.g. accesses to globals |
| or captured pointers). |
| |
| If the ``memory`` attribute is not specified, then ``memory(readwrite)`` |
| is implied (all memory effects are possible). |
| |
| The memory effects of a call can be computed as |
| ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the |
| call-site annotation takes precedence over the potential effects described |
| by either the function annotation or the operand bundles. |
| ``minsize`` |
| This attribute suggests that optimization passes and code generator |
| passes make choices that keep the code size of this function as small |
| as possible and perform optimizations that may sacrifice runtime |
| performance in order to minimize the size of the generated code. |
| ``naked`` |
| This attribute disables prologue / epilogue emission for the |
| function. This can have very system-specific consequences. |
| ``"no-inline-line-tables"`` |
| When this attribute is set to true, the inliner discards source locations |
| when inlining code and instead uses the source location of the call site. |
| Breakpoints set on code that was inlined into the current function will |
| not fire during the execution of the inlined call sites. If the debugger |
| stops inside an inlined call site, it will appear to be stopped at the |
| outermost inlined call site. |
| ``no-jump-tables`` |
| When this attribute is set to true, the jump tables and lookup tables that |
| can be generated from a switch case lowering are disabled. |
| ``nobuiltin`` |
| This indicates that the callee function at a call site is not recognized as |
| a built-in function. LLVM will retain the original call and not replace it |
| with equivalent code based on the semantics of the built-in function, unless |
| the call site uses the ``builtin`` attribute. This is valid at call sites |
| and on function declarations and definitions. |
| ``nocallback`` |
| This attribute indicates that the function is only allowed to jump back into |
| caller's module by a return or an exception, and is not allowed to jump back |
| by invoking a callback function, a direct, possibly transitive, external |
| function call, use of ``longjmp``, or other means. It is a compiler hint that |
| is used at module level to improve dataflow analysis, dropped during linking, |
| and has no effect on functions defined in the current module. |
| ``noduplicate`` |
| This attribute indicates that calls to the function cannot be |
| duplicated. A call to a ``noduplicate`` function may be moved |
| within its parent function, but may not be duplicated within |
| its parent function. |
| |
| A function containing a ``noduplicate`` call may still |
| be an inlining candidate, provided that the call is not |
| duplicated by inlining. That implies that the function has |
| internal linkage and only has one call site, so the original |
| call is dead after inlining. |
| ``nofree`` |
| This function attribute indicates that the function does not, directly or |
| transitively, call a memory-deallocation function (``free``, for example) |
| on a memory allocation which existed before the call. |
| |
| As a result, uncaptured pointers that are known to be dereferenceable |
| prior to a call to a function with the ``nofree`` attribute are still |
| known to be dereferenceable after the call. The capturing condition is |
| necessary in environments where the function might communicate the |
| pointer to another thread which then deallocates the memory. Alternatively, |
| ``nosync`` would ensure such communication cannot happen and even captured |
| pointers cannot be freed by the function. |
| |
| A ``nofree`` function is explicitly allowed to free memory which it |
| allocated or (if not ``nosync``) arrange for another thread to free |
| memory on it's behalf. As a result, perhaps surprisingly, a ``nofree`` |
| function can return a pointer to a previously deallocated memory object. |
| ``noimplicitfloat`` |
| Disallows implicit floating-point code. This inhibits optimizations that |
| use floating-point code and floating-point registers for operations that are |
| not nominally floating-point. LLVM instructions that perform floating-point |
| operations or require access to floating-point registers may still cause |
| floating-point code to be generated. |
| |
| Also inhibits optimizations that create SIMD/vector code and registers from |
| scalar code such as vectorization or memcpy/memset optimization. This |
| includes integer vectors. Vector instructions present in IR may still cause |
| vector code to be generated. |
| ``noinline`` |
| This attribute indicates that the inliner should never inline this |
| function in any situation. This attribute may not be used together |
| with the ``alwaysinline`` attribute. |
| ``nomerge`` |
| This attribute indicates that calls to this function should never be merged |
| during optimization. For example, it will prevent tail merging otherwise |
| identical code sequences that raise an exception or terminate the program. |
| Tail merging normally reduces the precision of source location information, |
| making stack traces less useful for debugging. This attribute gives the |
| user control over the tradeoff between code size and debug information |
| precision. |
| ``nonlazybind`` |
| This attribute suppresses lazy symbol binding for the function. This |
| may make calls to the function faster, at the cost of extra program |
| startup time if the function is not called during program startup. |
| ``noprofile`` |
| This function attribute prevents instrumentation based profiling, used for |
| coverage or profile based optimization, from being added to a function. It |
| also blocks inlining if the caller and callee have different values of this |
| attribute. |
| ``skipprofile`` |
| This function attribute prevents instrumentation based profiling, used for |
| coverage or profile based optimization, from being added to a function. This |
| attribute does not restrict inlining, so instrumented instruction could end |
| up in this function. |
| ``noredzone`` |
| This attribute indicates that the code generator should not use a |
| red zone, even if the target-specific ABI normally permits it. |
| ``indirect-tls-seg-refs`` |
| This attribute indicates that the code generator should not use |
| direct TLS access through segment registers, even if the |
| target-specific ABI normally permits it. |
| ``noreturn`` |
| This function attribute indicates that the function never returns |
| normally, hence through a return instruction. This produces undefined |
| behavior at runtime if the function ever does dynamically return. Annotated |
| functions may still raise an exception, i.a., ``nounwind`` is not implied. |
| ``norecurse`` |
| This function attribute indicates that the function does not call itself |
| either directly or indirectly down any possible call path. This produces |
| undefined behavior at runtime if the function ever does recurse. |
| |
| .. _langref_willreturn: |
| |
| ``willreturn`` |
| This function attribute indicates that a call of this function will |
| either exhibit undefined behavior or comes back and continues execution |
| at a point in the existing call stack that includes the current invocation. |
| Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. |
| If an invocation of an annotated function does not return control back |
| to a point in the call stack, the behavior is undefined. |
| ``nosync`` |
| This function attribute indicates that the function does not communicate |
| (synchronize) with another thread through memory or other well-defined means. |
| Synchronization is considered possible in the presence of `atomic` accesses |
| that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, |
| as well as `convergent` function calls. |
| |
| Note that `convergent` operations can involve communication that is |
| considered to be not through memory and does not necessarily imply an |
| ordering between threads for the purposes of the memory model. Therefore, |
| an operation can be both `convergent` and `nosync`. |
| |
| If a `nosync` function does ever synchronize with another thread, |
| the behavior is undefined. |
| ``nounwind`` |
| This function attribute indicates that the function never raises an |
| exception. If the function does raise an exception, its runtime |
| behavior is undefined. However, functions marked nounwind may still |
| trap or generate asynchronous exceptions. Exception handling schemes |
| that are recognized by LLVM to handle asynchronous exceptions, such |
| as SEH, will still provide their implementation defined semantics. |
| ``nosanitize_bounds`` |
| This attribute indicates that bounds checking sanitizer instrumentation |
| is disabled for this function. |
| ``nosanitize_coverage`` |
| This attribute indicates that SanitizerCoverage instrumentation is disabled |
| for this function. |
| ``null_pointer_is_valid`` |
| If ``null_pointer_is_valid`` is set, then the ``null`` address |
| in address-space 0 is considered to be a valid address for memory loads and |
| stores. Any analysis or optimization should not treat dereferencing a |
| pointer to ``null`` as undefined behavior in this function. |
| Note: Comparing address of a global variable to ``null`` may still |
| evaluate to false because of a limitation in querying this attribute inside |
| constant expressions. |
| ``optforfuzzing`` |
| This attribute indicates that this function should be optimized |
| for maximum fuzzing signal. |
| ``optnone`` |
| This function attribute indicates that most optimization passes will skip |
| this function, with the exception of interprocedural optimization passes. |
| Code generation defaults to the "fast" instruction selector. |
| This attribute cannot be used together with the ``alwaysinline`` |
| attribute; this attribute is also incompatible |
| with the ``minsize`` attribute and the ``optsize`` attribute. |
| |
| This attribute requires the ``noinline`` attribute to be specified on |
| the function as well, so the function is never inlined into any caller. |
| Only functions with the ``alwaysinline`` attribute are valid |
| candidates for inlining into the body of this function. |
| ``optsize`` |
| This attribute suggests that optimization passes and code generator |
| passes make choices that keep the code size of this function low, |
| and otherwise do optimizations specifically to reduce code size as |
| long as they do not significantly impact runtime performance. |
| ``"patchable-function"`` |
| This attribute tells the code generator that the code |
| generated for this function needs to follow certain conventions that |
| make it possible for a runtime function to patch over it later. |
| The exact effect of this attribute depends on its string value, |
| for which there currently is one legal possibility: |
| |
| * ``"prologue-short-redirect"`` - This style of patchable |
| function is intended to support patching a function prologue to |
| redirect control away from the function in a thread safe |
| manner. It guarantees that the first instruction of the |
| function will be large enough to accommodate a short jump |
| instruction, and will be sufficiently aligned to allow being |
| fully changed via an atomic compare-and-swap instruction. |
| While the first requirement can be satisfied by inserting large |
| enough NOP, LLVM can and will try to re-purpose an existing |
| instruction (i.e. one that would have to be emitted anyway) as |
| the patchable instruction larger than a short jump. |
| |
| ``"prologue-short-redirect"`` is currently only supported on |
| x86-64. |
| |
| This attribute by itself does not imply restrictions on |
| inter-procedural optimizations. All of the semantic effects the |
| patching may have to be separately conveyed via the linkage type. |
| ``"probe-stack"`` |
| This attribute indicates that the function will trigger a guard region |
| in the end of the stack. It ensures that accesses to the stack must be |
| no further apart than the size of the guard region to a previous |
| access of the stack. It takes one required string value, the name of |
| the stack probing function that will be called. |
| |
| If a function that has a ``"probe-stack"`` attribute is inlined into |
| a function with another ``"probe-stack"`` attribute, the resulting |
| function has the ``"probe-stack"`` attribute of the caller. If a |
| function that has a ``"probe-stack"`` attribute is inlined into a |
| function that has no ``"probe-stack"`` attribute at all, the resulting |
| function has the ``"probe-stack"`` attribute of the callee. |
| ``"stack-probe-size"`` |
| This attribute controls the behavior of stack probes: either |
| the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. |
| It defines the size of the guard region. It ensures that if the function |
| may use more stack space than the size of the guard region, stack probing |
| sequence will be emitted. It takes one required integer value, which |
| is 4096 by default. |
| |
| If a function that has a ``"stack-probe-size"`` attribute is inlined into |
| a function with another ``"stack-probe-size"`` attribute, the resulting |
| function has the ``"stack-probe-size"`` attribute that has the lower |
| numeric value. If a function that has a ``"stack-probe-size"`` attribute is |
| inlined into a function that has no ``"stack-probe-size"`` attribute |
| at all, the resulting function has the ``"stack-probe-size"`` attribute |
| of the callee. |
| ``"no-stack-arg-probe"`` |
| This attribute disables ABI-required stack probes, if any. |
| ``returns_twice`` |
| This attribute indicates that this function can return twice. The C |
| ``setjmp`` is an example of such a function. The compiler disables |
| some optimizations (like tail calls) in the caller of these |
| functions. |
| ``safestack`` |
| This attribute indicates that |
| `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_ |
| protection is enabled for this function. |
| |
| If a function that has a ``safestack`` attribute is inlined into a |
| function that doesn't have a ``safestack`` attribute or which has an |
| ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting |
| function will have a ``safestack`` attribute. |
| ``sanitize_address`` |
| This attribute indicates that AddressSanitizer checks |
| (dynamic address safety analysis) are enabled for this function. |
| ``sanitize_memory`` |
| This attribute indicates that MemorySanitizer checks (dynamic detection |
| of accesses to uninitialized memory) are enabled for this function. |
| ``sanitize_thread`` |
| This attribute indicates that ThreadSanitizer checks |
| (dynamic thread safety analysis) are enabled for this function. |
| ``sanitize_hwaddress`` |
| This attribute indicates that HWAddressSanitizer checks |
| (dynamic address safety analysis based on tagged pointers) are enabled for |
| this function. |
| ``sanitize_memtag`` |
| This attribute indicates that MemTagSanitizer checks |
| (dynamic address safety analysis based on Armv8 MTE) are enabled for |
| this function. |
| ``speculative_load_hardening`` |
| This attribute indicates that |
| `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ |
| should be enabled for the function body. |
| |
| Speculative Load Hardening is a best-effort mitigation against |
| information leak attacks that make use of control flow |
| miss-speculation - specifically miss-speculation of whether a branch |
| is taken or not. Typically vulnerabilities enabling such attacks are |
| classified as "Spectre variant #1". Notably, this does not attempt to |
| mitigate against miss-speculation of branch target, classified as |
| "Spectre variant #2" vulnerabilities. |
| |
| When inlining, the attribute is sticky. Inlining a function that carries |
| this attribute will cause the caller to gain the attribute. This is intended |
| to provide a maximally conservative model where the code in a function |
| annotated with this attribute will always (even after inlining) end up |
| hardened. |
| ``speculatable`` |
| This function attribute indicates that the function does not have any |
| effects besides calculating its result and does not have undefined behavior. |
| Note that ``speculatable`` is not enough to conclude that along any |
| particular execution path the number of calls to this function will not be |
| externally observable. This attribute is only valid on functions |
| and declarations, not on individual call sites. If a function is |
| incorrectly marked as speculatable and really does exhibit |
| undefined behavior, the undefined behavior may be observed even |
| if the call site is dead code. |
| |
| ``ssp`` |
| This attribute indicates that the function should emit a stack |
| smashing protector. It is in the form of a "canary" --- a random value |
| placed on the stack before the local variables that's checked upon |
| return from the function to see if it has been overwritten. A |
| heuristic is used to determine if a function needs stack protectors |
| or not. The heuristic used will enable protectors for functions with: |
| |
| - Character arrays larger than ``ssp-buffer-size`` (default 8). |
| - Aggregates containing character arrays larger than ``ssp-buffer-size``. |
| - Calls to alloca() with variable sizes or constant sizes greater than |
| ``ssp-buffer-size``. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| |
| If a function with an ``ssp`` attribute is inlined into a calling function, |
| the attribute is not carried over to the calling function. |
| |
| ``sspstrong`` |
| This attribute indicates that the function should emit a stack smashing |
| protector. This attribute causes a strong heuristic to be used when |
| determining if a function needs stack protectors. The strong heuristic |
| will enable protectors for functions with: |
| |
| - Arrays of any size and type |
| - Aggregates containing an array of any size and type. |
| - Calls to alloca(). |
| - Local variables that have had their address taken. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| The specific layout rules are: |
| |
| #. Large arrays and structures containing large arrays |
| (``>= ssp-buffer-size``) are closest to the stack protector. |
| #. Small arrays and structures containing small arrays |
| (``< ssp-buffer-size``) are 2nd closest to the protector. |
| #. Variables that have had their address taken are 3rd closest to the |
| protector. |
| |
| This overrides the ``ssp`` function attribute. |
| |
| If a function with an ``sspstrong`` attribute is inlined into a calling |
| function which has an ``ssp`` attribute, the calling function's attribute |
| will be upgraded to ``sspstrong``. |
| |
| ``sspreq`` |
| This attribute indicates that the function should *always* emit a stack |
| smashing protector. This overrides the ``ssp`` and ``sspstrong`` function |
| attributes. |
| |
| Variables that are identified as requiring a protector will be arranged |
| on the stack such that they are adjacent to the stack protector guard. |
| The specific layout rules are: |
| |
| #. Large arrays and structures containing large arrays |
| (``>= ssp-buffer-size``) are closest to the stack protector. |
| #. Small arrays and structures containing small arrays |
| (``< ssp-buffer-size``) are 2nd closest to the protector. |
| #. Variables that have had their address taken are 3rd closest to the |
| protector. |
| |
| If a function with an ``sspreq`` attribute is inlined into a calling |
| function which has an ``ssp`` or ``sspstrong`` attribute, the calling |
| function's attribute will be upgraded to ``sspreq``. |
| |
| ``strictfp`` |
| This attribute indicates that the function was called from a scope that |
| requires strict floating-point semantics. LLVM will not attempt any |
| optimizations that require assumptions about the floating-point rounding |
| mode or that might alter the state of floating-point status flags that |
| might otherwise be set or cleared by calling this function. LLVM will |
| not introduce any new floating-point instructions that may trap. |
| |
| .. _denormal_fp_math: |
| |
| ``"denormal-fp-math"`` |
| This indicates the denormal (subnormal) handling that may be |
| assumed for the default floating-point environment. This is a |
| comma separated pair. The elements may be one of ``"ieee"``, |
| ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The |
| first entry indicates the flushing mode for the result of floating |
| point operations. The second indicates the handling of denormal inputs |
| to floating point instructions. For compatibility with older |
| bitcode, if the second value is omitted, both input and output |
| modes will assume the same mode. |
| |
| If this is attribute is not specified, the default is ``"ieee,ieee"``. |
| |
| If the output mode is ``"preserve-sign"``, or ``"positive-zero"``, |
| denormal outputs may be flushed to zero by standard floating-point |
| operations. It is not mandated that flushing to zero occurs, but if |
| a denormal output is flushed to zero, it must respect the sign |
| mode. Not all targets support all modes. |
| |
| If the mode is ``"dynamic"``, the behavior is derived from the |
| dynamic state of the floating-point environment. Transformations |
| which depend on the behavior of denormal values should not be |
| performed. |
| |
| While this indicates the expected floating point mode the function |
| will be executed with, this does not make any attempt to ensure |
| the mode is consistent. User or platform code is expected to set |
| the floating point mode appropriately before function entry. |
| |
| If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, |
| a floating-point operation must treat any input denormal value as |
| zero. In some situations, if an instruction does not respect this |
| mode, the input may need to be converted to 0 as if by |
| ``@llvm.canonicalize`` during lowering for correctness. |
| |
| ``"denormal-fp-math-f32"`` |
| Same as ``"denormal-fp-math"``, but only controls the behavior of |
| the 32-bit float type (or vectors of 32-bit floats). If both are |
| are present, this overrides ``"denormal-fp-math"``. Not all targets |
| support separately setting the denormal mode per type, and no |
| attempt is made to diagnose unsupported uses. Currently this |
| attribute is respected by the AMDGPU and NVPTX backends. |
| |
| ``"thunk"`` |
| This attribute indicates that the function will delegate to some other |
| function with a tail call. The prototype of a thunk should not be used for |
| optimization purposes. The caller is expected to cast the thunk prototype to |
| match the thunk target prototype. |
| |
| ``"tls-load-hoist"`` |
| This attribute indicates that the function will try to reduce redundant |
| tls address calculation by hoisting tls variable. |
| |
| ``uwtable[(sync|async)]`` |
| This attribute indicates that the ABI being targeted requires that |
| an unwind table entry be produced for this function even if we can |
| show that no exceptions passes by it. This is normally the case for |
| the ELF x86-64 abi, but it can be disabled for some compilation |
| units. The optional parameter describes what kind of unwind tables |
| to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous |
| (instruction precise) unwind tables. Without the parameter, the attribute |
| ``uwtable`` is equivalent to ``uwtable(async)``. |
| ``nocf_check`` |
| This attribute indicates that no control-flow check will be performed on |
| the attributed entity. It disables -fcf-protection=<> for a specific |
| entity to fine grain the HW control flow protection mechanism. The flag |
| is target independent and currently appertains to a function or function |
| pointer. |
| ``shadowcallstack`` |
| This attribute indicates that the ShadowCallStack checks are enabled for |
| the function. The instrumentation checks that the return address for the |
| function has not changed between the function prolog and epilog. It is |
| currently x86_64-specific. |
| |
| .. _langref_mustprogress: |
| |
| ``mustprogress`` |
| This attribute indicates that the function is required to return, unwind, |
| or interact with the environment in an observable way e.g. via a volatile |
| memory access, I/O, or other synchronization. The ``mustprogress`` |
| attribute is intended to model the requirements of the first section of |
| [intro.progress] of the C++ Standard. As a consequence, a loop in a |
| function with the `mustprogress` attribute can be assumed to terminate if |
| it does not interact with the environment in an observable way, and |
| terminating loops without side-effects can be removed. If a `mustprogress` |
| function does not satisfy this contract, the behavior is undefined. This |
| attribute does not apply transitively to callees, but does apply to call |
| sites within the function. Note that `willreturn` implies `mustprogress`. |
| ``"warn-stack-size"="<threshold>"`` |
| This attribute sets a threshold to emit diagnostics once the frame size is |
| known should the frame size exceed the specified value. It takes one |
| required integer value, which should be a non-negative integer, and less |
| than `UINT_MAX`. It's unspecified which threshold will be used when |
| duplicate definitions are linked together with differing values. |
| ``vscale_range(<min>[, <max>])`` |
| This function attribute indicates `vscale` is a power-of-two within a |
| specified range. `min` must be a power-of-two that is greater than 0. When |
| specified, `max` must be a power-of-two greater-than-or-equal to `min` or 0 |
| to signify an unbounded maximum. The syntax `vscale_range(<val>)` can be |
| used to set both `min` and `max` to the same value. Functions that don't |
| include this attribute make no assumptions about the value of `vscale`. |
| ``"nooutline"`` |
| This attribute indicates that outlining passes should not modify the |
| function. |
| |
| Call Site Attributes |
| ---------------------- |
| |
| In addition to function attributes the following call site only |
| attributes are supported: |
| |
| ``vector-function-abi-variant`` |
| This attribute can be attached to a :ref:`call <i_call>` to list |
| the vector functions associated to the function. Notice that the |
| attribute cannot be attached to a :ref:`invoke <i_invoke>` or a |
| :ref:`callbr <i_callbr>` instruction. The attribute consists of a |
| comma separated list of mangled names. The order of the list does |
| not imply preference (it is logically a set). The compiler is free |
| to pick any listed vector function of its choosing. |
| |
| The syntax for the mangled names is as follows::: |
| |
| _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] |
| |
| When present, the attribute informs the compiler that the function |
| ``<scalar_name>`` has a corresponding vector variant that can be |
| used to perform the concurrent invocation of ``<scalar_name>`` on |
| vectors. The shape of the vector function is described by the |
| tokens between the prefix ``_ZGV`` and the ``<scalar_name>`` |
| token. The standard name of the vector function is |
| ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present, |
| the optional token ``(<vector_redirection>)`` informs the compiler |
| that a custom name is provided in addition to the standard one |
| (custom names can be provided for example via the use of ``declare |
| variant`` in OpenMP 5.0). The declaration of the variant must be |
| present in the IR Module. The signature of the vector variant is |
| determined by the rules of the Vector Function ABI (VFABI) |
| specifications of the target. For Arm and X86, the VFABI can be |
| found at https://github.com/ARM-software/abi-aa and |
| https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html, |
| respectively. |
| |
| For X86 and Arm targets, the values of the tokens in the standard |
| name are those that are defined in the VFABI. LLVM has an internal |
| ``<isa>`` token that can be used to create scalar-to-vector |
| mappings for functions that are not directly associated to any of |
| the target ISAs (for example, some of the mappings stored in the |
| TargetLibraryInfo). Valid values for the ``<isa>`` token are::: |
| |
| <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512 |
| | n | s -> Armv8 Advanced SIMD, SVE |
| | __LLVM__ -> Internal LLVM Vector ISA |
| |
| For all targets currently supported (x86, Arm and Internal LLVM), |
| the remaining tokens can have the following values::: |
| |
| <mask>:= M | N -> mask | no mask |
| |
| <vlen>:= number -> number of lanes |
| | x -> VLA (Vector Length Agnostic) |
| |
| <parameters>:= v -> vector |
| | l | l <number> -> linear |
| | R | R <number> -> linear with ref modifier |
| | L | L <number> -> linear with val modifier |
| | U | U <number> -> linear with uval modifier |
| | ls <pos> -> runtime linear |
| | Rs <pos> -> runtime linear with ref modifier |
| | Ls <pos> -> runtime linear with val modifier |
| | Us <pos> -> runtime linear with uval modifier |
| | u -> uniform |
| |
| <scalar_name>:= name of the scalar function |
| |
| <vector_redirection>:= optional, custom name of the vector function |
| |
| ``preallocated(<ty>)`` |
| This attribute is required on calls to ``llvm.call.preallocated.arg`` |
| and cannot be used on any other call. See |
| :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more |
| details. |
| |
| .. _glattrs: |
| |
| Global Attributes |
| ----------------- |
| |
| Attributes may be set to communicate additional information about a global variable. |
| Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable |
| are grouped into a single :ref:`attribute group <attrgrp>`. |
| |
| ``no_sanitize_address`` |
| This attribute indicates that the global variable should not have |
| AddressSanitizer instrumentation applied to it, because it was annotated |
| with `__attribute__((no_sanitize("address")))`, |
| `__attribute__((disable_sanitizer_instrumentation))`, or included in the |
| `-fsanitize-ignorelist` file. |
| ``no_sanitize_hwaddress`` |
| This attribute indicates that the global variable should not have |
| HWAddressSanitizer instrumentation applied to it, because it was annotated |
| with `__attribute__((no_sanitize("hwaddress")))`, |
| `__attribute__((disable_sanitizer_instrumentation))`, or included in the |
| `-fsanitize-ignorelist` file. |
| ``sanitize_memtag`` |
| This attribute indicates that the global variable should have AArch64 memory |
| tags (MTE) instrumentation applied to it. This attribute causes the |
| suppression of certain optimisations, like GlobalMerge, as well as ensuring |
| extra directives are emitted in the assembly and extra bits of metadata are |
| placed in the object file so that the linker can ensure the accesses are |
| protected by MTE. This attribute is added by clang when |
| `-fsanitize=memtag-globals` is provided, as long as the global is not marked |
| with `__attribute__((no_sanitize("memtag")))`, |
| `__attribute__((disable_sanitizer_instrumentation))`, or included in the |
| `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove |
| this attribute when it's not possible to tag the global (e.g. it's a TLS |
| variable). |
| ``sanitize_address_dyninit`` |
| This attribute indicates that the global variable, when instrumented with |
| AddressSanitizer, should be checked for ODR violations. This attribute is |
| applied to global variables that are dynamically initialized according to |
| C++ rules. |
| |
| .. _opbundles: |
| |
| Operand Bundles |
| --------------- |
| |
| Operand bundles are tagged sets of SSA values that can be associated |
| with certain LLVM instructions (currently only ``call`` s and |
| ``invoke`` s). In a way they are like metadata, but dropping them is |
| incorrect and will change program semantics. |
| |
| Syntax:: |
| |
| operand bundle set ::= '[' operand bundle (, operand bundle )* ']' |
| operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' |
| bundle operand ::= SSA value |
| tag ::= string constant |
| |
| Operand bundles are **not** part of a function's signature, and a |
| given function may be called from multiple places with different kinds |
| of operand bundles. This reflects the fact that the operand bundles |
| are conceptually a part of the ``call`` (or ``invoke``), not the |
| callee being dispatched to. |
| |
| Operand bundles are a generic mechanism intended to support |
| runtime-introspection-like functionality for managed languages. While |
| the exact semantics of an operand bundle depend on the bundle tag, |
| there are certain limitations to how much the presence of an operand |
| bundle can influence the semantics of a program. These restrictions |
| are described as the semantics of an "unknown" operand bundle. As |
| long as the behavior of an operand bundle is describable within these |
| restrictions, LLVM does not need to have special knowledge of the |
| operand bundle to not miscompile programs containing it. |
| |
| - The bundle operands for an unknown operand bundle escape in unknown |
| ways before control is transferred to the callee or invokee. |
| - Calls and invokes with operand bundles have unknown read / write |
| effect on the heap on entry and exit (even if the call target specifies |
| a ``memory`` attribute), unless they're overridden with |
| callsite specific attributes. |
| - An operand bundle at a call site cannot change the implementation |
| of the called function. Inter-procedural optimizations work as |
| usual as long as they take into account the first two properties. |
| |
| More specific types of operand bundles are described below. |
| |
| .. _deopt_opbundles: |
| |
| Deoptimization Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Deoptimization operand bundles are characterized by the ``"deopt"`` |
| operand bundle tag. These operand bundles represent an alternate |
| "safe" continuation for the call site they're attached to, and can be |
| used by a suitable runtime to deoptimize the compiled frame at the |
| specified call site. There can be at most one ``"deopt"`` operand |
| bundle attached to a call site. Exact details of deoptimization is |
| out of scope for the language reference, but it usually involves |
| rewriting a compiled frame into a set of interpreted frames. |
| |
| From the compiler's perspective, deoptimization operand bundles make |
| the call sites they're attached to at least ``readonly``. They read |
| through all of their pointer typed operands (even if they're not |
| otherwise escaped) and the entire visible heap. Deoptimization |
| operand bundles do not capture their operands except during |
| deoptimization, in which case control will not be returned to the |
| compiled frame. |
| |
| The inliner knows how to inline through calls that have deoptimization |
| operand bundles. Just like inlining through a normal call site |
| involves composing the normal and exceptional continuations, inlining |
| through a call site with a deoptimization operand bundle needs to |
| appropriately compose the "safe" deoptimization continuation. The |
| inliner does this by prepending the parent's deoptimization |
| continuation to every deoptimization continuation in the inlined body. |
| E.g. inlining ``@f`` into ``@g`` in the following example |
| |
| .. code-block:: llvm |
| |
| define void @f() { |
| call void @x() ;; no deopt state |
| call void @y() [ "deopt"(i32 10) ] |
| call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ] |
| ret void |
| } |
| |
| define void @g() { |
| call void @f() [ "deopt"(i32 20) ] |
| ret void |
| } |
| |
| will result in |
| |
| .. code-block:: llvm |
| |
| define void @g() { |
| call void @x() ;; still no deopt state |
| call void @y() [ "deopt"(i32 20, i32 10) ] |
| call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ] |
| ret void |
| } |
| |
| It is the frontend's responsibility to structure or encode the |
| deoptimization state in a way that syntactically prepending the |
| caller's deoptimization state to the callee's deoptimization state is |
| semantically equivalent to composing the caller's deoptimization |
| continuation after the callee's deoptimization continuation. |
| |
| .. _ob_funclet: |
| |
| Funclet Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Funclet operand bundles are characterized by the ``"funclet"`` |
| operand bundle tag. These operand bundles indicate that a call site |
| is within a particular funclet. There can be at most one |
| ``"funclet"`` operand bundle attached to a call site and it must have |
| exactly one bundle operand. |
| |
| If any funclet EH pads have been "entered" but not "exited" (per the |
| `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), |
| it is undefined behavior to execute a ``call`` or ``invoke`` which: |
| |
| * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind |
| intrinsic, or |
| * has a ``"funclet"`` bundle whose operand is not the most-recently-entered |
| not-yet-exited funclet EH pad. |
| |
| Similarly, if no funclet EH pads have been entered-but-not-yet-exited, |
| executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. |
| |
| GC Transition Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| GC transition operand bundles are characterized by the |
| ``"gc-transition"`` operand bundle tag. These operand bundles mark a |
| call as a transition between a function with one GC strategy to a |
| function with a different GC strategy. If coordinating the transition |
| between GC strategies requires additional code generation at the call |
| site, these bundles may contain any values that are needed by the |
| generated code. For more details, see :ref:`GC Transitions |
| <gc_transition_args>`. |
| |
| The bundle contain an arbitrary list of Values which need to be passed |
| to GC transition code. They will be lowered and passed as operands to |
| the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed |
| that these arguments must be available before and after (but not |
| necessarily during) the execution of the callee. |
| |
| .. _assume_opbundles: |
| |
| Assume Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing |
| assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a |
| :ref:`function attribute <fnattrs>` holds for a certain value at a certain |
| location. Operand bundles enable assumptions that are either hard or impossible |
| to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`. |
| |
| An assume operand bundle has the form: |
| |
| :: |
| |
| "<tag>"([ <arguments>] ]) |
| |
| In the case of function or parameter attributes, the operand bundle has the |
| restricted form: |
| |
| :: |
| |
| "<tag>"([ <holds for value> [, <attribute argument>] ]) |
| |
| * The tag of the operand bundle is usually the name of attribute that can be |
| assumed to hold. It can also be `ignore`, this tag doesn't contain any |
| information and should be ignored. |
| * The first argument if present is the value for which the attribute hold. |
| * The second argument if present is an argument of the attribute. |
| |
| If there are no arguments the attribute is a property of the call location. |
| |
| For example: |
| |
| .. code-block:: llvm |
| |
| call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)] |
| |
| allows the optimizer to assume that at location of call to |
| :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8. |
| |
| .. code-block:: llvm |
| |
| call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)] |
| |
| allows the optimizer to assume that the :ref:`llvm.assume <int_assume>` |
| call location is cold and that ``%val`` may not be null. |
| |
| Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the |
| provided guarantees are violated at runtime the behavior is undefined. |
| |
| While attributes expect constant arguments, assume operand bundles may be |
| provided a dynamic value, for example: |
| |
| .. code-block:: llvm |
| |
| call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)] |
| |
| If the operand bundle value violates any requirements on the attribute value, |
| the behavior is undefined, unless one of the following exceptions applies: |
| |
| * ``"align"`` operand bundles may specify a non-power-of-two alignment |
| (including a zero alignment). If this is the case, then the pointer value |
| must be a null pointer, otherwise the behavior is undefined. |
| |
| In addition to allowing operand bundles encoding function and parameter |
| attributes, an assume operand bundle my also encode a ``separate_storage`` |
| operand bundle. This has the form: |
| |
| .. code-block:: llvm |
| |
| separate_storage(<val1>, <val2>)`` |
| |
| This indicates that no pointer :ref:`based <pointeraliasing>` on one of its |
| arguments can alias any pointer based on the other. |
| |
| Even if the assumed property can be encoded as a boolean value, like |
| ``nonnull``, using operand bundles to express the property can still have |
| benefits: |
| |
| * Attributes that can be expressed via operand bundles are directly the |
| property that the optimizer uses and cares about. Encoding attributes as |
| operand bundles removes the need for an instruction sequence that represents |
| the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the |
| optimizer to deduce the property from that instruction sequence. |
| * Expressing the property using operand bundles makes it easy to identify the |
| use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then |
| simplifies and improves heuristics, e.g., for use "use-sensitive" |
| optimizations. |
| |
| .. _ob_preallocated: |
| |
| Preallocated Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Preallocated operand bundles are characterized by the ``"preallocated"`` |
| operand bundle tag. These operand bundles allow separation of the allocation |
| of the call argument memory from the call site. This is necessary to pass |
| non-trivially copyable objects by value in a way that is compatible with MSVC |
| on some targets. There can be at most one ``"preallocated"`` operand bundle |
| attached to a call site and it must have exactly one bundle operand, which is |
| a token generated by ``@llvm.call.preallocated.setup``. A call with this |
| operand bundle should not adjust the stack before entering the function, as |
| that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics. |
| |
| .. code-block:: llvm |
| |
| %foo = type { i64, i32 } |
| |
| ... |
| |
| %t = call token @llvm.call.preallocated.setup(i32 1) |
| %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo) |
| ; initialize %b |
| call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)] |
| |
| .. _ob_gc_live: |
| |
| GC Live Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>` |
| intrinsic. The operand bundle must contain every pointer to a garbage collected |
| object which potentially needs to be updated by the garbage collector. |
| |
| When lowered, any relocated value will be recorded in the corresponding |
| :ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description |
| for further details. |
| |
| ObjC ARC Attached Call Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is |
| implicitly followed by a marker instruction and a call to an ObjC runtime |
| function that uses the result of the call. The operand bundle takes a mandatory |
| pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or |
| ``@objc_unsafeClaimAutoreleasedReturnValue``). |
| The return value of a call with this bundle is used by a call to |
| ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is |
| void, in which case the operand bundle is ignored. |
| |
| .. code-block:: llvm |
| |
| ; The marker instruction and a runtime function call are inserted after the call |
| ; to @foo. |
| call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ] |
| call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ] |
| |
| The operand bundle is needed to ensure the call is immediately followed by the |
| marker instruction and the ObjC runtime call in the final output. |
| |
| .. _ob_ptrauth: |
| |
| Pointer Authentication Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Pointer Authentication operand bundles are characterized by the |
| ``"ptrauth"`` operand bundle tag. They are described in the |
| `Pointer Authentication <PointerAuth.html#operand-bundle>`__ document. |
| |
| .. _ob_kcfi: |
| |
| KCFI Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^ |
| |
| A ``"kcfi"`` operand bundle on an indirect call indicates that the call will |
| be preceded by a runtime type check, which validates that the call target is |
| prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand |
| bundle attribute. For example: |
| |
| .. code-block:: llvm |
| |
| call void %0() ["kcfi"(i32 1234)] |
| |
| Clang emits KCFI operand bundles and the necessary metadata with |
| ``-fsanitize=kcfi``. |
| |
| .. _convergencectrl: |
| |
| Convergence Control Operand Bundles |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| A "convergencectrl" operand bundle is only valid on a ``convergent`` operation. |
| When present, the operand bundle must contain exactly one value of token type. |
| See the :doc:`ConvergentOperations` document for details. |
| |
| .. _moduleasm: |
| |
| Module-Level Inline Assembly |
| ---------------------------- |
| |
| Modules may contain "module-level inline asm" blocks, which corresponds |
| to the GCC "file scope inline asm" blocks. These blocks are internally |
| concatenated by LLVM and treated as a single unit, but may be separated |
| in the ``.ll`` file if desired. The syntax is very simple: |
| |
| .. code-block:: llvm |
| |
| module asm "inline asm code goes here" |
| module asm "more can go here" |
| |
| The strings can contain any character by escaping non-printable |
| characters. The escape sequence used is simply "\\xx" where "xx" is the |
| two digit hex code for the number. |
| |
| Note that the assembly string *must* be parseable by LLVM's integrated assembler |
| (unless it is disabled), even when emitting a ``.s`` file. |
| |
| .. _langref_datalayout: |
| |
| Data Layout |
| ----------- |
| |
| A module may specify a target specific data layout string that specifies |
| how data is to be laid out in memory. The syntax for the data layout is |
| simply: |
| |
| .. code-block:: llvm |
| |
| target datalayout = "layout specification" |
| |
| The *layout specification* consists of a list of specifications |
| separated by the minus sign character ('-'). Each specification starts |
| with a letter and may include other information after the letter to |
| define some aspect of the data layout. The specifications accepted are |
| as follows: |
| |
| ``E`` |
| Specifies that the target lays out data in big-endian form. That is, |
| the bits with the most significance have the lowest address |
| location. |
| ``e`` |
| Specifies that the target lays out data in little-endian form. That |
| is, the bits with the least significance have the lowest address |
| location. |
| ``S<size>`` |
| Specifies the natural alignment of the stack in bits. Alignment |
| promotion of stack variables is limited to the natural stack |
| alignment to avoid dynamic stack realignment. The stack alignment |
| must be a multiple of 8-bits. If omitted, the natural stack |
| alignment defaults to "unspecified", which does not prevent any |
| alignment promotions. |
| ``P<address space>`` |
| Specifies the address space that corresponds to program memory. |
| Harvard architectures can use this to specify what space LLVM |
| should place things such as functions into. If omitted, the |
| program memory space defaults to the default address space of 0, |
| which corresponds to a Von Neumann architecture that has code |
| and data in the same space. |
| ``G<address space>`` |
| Specifies the address space to be used by default when creating global |
| variables. If omitted, the globals address space defaults to the default |
| address space 0. |
| Note: variable declarations without an address space are always created in |
| address space 0, this property only affects the default value to be used |
| when creating globals without additional contextual information (e.g. in |
| LLVM passes). |
| |
| .. _alloca_addrspace: |
| |
| ``A<address space>`` |
| Specifies the address space of objects created by '``alloca``'. |
| Defaults to the default address space of 0. |
| ``p[n]:<size>:<abi>[:<pref>][:<idx>]`` |
| This specifies the *size* of a pointer and its ``<abi>`` and |
| ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional |
| and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the |
| index that used for address calculation. If not |
| specified, the default index size is equal to the pointer size. All sizes |
| are in bits. The address space, ``n``, is optional, and if not specified, |
| denotes the default address space 0. The value of ``n`` must be |
| in the range [1,2^24). |
| ``i<size>:<abi>[:<pref>]`` |
| This specifies the alignment for an integer type of a given bit |
| ``<size>``. The value of ``<size>`` must be in the range [1,2^24). |
| ``<pref>`` is optional and defaults to ``<abi>``. |
| For ``i8``, the ``<abi>`` value must equal 8, |
| that is, ``i8`` must be naturally aligned. |
| ``v<size>:<abi>[:<pref>]`` |
| This specifies the alignment for a vector type of a given bit |
| ``<size>``. The value of ``<size>`` must be in the range [1,2^24). |
| ``<pref>`` is optional and defaults to ``<abi>``. |
| ``f<size>:<abi>[:<pref>]`` |
| This specifies the alignment for a floating-point type of a given bit |
| ``<size>``. Only values of ``<size>`` that are supported by the target |
| will work. 32 (float) and 64 (double) are supported on all targets; 80 |
| or 128 (different flavors of long double) are also supported on some |
| targets. The value of ``<size>`` must be in the range [1,2^24). |
| ``<pref>`` is optional and defaults to ``<abi>``. |
| ``a:<abi>[:<pref>]`` |
| This specifies the alignment for an object of aggregate type. |
| ``<pref>`` is optional and defaults to ``<abi>``. |
| ``F<type><abi>`` |
| This specifies the alignment for function pointers. |
| The options for ``<type>`` are: |
| |
| * ``i``: The alignment of function pointers is independent of the alignment |
| of functions, and is a multiple of ``<abi>``. |
| * ``n``: The alignment of function pointers is a multiple of the explicit |
| alignment specified on the function, and is a multiple of ``<abi>``. |
| ``m:<mangling>`` |
| If present, specifies that llvm names are mangled in the output. Symbols |
| prefixed with the mangling escape character ``\01`` are passed through |
| directly to the assembler without the escape character. The mangling style |
| options are |
| |
| * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. |
| * ``l``: GOFF mangling: Private symbols get a ``@`` prefix. |
| * ``m``: Mips mangling: Private symbols get a ``$`` prefix. |
| * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other |
| symbols get a ``_`` prefix. |
| * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. |
| Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, |
| ``__fastcall``, and ``__vectorcall`` have custom mangling that appends |
| ``@N`` where N is the number of bytes used to pass parameters. C++ symbols |
| starting with ``?`` are not mangled in any way. |
| * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C |
| symbols do not receive a ``_`` prefix. |
| * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix. |
| ``n<size1>:<size2>:<size3>...`` |
| This specifies a set of native integer widths for the target CPU in |
| bits. For example, it might contain ``n32`` for 32-bit PowerPC, |
| ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of |
| this set are considered to support most general arithmetic operations |
| efficiently. |
| ``ni:<address space0>:<address space1>:<address space2>...`` |
| This specifies pointer types with the specified address spaces |
| as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` |
| address space cannot be specified as non-integral. |
| |
| On every specification that takes a ``<abi>:<pref>``, specifying the |
| ``<pref>`` alignment is optional. If omitted, the preceding ``:`` |
| should be omitted too and ``<pref>`` will be equal to ``<abi>``. |
| |
| When constructing the data layout for a given target, LLVM starts with a |
| default set of specifications which are then (possibly) overridden by |
| the specifications in the ``datalayout`` keyword. The default |
| specifications are given in this list: |
| |
| - ``e`` - little endian |
| - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. |
| - ``p[n]:64:64:64`` - Other address spaces are assumed to be the |
| same as the default address space. |
| - ``S0`` - natural stack alignment is unspecified |
| - ``i1:8:8`` - i1 is 8-bit (byte) aligned |
| - ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated |
| - ``i16:16:16`` - i16 is 16-bit aligned |
| - ``i32:32:32`` - i32 is 32-bit aligned |
| - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred |
| alignment of 64-bits |
| - ``f16:16:16`` - half is 16-bit aligned |
| - ``f32:32:32`` - float is 32-bit aligned |
| - ``f64:64:64`` - double is 64-bit aligned |
| - ``f128:128:128`` - quad is 128-bit aligned |
| - ``v64:64:64`` - 64-bit vector is 64-bit aligned |
| - ``v128:128:128`` - 128-bit vector is 128-bit aligned |
| - ``a:0:64`` - aggregates are 64-bit aligned |
| |
| When LLVM is determining the alignment for a given type, it uses the |
| following rules: |
| |
| #. If the type sought is an exact match for one of the specifications, |
| that specification is used. |
| #. If no match is found, and the type sought is an integer type, then |
| the smallest integer type that is larger than the bitwidth of the |
| sought type is used. If none of the specifications are larger than |
| the bitwidth then the largest integer type is used. For example, |
| given the default specifications above, the i7 type will use the |
| alignment of i8 (next largest) while both i65 and i256 will use the |
| alignment of i64 (largest specified). |
| |
| The function of the data layout string may not be what you expect. |
| Notably, this is not a specification from the frontend of what alignment |
| the code generator should use. |
| |
| Instead, if specified, the target data layout is required to match what |
| the ultimate *code generator* expects. This string is used by the |
| mid-level optimizers to improve code, and this only works if it matches |
| what the ultimate code generator uses. There is no way to generate IR |
| that does not embed this target-specific detail into the IR. If you |
| don't specify the string, the default specifications will be used to |
| generate a Data Layout and the optimization phases will operate |
| accordingly and introduce target specificity into the IR with respect to |
| these default specifications. |
| |
| .. _langref_triple: |
| |
| Target Triple |
| ------------- |
| |
| A module may specify a target triple string that describes the target |
| host. The syntax for the target triple is simply: |
| |
| .. code-block:: llvm |
| |
| target triple = "x86_64-apple-macosx10.7.0" |
| |
| The *target triple* string consists of a series of identifiers delimited |
| by the minus sign character ('-'). The canonical forms are: |
| |
| :: |
| |
| ARCHITECTURE-VENDOR-OPERATING_SYSTEM |
| ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT |
| |
| This information is passed along to the backend so that it generates |
| code for the proper architecture. It's possible to override this on the |
| command line with the ``-mtriple`` command line option. |
| |
| .. _objectlifetime: |
| |
| Object Lifetime |
| ---------------------- |
| |
| A memory object, or simply object, is a region of a memory space that is |
| reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap |
| allocation calls, and global variable definitions. |
| Once it is allocated, the bytes stored in the region can only be read or written |
| through a pointer that is :ref:`based on <pointeraliasing>` the allocation |
| value. |
| If a pointer that is not based on the object tries to read or write to the |
| object, it is undefined behavior. |
| |
| A lifetime of a memory object is a property that decides its accessibility. |
| Unless stated otherwise, a memory object is alive since its allocation, and |
| dead after its deallocation. |
| It is undefined behavior to access a memory object that isn't alive, but |
| operations that don't dereference it such as |
| :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and |
| :ref:`icmp <i_icmp>` return a valid result. |
| This explains code motion of these instructions across operations that |
| impact the object's lifetime. |
| A stack object's lifetime can be explicitly specified using |
| :ref:`llvm.lifetime.start <int_lifestart>` and |
| :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls. |
| |
| .. _pointeraliasing: |
| |
| Pointer Aliasing Rules |
| ---------------------- |
| |
| Any memory access must be done through a pointer value associated with |
| an address range of the memory access, otherwise the behavior is |
| undefined. Pointer values are associated with address ranges according |
| to the following rules: |
| |
| - A pointer value is associated with the addresses associated with any |
| value it is *based* on. |
| - An address of a global variable is associated with the address range |
| of the variable's storage. |
| - The result value of an allocation instruction is associated with the |
| address range of the allocated storage. |
| - A null pointer in the default address-space is associated with no |
| address. |
| - An :ref:`undef value <undefvalues>` in *any* address-space is |
| associated with no address. |
| - An integer constant other than zero or a pointer value returned from |
| a function not defined within LLVM may be associated with address |
| ranges allocated through mechanisms other than those provided by |
| LLVM. Such ranges shall not overlap with any ranges of addresses |
| allocated by mechanisms provided by LLVM. |
| |
| A pointer value is *based* on another pointer value according to the |
| following rules: |
| |
| - A pointer value formed from a scalar ``getelementptr`` operation is *based* on |
| the pointer-typed operand of the ``getelementptr``. |
| - The pointer in lane *l* of the result of a vector ``getelementptr`` operation |
| is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand |
| of the ``getelementptr``. |
| - The result value of a ``bitcast`` is *based* on the operand of the |
| ``bitcast``. |
| - A pointer value formed by an ``inttoptr`` is *based* on all pointer |
| values that contribute (directly or indirectly) to the computation of |
| the pointer's value. |
| - The "*based* on" relationship is transitive. |
| |
| Note that this definition of *"based"* is intentionally similar to the |
| definition of *"based"* in C99, though it is slightly weaker. |
| |
| LLVM IR does not associate types with memory. The result type of a |
| ``load`` merely indicates the size and alignment of the memory from |
| which to load, as well as the interpretation of the value. The first |
| operand type of a ``store`` similarly only indicates the size and |
| alignment of the store. |
| |
| Consequently, type-based alias analysis, aka TBAA, aka |
| ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. |
| :ref:`Metadata <metadata>` may be used to encode additional information |
| which specialized optimization passes may use to implement type-based |
| alias analysis. |
| |
| .. _pointercapture: |
| |
| Pointer Capture |
| --------------- |
| |
| Given a function call and a pointer that is passed as an argument or stored in |
| the memory before the call, a pointer is *captured* by the call if it makes a |
| copy of any part of the pointer that outlives the call. |
| To be precise, a pointer is captured if one or more of the following conditions |
| hold: |
| |
| 1. The call stores any bit of the pointer carrying information into a place, |
| and the stored bits can be read from the place by the caller after this call |
| exits. |
| |
| .. code-block:: llvm |
| |
| @glb = global ptr null |
| @glb2 = global ptr null |
| @glb3 = global ptr null |
| @glbi = global i32 0 |
| |
| define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) { |
| store ptr %a, ptr @glb ; %a is captured by this call |
| |
| store ptr %b, ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below |
| store ptr null, ptr @glb2 |
| |
| store ptr %c, ptr @glb3 |
| call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured |
| store ptr null, ptr @glb3 |
| |
| %i = ptrtoint ptr %d to i64 |
| %j = trunc i64 %i to i32 |
| store i32 %j, ptr @glbi ; %d is captured |
| |
| ret ptr %e ; %e is captured |
| } |
| |
| 2. The call stores any bit of the pointer carrying information into a place, |
| and the stored bits can be safely read from the place by another thread via |
| synchronization. |
| |
| .. code-block:: llvm |
| |
| @lock = global i1 true |
| |
| define void @f(ptr %a) { |
| store ptr %a, ptr* @glb |
| store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb |
| store ptr null, ptr @glb |
| ret void |
| } |
| |
| 3. The call's behavior depends on any bit of the pointer carrying information. |
| |
| .. code-block:: llvm |
| |
| @glb = global i8 0 |
| |
| define void @f(ptr %a) { |
| %c = icmp eq ptr %a, @glb |
| br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a |
| BB_EXIT: |
| call void @exit() |
| unreachable |
| BB_CONTINUE: |
| ret void |
| } |
| |
| 4. The pointer is used in a volatile access as its address. |
| |
| |
| .. _volatile: |
| |
| Volatile Memory Accesses |
| ------------------------ |
| |
| Certain memory accesses, such as :ref:`load <i_load>`'s, |
| :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be |
| marked ``volatile``. The optimizers must not change the number of |
| volatile operations or change their order of execution relative to other |
| volatile operations. The optimizers *may* change the order of volatile |
| operations relative to non-volatile operations. This is not Java's |
| "volatile" and has no cross-thread synchronization behavior. |
| |
| A volatile load or store may have additional target-specific semantics. |
| Any volatile operation can have side effects, and any volatile operation |
| can read and/or modify state which is not accessible via a regular load |
| or store in this module. Volatile operations may use addresses which do |
| not point to memory (like MMIO registers). This means the compiler may |
| not use a volatile operation to prove a non-volatile access to that |
| address has defined behavior. |
| |
| The allowed side-effects for volatile accesses are limited. If a |
| non-volatile store to a given address would be legal, a volatile |
| operation may modify the memory at that address. A volatile operation |
| may not modify any other memory accessible by the module being compiled. |
| A volatile operation may not call any code in the current module. |
| |
| In general (without target specific context), the address space of a |
| volatile operation may not be changed. Different address spaces may |
| have different trapping behavior when dereferencing an invalid |
| pointer. |
| |
| The compiler may assume execution will continue after a volatile operation, |
| so operations which modify memory or may have undefined behavior can be |
| hoisted past a volatile operation. |
| |
| As an exception to the preceding rule, the compiler may not assume execution |
| will continue after a volatile store operation. This restriction is necessary |
| to support the somewhat common pattern in C of intentionally storing to an |
| invalid pointer to crash the program. In the future, it might make sense to |
| allow frontends to control this behavior. |
| |
| IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy |
| or llvm.memmove intrinsics even when those intrinsics are flagged volatile. |
| Likewise, the backend should never split or merge target-legal volatile |
| load/store instructions. Similarly, IR-level volatile loads and stores cannot |
| change from integer to floating-point or vice versa. |
| |
| .. admonition:: Rationale |
| |
| Platforms may rely on volatile loads and stores of natively supported |
| data width to be executed as single instruction. For example, in C |
| this holds for an l-value of volatile primitive type with native |
| hardware support, but not necessarily for aggregate types. The |
| frontend upholds these expectations, which are intentionally |
| unspecified in the IR. The rules above ensure that IR transformations |
| do not violate the frontend's contract with the language. |
| |
| .. _memmodel: |
| |
| Memory Model for Concurrent Operations |
| -------------------------------------- |
| |
| The LLVM IR does not define any way to start parallel threads of |
| execution or to register signal handlers. Nonetheless, there are |
| platform-specific ways to create them, and we define LLVM IR's behavior |
| in their presence. This model is inspired by the C++0x memory model. |
| |
| For a more informal introduction to this model, see the :doc:`Atomics`. |
| |
| We define a *happens-before* partial order as the least partial order |
| that |
| |
| - Is a superset of single-thread program order, and |
| - When a *synchronizes-with* ``b``, includes an edge from ``a`` to |
| ``b``. *Synchronizes-with* pairs are introduced by platform-specific |
| techniques, like pthread locks, thread creation, thread joining, |
| etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering |
| Constraints <ordering>`). |
| |
| Note that program order does not introduce *happens-before* edges |
| between a thread and signals executing inside that thread. |
| |
| Every (defined) read operation (load instructions, memcpy, atomic |
| loads/read-modify-writes, etc.) R reads a series of bytes written by |
| (defined) write operations (store instructions, atomic |
| stores/read-modify-writes, memcpy, etc.). For the purposes of this |
| section, initialized globals are considered to have a write of the |
| initializer which is atomic and happens before any other read or write |
| of the memory in question. For each byte of a read R, R\ :sub:`byte` |
| may see any write to the same byte, except: |
| |
| - If write\ :sub:`1` happens before write\ :sub:`2`, and |
| write\ :sub:`2` happens before R\ :sub:`byte`, then |
| R\ :sub:`byte` does not see write\ :sub:`1`. |
| - If R\ :sub:`byte` happens before write\ :sub:`3`, then |
| R\ :sub:`byte` does not see write\ :sub:`3`. |
| |
| Given that definition, R\ :sub:`byte` is defined as follows: |
| |
| - If R is volatile, the result is target-dependent. (Volatile is |
| supposed to give guarantees which can support ``sig_atomic_t`` in |
| C/C++, and may be used for accesses to addresses that do not behave |
| like normal memory. It does not generally provide cross-thread |
| synchronization.) |
| - Otherwise, if there is no write to the same byte that happens before |
| R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. |
| - Otherwise, if R\ :sub:`byte` may see exactly one write, |
| R\ :sub:`byte` returns the value written by that write. |
| - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may |
| see are atomic, it chooses one of the values written. See the :ref:`Atomic |
| Memory Ordering Constraints <ordering>` section for additional |
| constraints on how the choice is made. |
| - Otherwise R\ :sub:`byte` returns ``undef``. |
| |
| R returns the value composed of the series of bytes it read. This |
| implies that some bytes within the value may be ``undef`` **without** |
| the entire value being ``undef``. Note that this only defines the |
| semantics of the operation; it doesn't mean that targets will emit more |
| than one instruction to read the series of bytes. |
| |
| Note that in cases where none of the atomic intrinsics are used, this |
| model places only one restriction on IR transformations on top of what |
| is required for single-threaded execution: introducing a store to a byte |
| which might not otherwise be stored is not allowed in general. |
| (Specifically, in the case where another thread might write to and read |
| from an address, introducing a store can change a load that may see |
| exactly one write into a load that may see multiple writes.) |
| |
| .. _ordering: |
| |
| Atomic Memory Ordering Constraints |
| ---------------------------------- |
| |
| Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, |
| :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, |
| :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take |
| ordering parameters that determine which other atomic instructions on |
| the same address they *synchronize with*. These semantics are borrowed |
| from Java and C++0x, but are somewhat more colloquial. If these |
| descriptions aren't precise enough, check those specs (see spec |
| references in the :doc:`atomics guide <Atomics>`). |
| :ref:`fence <i_fence>` instructions treat these orderings somewhat |
| differently since they don't take an address. See that instruction's |
| documentation for details. |
| |
| For a simpler introduction to the ordering constraints, see the |
| :doc:`Atomics`. |
| |
| ``unordered`` |
| The set of values that can be read is governed by the happens-before |
| partial order. A value cannot be read unless some operation wrote |
| it. This is intended to provide a guarantee strong enough to model |
| Java's non-volatile shared variables. This ordering cannot be |
| specified for read-modify-write operations; it is not strong enough |
| to make them atomic in any interesting way. |
| ``monotonic`` |
| In addition to the guarantees of ``unordered``, there is a single |
| total order for modifications by ``monotonic`` operations on each |
| address. All modification orders must be compatible with the |
| happens-before order. There is no guarantee that the modification |
| orders can be combined to a global total order for the whole program |
| (and this often will not be possible). The read in an atomic |
| read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and |
| :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification |
| order immediately before the value it writes. If one atomic read |
| happens before another atomic read of the same address, the later |
| read must see the same value or a later value in the address's |
| modification order. This disallows reordering of ``monotonic`` (or |
| stronger) operations on the same address. If an address is written |
| ``monotonic``-ally by one thread, and other threads ``monotonic``-ally |
| read that address repeatedly, the other threads must eventually see |
| the write. This corresponds to the C++0x/C1x |
| ``memory_order_relaxed``. |
| ``acquire`` |
| In addition to the guarantees of ``monotonic``, a |
| *synchronizes-with* edge may be formed with a ``release`` operation. |
| This is intended to model C++'s ``memory_order_acquire``. |
| ``release`` |
| In addition to the guarantees of ``monotonic``, if this operation |
| writes a value which is subsequently read by an ``acquire`` |
| operation, it *synchronizes-with* that operation. (This isn't a |
| complete description; see the C++0x definition of a release |
| sequence.) This corresponds to the C++0x/C1x |
| ``memory_order_release``. |
| ``acq_rel`` (acquire+release) |
| Acts as both an ``acquire`` and ``release`` operation on its |
| address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. |
| ``seq_cst`` (sequentially consistent) |
| In addition to the guarantees of ``acq_rel`` (``acquire`` for an |
| operation that only reads, ``release`` for an operation that only |
| writes), there is a global total order on all |
| sequentially-consistent operations on all addresses, which is |
| consistent with the *happens-before* partial order and with the |
| modification orders of all the affected addresses. Each |
| sequentially-consistent read sees the last preceding write to the |
| same address in this global order. This corresponds to the C++0x/C1x |
| ``memory_order_seq_cst`` and Java volatile. |
| |
| .. _syncscope: |
| |
| If an atomic operation is marked ``syncscope("singlethread")``, it only |
| *synchronizes with* and only participates in the seq\_cst total orderings of |
| other operations running in the same thread (for example, in signal handlers). |
| |
| If an atomic operation is marked ``syncscope("<target-scope>")``, where |
| ``<target-scope>`` is a target specific synchronization scope, then it is target |
| dependent if it *synchronizes with* and participates in the seq\_cst total |
| orderings of other operations. |
| |
| Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` |
| or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the |
| seq\_cst total orderings of other operations that are not marked |
| ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. |
| |
| .. _floatenv: |
| |
| Floating-Point Environment |
| -------------------------- |
| |
| The default LLVM floating-point environment assumes that traps are disabled and |
| status flags are not observable. Therefore, floating-point math operations do |
| not have side effects and may be speculated freely. Results assume the |
| round-to-nearest rounding mode. |
| |
| Floating-point math operations are allowed to treat all NaNs as if they were |
| quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. This also |
| means that SNaN may be passed through a math operation without quieting. For |
| example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN. However, |
| SNaN values are never created by math operations. They may only occur when |
| provided as a program input value. |
| |
| Code that requires different behavior than this should use the |
| :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. |
| |
| .. _fastmath: |
| |
| Fast-Math Flags |
| --------------- |
| |
| LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`, |
| :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, |
| :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`, |
| :ref:`select <i_select>` and :ref:`call <i_call>` |
| may use the following flags to enable otherwise unsafe |
| floating-point transformations. |
| |
| ``nnan`` |
| No NaNs - Allow optimizations to assume the arguments and result are not |
| NaN. If an argument is a nan, or the result would be a nan, it produces |
| a :ref:`poison value <poisonvalues>` instead. |
| |
| ``ninf`` |
| No Infs - Allow optimizations to assume the arguments and result are not |
| +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it |
| produces a :ref:`poison value <poisonvalues>` instead. |
| |
| ``nsz`` |
| No Signed Zeros - Allow optimizations to treat the sign of a zero |
| argument or zero result as insignificant. This does not imply that -0.0 |
| is poison and/or guaranteed to not exist in the operation. |
| |
| ``arcp`` |
| Allow Reciprocal - Allow optimizations to use the reciprocal of an |
| argument rather than perform division. |
| |
| ``contract`` |
| Allow floating-point contraction (e.g. fusing a multiply followed by an |
| addition into a fused multiply-and-add). This does not enable reassociating |
| to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not |
| be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations. |
| |
| .. _fastmath_afn: |
| |
| ``afn`` |
| Approximate functions - Allow substitution of approximate calculations for |
| functions (sin, log, sqrt, etc). See floating-point intrinsic definitions |
| for places where this can apply to LLVM's intrinsic math functions. |
| |
| ``reassoc`` |
| Allow reassociation transformations for floating-point instructions. |
| This may dramatically change results in floating-point. |
| |
| ``fast`` |
| This flag implies all of the others. |
| |
| .. _uselistorder: |
| |
| Use-list Order Directives |
| ------------------------- |
| |
| Use-list directives encode the in-memory order of each use-list, allowing the |
| order to be recreated. ``<order-indexes>`` is a comma-separated list of |
| indexes that are assigned to the referenced value's uses. The referenced |
| value's use-list is immediately sorted by these indexes. |
| |
| Use-list directives may appear at function scope or global scope. They are not |
| instructions, and have no effect on the semantics of the IR. When they're at |
| function scope, they must appear after the terminator of the final basic block. |
| |
| If basic blocks have their address taken via ``blockaddress()`` expressions, |
| ``uselistorder_bb`` can be used to reorder their use-lists from outside their |
| function's scope. |
| |
| :Syntax: |
| |
| :: |
| |
| uselistorder <ty> <value>, { <order-indexes> } |
| uselistorder_bb @function, %block { <order-indexes> } |
| |
| :Examples: |
| |
| :: |
| |
| define void @foo(i32 %arg1, i32 %arg2) { |
| entry: |
| ; ... instructions ... |
| bb: |
| ; ... instructions ... |
| |
| ; At function scope. |
| uselistorder i32 %arg1, { 1, 0, 2 } |
| uselistorder label %bb, { 1, 0 } |
| } |
| |
| ; At global scope. |
| uselistorder ptr @global, { 1, 2, 0 } |
| uselistorder i32 7, { 1, 0 } |
| uselistorder i32 (i32) @bar, { 1, 0 } |
| uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } |
| |
| .. _source_filename: |
| |
| Source Filename |
| --------------- |
| |
| The *source filename* string is set to the original module identifier, |
| which will be the name of the compiled source file when compiling from |
| source through the clang front end, for example. It is then preserved through |
| the IR and bitcode. |
| |
| This is currently necessary to generate a consistent unique global |
| identifier for local functions used in profile data, which prepends the |
| source file name to the local function name. |
| |
| The syntax for the source file name is simply: |
| |
| .. code-block:: text |
| |
| source_filename = "/path/to/source.c" |
| |
| .. _typesystem: |
| |
| Type System |
| =========== |
| |
| The LLVM type system is one of the most important features of the |
| intermediate representation. Being typed enables a number of |
| optimizations to be performed on the intermediate representation |
| directly, without having to do extra analyses on the side before the |
| transformation. A strong type system makes it easier to read the |
| generated code and enables novel analyses and transformations that are |
| not feasible to perform on normal three address code representations. |
| |
| .. _t_void: |
| |
| Void Type |
| --------- |
| |
| :Overview: |
| |
| |
| The void type does not represent any value and has no size. |
| |
| :Syntax: |
| |
| |
| :: |
| |
| void |
| |
| |
| .. _t_function: |
| |
| Function Type |
| ------------- |
| |
| :Overview: |
| |
| |
| The function type can be thought of as a function signature. It consists of a |
| return type and a list of formal parameter types. The return type of a function |
| type is a void type or first class type --- except for :ref:`label <t_label>` |
| and :ref:`metadata <t_metadata>` types. |
| |
| :Syntax: |
| |
| :: |
| |
| <returntype> (<parameter list>) |
| |
| ...where '``<parameter list>``' is a comma-separated list of type |
| specifiers. Optionally, the parameter list may include a type ``...``, which |
| indicates that the function takes a variable number of arguments. Variable |
| argument functions can access their arguments with the :ref:`variable argument |
| handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type |
| except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. |
| |
| :Examples: |
| |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``i32 (ptr, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM. | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | |
| +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| |
| .. _t_firstclass: |
| |
| First Class Types |
| ----------------- |
| |
| The :ref:`first class <t_firstclass>` types are perhaps the most important. |
| Values of these types are the only ones which can be produced by |
| instructions. |
| |
| .. _t_single_value: |
| |
| Single Value Types |
| ^^^^^^^^^^^^^^^^^^ |
| |
| These are the types that are valid in registers from CodeGen's perspective. |
| |
| .. _t_integer: |
| |
| Integer Type |
| """""""""""" |
| |
| :Overview: |
| |
| The integer type is a very simple type that simply specifies an |
| arbitrary bit width for the integer type desired. Any bit width from 1 |
| bit to 2\ :sup:`23`\ (about 8 million) can be specified. |
| |
| :Syntax: |
| |
| :: |
| |
| iN |
| |
| The number of bits the integer will occupy is specified by the ``N`` |
| value. |
| |
| Examples: |
| ********* |
| |
| +----------------+------------------------------------------------+ |
| | ``i1`` | a single-bit integer. | |
| +----------------+------------------------------------------------+ |
| | ``i32`` | a 32-bit integer. | |
| +----------------+------------------------------------------------+ |
| | ``i1942652`` | a really big integer of over 1 million bits. | |
| +----------------+------------------------------------------------+ |
| |
| .. _t_floating: |
| |
| Floating-Point Types |
| """""""""""""""""""" |
| |
| .. list-table:: |
| :header-rows: 1 |
| |
| * - Type |
| - Description |
| |
| * - ``half`` |
| - 16-bit floating-point value |
| |
| * - ``bfloat`` |
| - 16-bit "brain" floating-point value (7-bit significand). Provides the |
| same number of exponent bits as ``float``, so that it matches its dynamic |
| range, but with greatly reduced precision. Used in Intel's AVX-512 BF16 |
| extensions and Arm's ARMv8.6-A extensions, among others. |
| |
| * - ``float`` |
| - 32-bit floating-point value |
| |
| * - ``double`` |
| - 64-bit floating-point value |
| |
| * - ``fp128`` |
| - 128-bit floating-point value (113-bit significand) |
| |
| * - ``x86_fp80`` |
| - 80-bit floating-point value (X87) |
| |
| * - ``ppc_fp128`` |
| - 128-bit floating-point value (two 64-bits) |
| |
| The binary format of half, float, double, and fp128 correspond to the |
| IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 |
| respectively. |
| |
| X86_amx Type |
| """""""""""" |
| |
| :Overview: |
| |
| The x86_amx type represents a value held in an AMX tile register on an x86 |
| machine. The operations allowed on it are quite limited. Only few intrinsics |
| are allowed: stride load and store, zero and dot product. No instruction is |
| allowed for this type. There are no arguments, arrays, pointers, vectors |
| or constants of this type. |
| |
| :Syntax: |
| |
| :: |
| |
| x86_amx |
| |
| |
| X86_mmx Type |
| """""""""""" |
| |
| :Overview: |
| |
| The x86_mmx type represents a value held in an MMX register on an x86 |
| machine. The operations allowed on it are quite limited: parameters and |
| return values, load and store, and bitcast. User-specified MMX |
| instructions are represented as intrinsic or asm calls with arguments |
| and/or results of this type. There are no arrays, vectors or constants |
| of this type. |
| |
| :Syntax: |
| |
| :: |
| |
| x86_mmx |
| |
| |
| .. _t_pointer: |
| |
| Pointer Type |
| """""""""""" |
| |
| :Overview: |
| |
| The pointer type ``ptr`` is used to specify memory locations. Pointers are |
| commonly used to reference objects in memory. |
| |
| Pointer types may have an optional address space attribute defining |
| the numbered address space where the pointed-to object resides. For |
| example, ``ptr addrspace(5)`` is a pointer to address space 5. |
| In addition to integer constants, ``addrspace`` can also reference one of the |
| address spaces defined in the :ref:`datalayout string<langref_datalayout>`. |
| ``addrspace("A")`` will use the alloca address space, ``addrspace("G")`` |
| the default globals address space and ``addrspace("P")`` the program address |
| space. |
| |
| The default address space is number zero. |
| |
| The semantics of non-zero address spaces are target-specific. Memory |
| access through a non-dereferenceable pointer is undefined behavior in |
| any address space. Pointers with the bit-value 0 are only assumed to |
| be non-dereferenceable in address space 0, unless the function is |
| marked with the ``null_pointer_is_valid`` attribute. |
| |
| If an object can be proven accessible through a pointer with a |
| different address space, the access may be modified to use that |
| address space. Exceptions apply if the operation is ``volatile``. |
| |
| Prior to LLVM 15, pointer types also specified a pointee type, such as |
| ``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed |
| pointers" are still supported under non-default options. See the |
| `opaque pointers document <OpaquePointers.html>`__ for more information. |
| |
| .. _t_target_type: |
| |
| Target Extension Type |
| """"""""""""""""""""" |
| |
| :Overview: |
| |
| Target extension types represent types that must be preserved through |
| optimization, but are otherwise generally opaque to the compiler. They may be |
| used as function parameters or arguments, and in :ref:`phi <i_phi>` or |
| :ref:`select <i_select>` instructions. Some types may be also used in |
| :ref:`alloca <i_alloca>` instructions or as global values, and correspondingly |
| it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions |
| on them. Full semantics for these types are defined by the target. |
| |
| The only constants that target extension types may have are ``zeroinitializer``, |
| ``undef``, and ``poison``. Other possible values for target extension types may |
| arise from target-specific intrinsics and functions. |
| |
| These types cannot be converted to other types. As such, it is not legal to use |
| them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type), |
| nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or |
| :ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use |
| in an :ref:`icmp <i_icmp>` instruction. |
| |
| Target extension types have a name and optional type or integer parameters. The |
| meanings of name and parameters are defined by the target. When being defined in |
| LLVM IR, all of the type parameters must precede all of the integer parameters. |
| |
| Specific target extension types are registered with LLVM as having specific |
| properties. These properties can be used to restrict the type from appearing in |
| certain contexts, such as being the type of a global variable or having a |
| ``zeroinitializer`` constant be valid. A complete list of type properties may be |
| found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen |
| <https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_). |
| |
| :Syntax: |
| |
| .. code-block:: llvm |
| |
| target("label") |
| target("label", void) |
| target("label", void, i32) |
| target("label", 0, 1, 2) |
| target("label", void, i32, 0, 1, 2) |
| |
| |
| .. _t_vector: |
| |
| Vector Type |
| """"""""""" |
| |
| :Overview: |
| |
| A vector type is a simple derived type that represents a vector of |
| elements. Vector types are used when multiple primitive data are |
| operated in parallel using a single instruction (SIMD). A vector type |
| requires a size (number of elements), an underlying primitive data type, |
| and a scalable property to represent vectors where the exact hardware |
| vector length is unknown at compile time. Vector types are considered |
| :ref:`first class <t_firstclass>`. |
| |
| :Memory Layout: |
| |
| In general vector elements are laid out in memory in the same way as |
| :ref:`array types <t_array>`. Such an analogy works fine as long as the vector |
| elements are byte sized. However, when the elements of the vector aren't byte |
| sized it gets a bit more complicated. One way to describe the layout is by |
| describing what happens when a vector such as <N x iM> is bitcasted to an |
| integer type with N*M bits, and then following the rules for storing such an |
| integer to memory. |
| |
| A bitcast from a vector type to a scalar integer type will see the elements |
| being packed together (without padding). The order in which elements are |
| inserted in the integer depends on endianess. For little endian element zero |
| is put in the least significant bits of the integer, and for big endian |
| element zero is put in the most significant bits. |
| |
| Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together |
| with the analogy that we can replace a vector store by a bitcast followed by |
| an integer store, we get this for big endian: |
| |
| .. code-block:: llvm |
| |
| %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 |
| |
| ; Bitcasting from a vector to an integral type can be seen as |
| ; concatenating the values: |
| ; %val now has the hexadecimal value 0x1235. |
| |
| store i16 %val, ptr %ptr |
| |
| ; In memory the content will be (8-bit addressing): |
| ; |
| ; [%ptr + 0]: 00010010 (0x12) |
| ; [%ptr + 1]: 00110101 (0x35) |
| |
| The same example for little endian: |
| |
| .. code-block:: llvm |
| |
| %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 |
| |
| ; Bitcasting from a vector to an integral type can be seen as |
| ; concatenating the values: |
| ; %val now has the hexadecimal value 0x5321. |
| |
| store i16 %val, ptr %ptr |
| |
| ; In memory the content will be (8-bit addressing): |
| ; |
| ; [%ptr + 0]: 00100001 (0x21) |
| ; [%ptr + 1]: 01010011 (0x53) |
| |
| When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout |
| is unspecified (just like it is for an integral type of the same size). This |
| is because different targets could put the padding at different positions when |
| the type size is smaller than the type's store size. |
| |
| :Syntax: |
| |
| :: |
| |
| < <# elements> x <elementtype> > ; Fixed-length vector |
| < vscale x <# elements> x <elementtype> > ; Scalable vector |
| |
| The number of elements is a constant integer value larger than 0; |
| elementtype may be any integer, floating-point or pointer type. Vectors |
| of size zero are not allowed. For scalable vectors, the total number of |
| elements is a constant multiple (called vscale) of the specified number |
| of elements; vscale is a positive integer that is unknown at compile time |
| and the same hardware-dependent constant for all scalable vectors at run |
| time. The size of a specific scalable vector type is thus constant within |
| IR, even if the exact size in bytes cannot be determined until run time. |
| |
| :Examples: |
| |
| +------------------------+----------------------------------------------------+ |
| | ``<4 x i32>`` | Vector of 4 32-bit integer values. | |
| +------------------------+----------------------------------------------------+ |
| | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | |
| +------------------------+----------------------------------------------------+ |
| | ``<2 x i64>`` | Vector of 2 64-bit integer values. | |
| +------------------------+----------------------------------------------------+ |
| | ``<4 x ptr>`` | Vector of 4 pointers | |
| +------------------------+----------------------------------------------------+ |
| | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | |
| +------------------------+----------------------------------------------------+ |
| |
| .. _t_label: |
| |
| Label Type |
| ^^^^^^^^^^ |
| |
| :Overview: |
| |
| The label type represents code labels. |
| |
| :Syntax: |
| |
| :: |
| |
| label |
| |
| .. _t_token: |
| |
| Token Type |
| ^^^^^^^^^^ |
| |
| :Overview: |
| |
| The token type is used when a value is associated with an instruction |
| but all uses of the value must not attempt to introspect or obscure it. |
| As such, it is not appropriate to have a :ref:`phi <i_phi>` or |
| :ref:`select <i_select>` of type token. |
| |
| :Syntax: |
| |
| :: |
| |
| token |
| |
| |
| |
| .. _t_metadata: |
| |
| Metadata Type |
| ^^^^^^^^^^^^^ |
| |
| :Overview: |
| |
| The metadata type represents embedded metadata. No derived types may be |
| created from metadata except for :ref:`function <t_function>` arguments. |
| |
| :Syntax: |
| |
| :: |
| |
| metadata |
| |
| .. _t_aggregate: |
| |
| Aggregate Types |
| ^^^^^^^^^^^^^^^ |
| |
| Aggregate Types are a subset of derived types that can contain multiple |
| member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are |
| aggregate types. :ref:`Vectors <t_vector>` are not considered to be |
| aggregate types. |
| |
| .. _t_array: |
| |
| Array Type |
| """""""""" |
| |
| :Overview: |
| |
| The array type is a very simple derived type that arranges elements |
| sequentially in memory. The array type requires a size (number of |
| elements) and an underlying data type. |
| |
| :Syntax: |
| |
| :: |
| |
| [<# elements> x <elementtype>] |
| |
| The number of elements is a constant integer value; ``elementtype`` may |
| be any type with a size. |
| |
| :Examples: |
| |
| +------------------+--------------------------------------+ |
| | ``[40 x i32]`` | Array of 40 32-bit integer values. | |
| +------------------+--------------------------------------+ |
| | ``[41 x i32]`` | Array of 41 32-bit integer values. | |
| +------------------+--------------------------------------+ |
| | ``[4 x i8]`` | Array of 4 8-bit integer values. | |
| +------------------+--------------------------------------+ |
| |
| Here are some examples of multidimensional arrays: |
| |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | |
| +-----------------------------+----------------------------------------------------------+ |
| | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | |
| +-----------------------------+----------------------------------------------------------+ |
| |
| There is no restriction on indexing beyond the end of the array implied |
| by a static type (though there are restrictions on indexing beyond the |
| bounds of an allocated object in some cases). This means that |
| single-dimension 'variable sized array' addressing can be implemented in |
| LLVM with a zero length array type. An implementation of 'pascal style |
| arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for |
| example. |
| |
| .. _t_struct: |
| |
| Structure Type |
| """""""""""""" |
| |
| :Overview: |
| |
| The structure type is used to represent a collection of data members |
| together in memory. The elements of a structure may be any type that has |
| a size. |
| |
| Structures in memory are accessed using '``load``' and '``store``' by |
| getting a pointer to a field with the '``getelementptr``' instruction. |
| Structures in registers are accessed using the '``extractvalue``' and |
| '``insertvalue``' instructions. |
| |
| Structures may optionally be "packed" structures, which indicate that |
| the alignment of the struct is one byte, and that there is no padding |
| between the elements. In non-packed structs, padding between field types |
| is inserted as defined by the DataLayout string in the module, which is |
| required to match what the underlying code generator expects. |
| |
| Structures can either be "literal" or "identified". A literal structure |
| is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas |
| identified types are always defined at the top level with a name. |
| Literal types are uniqued by their contents and can never be recursive |
| or opaque since there is no way to write one. Identified types can be |
| recursive, can be opaqued, and are never uniqued. |
| |
| :Syntax: |
| |
| :: |
| |
| %T1 = type { <type list> } ; Identified normal struct type |
| %T2 = type <{ <type list> }> ; Identified packed struct type |
| |
| :Examples: |
| |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``{ float, ptr }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`. | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | |
| +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| |
| .. _t_opaque: |
| |
| Opaque Structure Types |
| """""""""""""""""""""" |
| |
| :Overview: |
| |
| Opaque structure types are used to represent structure types that |
| do not have a body specified. This corresponds (for example) to the C |
| notion of a forward declared structure. They can be named (``%X``) or |
| unnamed (``%52``). |
| |
| :Syntax: |
| |
| :: |
| |
| %X = type opaque |
| %52 = type opaque |
| |
| :Examples: |
| |
| +--------------+-------------------+ |
| | ``opaque`` | An opaque type. | |
| +--------------+-------------------+ |
| |
| .. _constants: |
| |
| Constants |
| ========= |
| |
| LLVM has several different basic types of constants. This section |
| describes them all and their syntax. |
| |
| Simple Constants |
| ---------------- |
| |
| **Boolean constants** |
| The two strings '``true``' and '``false``' are both valid constants |
| of the ``i1`` type. |
| **Integer constants** |
| Standard integers (such as '4') are constants of the :ref:`integer |
| <t_integer>` type. They can be either decimal or |
| hexadecimal. Decimal integers can be prefixed with - to represent |
| negative integers, e.g. '``-1234``'. Hexadecimal integers must be |
| prefixed with either u or s to indicate whether they are unsigned |
| or signed respectively. e.g '``u0x8000``' gives 32768, whilst |
| '``s0x8000``' gives -32768. |
| |
| Note that hexadecimal integers are sign extended from the number |
| of active bits, i.e. the bit width minus the number of leading |
| zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1. |
| **Floating-point constants** |
| Floating-point constants use standard decimal notation (e.g. |
| 123.421), exponential notation (e.g. 1.23421e+2), or a more precise |
| hexadecimal notation (see below). The assembler requires the exact |
| decimal value of a floating-point constant. For example, the |
| assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating |
| decimal in binary. Floating-point constants must have a |
| :ref:`floating-point <t_floating>` type. |
| **Null pointer constants** |
| The identifier '``null``' is recognized as a null pointer constant |
| and must be of :ref:`pointer type <t_pointer>`. |
| **Token constants** |
| The identifier '``none``' is recognized as an empty token constant |
| and must be of :ref:`token type <t_token>`. |
| |
| The one non-intuitive notation for constants is the hexadecimal form of |
| floating-point constants. For example, the form |
|