| .. _amdgpu-dwarf-extensions-for-heterogeneous-debugging: |
| |
| ******************************************** |
| DWARF Extensions For Heterogeneous Debugging |
| ******************************************** |
| |
| .. contents:: |
| :local: |
| |
| .. warning:: |
| |
| This document describes **provisional extensions** to DWARF Version 5 |
| [:ref:`DWARF <amdgpu-dwarf-DWARF>`] to support heterogeneous debugging. It is |
| not currently fully implemented and is subject to change. |
| |
| .. _amdgpu-dwarf-introduction: |
| |
| 1. Introduction |
| =============== |
| |
| AMD [:ref:`AMD <amdgpu-dwarf-AMD>`] has been working on supporting heterogeneous |
| computing. A heterogeneous computing program can be written in a high level |
| language such as C++ or Fortran with OpenMP pragmas, OpenCL, or HIP (a portable |
| C++ programming environment for heterogeneous computing [:ref:`HIP |
| <amdgpu-dwarf-HIP>`]). A heterogeneous compiler and runtime allows a program to |
| execute on multiple devices within the same native process. Devices could |
| include CPUs, GPUs, DSPs, FPGAs, or other special purpose accelerators. |
| Currently HIP programs execute on systems with CPUs and GPUs. |
| |
| The AMD [:ref:`AMD <amdgpu-dwarf-AMD>`] ROCm platform [:ref:`AMD-ROCm |
| <amdgpu-dwarf-AMD-ROCm>`] is an implementation of the industry standard for |
| heterogeneous computing devices defined by the Heterogeneous System Architecture |
| (HSA) Foundation [:ref:`HSA <amdgpu-dwarf-HSA>`]. It is open sourced and |
| includes contributions to open source projects such as LLVM [:ref:`LLVM |
| <amdgpu-dwarf-LLVM>`] for compilation and GDB for debugging [:ref:`GDB |
| <amdgpu-dwarf-GDB>`]. |
| |
| The LLVM compiler has upstream support for commercially available AMD GPU |
| hardware (AMDGPU) [:ref:`AMDGPU-LLVM <amdgpu-dwarf-AMDGPU-LLVM>`]. The open |
| source ROCgdb [:ref:`AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb>`] GDB based debugger |
| also has support for AMDGPU which is being upstreamed. Support for AMDGPU is |
| also being added by third parties to the GCC [:ref:`GCC <amdgpu-dwarf-GCC>`] |
| compiler and the Perforce TotalView HPC Debugger [:ref:`Perforce-TotalView |
| <amdgpu-dwarf-Perforce-TotalView>`]. |
| |
| To support debugging heterogeneous programs several features that are not |
| provided by current DWARF Version 5 [:ref:`DWARF <amdgpu-dwarf-DWARF>`] have |
| been identified. The :ref:`amdgpu-dwarf-extensions` section gives an overview of |
| the extensions devised to address the missing features. The extensions seek to |
| be general in nature and backwards compatible with DWARF Version 5. Their goal |
| is to be applicable to meeting the needs of any heterogeneous system and not be |
| vendor or architecture specific. That is followed by appendix |
| :ref:`amdgpu-dwarf-changes-relative-to-dwarf-version-5` which contains the |
| textual changes for the extensions relative to the DWARF Version 5 standard. |
| There are a number of notes included that raise open questions, or provide |
| alternative approaches that may be worth considering. Then appendix |
| :ref:`amdgpu-dwarf-further-examples` links to the AMD GPU specific usage of the |
| extensions that includes an example. Finally, appendix |
| :ref:`amdgpu-dwarf-references` provides references to further information. |
| |
| .. _amdgpu-dwarf-extensions: |
| |
| 2. Extensions |
| ============= |
| |
| The extensions continue to evolve through collaboration with many individuals and |
| active prototyping within the GDB debugger and LLVM compiler. Input has also |
| been very much appreciated from the developers working on the Perforce TotalView |
| HPC Debugger and GCC compiler. |
| |
| The inputs provided and insights gained so far have been incorporated into this |
| current version. The plan is to participate in upstreaming the work and |
| addressing any feedback. If there is general interest then some or all of these |
| extensions could be submitted as future DWARF standard proposals. |
| |
| The general principles in designing the extensions have been: |
| |
| 1. Be backwards compatible with the DWARF Version 5 [:ref:`DWARF |
| <amdgpu-dwarf-DWARF>`] standard. |
| |
| 2. Be vendor and architecture neutral. They are intended to apply to other |
| heterogeneous hardware devices including GPUs, DSPs, FPGAs, and other |
| specialized hardware. These collectively include similar characteristics and |
| requirements as AMDGPU devices. |
| |
| 3. Provide improved optimization support for non-GPU code. For example, some |
| extensions apply to traditional CPU hardware that supports large vector |
| registers. Compilers can map source languages, and source language |
| extensions, that describe large scale parallel execution, onto the lanes of |
| the vector registers. This is common in programming languages used in ML and |
| HPC. |
| |
| 4. Fully define well-formed DWARF in a consistent style based on the DWARF |
| Version 5 specification. |
| |
| It is possible that some of the generalizations may also benefit other DWARF |
| issues that have been raised. |
| |
| The remainder of this section enumerates the extensions and provides motivation |
| for each in terms of heterogeneous debugging. |
| |
| .. _amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack: |
| |
| 2.1 Allow Location Description on the DWARF Expression Stack |
| ------------------------------------------------------------ |
| |
| DWARF Version 5 does not allow location descriptions to be entries on the DWARF |
| expression stack. They can only be the final result of the evaluation of a DWARF |
| expression. However, by allowing a location description to be a first-class |
| entry on the DWARF expression stack it becomes possible to compose expressions |
| containing both values and location descriptions naturally. It allows objects to |
| be located in any kind of memory address space, in registers, be implicit |
| values, be undefined, or a composite of any of these. |
| |
| By extending DWARF carefully, all existing DWARF expressions can retain their |
| current semantic meaning. DWARF has implicit conversions that convert from a |
| value that represents an address in the default address space to a memory |
| location description. This can be extended to allow a default address space |
| memory location description to be implicitly converted back to its address |
| value. This allows all DWARF Version 5 expressions to retain their same meaning, |
| while enabling the ability to explicitly create memory location descriptions in |
| non-default address spaces and generalizing the power of composite location |
| descriptions to any kind of location description. |
| |
| For those familiar with the definition of location descriptions in DWARF Version |
| 5, the definitions in these extensions are presented differently, but does in |
| fact define the same concept with the same fundamental semantics. However, it |
| does so in a way that allows the concept to extend to support address spaces, |
| bit addressing, the ability for composite location descriptions to be composed |
| of any kind of location description, and the ability to support objects located |
| at multiple places. Collectively these changes expand the set of architectures |
| that can be supported and improves support for optimized code. |
| |
| Several approaches were considered, and the one presented, together with the |
| extensions it enables, appears to be the simplest and cleanest one that offers |
| the greatest improvement of DWARF's ability to support debugging optimized GPU |
| and non-GPU code. Examining the GDB debugger and LLVM compiler, it appears only |
| to require modest changes as they both already have to support general use of |
| location descriptions. It is anticipated that will also be the case for other |
| debuggers and compilers. |
| |
| GDB has been modified to evaluate DWARF Version 5 expressions with location |
| descriptions as stack entries and with implicit conversions. All GDB tests have |
| passed, except one that turned out to be an invalid test case by DWARF Version 5 |
| rules. The code in GDB actually became simpler as all evaluation is done on a |
| single stack and there was no longer a need to maintain a separate structure for |
| the location description results. This gives confidence in backwards |
| compatibility. |
| |
| See :ref:`amdgpu-dwarf-expressions` and nested sections. |
| |
| This extension is separately described at *Allow Location Descriptions on the |
| DWARF Expression Stack* [:ref:`AMDGPU-DWARF-LOC |
| <amdgpu-dwarf-AMDGPU-DWARF-LOC>`]. |
| |
| 2.2 Generalize CFI to Allow Any Location Description Kind |
| --------------------------------------------------------- |
| |
| CFI describes restoring callee saved registers that are spilled. Currently CFI |
| only allows a location description that is a register, memory address, or |
| implicit location description. AMDGPU optimized code may spill scalar registers |
| into portions of vector registers. This requires extending CFI to allow any |
| location description kind to be supported. |
| |
| See :ref:`amdgpu-dwarf-call-frame-information`. |
| |
| 2.3 Generalize DWARF Operation Expressions to Support Multiple Places |
| --------------------------------------------------------------------- |
| |
| In DWARF Version 5 a location description is defined as a single location |
| description or a location list. A location list is defined as either |
| effectively an undefined location description or as one or more single |
| location descriptions to describe an object with multiple places. |
| |
| With |
| :ref:`amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack`, |
| the ``DW_OP_push_object_address`` and ``DW_OP_call*`` operations can put a |
| location description on the stack. Furthermore, debugger information entry |
| attributes such as ``DW_AT_data_member_location``, ``DW_AT_use_location``, and |
| ``DW_AT_vtable_elem_location`` are defined as pushing a location description on |
| the expression stack before evaluating the expression. |
| |
| DWARF Version 5 only allows the stack to contain values and so only a single |
| memory address can be on the stack. This makes these operations and attributes |
| incapable of handling location descriptions with multiple places, or places |
| other than memory. |
| |
| Since |
| :ref:`amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack` |
| allows the stack to contain location descriptions, the operations are |
| generalized to support location descriptions that can have multiple places. This |
| is backwards compatible with DWARF Version 5 and allows objects with multiple |
| places to be supported. For example, the expression that describes how to access |
| the field of an object can be evaluated with a location description that has |
| multiple places and will result in a location description with multiple places. |
| |
| With this change, the separate DWARF Version 5 sections that described DWARF |
| expressions and location lists are unified into a single section that describes |
| DWARF expressions in general. This unification is a natural consequence of, and |
| a necessity of, allowing location descriptions to be part of the evaluation |
| stack. |
| |
| See :ref:`amdgpu-dwarf-location-description`. |
| |
| 2.4 Generalize Offsetting of Location Descriptions |
| -------------------------------------------------- |
| |
| The ``DW_OP_plus`` and ``DW_OP_minus`` operations can be defined to operate on a |
| memory location description in the default target architecture specific address |
| space and a generic type value to produce an updated memory location |
| description. This allows them to continue to be used to offset an address. |
| |
| To generalize offsetting to any location description, including location |
| descriptions that describe when bytes are in registers, are implicit, or a |
| composite of these, the ``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_uconst``, and |
| ``DW_OP_LLVM_bit_offset`` offset operations are added. |
| |
| The offset operations can operate on location storage of any size. For example, |
| implicit location storage could be any number of bits in size. It is simpler to |
| define offsets that exceed the size of the location storage as being an |
| evaluation error, than having to force an implementation to support potentially |
| infinite precision offsets to allow it to correctly track a series of positive |
| and negative offsets that may transiently overflow or underflow, but end up in |
| range. This is simple for the arithmetic operations as they are defined in terms |
| of two's complement arithmetic on a base type of a fixed size. Therefore, the |
| offset operation define that integer overflow is ill-formed. This is in contrast |
| to the ``DW_OP_plus``, ``DW_OP_plus_uconst``, and ``DW_OP_minus`` arithmetic |
| operations which define that it causes wrap-around. |
| |
| Having the offset operations allows ``DW_OP_push_object_address`` to push a |
| location description that may be in a register, or be an implicit value. The |
| DWARF expression of ``DW_TAG_ptr_to_member_type`` can use the offset operations |
| without regard to what kind of location description was pushed. |
| |
| Since |
| :ref:`amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack` has |
| generalized location storage to be bit indexable, ``DW_OP_LLVM_bit_offset`` |
| generalizes DWARF to work with bit fields. This is generally not possible in |
| DWARF Version 5. |
| |
| The ``DW_OP_*piece`` operations only allow literal indices. A way to use a |
| computed offset of an arbitrary location description (such as a vector register) |
| is required. The offset operations provide this ability since they can be used |
| to compute a location description on the stack. |
| |
| It could be possible to define ``DW_OP_plus``, ``DW_OP_plus_uconst``, and |
| ``DW_OP_minus`` to operate on location descriptions to avoid needing |
| ``DW_OP_LLVM_offset`` and ``DW_OP_LLVM_offset_uconst``. However, this is not |
| proposed since currently the arithmetic operations are defined to require values |
| of the same base type and produces a result with the same base type. Allowing |
| these operations to act on location descriptions would permit the first operand |
| to be a location description and the second operand to be an integral value |
| type, or vice versa, and return a location description. This complicates the |
| rules for implicit conversions between default address space memory location |
| descriptions and generic base type values. Currently the rules would convert |
| such a location description to the memory address value and then perform two's |
| compliment wrap around arithmetic. If the result was used as a location |
| description, it would be implicitly converted back to a default address space |
| memory location description. This is different to the overflow rules on location |
| descriptions. To allow control, an operation that converts a memory location |
| description to an address integral type value would be required. Keeping a |
| separation of location description operations and arithmetic operations avoids |
| this semantic complexity. |
| |
| See ``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_uconst``, and |
| ``DW_OP_LLVM_bit_offset`` in |
| :ref:`amdgpu-dwarf-general-location-description-operations`. |
| |
| 2.5 Generalize Creation of Undefined Location Descriptions |
| ---------------------------------------------------------- |
| |
| Current DWARF uses an empty expression to indicate an undefined location |
| description. Since |
| :ref:`amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack` |
| allows location descriptions to be created on the stack, it is necessary to have |
| an explicit way to specify an undefined location description. |
| |
| For example, the ``DW_OP_LLVM_select_bit_piece`` (see |
| :ref:`amdgpu-dwarf-support-for-divergent-control-flow-of-simt-hardware`) |
| operation takes more than one location description on the stack. Without this |
| ability, it is not possible to specify that a particular one of the input |
| location descriptions is undefined. |
| |
| See the ``DW_OP_LLVM_undefined`` operation in |
| :ref:`amdgpu-dwarf-undefined-location-description-operations`. |
| |
| 2.6 Generalize Creation of Composite Location Descriptions |
| ---------------------------------------------------------- |
| |
| To allow composition of composite location descriptions, an explicit operation |
| that indicates the end of the definition of a composite location description is |
| required. This can be implied if the end of a DWARF expression is reached, |
| allowing current DWARF expressions to remain legal. |
| |
| See ``DW_OP_LLVM_piece_end`` in |
| :ref:`amdgpu-dwarf-composite-location-description-operations`. |
| |
| 2.7 Generalize DWARF Base Objects to Allow Any Location Description Kind |
| ------------------------------------------------------------------------ |
| |
| The number of registers and the cost of memory operations is much higher for |
| AMDGPU than a typical CPU. The compiler attempts to optimize whole variables and |
| arrays into registers. |
| |
| Currently DWARF only allows ``DW_OP_push_object_address`` and related operations |
| to work with a global memory location. To support AMDGPU optimized code it is |
| required to generalize DWARF to allow any location description to be used. This |
| allows registers, or composite location descriptions that may be a mixture of |
| memory, registers, or even implicit values. |
| |
| See ``DW_OP_push_object_address`` in |
| :ref:`amdgpu-dwarf-general-location-description-operations`. |
| |
| 2.8 General Support for Address Spaces |
| -------------------------------------- |
| |
| AMDGPU needs to be able to describe addresses that are in different kinds of |
| memory. Optimized code may need to describe a variable that resides in pieces |
| that are in different kinds of storage which may include parts of registers, |
| memory that is in a mixture of memory kinds, implicit values, or be undefined. |
| |
| DWARF has the concept of segment addresses. However, the segment cannot be |
| specified within a DWARF expression, which is only able to specify the offset |
| portion of a segment address. The segment index is only provided by the entity |
| that specifies the DWARF expression. Therefore, the segment index is a property |
| that can only be put on complete objects, such as a variable. That makes it only |
| suitable for describing an entity (such as variable or subprogram code) that is |
| in a single kind of memory. |
| |
| AMDGPU uses multiple address spaces. For example, a variable may be allocated in |
| a register that is partially spilled to the call stack which is in the private |
| address space, and partially spilled to the local address space. DWARF mentions |
| address spaces, for example as an argument to the ``DW_OP_xderef*`` operations. |
| A new section that defines address spaces is added (see |
| :ref:`amdgpu-dwarf-address-spaces`). |
| |
| A new attribute ``DW_AT_LLVM_address_space`` is added to pointer and reference |
| types (see :ref:`amdgpu-dwarf-type-modifier-entries`). This allows the compiler |
| to specify which address space is being used to represent the pointer or |
| reference type. |
| |
| DWARF uses the concept of an address in many expression operations but does not |
| define how it relates to address spaces. For example, |
| ``DW_OP_push_object_address`` pushes the address of an object. Other contexts |
| implicitly push an address on the stack before evaluating an expression. For |
| example, the ``DW_AT_use_location`` attribute of the |
| ``DW_TAG_ptr_to_member_type``. The expression belongs to a source language type |
| which may apply to objects allocated in different kinds of storage. Therefore, |
| it is desirable that the expression that uses the address can do so without |
| regard to what kind of storage it specifies, including the address space of a |
| memory location description. For example, a pointer to member value may want to |
| be applied to an object that may reside in any address space. |
| |
| The DWARF ``DW_OP_xderef*`` operations allow a value to be converted into an |
| address of a specified address space which is then read. But it provides no |
| way to create a memory location description for an address in the non-default |
| address space. For example, AMDGPU variables can be allocated in the local |
| address space at a fixed address. |
| |
| The ``DW_OP_LLVM_form_aspace_address`` (see |
| :ref:`amdgpu-dwarf-memory-location-description-operations`) operation is defined |
| to create a memory location description from an address and address space. If |
| can be used to specify the location of a variable that is allocated in a |
| specific address space. This allows the size of addresses in an address space to |
| be larger than the generic type. It also allows a consumer great implementation |
| freedom. It allows the implicit conversion back to a value to be limited only to |
| the default address space to maintain compatibility with DWARF Version 5. For |
| other address spaces the producer can use the new operations that explicitly |
| specify the address space. |
| |
| In contrast, if the ``DW_OP_LLVM_form_aspace_address`` operation had been |
| defined to produce a value, and an implicit conversion to a memory location |
| description was defined, then it would be limited to the size of the generic |
| type (which matches the size of the default address space). An implementation |
| would likely have to use *reserved ranges* of value to represent different |
| address spaces. Such a value would likely not match any address value in the |
| actual hardware. That would require the consumer to have special treatment for |
| such values. |
| |
| ``DW_OP_breg*`` treats the register as containing an address in the default |
| address space. A ``DW_OP_LLVM_aspace_bregx`` (see |
| :ref:`amdgpu-dwarf-memory-location-description-operations`) operation is added |
| to allow the address space of the address held in a register to be specified. |
| |
| Similarly, ``DW_OP_implicit_pointer`` treats its implicit pointer value as being |
| in the default address space. A ``DW_OP_LLVM_aspace_implicit_pointer`` |
| (:ref:`amdgpu-dwarf-implicit-location-description-operations`) operation is |
| added to allow the address space to be specified. |
| |
| Almost all uses of addresses in DWARF are limited to defining location |
| descriptions, or to be dereferenced to read memory. The exception is |
| ``DW_CFA_val_offset`` which uses the address to set the value of a register. In |
| order to support address spaces, the CFA DWARF expression is defined to be a |
| memory location description. This allows it to specify an address space which is |
| used to convert the offset address back to an address in that address space. See |
| :ref:`amdgpu-dwarf-call-frame-information`. |
| |
| This approach of extending memory location descriptions to support address |
| spaces, allows all existing DWARF Version 5 expressions to have the identical |
| semantics. It allows the compiler to explicitly specify the address space it is |
| using. For example, a compiler could choose to access private memory in a |
| swizzled manner when mapping a source language thread to the lane of a wavefront |
| in a SIMT manner. Or a compiler could choose to access it in an unswizzled |
| manner if mapping the same language with the wavefront being the thread. |
| |
| It also allows the compiler to mix the address space it uses to access private |
| memory. For example, for SIMT it can still spill entire vector registers in an |
| unswizzled manner, while using a swizzled private memory for SIMT variable |
| access. |
| |
| This approach also allows memory location descriptions for different address |
| spaces to be combined using the regular ``DW_OP_*piece`` operations. |
| |
| Location descriptions are an abstraction of storage. They give freedom to the |
| consumer on how to implement them. They allow the address space to encode lane |
| information so they can be used to read memory with only the memory location |
| description and no extra information. The same set of operations can operate on |
| locations independent of their kind of storage. The ``DW_OP_deref*`` therefore |
| can be used on any storage kind, including memory location descriptions of |
| different address spaces. Therefore, the ``DW_OP_xderef*`` operations are |
| unnecessary, except to become a more compact way to encode a non-default address |
| space address followed by dereferencing it. See |
| :ref:`amdgpu-dwarf-general-operations`. |
| |
| 2.9 Support for Vector Base Types |
| --------------------------------- |
| |
| The vector registers of the AMDGPU are represented as their full wavefront |
| size, meaning the wavefront size times the dword size. This reflects the |
| actual hardware and allows the compiler to generate DWARF for languages that |
| map a thread to the complete wavefront. It also allows more efficient DWARF to |
| be generated to describe the CFI as only a single expression is required for |
| the whole vector register, rather than a separate expression for each lane's |
| dword of the vector register. It also allows the compiler to produce DWARF |
| that indexes the vector register if it spills scalar registers into portions |
| of a vector register. |
| |
| Since DWARF stack value entries have a base type and AMDGPU registers are a |
| vector of dwords, the ability to specify that a base type is a vector is |
| required. |
| |
| See ``DW_AT_LLVM_vector_size`` in :ref:`amdgpu-dwarf-base-type-entries`. |
| |
| .. _amdgpu-dwarf-operation-to-create-vector-composite-location-descriptions: |
| |
| 2.10 DWARF Operations to Create Vector Composite Location Descriptions |
| ---------------------------------------------------------------------- |
| |
| AMDGPU optimized code may spill vector registers to non-global address space |
| memory, and this spilling may be done only for SIMT lanes that are active on |
| entry to the subprogram. |
| |
| To support this, a composite location description that can be created as a |
| masked select is required. In addition, an operation that creates a composite |
| location description that is a vector on another location description is needed. |
| |
| An example that uses these operations is referenced in the |
| :ref:`amdgpu-dwarf-further-examples` appendix. |
| |
| See ``DW_OP_LLVM_select_bit_piece`` and ``DW_OP_LLVM_extend`` in |
| :ref:`amdgpu-dwarf-composite-location-description-operations`. |
| |
| 2.11 DWARF Operation to Access Call Frame Entry Registers |
| --------------------------------------------------------- |
| |
| As described in |
| :ref:`amdgpu-dwarf-operation-to-create-vector-composite-location-descriptions`, |
| a DWARF expression involving the set of SIMT lanes active on entry to a |
| subprogram is required. The SIMT active lane mask may be held in a register that |
| is modified as the subprogram executes. However, its value may be saved on entry |
| to the subprogram. |
| |
| The Call Frame Information (CFI) already encodes such register saving, so it is |
| more efficient to provide an operation to return the location of a saved |
| register than have to generate a loclist to describe the same information. This |
| is now possible since |
| :ref:`amdgpu-dwarf-allow-location-description-on-the-dwarf-evaluation-stack` |
| allows location descriptions on the stack. |
| |
| See ``DW_OP_LLVM_call_frame_entry_reg`` in |
| :ref:`amdgpu-dwarf-general-location-description-operations` and |
| :ref:`amdgpu-dwarf-call-frame-information`. |
| |
| 2.12 Support for Source Languages Mapped to SIMT Hardware |
| --------------------------------------------------------- |
| |
| If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner, |
| then the variable DWARF location expressions must compute the location for a |
| single lane of the wavefront. Therefore, a DWARF operation is required to denote |
| the current lane, much like ``DW_OP_push_object_address`` denotes the current |
| object. See ``DW_OP_LLVM_push_lane`` in :ref:`amdgpu-dwarf-literal-operations`. |
| |
| In addition, a way is needed for the compiler to communicate how many source |
| language threads of execution are mapped to a target architecture thread's SIMT |
| lanes. See ``DW_AT_LLVM_lanes`` in :ref:`amdgpu-dwarf-low-level-information`. |
| |
| .. _amdgpu-dwarf-support-for-divergent-control-flow-of-simt-hardware: |
| |
| 2.13 Support for Divergent Control Flow of SIMT Hardware |
| -------------------------------------------------------- |
| |
| If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner the |
| compiler can use the AMDGPU execution mask register to control which lanes are |
| active. To describe the conceptual location of non-active lanes requires an |
| attribute that has an expression that computes the source location PC for each |
| lane. |
| |
| For efficiency, the expression calculates the source location the wavefront as a |
| whole. This can be done using the ``DW_OP_LLVM_select_bit_piece`` (see |
| :ref:`amdgpu-dwarf-operation-to-create-vector-composite-location-descriptions`) |
| operation. |
| |
| The AMDGPU may update the execution mask to perform whole wavefront operations. |
| Therefore, there is a need for an attribute that computes the current active |
| lane mask. This can have an expression that may evaluate to the SIMT active lane |
| mask register or to a saved mask when in whole wavefront execution mode. |
| |
| An example that uses these attributes is referenced in the |
| :ref:`amdgpu-dwarf-further-examples` appendix. |
| |
| See ``DW_AT_LLVM_lane_pc`` and ``DW_AT_LLVM_active_lane`` in |
| :ref:`amdgpu-dwarf-composite-location-description-operations`. |
| |
| 2.14 Define Source Language Memory Classes |
| ------------------------------------------- |
| |
| AMDGPU supports languages, such as OpenCL [:ref:`OpenCL <amdgpu-dwarf-OpenCL>`], |
| that define source language memory classes. Support is added to define language |
| specific memory spaces so they can be used in a consistent way by consumers. |
| |
| Support for using memory spaces in defining source language types and data |
| object allocation is also added. |
| |
| See :ref:`amdgpu-dwarf-memory-spaces`. |
| |
| 2.15 Define Augmentation Strings to Support Multiple Extensions |
| --------------------------------------------------------------- |
| |
| A ``DW_AT_LLVM_augmentation`` attribute is added to a compilation unit debugger |
| information entry to indicate that there is additional target architecture |
| specific information in the debugging information entries of that compilation |
| unit. This allows a consumer to know what extensions are present in the debugger |
| information entries as is possible with the augmentation string of other |
| sections. See . |
| |
| The format that should be used for an augmentation string is also recommended. |
| This allows a consumer to parse the string when it contains information from |
| multiple vendors. Augmentation strings occur in the ``DW_AT_LLVM_augmentation`` |
| attribute, in the lookup by name table, and in the CFI Common Information Entry |
| (CIE). |
| |
| See :ref:`amdgpu-dwarf-full-and-partial-compilation-unit-entries`, |
| :ref:`amdgpu-dwarf-name-index-section-header`, and |
| :ref:`amdgpu-dwarf-structure_of-call-frame-information`. |
| |
| 2.16 Support Embedding Source Text for Online Compilation |
| --------------------------------------------------------- |
| |
| AMDGPU supports programming languages that include online compilation where the |
| source text may be created at runtime. For example, the OpenCL and HIP language |
| runtimes support online compilation. To support is, a way to embed the source |
| text in the debug information is provided. |
| |
| See :ref:`amdgpu-dwarf-line-number-information`. |
| |
| 2.17 Allow MD5 Checksums to be Optionally Present |
| ------------------------------------------------- |
| |
| In DWARF Version 5 the file timestamp and file size can be optional, but if the |
| MD5 checksum is present it must be valid for all files. This is a problem if |
| using link time optimization to combine compilation units where some have MD5 |
| checksums and some do not. Therefore, sSupport to allow MD5 checksums to be |
| optionally present in the line table is added. |
| |
| See :ref:`amdgpu-dwarf-line-number-information`. |
| |
| 2.18 Add the HIP Programing Language |
| ------------------------------------ |
| |
| The HIP programming language [:ref:`HIP <amdgpu-dwarf-HIP>`], which is supported |
| by the AMDGPU, is added. |
| |
| See :ref:`amdgpu-dwarf-language-names-table`. |
| |
| 2.19 Support for Source Language Optimizations that Result in Concurrent Iteration Execution |
| -------------------------------------------------------------------------------------------- |
| |
| A compiler can perform loop optimizations that result in the generated code |
| executing multiple iterations concurrently. For example, software pipelining |
| schedules multiple iterations in an interleaved fashion to allow the |
| instructions of one iteration to hide the latencies of the instructions of |
| another iteration. Another example is vectorization that can exploit SIMD |
| hardware to allow a single instruction to execute multiple iterations using |
| vector registers. |
| |
| Note that although this is similar to SIMT execution, the way a client debugger |
| uses the information is fundamentally different. In SIMT execution the debugger |
| needs to present the concurrent execution as distinct source language threads |
| that the user can list and switch focus between. With iteration concurrency |
| optimizations, such as software pipelining and vectorized SIMD, the debugger |
| must not present the concurrency as distinct source language threads. Instead, |
| it must inform the user that multiple loop iterations are executing in parallel |
| and allow the user to select between them. |
| |
| In general, SIMT execution fixes the number of concurrent executions per target |
| architecture thread. However, both software pipelining and SIMD vectorization |
| may vary the number of concurrent iterations for different loops executed by a |
| single source language thread. |
| |
| It is possible for the compiler to use both SIMT concurrency and iteration |
| concurrency techniques in the code of a single source language thread. |
| |
| Therefore, a DWARF operation is required to denote the current concurrent |
| iteration instance, much like ``DW_OP_push_object_address`` denotes the current |
| object. See ``DW_OP_LLVM_push_iteration`` in |
| :ref:`amdgpu-dwarf-literal-operations`. |
| |
| In addition, a way is needed for the compiler to communicate how many source |
| language loop iterations are executing concurrently. See |
| ``DW_AT_LLVM_iterations`` in :ref:`amdgpu-dwarf-low-level-information`. |
| |
| 2.20 DWARF Operation to Create Runtime Overlay Composite Location Description |
| ----------------------------------------------------------------------------- |
| |
| It is common in SIMD vectorization for the compiler to generate code that |
| promotes portions of an array into vector registers. For example, if the |
| hardware has vector registers with 8 elements, and 8 wide SIMD instructions, the |
| compiler may vectorize a loop so that is executes 8 iterations concurrently for |
| each vectorized loop iteration. |
| |
| On the first iteration of the generated vectorized loop, iterations 0 to 7 of |
| the source language loop will be executed using SIMD instructions. Then on the |
| next iteration of the generated vectorized loop, iteration 8 to 15 will be |
| executed, and so on. |
| |
| If the source language loop accesses an array element based on the loop |
| iteration index, the compiler may read the element into a register for the |
| duration of that iteration. Next iteration it will read the next element into |
| the register, and so on. With SIMD, this generalizes to the compiler reading |
| array elements 0 to 7 into a vector register on the first vectorized loop |
| iteration, then array elements 8 to 15 on the next iteration, and so on. |
| |
| The DWARF location description for the array needs to express that all elements |
| are in memory, except the slice that has been promoted to the vector register. |
| The starting position of the slice is a runtime value based on the iteration |
| index modulo the vectorization size. This cannot be expressed by ``DW_OP_piece`` |
| and ``DW_OP_bit_piece`` which only allow constant offsets to be expressed. |
| |
| Therefore, a new operator is defined that takes two location descriptions, an |
| offset and a size, and creates a composite that effectively uses the second |
| location description as an overlay of the first, positioned according to the |
| offset and size. See ``DW_OP_LLVM_overlay`` and ``DW_OP_LLVM_bit_overlay`` in |
| :ref:`amdgpu-dwarf-composite-location-description-operations`. |
| |
| Consider an array that has been partially registerized such that the currently |
| processed elements are held in registers, whereas the remainder of the array |
| remains in memory. Consider the loop in this C function, for example: |
| |
| .. code:: |
| :number-lines: |
| |
| extern void foo(uint32_t dst[], uint32_t src[], int len) { |
| for (int i = 0; i < len; ++i) |
| dst[i] += src[i]; |
| } |
| |
| Inside the loop body, the machine code loads ``src[i]`` and ``dst[i]`` into |
| registers, adds them, and stores the result back into ``dst[i]``. |
| |
| Considering the location of ``dst`` and ``src`` in the loop body, the elements |
| ``dst[i]`` and ``src[i]`` would be located in registers, all other elements are |
| located in memory. Let register ``R0`` contain the base address of ``dst``, |
| register ``R1`` contain ``i``, and register ``R2`` contain the registerized |
| ``dst[i]`` element. We can describe the location of ``dst`` as a memory location |
| with a register location overlaid at a runtime offset involving ``i``: |
| |
| .. code:: |
| :number-lines: |
| |
| // 1. Memory location description of dst elements located in memory: |
| DW_OP_breg0 0 |
| |
| // 2. Register location description of element dst[i] is located in R2: |
| DW_OP_reg2 |
| |
| // 3. Offset of the register within the memory of dst: |
| DW_OP_breg1 0 |
| DW_OP_lit4 |
| DW_OP_mul |
| |
| // 4. The size of the register element: |
| DW_OP_lit4 |
| |
| // 5. Make a composite location description for dst that is the memory #1 with |
| // the register #2 positioned as an overlay at offset #3 of size #4: |
| DW_OP_LLVM_overlay |
| |
| 2.21 Support for Source Language Memory Spaces |
| ---------------------------------------------- |
| |
| AMDGPU supports languages, such as OpenCL, that define source language memory |
| spaces. Support is added to define language specific memory spaces so they can |
| be used in a consistent way by consumers. See :ref:`amdgpu-dwarf-memory-spaces`. |
| |
| A new attribute ``DW_AT_LLVM_memory_space`` is added to support using memory |
| spaces in defining source language pointer and reference types (see |
| :ref:`amdgpu-dwarf-type-modifier-entries`) and data object allocation (see |
| :ref:`amdgpu-dwarf-data-object-entries`). |
| |
| 2.22 Expression Operation Vendor Extensibility Opcode |
| ----------------------------------------------------- |
| |
| The vendor extension encoding space for DWARF expression operations |
| accommodates only 32 unique operations. In practice, the lack of a central |
| registry and a desire for backwards compatibility means vendor extensions are |
| never retired, even when standard versions are accepted into DWARF proper. This |
| has produced a situation where the effective encoding space available for new |
| vendor extensions is miniscule today. |
| |
| To expand this encoding space a new DWARF operation ``DW_OP_LLVM_user`` is |
| added which acts as a "prefix" for vendor extensions. It is followed by a |
| ULEB128 encoded vendor extension opcode, which is then followed by the operands |
| of the corresponding vendor extension operation. |
| |
| This approach allows all remaining operations defined in these extensions to be |
| encoded without conflicting with existing vendor extensions. |
| |
| See ``DW_OP_LLVM_user`` in :ref:`amdgpu-dwarf-vendor-extensions-operations`. |
| |
| .. _amdgpu-dwarf-changes-relative-to-dwarf-version-5: |
| |
| A. Changes Relative to DWARF Version 5 |
| ====================================== |
| |
| .. note:: |
| |
| This appendix provides changes relative to DWARF Version 5. It has been |
| defined such that it is backwards compatible with DWARF Version 5. |
| Non-normative text is shown in *italics*. The section numbers generally |
| correspond to those in the DWARF Version 5 standard unless specified |
| otherwise. Definitions are given for the additional operations, as well as |
| clarifying how existing expression operations, CFI operations, and attributes |
| behave with respect to generalized location descriptions that support address |
| spaces and multiple places. |
| |
| The names for the new operations, attributes, and constants include "\ |
| ``LLVM``\ " and are encoded with vendor specific codes so these extensions |
| can be implemented as an LLVM vendor extension to DWARF Version 5. New |
| operations other than ``DW_OP_LLVM_user`` are "prefixed" by |
| ``DW_OP_LLVM_user`` to make enough encoding space available for their |
| implementation. |
| |
| .. note:: |
| |
| Notes are included to describe how the changes are to be applied to the |
| DWARF Version 5 standard. They also describe rational and issues that may |
| need further consideration. |
| |
| A.2 General Description |
| ----------------------- |
| |
| A.2.2 Attribute Types |
| ~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This augments DWARF Version 5 section 2.2 and Table 2.2. |
| |
| The following table provides the additional attributes. |
| |
| .. table:: Attribute names |
| :name: amdgpu-dwarf-attribute-names-table |
| |
| ============================ ==================================== |
| Attribute Usage |
| ============================ ==================================== |
| ``DW_AT_LLVM_active_lane`` SIMT active lanes (see :ref:`amdgpu-dwarf-low-level-information`) |
| ``DW_AT_LLVM_augmentation`` Compilation unit augmentation string (see :ref:`amdgpu-dwarf-full-and-partial-compilation-unit-entries`) |
| ``DW_AT_LLVM_lane_pc`` SIMT lane program location (see :ref:`amdgpu-dwarf-low-level-information`) |
| ``DW_AT_LLVM_lanes`` SIMT lane count (see :ref:`amdgpu-dwarf-low-level-information`) |
| ``DW_AT_LLVM_iterations`` Concurrent iteration count (see :ref:`amdgpu-dwarf-low-level-information`) |
| ``DW_AT_LLVM_vector_size`` Base type vector size (see :ref:`amdgpu-dwarf-base-type-entries`) |
| ``DW_AT_LLVM_address_space`` Architecture specific address space (see :ref:`amdgpu-dwarf-address-spaces`) |
| ``DW_AT_LLVM_memory_space`` Pointer or reference types (see 5.3 "Type Modifier Entries") |
| Data objects (see 4.1 "Data Object Entries") |
| ============================ ==================================== |
| |
| .. _amdgpu-dwarf-expressions: |
| |
| A.2.5 DWARF Expressions |
| ~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This section, and its nested sections, replaces DWARF Version 5 section 2.5 |
| and section 2.6. The new DWARF expression operation extensions are defined as |
| well as clarifying the extensions to already existing DWARF Version 5 |
| operations. It is based on the text of the existing DWARF Version 5 standard. |
| |
| DWARF expressions describe how to compute a value or specify a location. |
| |
| *The evaluation of a DWARF expression can provide the location of an object, the |
| value of an array bound, the length of a dynamic string, the desired value |
| itself, and so on.* |
| |
| If the evaluation of a DWARF expression does not encounter an error, then it can |
| either result in a value (see :ref:`amdgpu-dwarf-expression-value`) or a |
| location description (see :ref:`amdgpu-dwarf-location-description`). When a |
| DWARF expression is evaluated, it may be specified whether a value or location |
| description is required as the result kind. |
| |
| If a result kind is specified, and the result of the evaluation does not match |
| the specified result kind, then the implicit conversions described in |
| :ref:`amdgpu-dwarf-memory-location-description-operations` are performed if |
| valid. Otherwise, the DWARF expression is ill-formed. |
| |
| If the evaluation of a DWARF expression encounters an evaluation error, then the |
| result is an evaluation error. |
| |
| .. note:: |
| |
| Decided to define the concept of an evaluation error. An alternative is to |
| introduce an undefined value base type in a similar way to location |
| descriptions having an undefined location description. Then operations that |
| encounter an evaluation error can return the undefined location description or |
| value with an undefined base type. |
| |
| All operations that act on values would return an undefined entity if given an |
| undefined value. The expression would then always evaluate to completion, and |
| can be tested to determine if it is an undefined entity. |
| |
| However, this would add considerable additional complexity and does not match |
| that GDB throws an exception when these evaluation errors occur. |
| |
| If a DWARF expression is ill-formed, then the result is undefined. |
| |
| The following sections detail the rules for when a DWARF expression is |
| ill-formed or results in an evaluation error. |
| |
| A DWARF expression can either be encoded as an operation expression (see |
| :ref:`amdgpu-dwarf-operation-expressions`), or as a location list expression |
| (see :ref:`amdgpu-dwarf-location-list-expressions`). |
| |
| .. _amdgpu-dwarf-expression-evaluation-context: |
| |
| A.2.5.1 DWARF Expression Evaluation Context |
| +++++++++++++++++++++++++++++++++++++++++++ |
| |
| A DWARF expression is evaluated in a context that can include a number of |
| context elements. If multiple context elements are specified then they must be |
| self consistent or the result of the evaluation is undefined. The context |
| elements that can be specified are: |
| |
| *A current result kind* |
| |
| The kind of result required by the DWARF expression evaluation. If specified |
| it can be a location description or a value. |
| |
| *A current thread* |
| |
| The target architecture thread identifier. For source languages that are not |
| implemented using a SIMT execution model, this corresponds to the source |
| program thread of execution for which a user presented expression is currently |
| being evaluated. For source languages that are implemented using a SIMT |
| execution model, this together with the current lane corresponds to the source |
| program thread of execution for which a user presented expression is currently |
| being evaluated. |
| |
| It is required for operations that are related to target architecture threads. |
| |
| *For example, the* ``DW_OP_regval_type`` *operation, or the* |
| ``DW_OP_form_tls_address`` *and* ``DW_OP_LLVM_form_aspace_address`` |
| *operations when given an address space that is target architecture thread |
| specific.* |
| |
| *A current lane* |
| |
| The 0 based SIMT lane identifier to be used in evaluating a user presented |
| expression. This applies to source languages that are implemented for a target |
| architecture using a SIMT execution model. These implementations map source |
| language threads of execution to lanes of the target architecture threads. |
| |
| It is required for operations that are related to SIMT lanes. |
| |
| *For example, the* ``DW_OP_LLVM_push_lane`` *operation and* |
| ``DW_OP_LLVM_form_aspace_address`` *operation when given an address space that |
| is SIMT lane specific.* |
| |
| If specified, it must be consistent with the value of the ``DW_AT_LLVM_lanes`` |
| attribute of the subprogram corresponding to context's frame and program |
| location. It is consistent if the value is greater than or equal to 0 and less |
| than the, possibly default, value of the ``DW_AT_LLVM_lanes`` attribute. |
| Otherwise the result is undefined. |
| |
| *A current iteration* |
| |
| The 0 based source language iteration instance to be used in evaluating a user |
| presented expression. This applies to target architectures that support |
| optimizations that result in executing multiple source language loop iterations |
| concurrently. |
| |
| *For example, software pipelining and SIMD vectorization.* |
| |
| It is required for operations that are related to source language loop |
| iterations. |
| |
| *For example, the* ``DW_OP_LLVM_push_iteration`` *operation.* |
| |
| If specified, it must be consistent with the value of the |
| ``DW_AT_LLVM_iterations`` attribute of the subprogram corresponding to |
| context's frame and program location. It is consistent if the value is greater |
| than or equal to 0 and less than the, possibly default, value of the |
| ``DW_AT_LLVM_iterations`` attribute. Otherwise the result is undefined. |
| |
| *A current call frame* |
| |
| The target architecture call frame identifier. It identifies a call frame that |
| corresponds to an active invocation of a subprogram in the current thread. It |
| is identified by its address on the call stack. The address is referred to as |
| the Canonical Frame Address (CFA). The call frame information is used to |
| determine the CFA for the call frames of the current thread's call stack (see |
| :ref:`amdgpu-dwarf-call-frame-information`). |
| |
| It is required for operations that specify target architecture registers to |
| support virtual unwinding of the call stack. |
| |
| *For example, the* ``DW_OP_*reg*`` *operations.* |
| |
| If specified, it must be an active call frame in the current thread. If the |
| current lane is specified, then that lane must have been active on entry to |
| the call frame (see the ``DW_AT_LLVM_lane_pc`` attribute). Otherwise the |
| result is undefined. |
| |
| If it is the currently executing call frame, then it is termed the top call |
| frame. |
| |
| *A current program location* |
| |
| The target architecture program location corresponding to the current call |
| frame of the current thread. |
| |
| The program location of the top call frame is the target architecture program |
| counter for the current thread. The call frame information is used to obtain |
| the value of the return address register to determine the program location of |
| the other call frames (see :ref:`amdgpu-dwarf-call-frame-information`). |
| |
| It is required for the evaluation of location list expressions to select |
| amongst multiple program location ranges. It is required for operations that |
| specify target architecture registers to support virtual unwinding of the call |
| stack (see :ref:`amdgpu-dwarf-call-frame-information`). |
| |
| If specified: |
| |
| * If the current lane is not specified: |
| |
| * If the current call frame is the top call frame, it must be the current |
| target architecture program location. |
| |
| * If the current call frame F is not the top call frame, it must be the |
| program location associated with the call site in the current caller frame |
| F that invoked the callee frame. |
| |
| * If the current lane is specified and the architecture program location LPC |
| computed by the ``DW_AT_LLVM_lane_pc`` attribute for the current lane is not |
| the undefined location description (indicating the lane was not active on |
| entry to the call frame), it must be LPC. |
| |
| * Otherwise the result is undefined. |
| |
| *A current compilation unit* |
| |
| The compilation unit debug information entry that contains the DWARF expression |
| being evaluated. |
| |
| It is required for operations that reference debug information associated with |
| the same compilation unit, including indicating if such references use the |
| 32-bit or 64-bit DWARF format. It can also provide the default address space |
| address size if no current target architecture is specified. |
| |
| *For example, the* ``DW_OP_constx`` *and* ``DW_OP_addrx`` *operations.* |
| |
| *Note that this compilation unit may not be the same as the compilation unit |
| determined from the loaded code object corresponding to the current program |
| location. For example, the evaluation of the expression E associated with a* |
| ``DW_AT_location`` *attribute of the debug information entry operand of the* |
| ``DW_OP_call*`` *operations is evaluated with the compilation unit that |
| contains E and not the one that contains the* ``DW_OP_call*`` *operation |
| expression.* |
| |
| *A current target architecture* |
| |
| The target architecture. |
| |
| It is required for operations that specify target architecture specific |
| entities. |
| |
| *For example, target architecture specific entities include DWARF register |
| identifiers, DWARF lane identifiers, DWARF address space identifiers, the |
| default address space, and the address space address sizes.* |
| |
| If specified: |
| |
| * If the current frame is specified, then the current target architecture must |
| be the same as the target architecture of the current frame. |
| |
| * If the current frame is specified and is the top frame, and if the current |
| thread is specified, then the current target architecture must be the same |
| as the target architecture of the current thread. |
| |
| * If the current compilation unit is specified, then the current target |
| architecture default address space address size must be the same as the |
| ``address_size`` field in the header of the current compilation unit and any |
| associated entry in the ``.debug_aranges`` section. |
| |
| * If the current program location is specified, then the current target |
| architecture must be the same as the target architecture of any line number |
| information entry (see :ref:`amdgpu-dwarf-line-number-information`) |
| corresponding to the current program location. |
| |
| * If the current program location is specified, then the current target |
| architecture default address space address size must be the same as the |
| ``address_size`` field in the header of any entry corresponding to the |
| current program location in the ``.debug_addr``, ``.debug_line``, |
| ``.debug_rnglists``, ``.debug_rnglists.dwo``, ``.debug_loclists``, and |
| ``.debug_loclists.dwo`` sections. |
| |
| * Otherwise the result is undefined. |
| |
| *A current object* |
| |
| The location description of a program object. |
| |
| It is required for the ``DW_OP_push_object_address`` operation. |
| |
| *For example, the* ``DW_AT_data_location`` *attribute on type debug |
| information entries specifies the program object corresponding to a runtime |
| descriptor as the current object when it evaluates its associated expression.* |
| |
| The result is undefined if the location description is invalid (see |
| :ref:`amdgpu-dwarf-location-description`). |
| |
| *An initial stack* |
| |
| This is a list of values or location descriptions that will be pushed on the |
| operation expression evaluation stack in the order provided before evaluation |
| of an operation expression starts. |
| |
| Some debugger information entries have attributes that evaluate their DWARF |
| expression value with initial stack entries. In all other cases the initial |
| stack is empty. |
| |
| The result is undefined if any location descriptions are invalid (see |
| :ref:`amdgpu-dwarf-location-description`). |
| |
| If the evaluation requires a context element that is not specified, then the |
| result of the evaluation is an error. |
| |
| *A DWARF expression for a location description may be able to be evaluated |
| without a thread, lane, call frame, program location, or architecture context. |
| For example, the location of a global variable may be able to be evaluated |
| without such context. If the expression evaluates with an error then it may |
| indicate the variable has been optimized and so requires more context.* |
| |
| *The DWARF expression for call frame information (see* |
| :ref:`amdgpu-dwarf-call-frame-information`\ *) operations are restricted to |
| those that do not require the compilation unit context to be specified.* |
| |
| The DWARF is ill-formed if all the ``address_size`` fields in the headers of all |
| the entries in the ``.debug_info``, ``.debug_addr``, ``.debug_line``, |
| ``.debug_rnglists``, ``.debug_rnglists.dwo``, ``.debug_loclists``, and |
| ``.debug_loclists.dwo`` sections corresponding to any given program location do |
| not match. |
| |
| .. _amdgpu-dwarf-expression-value: |
| |
| A.2.5.2 DWARF Expression Value |
| ++++++++++++++++++++++++++++++ |
| |
| A value has a type and a literal value. It can represent a literal value of any |
| supported base type of the target architecture. The base type specifies the |
| size, encoding, and endianity of the literal value. |
| |
| .. note:: |
| |
| It may be desirable to add an implicit pointer base type encoding. It would be |
| used for the type of the value that is produced when the ``DW_OP_deref*`` |
| operation retrieves the full contents of an implicit pointer location storage |
| created by the ``DW_OP_implicit_pointer`` or |
| ``DW_OP_LLVM_aspace_implicit_pointer`` operations. The literal value would |
| record the debugging information entry and byte displacement specified by the |
| associated ``DW_OP_implicit_pointer`` or |
| ``DW_OP_LLVM_aspace_implicit_pointer`` operations. |
| |
| There is a distinguished base type termed the generic type, which is an integral |
| type that has the size of an address in the target architecture default address |
| space, a target architecture defined endianity, and unspecified signedness. |
| |
| *The generic type is the same as the unspecified type used for stack operations |
| defined in DWARF Version 4 and before.* |
| |
| An integral type is a base type that has an encoding of ``DW_ATE_signed``, |
| ``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``, |
| ``DW_ATE_boolean``, or any target architecture defined integral encoding in the |
| inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``. |
| |
| .. note:: |
| |
| It is unclear if ``DW_ATE_address`` is an integral type. GDB does not seem to |
| consider it as integral. |
| |
| .. _amdgpu-dwarf-location-description: |
| |
| A.2.5.3 DWARF Location Description |
| ++++++++++++++++++++++++++++++++++ |
| |
| *Debugging information must provide consumers a way to find the location of |
| program variables, determine the bounds of dynamic arrays and strings, and |
| possibly to find the base address of a subprogram’s call frame or the return |
| address of a subprogram. Furthermore, to meet the needs of recent computer |
| architectures and optimization techniques, debugging information must be able to |
| describe the location of an object whose location changes over the object’s |
| lifetime, and may reside at multiple locations simultaneously during parts of an |
| object's lifetime.* |
| |
| Information about the location of program objects is provided by location |
| descriptions. |
| |
| Location descriptions can consist of one or more single location descriptions. |
| |
| A single location description specifies the location storage that holds a |
| program object and a position within the location storage where the program |
| object starts. The position within the location storage is expressed as a bit |
| offset relative to the start of the location storage. |
| |
| A location storage is a linear stream of bits that can hold values. Each |
| location storage has a size in bits and can be accessed using a zero-based bit |
| offset. The ordering of bits within a location storage uses the bit numbering |
| and direction conventions that are appropriate to the current language on the |
| target architecture. |
| |
| There are five kinds of location storage: |
| |
| *memory location storage* |
| Corresponds to the target architecture memory address spaces. |
| |
| *register location storage* |
| Corresponds to the target architecture registers. |
| |
| *implicit location storage* |
| Corresponds to fixed values that can only be read. |
| |
| *undefined location storage* |
| Indicates no value is available and therefore cannot be read or written. |
| |
| *composite location storage* |
| Allows a mixture of these where some bits come from one location storage and |
| some from another location storage, or from disjoint parts of the same |
| location storage. |
| |
| .. note:: |
| |
| It may be better to add an implicit pointer location storage kind used by the |
| ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer`` |
| operations. It would specify the debugger information entry and byte offset |
| provided by the operations. |
| |
| *Location descriptions are a language independent representation of addressing |
| rules.* |
| |
| * *They can be the result of evaluating a debugger information entry attribute |
| that specifies an operation expression of arbitrary complexity. In this usage |
| they can describe the location of an object as long as its lifetime is either |
| static or the same as the lexical block (see |
| :ref:`amdgpu-dwarf-lexical-block-entries`) that owns it, and it does not move |
| during its lifetime.* |
| |
| * *They can be the result of evaluating a debugger information entry attribute |
| that specifies a location list expression. In this usage they can describe the |
| location of an object that has a limited lifetime, changes its location during |
| its lifetime, or has multiple locations over part or all of its lifetime.* |
| |
| If a location description has more than one single location description, the |
| DWARF expression is ill-formed if the object value held in each single location |
| description's position within the associated location storage is not the same |
| value, except for the parts of the value that are uninitialized. |
| |
| *A location description that has more than one single location description can |
| only be created by a location list expression that has overlapping program |
| location ranges, or certain expression operations that act on a location |
| description that has more than one single location description. There are no |
| operation expression operations that can directly create a location description |
| with more than one single location description.* |
| |
| *A location description with more than one single location description can be |
| used to describe objects that reside in more than one piece of storage at the |
| same time. An object may have more than one location as a result of |
| optimization. For example, a value that is only read may be promoted from memory |
| to a register for some region of code, but later code may revert to reading the |
| value from memory as the register may be used for other purposes. For the code |
| region where the value is in a register, any change to the object value must be |
| made in both the register and the memory so both regions of code will read the |
| updated value.* |
| |
| *A consumer of a location description with more than one single location |
| description can read the object's value from any of the single location |
| descriptions (since they all refer to location storage that has the same value), |
| but must write any changed value to all the single location descriptions.* |
| |
| The evaluation of an expression may require context elements to create a |
| location description. If such a location description is accessed, the storage it |
| denotes is that associated with the context element values specified when the |
| location description was created, which may differ from the context at the time |
| it is accessed. |
| |
| *For example, creating a register location description requires the thread |
| context: the location storage is for the specified register of that thread. |
| Creating a memory location description for an address space may required a |
| thread and a lane context: the location storage is the memory associated with |
| that thread and lane.* |
| |
| If any of the context elements required to create a location description change, |
| the location description becomes invalid and accessing it is undefined. |
| |
| *Examples of context that can invalidate a location description are:* |
| |
| * *The thread context is required and execution causes the thread to terminate.* |
| * *The call frame context is required and further execution causes the call |
| frame to return to the calling frame.* |
| * *The program location is required and further execution of the thread occurs. |
| That could change the location list entry or call frame information entry that |
| applies.* |
| * *An operation uses call frame information:* |
| |
| * *Any of the frames used in the virtual call frame unwinding return.* |
| * *The top call frame is used, the program location is used to select the call |
| frame information entry, and further execution of the thread occurs.* |
| |
| *A DWARF expression can be used to compute a location description for an object. |
| A subsequent DWARF expression evaluation can be given the object location |
| description as the object context or initial stack context to compute a |
| component of the object. The final result is undefined if the object location |
| description becomes invalid between the two expression evaluations.* |
| |
| A change of a thread's program location may not make a location description |
| invalid, yet may still render it as no longer meaningful. Accessing such a |
| location description, or using it as the object context or initial stack context |
| of an expression evaluation, may produce an undefined result. |
| |
| *For example, a location description may specify a register that no longer holds |
| the intended program object after a program location change. One way to avoid |
| such problems is to recompute location descriptions associated with threads when |
| their program locations change.* |
| |
| .. _amdgpu-dwarf-operation-expressions: |
| |
| A.2.5.4 DWARF Operation Expressions |
| +++++++++++++++++++++++++++++++++++ |
| |
| An operation expression is comprised of a stream of operations, each consisting |
| of an opcode followed by zero or more operands. The number of operands is |
| implied by the opcode. |
| |
| Operations represent a postfix operation on a simple stack machine. Each stack |
| entry can hold either a value or a location description. Operations can act on |
| entries on the stack, including adding entries and removing entries. If the kind |
| of a stack entry does not match the kind required by the operation and is not |
| implicitly convertible to the required kind (see |
| :ref:`amdgpu-dwarf-memory-location-description-operations`), then the DWARF |
| operation expression is ill-formed. |
| |
| Evaluation of an operation expression starts with an empty stack on which the |
| entries from the initial stack provided by the context are pushed in the order |
| provided. Then the operations are evaluated, starting with the first operation |
| of the stream. Evaluation continues until either an operation has an evaluation |
| error, or until one past the last operation of the stream is reached. |
| |
| The result of the evaluation is: |
| |
| * If an operation has an evaluation error, or an operation evaluates an |
| expression that has an evaluation error, then the result is an evaluation |
| error. |
| |
| * If the current result kind specifies a location description, then: |
| |
| * If the stack is empty, the result is a location description with one |
| undefined location description. |
| |
| *This rule is for backwards compatibility with DWARF Version 5 which has no |
| explicit operation to create an undefined location description, and uses an |
| empty operation expression for this purpose.* |
| |
| * If the top stack entry is a location description, or can be converted |
| to one (see :ref:`amdgpu-dwarf-memory-location-description-operations`), |
| then the result is that, possibly converted, location description. Any other |
| entries on the stack are discarded. |
| |
| * Otherwise the DWARF expression is ill-formed. |
| |
| .. note:: |
| |
| Could define this case as returning an implicit location description as |
| if the ``DW_OP_implicit`` operation is performed. |
| |
| * If the current result kind specifies a value, then: |
| |
| * If the top stack entry is a value, or can be converted to one (see |
| :ref:`amdgpu-dwarf-memory-location-description-operations`), then the result |
| is that, possibly converted, value. Any other entries on the stack are |
| discarded. |
| |
| * Otherwise the DWARF expression is ill-formed. |
| |
| * If the current result kind is not specified, then: |
| |
| * If the stack is empty, the result is a location description with one |
| undefined location description. |
| |
| *This rule is for backwards compatibility with DWARF Version 5 which has no |
| explicit operation to create an undefined location description, and uses an |
| empty operation expression for this purpose.* |
| |
| .. note:: |
| |
| This rule is consistent with the rule above for when a location |
| description is requested. However, GDB appears to report this as an error |
| and no GDB tests appear to cause an empty stack for this case. |
| |
| * Otherwise, the top stack entry is returned. Any other entries on the stack |
| are discarded. |
| |
| An operation expression is encoded as a byte block with some form of prefix that |
| specifies the byte count. It can be used: |
| |
| * as the value of a debugging information entry attribute that is encoded using |
| class ``exprloc`` (see :ref:`amdgpu-dwarf-classes-and-forms`), |
| |
| * as the operand to certain operation expression operations, |
| |
| * as the operand to certain call frame information operations (see |
| :ref:`amdgpu-dwarf-call-frame-information`), |
| |
| * and in location list entries (see |
| :ref:`amdgpu-dwarf-location-list-expressions`). |
| |
| .. _amdgpu-dwarf-vendor-extensions-operations: |
| |
| A.2.5.4.0 Vendor Extension Operations |
| ##################################### |
| |
| 1. ``DW_OP_LLVM_user`` |
| |
| ``DW_OP_LLVM_user`` encodes a vendor extension operation. It has at least one |
| operand: a ULEB128 constant identifying a vendor extension operation. The |
| remaining operands are defined by the vendor extension. The vendor extension |
| opcode 0 is reserved and cannot be used by any vendor extension. |
| |
| *The DW_OP_user encoding space can be understood to supplement the space |
| defined by DW_OP_lo_user and DW_OP_hi_user that is allocated by the standard |
| for the same purpose.* |
| |
| .. _amdgpu-dwarf-stack-operations: |
| |
| A.2.5.4.1 Stack Operations |
| ########################## |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.5.1.3. |
| |
| The following operations manipulate the DWARF stack. Operations that index the |
| stack assume that the top of the stack (most recently added entry) has index 0. |
| They allow the stack entries to be either a value or location description. |
| |
| If any stack entry accessed by a stack operation is an incomplete composite |
| location description (see |
| :ref:`amdgpu-dwarf-composite-location-description-operations`), then the DWARF |
| expression is ill-formed. |
| |
| .. note:: |
| |
| These operations now support stack entries that are values and location |
| descriptions. |
| |
| .. note:: |
| |
| If it is desired to also make them work with incomplete composite location |
| descriptions, then would need to define that the composite location storage |
| specified by the incomplete composite location description is also replicated |
| when a copy is pushed. This ensures that each copy of the incomplete composite |
| location description can update the composite location storage they specify |
| independently. |
| |
| 1. ``DW_OP_dup`` |
| |
| ``DW_OP_dup`` duplicates the stack entry at the top of the stack. |
| |
| 2. ``DW_OP_drop`` |
| |
| ``DW_OP_drop`` pops the stack entry at the top of the stack and discards it. |
| |
| 3. ``DW_OP_pick`` |
| |
| ``DW_OP_pick`` has a single unsigned 1-byte operand that represents an index |
| I. A copy of the stack entry with index I is pushed onto the stack. |
| |
| 4. ``DW_OP_over`` |
| |
| ``DW_OP_over`` pushes a copy of the entry with index 1. |
| |
| *This is equivalent to a* ``DW_OP_pick 1`` *operation.* |
| |
| 5. ``DW_OP_swap`` |
| |
| ``DW_OP_swap`` swaps the top two stack entries. The entry at the top of the |
| stack becomes the second stack entry, and the second stack entry becomes the |
| top of the stack. |
| |
| 6. ``DW_OP_rot`` |
| |
| ``DW_OP_rot`` rotates the first three stack entries. The entry at the top of |
| the stack becomes the third stack entry, the second entry becomes the top of |
| the stack, and the third entry becomes the second entry. |
| |
| *Examples illustrating many of these stack operations are found in Appendix |
| D.1.2 on page 289.* |
| |
| .. _amdgpu-dwarf-control-flow-operations: |
| |
| A.2.5.4.2 Control Flow Operations |
| ################################# |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.5.1.5. |
| |
| The following operations provide simple control of the flow of a DWARF operation |
| expression. |
| |
| 1. ``DW_OP_nop`` |
| |
| ``DW_OP_nop`` is a place holder. It has no effect on the DWARF stack |
| entries. |
| |
| 2. ``DW_OP_le``, ``DW_OP_ge``, ``DW_OP_eq``, ``DW_OP_lt``, ``DW_OP_gt``, |
| ``DW_OP_ne`` |
| |
| .. note:: |
| |
| The same as in DWARF Version 5 section 2.5.1.5. |
| |
| 3. ``DW_OP_skip`` |
| |
| ``DW_OP_skip`` is an unconditional branch. Its single operand is a 2-byte |
| signed integer constant. The 2-byte constant is the number of bytes of the |
| DWARF expression to skip forward or backward from the current operation, |
| beginning after the 2-byte constant. |
| |
| If the updated position is at one past the end of the last operation, then |
| the operation expression evaluation is complete. |
| |
| Otherwise, the DWARF expression is ill-formed if the updated operation |
| position is not in the range of the first to last operation inclusive, or |
| not at the start of an operation. |
| |
| 4. ``DW_OP_bra`` |
| |
| ``DW_OP_bra`` is a conditional branch. Its single operand is a 2-byte signed |
| integer constant. This operation pops the top of stack. If the value popped |
| is not the constant 0, the 2-byte constant operand is the number of bytes of |
| the DWARF operation expression to skip forward or backward from the current |
| operation, beginning after the 2-byte constant. |
| |
| If the updated position is at one past the end of the last operation, then |
| the operation expression evaluation is complete. |
| |
| Otherwise, the DWARF expression is ill-formed if the updated operation |
| position is not in the range of the first to last operation inclusive, or |
| not at the start of an operation. |
| |
| 5. ``DW_OP_call2, DW_OP_call4, DW_OP_call_ref`` |
| |
| ``DW_OP_call2``, ``DW_OP_call4``, and ``DW_OP_call_ref`` perform DWARF |
| procedure calls during evaluation of a DWARF operation expression. |
| |
| ``DW_OP_call2`` and ``DW_OP_call4``, have one operand that is, respectively, |
| a 2-byte or 4-byte unsigned offset DR that represents the byte offset of a |
| debugging information entry D relative to the beginning of the current |
| compilation unit. |
| |
| ``DW_OP_call_ref`` has one operand that is a 4-byte unsigned value in the |
| 32-bit DWARF format, or an 8-byte unsigned value in the 64-bit DWARF format, |
| that represents the byte offset DR of a debugging information entry D |
| relative to the beginning of the ``.debug_info`` section that contains the |
| current compilation unit. D may not be in the current compilation unit. |
| |
| .. note:: |
| |
| DWARF Version 5 states that DR can be an offset in a ``.debug_info`` |
| section other than the one that contains the current compilation unit. It |
| states that relocation of references from one executable or shared object |
| file to another must be performed by the consumer. But given that DR is |
| defined as an offset in a ``.debug_info`` section this seems impossible. |
| If DR was defined as an implementation defined value, then the consumer |
| could choose to interpret the value in an implementation defined manner to |
| reference a debug information in another executable or shared object. |
| |
| In ELF the ``.debug_info`` section is in a non-\ ``PT_LOAD`` segment so |
| standard dynamic relocations cannot be used. But even if they were loaded |
| segments and dynamic relocations were used, DR would need to be the |
| address of D, not an offset in a ``.debug_info`` section. That would also |
| need DR to be the size of a global address. So it would not be possible to |
| use the 32-bit DWARF format in a 64-bit global address space. In addition, |
| the consumer would need to determine what executable or shared object the |
| relocated address was in so it could determine the containing compilation |
| unit. |
| |
| GDB only interprets DR as an offset in the ``.debug_info`` section that |
| contains the current compilation unit. |
| |
| This comment also applies to ``DW_OP_implicit_pointer`` and |
| ``DW_OP_LLVM_aspace_implicit_pointer``. |
| |
| *Operand interpretation of* ``DW_OP_call2``\ *,* ``DW_OP_call4``\ *, and* |
| ``DW_OP_call_ref`` *is exactly like that for* ``DW_FORM_ref2``\ *, |
| ``DW_FORM_ref4``\ *, and* ``DW_FORM_ref_addr``\ *, respectively.* |
| |
| The call operation is evaluated by: |
| |
| * If D has a ``DW_AT_location`` attribute that is encoded as a ``exprloc`` |
| that specifies an operation expression E, then execution of the current |
| operation expression continues from the first operation of E. Execution |
| continues until one past the last operation of E is reached, at which |
| point execution continues with the operation following the call operation. |
| The operations of E are evaluated with the same current context, except |
| current compilation unit is the one that contains D and the stack is the |
| same as that being used by the call operation. After the call operation |
| has been evaluated, the stack is therefore as it is left by the evaluation |
| of the operations of E. Since E is evaluated on the same stack as the call |
| operation, E can use, and/or remove entries already on the stack, and can |
| add new entries to the stack. |
| |
| *Values on the stack at the time of the call may be used as parameters by |
| the called expression and values left on the stack by the called expression |
| may be used as return values by prior agreement between the calling and |
| called expressions.* |
| |
| * If D has a ``DW_AT_location`` attribute that is encoded as a ``loclist`` or |
| ``loclistsptr``, then the specified location list expression E is |
| evaluated. The evaluation of E uses the current context, except the result |
| kind is a location description, the compilation unit is the one that |
| contains D, and the initial stack is empty. The location description |
| result is pushed on the stack. |
| |
| .. note:: |
| |
| This rule avoids having to define how to execute a matched location list |
| entry operation expression on the same stack as the call when there are |
| multiple matches. But it allows the call to obtain the location |
| description for a variable or formal parameter which may use a location |
| list expression. |
| |
| An alternative is to treat the case when D has a ``DW_AT_location`` |
| attribute that is encoded as a ``loclist`` or ``loclistsptr``, and the |
| specified location list expression E' matches a single location list |
| entry with operation expression E, the same as the ``exprloc`` case and |
| evaluate on the same stack. |
| |
| But this is not attractive as if the attribute is for a variable that |
| happens to end with a non-singleton stack, it will not simply put a |
| location description on the stack. Presumably the intent of using |
| ``DW_OP_call*`` on a variable or formal parameter debugger information |
| entry is to push just one location description on the stack. That |
| location description may have more than one single location description. |
| |
| The previous rule for ``exprloc`` also has the same problem, as normally |
| a variable or formal parameter location expression may leave multiple |
| entries on the stack and only return the top entry. |
| |
| GDB implements ``DW_OP_call*`` by always executing E on the same stack. |
| If the location list has multiple matching entries, it simply picks the |
| first one and ignores the rest. This seems fundamentally at odds with |
| the desire to support multiple places for variables. |
| |
| So, it feels like ``DW_OP_call*`` should both support pushing a location |
| description on the stack for a variable or formal parameter, and also |
| support being able to execute an operation expression on the same stack. |
| Being able to specify a different operation expression for different |
| program locations seems a desirable feature to retain. |
| |
| A solution to that is to have a distinct ``DW_AT_LLVM_proc`` attribute |
| for the ``DW_TAG_dwarf_procedure`` debugging information entry. Then the |
| ``DW_AT_location`` attribute expression is always executed separately |
| and pushes a location description (that may have multiple single |
| location descriptions), and the ``DW_AT_LLVM_proc`` attribute expression |
| is always executed on the same stack and can leave anything on the |
| stack. |
| |
| The ``DW_AT_LLVM_proc`` attribute could have the new classes |
| ``exprproc``, ``loclistproc``, and ``loclistsptrproc`` to indicate that |
| the expression is executed on the same stack. ``exprproc`` is the same |
| encoding as ``exprloc``. ``loclistproc`` and ``loclistsptrproc`` are the |
| same encoding as their non-\ ``proc`` counterparts, except the DWARF is |
| ill-formed if the location list does not match exactly one location list |
| entry and a default entry is required. These forms indicate explicitly |
| that the matched single operation expression must be executed on the |
| same stack. This is better than ad hoc special rules for ``loclistproc`` |
| and ``loclistsptrproc`` which are currently clearly defined to always |
| return a location description. The producer then explicitly indicates |
| the intent through the attribute classes. |
| |
| Such a change would be a breaking change for how GDB implements |
| ``DW_OP_call*``. However, are the breaking cases actually occurring in |
| practice? GDB could implement the current approach for DWARF Version 5, |
| and the new semantics for DWARF Version 6 which has been done for some |
| other features. |
| |
| Another option is to limit the execution to be on the same stack only to |
| the evaluation of an expression E that is the value of a |
| ``DW_AT_location`` attribute of a ``DW_TAG_dwarf_procedure`` debugging |
| information entry. The DWARF would be ill-formed if E is a location list |
| expression that does not match exactly one location list entry. In all |
| other cases the evaluation of an expression E that is the value of a |
| ``DW_AT_location`` attribute would evaluate E with the current context, |
| except the result kind is a location description, the compilation unit |
| is the one that contains D, and the initial stack is empty. The location |
| description result is pushed on the stack. |
| |
| * If D has a ``DW_AT_const_value`` attribute with a value V, then it is as |
| if a ``DW_OP_implicit_value V`` operation was executed. |
| |
| *This allows a call operation to be used to compute the location |
| description for any variable or formal parameter regardless of whether the |
| producer has optimized it to a constant. This is consistent with the* |
| ``DW_OP_implicit_pointer`` *operation.* |
| |
| .. note:: |
| |
| Alternatively, could deprecate using ``DW_AT_const_value`` for |
| ``DW_TAG_variable`` and ``DW_TAG_formal_parameter`` debugger information |
| entries that are constants and instead use ``DW_AT_location`` with an |
| operation expression that results in a location description with one |
| implicit location description. Then this rule would not be required. |
| |
| * Otherwise, there is no effect and no changes are made to the stack. |
| |
| .. note:: |
| |
| In DWARF Version 5, if D does not have a ``DW_AT_location`` then |
| ``DW_OP_call*`` is defined to have no effect. It is unclear that this is |
| the right definition as a producer should be able to rely on using |
| ``DW_OP_call*`` to get a location description for any non-\ |
| ``DW_TAG_dwarf_procedure`` debugging information entries. Also, the |
| producer should not be creating DWARF with ``DW_OP_call*`` to a |
| ``DW_TAG_dwarf_procedure`` that does not have a ``DW_AT_location`` |
| attribute. So, should this case be defined as an ill-formed DWARF |
| expression? |
| |
| *The* ``DW_TAG_dwarf_procedure`` *debugging information entry can be used to |
| define DWARF procedures that can be called.* |
| |
| .. _amdgpu-dwarf-value-operations: |
| |
| A.2.5.4.3 Value Operations |
| ########################## |
| |
| This section describes the operations that push values on the stack. |
| |
| Each value stack entry has a type and a literal value. It can represent a |
| literal value of any supported base type of the target architecture. The base |
| type specifies the size, encoding, and endianity of the literal value. |
| |
| The base type of value stack entries can be the distinguished generic type. |
| |
| .. _amdgpu-dwarf-literal-operations: |
| |
| A.2.5.4.3.1 Literal Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.5.1.1. |
| |
| The following operations all push a literal value onto the DWARF stack. |
| |
| Operations other than ``DW_OP_const_type`` push a value V with the generic type. |
| If V is larger than the generic type, then V is truncated to the generic type |
| size and the low-order bits used. |
| |
| 1. ``DW_OP_lit0``, ``DW_OP_lit1``, ..., ``DW_OP_lit31`` |
| |
| ``DW_OP_lit<N>`` operations encode an unsigned literal value N from 0 |
| through 31, inclusive. They push the value N with the generic type. |
| |
| 2. ``DW_OP_const1u``, ``DW_OP_const2u``, ``DW_OP_const4u``, ``DW_OP_const8u`` |
| |
| ``DW_OP_const<N>u`` operations have a single operand that is a 1, 2, 4, or |
| 8-byte unsigned integer constant U, respectively. They push the value U with |
| the generic type. |
| |
| 3. ``DW_OP_const1s``, ``DW_OP_const2s``, ``DW_OP_const4s``, ``DW_OP_const8s`` |
| |
| ``DW_OP_const<N>s`` operations have a single operand that is a 1, 2, 4, or |
| 8-byte signed integer constant S, respectively. They push the value S with |
| the generic type. |
| |
| 4. ``DW_OP_constu`` |
| |
| ``DW_OP_constu`` has a single unsigned LEB128 integer operand N. It pushes |
| the value N with the generic type. |
| |
| 5. ``DW_OP_consts`` |
| |
| ``DW_OP_consts`` has a single signed LEB128 integer operand N. It pushes the |
| value N with the generic type. |
| |
| 6. ``DW_OP_constx`` |
| |
| ``DW_OP_constx`` has a single unsigned LEB128 integer operand that |
| represents a zero-based index into the ``.debug_addr`` section relative to |
| the value of the ``DW_AT_addr_base`` attribute of the associated compilation |
| unit. The value N in the ``.debug_addr`` section has the size of the generic |
| type. It pushes the value N with the generic type. |
| |
| *The* ``DW_OP_constx`` *operation is provided for constants that require |
| link-time relocation but should not be interpreted by the consumer as a |
| relocatable address (for example, offsets to thread-local storage).* |
| |
| 7. ``DW_OP_const_type`` |
| |
| ``DW_OP_const_type`` has three operands. The first is an unsigned LEB128 |
| integer DR that represents the byte offset of a debugging information entry |
| D relative to the beginning of the current compilation unit, that provides |
| the type T of the constant value. The second is a 1-byte unsigned integral |
| constant S. The third is a block of bytes B, with a length equal to S. |
| |
| TS is the bit size of the type T. The least significant TS bits of B are |
| interpreted as a value V of the type D. It pushes the value V with the type |
| D. |
| |
| The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging |
| information entry in the current compilation unit, or if TS divided by 8 |
| (the byte size) and rounded up to a whole number is not equal to S. |
| |
| *While the size of the byte block B can be inferred from the type D |
| definition, it is encoded explicitly into the operation so that the |
| operation can be parsed easily without reference to the* ``.debug_info`` |
| *section.* |
| |
| 8. ``DW_OP_LLVM_push_lane`` *New* |
| |
| ``DW_OP_LLVM_push_lane`` pushes the current lane as a value with the generic |
| type. |
| |
| *For source languages that are implemented using a SIMT execution model, |
| this is the zero-based lane number that corresponds to the source language |
| thread of execution upon which the user is focused.* |
| |
| The value must be greater than or equal to 0 and less than the value of the |
| ``DW_AT_LLVM_lanes`` attribute, otherwise the DWARF expression is |
| ill-formed. See :ref:`amdgpu-dwarf-low-level-information`. |
| |
| 9. ``DW_OP_LLVM_push_iteration`` *New* |
| |
| ``DW_OP_LLVM_push_iteration`` pushes the current iteration as a value with |
| the generic type. |
| |
| *For source language implementations with optimizations that cause multiple |
| loop iterations to execute concurrently, this is the zero-based iteration |
| number that corresponds to the source language concurrent loop iteration |
| upon which the user is focused.* |
| |
| The value must be greater than or equal to 0 and less than the value of the |
| ``DW_AT_LLVM_iterations`` attribute, otherwise the DWARF expression is |
| ill-formed. See :ref:`amdgpu-dwarf-low-level-information`. |
| |
| .. _amdgpu-dwarf-arithmetic-logical-operations: |
| |
| A.2.5.4.3.2 Arithmetic and Logical Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section is the same as DWARF Version 5 section 2.5.1.4. |
| |
| .. _amdgpu-dwarf-type-conversions-operations: |
| |
| A.2.5.4.3.3 Type Conversion Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section is the same as DWARF Version 5 section 2.5.1.6. |
| |
| .. _amdgpu-dwarf-general-operations: |
| |
| A.2.5.4.3.4 Special Value Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces parts of DWARF Version 5 sections 2.5.1.2, 2.5.1.3, and |
| 2.5.1.7. |
| |
| There are these special value operations currently defined: |
| |
| 1. ``DW_OP_regval_type`` |
| |
| ``DW_OP_regval_type`` has two operands. The first is an unsigned LEB128 |
| integer that represents a register number R. The second is an unsigned |
| LEB128 integer DR that represents the byte offset of a debugging information |
| entry D relative to the beginning of the current compilation unit, that |
| provides the type T of the register value. |
| |
| The operation is equivalent to performing ``DW_OP_regx R; DW_OP_deref_type |
| DR``. |
| |
| .. note:: |
| |
| Should DWARF allow the type T to be a larger size than the size of the |
| register R? Restricting a larger bit size avoids any issue of conversion |
| as the, possibly truncated, bit contents of the register is simply |
| interpreted as a value of T. If a conversion is wanted it can be done |
| explicitly using a ``DW_OP_convert`` operation. |
| |
| GDB has a per register hook that allows a target specific conversion on a |
| register by register basis. It defaults to truncation of bigger registers. |
| Removing use of the target hook does not cause any test failures in common |
| architectures. If the compiler for a target architecture did want some |
| form of conversion, including a larger result type, it could always |
| explicitly use the ``DW_OP_convert`` operation. |
| |
| If T is a larger type than the register size, then the default GDB |
| register hook reads bytes from the next register (or reads out of bounds |
| for the last register!). Removing use of the target hook does not cause |
| any test failures in common architectures (except an illegal hand written |
| assembly test). If a target architecture requires this behavior, these |
| extensions allow a composite location description to be used to combine |
| multiple registers. |
| |
| 2. ``DW_OP_deref`` |
| |
| S is the bit size of the generic type divided by 8 (the byte size) and |
| rounded up to a whole number. DR is the offset of a hypothetical debug |
| information entry D in the current compilation unit for a base type of the |
| generic type. |
| |
| The operation is equivalent to performing ``DW_OP_deref_type S, DR``. |
| |
| 3. ``DW_OP_deref_size`` |
| |
| ``DW_OP_deref_size`` has a single 1-byte unsigned integral constant that |
| represents a byte result size S. |
| |
| TS is the smaller of the generic type bit size and S scaled by 8 (the byte |
| size). If TS is smaller than the generic type bit size then T is an unsigned |
| integral type of bit size TS, otherwise T is the generic type. DR is the |
| offset of a hypothetical debug information entry D in the current |
| compilation unit for a base type T. |
| |
| .. note:: |
| |
| Truncating the value when S is larger than the generic type matches what |
| GDB does. This allows the generic type size to not be an integral byte |
| size. It does allow S to be arbitrarily large. Should S be restricted to |
| the size of the generic type rounded up to a multiple of 8? |
| |
| The operation is equivalent to performing ``DW_OP_deref_type S, DR``, except |
| if T is not the generic type, the value V pushed is zero-extended to the |
| generic type bit size and its type changed to the generic type. |
| |
| 4. ``DW_OP_deref_type`` |
| |
| ``DW_OP_deref_type`` has two operands. The first is a 1-byte unsigned |
| integral constant S. The second is an unsigned LEB128 integer DR that |
| represents the byte offset of a debugging information entry D relative to |
| the beginning of the current compilation unit, that provides the type T of |
| the result value. |
| |
| TS is the bit size of the type T. |
| |
| *While the size of the pushed value V can be inferred from the type T, it is |
| encoded explicitly as the operand S so that the operation can be parsed |
| easily without reference to the* ``.debug_info`` *section.* |
| |
| .. note:: |
| |
| It is unclear why the operand S is needed. Unlike ``DW_OP_const_type``, |
| the size is not needed for parsing. Any evaluation needs to get the base |
| type T to push with the value to know its encoding and bit size. |
| |
| It pops one stack entry that must be a location description L. |
| |
| A value V of TS bits is retrieved from the location storage LS specified by |
| one of the single location descriptions SL of L. |
| |
| *If L, or the location description of any composite location description |
| part that is a subcomponent of L, has more than one single location |
| description, then any one of them can be selected as they are required to |
| all have the same value. For any single location description SL, bits are |
| retrieved from the associated storage location starting at the bit offset |
| specified by SL. For a composite location description, the retrieved bits |
| are the concatenation of the N bits from each composite location part PL, |
| where N is limited to the size of PL.* |
| |
| V is pushed on the stack with the type T. |
| |
| .. note:: |
| |
| This definition makes it an evaluation error if L is a register location |
| description that has less than TS bits remaining in the register storage. |
| Particularly since these extensions extend location descriptions to have |
| a bit offset, it would be odd to define this as performing sign extension |
| based on the type, or be target architecture dependent, as the number of |
| remaining bits could be any number. This matches the GDB implementation |
| for ``DW_OP_deref_type``. |
| |
| These extensions define ``DW_OP_*breg*`` in terms of |
| ``DW_OP_regval_type``. ``DW_OP_regval_type`` is defined in terms of |
| ``DW_OP_regx``, which uses a 0 bit offset, and ``DW_OP_deref_type``. |
| Therefore, it requires the register size to be greater or equal to the |
| address size of the address space. This matches the GDB implementation for |
| ``DW_OP_*breg*``. |
| |
| The DWARF is ill-formed if D is not in the current compilation unit, D is |
| not a ``DW_TAG_base_type`` debugging information entry, or if TS divided by |
| 8 (the byte size) and rounded up to a whole number is not equal to S. |
| |
| .. note:: |
| |
| This definition allows the base type to be a bit size since there seems no |
| reason to restrict it. |
| |
| It is an evaluation error if any bit of the value is retrieved from the |
| undefined location storage or the offset of any bit exceeds the size of the |
| location storage LS specified by any single location description SL of L. |
| |
| See :ref:`amdgpu-dwarf-implicit-location-description-operations` for special |
| rules concerning implicit location descriptions created by the |
| ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer`` |
| operations. |
| |
| 5. ``DW_OP_xderef`` *Deprecated* |
| |
| ``DW_OP_xderef`` pops two stack entries. The first must be an integral type |
| value that represents an address A. The second must be an integral type |
| value that represents a target architecture specific address space |
| identifier AS. |
| |
| The operation is equivalent to performing ``DW_OP_swap; |
| DW_OP_LLVM_form_aspace_address; DW_OP_deref``. The value V retrieved is left |
| on the stack with the generic type. |
| |
| *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address`` |
| *operation can be used and provides greater expressiveness.* |
| |
| 6. ``DW_OP_xderef_size`` *Deprecated* |
| |
| ``DW_OP_xderef_size`` has a single 1-byte unsigned integral constant that |
| represents a byte result size S. |
| |
| It pops two stack entries. The first must be an integral type value that |
| represents an address A. The second must be an integral type value that |
| represents a target architecture specific address space identifier AS. |
| |
| The operation is equivalent to performing ``DW_OP_swap; |
| DW_OP_LLVM_form_aspace_address; DW_OP_deref_size S``. The zero-extended |
| value V retrieved is left on the stack with the generic type. |
| |
| *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address`` |
| *operation can be used and provides greater expressiveness.* |
| |
| 7. ``DW_OP_xderef_type`` *Deprecated* |
| |
| ``DW_OP_xderef_type`` has two operands. The first is a 1-byte unsigned |
| integral constant S. The second operand is an unsigned LEB128 integer DR |
| that represents the byte offset of a debugging information entry D relative |
| to the beginning of the current compilation unit, that provides the type T |
| of the result value. |
| |
| It pops two stack entries. The first must be an integral type value that |
| represents an address A. The second must be an integral type value that |
| represents a target architecture specific address space identifier AS. |
| |
| The operation is equivalent to performing ``DW_OP_swap; |
| DW_OP_LLVM_form_aspace_address; DW_OP_deref_type S DR``. The value V |
| retrieved is left on the stack with the type T. |
| |
| *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address`` |
| *operation can be used and provides greater expressiveness.* |
| |
| 8. ``DW_OP_entry_value`` *Deprecated* |
| |
| ``DW_OP_entry_value`` pushes the value of an expression that is evaluated in |
| the context of the calling frame. |
| |
| *It may be used to determine the value of arguments on entry to the current |
| call frame provided they are not clobbered.* |
| |
| It has two operands. The first is an unsigned LEB128 integer S. The second |
| is a block of bytes, with a length equal S, interpreted as a DWARF |
| operation expression E. |
| |
| E is evaluated with the current context, except the result kind is |
| unspecified, the call frame is the one that called the current frame, the |
| program location is the call site in the calling frame, the object is |
| unspecified, and the initial stack is empty. The calling frame information |
| is obtained by virtually unwinding the current call frame using the call |
| frame information (see :ref:`amdgpu-dwarf-call-frame-information`). |
| |
| If the result of E is a location description L (see |
| :ref:`amdgpu-dwarf-register-location-description-operations`), and the last |
| operation executed by E is a ``DW_OP_reg*`` for register R with a target |
| architecture specific base type of T, then the contents of the register are |
| retrieved as if a ``DW_OP_deref_type DR`` operation was performed where DR |
| is the offset of a hypothetical debug information entry in the current |
| compilation unit for T. The resulting value V s pushed on the stack. |
| |
| *Using* ``DW_OP_reg*`` *provides a more compact form for the case where the |
| value was in a register on entry to the subprogram.* |
| |
| .. note:: |
| |
| It is unclear how this provides a more compact expression, as |
| ``DW_OP_regval_type`` could be used which is marginally larger. |
| |
| If the result of E is a value V, then V is pushed on the stack. |
| |
| Otherwise, the DWARF expression is ill-formed. |
| |
| *The* ``DW_OP_entry_value`` *operation is deprecated as its main usage is |
| provided by other means. DWARF Version 5 added the* |
| ``DW_TAG_call_site_parameter`` *debugger information entry for call sites |
| that has* ``DW_AT_call_value``\ *,* ``DW_AT_call_data_location``\ *, and* |
| ``DW_AT_call_data_value`` *attributes that provide DWARF expressions to |
| compute actual parameter values at the time of the call, and requires the |
| producer to ensure the expressions are valid to evaluate even when virtually |
| unwound. The* ``DW_OP_LLVM_call_frame_entry_reg`` *operation provides access |
| to registers in the virtually unwound calling frame.* |
| |
| .. note:: |
| |
| GDB only implements ``DW_OP_entry_value`` when E is exactly |
| ``DW_OP_reg*`` or ``DW_OP_breg*; DW_OP_deref*``. |
| |
| .. _amdgpu-dwarf-location-description-operations: |
| |
| A.2.5.4.4 Location Description Operations |
| ######################################### |
| |
| This section describes the operations that push location descriptions on the |
| stack. |
| |
| .. _amdgpu-dwarf-general-location-description-operations: |
| |
| A.2.5.4.4.1 General Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces part of DWARF Version 5 section 2.5.1.3. |
| |
| 1. ``DW_OP_LLVM_offset`` *New* |
| |
| ``DW_OP_LLVM_offset`` pops two stack entries. The first must be an integral |
| type value that represents a byte displacement B. The second must be a |
| location description L. |
| |
| It adds the value of B scaled by 8 (the byte size) to the bit offset of each |
| single location description SL of L, and pushes the updated L. |
| |
| It is an evaluation error if the updated bit offset of any SL is less than 0 |
| or greater than or equal to the size of the location storage specified by |
| SL. |
| |
| 2. ``DW_OP_LLVM_offset_uconst`` *New* |
| |
| ``DW_OP_LLVM_offset_uconst`` has a single unsigned LEB128 integer operand |
| that represents a byte displacement B. |
| |
| The operation is equivalent to performing ``DW_OP_constu B; |
| DW_OP_LLVM_offset``. |
| |
| *This operation is supplied specifically to be able to encode more field |
| displacements in two bytes than can be done with* ``DW_OP_lit*; |
| DW_OP_LLVM_offset``\ *.* |
| |
| .. note:: |
| |
| Should this be named ``DW_OP_LLVM_offset_uconst`` to match |
| ``DW_OP_plus_uconst``, or ``DW_OP_LLVM_offset_constu`` to match |
| ``DW_OP_constu``? |
| |
| 3. ``DW_OP_LLVM_bit_offset`` *New* |
| |
| ``DW_OP_LLVM_bit_offset`` pops two stack entries. The first must be an |
| integral type value that represents a bit displacement B. The second must be |
| a location description L. |
| |
| It adds the value of B to the bit offset of each single location description |
| SL of L, and pushes the updated L. |
| |
| It is an evaluation error if the updated bit offset of any SL is less than 0 |
| or greater than or equal to the size of the location storage specified by |
| SL. |
| |
| 4. ``DW_OP_push_object_address`` |
| |
| ``DW_OP_push_object_address`` pushes the location description L of the |
| current object. |
| |
| *This object may correspond to an independent variable that is part of a |
| user presented expression that is being evaluated. The object location |
| description may be determined from the variable's own debugging information |
| entry or it may be a component of an array, structure, or class whose |
| address has been dynamically determined by an earlier step during user |
| expression evaluation.* |
| |
| *This operation provides explicit functionality (especially for arrays |
| involving descriptors) that is analogous to the implicit push of the base |
| location description of a structure prior to evaluation of a* |
| ``DW_AT_data_member_location`` *to access a data member of a structure.* |
| |
| .. note:: |
| |
| This operation could be removed and the object location description |
| specified as the initial stack as for ``DW_AT_data_member_location``. |
| |
| Or this operation could be used instead of needing to specify an initial |
| stack. The latter approach is more composable as access to the object may |
| be needed at any point of the expression, and passing it as the initial |
| stack requires the entire expression to be aware where on the stack it is. |
| If this were done, ``DW_AT_use_location`` would require a |
| ``DW_OP_push_object2_address`` operation for the second object. |
| |
| Or a more general way to pass an arbitrary number of arguments in and an |
| operation to get the Nth one such as ``DW_OP_arg N``. A vector of |
| arguments would then be passed in the expression context rather than an |
| initial stack. This could also resolve the issues with ``DW_OP_call*`` by |
| allowing a specific number of arguments passed in and returned to be |
| specified. The ``DW_OP_call*`` operation could then always execute on a |
| separate stack: the number of arguments would be specified in a new call |
| operation and taken from the callers stack, and similarly the number of |
| return results specified and copied from the called stack back to the |
| callee stack when the called expression was complete. |
| |
| The only attribute that specifies a current object is |
| ``DW_AT_data_location`` so the non-normative text seems to overstate how |
| this is being used. Or are there other attributes that need to state they |
| pass an object? |
| |
| 5. ``DW_OP_LLVM_call_frame_entry_reg`` *New* |
| |
| ``DW_OP_LLVM_call_frame_entry_reg`` has a single unsigned LEB128 integer |
| operand that represents a target architecture register number R. |
| |
| It pushes a location description L that holds the value of register R on |
| entry to the current subprogram as defined by the call frame information |
| (see :ref:`amdgpu-dwarf-call-frame-information`). |
| |
| *If there is no call frame information defined, then the default rules for |
| the target architecture are used. If the register rule is* undefined\ *, then |
| the undefined location description is pushed. If the register rule is* same |
| value\ *, then a register location description for R is pushed.* |
| |
| .. _amdgpu-dwarf-undefined-location-description-operations: |
| |
| A.2.5.4.4.2 Undefined Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.6.1.1.1. |
| |
| *The undefined location storage represents a piece or all of an object that is |
| present in the source but not in the object code (perhaps due to optimization). |
| Neither reading nor writing to the undefined location storage is meaningful.* |
| |
| An undefined location description specifies the undefined location storage. |
| There is no concept of the size of the undefined location storage, nor of a bit |
| offset for an undefined location description. The ``DW_OP_LLVM_*offset`` |
| operations leave an undefined location description unchanged. The |
| ``DW_OP_*piece`` operations can explicitly or implicitly specify an undefined |
| location description, allowing any size and offset to be specified, and results |
| in a part with all undefined bits. |
| |
| 1. ``DW_OP_LLVM_undefined`` *New* |
| |
| ``DW_OP_LLVM_undefined`` pushes a location description L that comprises one |
| undefined location description SL. |
| |
| .. _amdgpu-dwarf-memory-location-description-operations: |
| |
| A.2.5.4.4.3 Memory Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces parts of DWARF Version 5 section 2.5.1.1, 2.5.1.2, |
| 2.5.1.3, and 2.6.1.1.2. |
| |
| Each of the target architecture specific address spaces has a corresponding |
| memory location storage that denotes the linear addressable memory of that |
| address space. The size of each memory location storage corresponds to the range |
| of the addresses in the corresponding address space. |
| |
| *It is target architecture defined how address space location storage maps to |
| target architecture physical memory. For example, they may be independent |
| memory, or more than one location storage may alias the same physical memory |
| possibly at different offsets and with different interleaving. The mapping may |
| also be dictated by the source language address classes.* |
| |
| A memory location description specifies a memory location storage. The bit |
| offset corresponds to a bit position within a byte of the memory. Bits accessed |
| using a memory location description, access the corresponding target |
| architecture memory starting at the bit position within the byte specified by |
| the bit offset. |
| |
| A memory location description that has a bit offset that is a multiple of 8 (the |
| byte size) is defined to be a byte address memory location description. It has a |
| memory byte address A that is equal to the bit offset divided by 8. |
| |
| A memory location description that does not have a bit offset that is a multiple |
| of 8 (the byte size) is defined to be a bit field memory location description. |
| It has a bit position B equal to the bit offset modulo 8, and a memory byte |
| address A equal to the bit offset minus B that is then divided by 8. |
| |
| The address space AS of a memory location description is defined to be the |
| address space that corresponds to the memory location storage associated with |
| the memory location description. |
| |
| A location description that is comprised of one byte address memory location |
| description SL is defined to be a memory byte address location description. It |
| has a byte address equal to A and an address space equal to AS of the |
| corresponding SL. |
| |
| ``DW_ASPACE_LLVM_none`` is defined as the target architecture default address |
| space. See :ref:`amdgpu-dwarf-address-spaces`. |
| |
| If a stack entry is required to be a location description, but it is a value V |
| with the generic type, then it is implicitly converted to a location description |
| L with one memory location description SL. SL specifies the memory location |
| storage that corresponds to the target architecture default address space with a |
| bit offset equal to V scaled by 8 (the byte size). |
| |
| .. note:: |
| |
| If it is wanted to allow any integral type value to be implicitly converted to |
| a memory location description in the target architecture default address |
| space: |
| |
| If a stack entry is required to be a location description, but is a value V |
| with an integral type, then it is implicitly converted to a location |
| description L with a one memory location description SL. If the type size of |
| V is less than the generic type size, then the value V is zero extended to |
| the size of the generic type. The least significant generic type size bits |
| are treated as an unsigned value to be used as an address A. SL specifies |
| memory location storage corresponding to the target architecture default |
| address space with a bit offset equal to A scaled by 8 (the byte size). |
| |
| The implicit conversion could also be defined as target architecture specific. |
| For example, GDB checks if V is an integral type. If it is not it gives an |
| error. Otherwise, GDB zero-extends V to 64 bits. If the GDB target defines a |
| hook function, then it is called. The target specific hook function can modify |
| the 64-bit value, possibly sign extending based on the original value type. |
| Finally, GDB treats the 64-bit value V as a memory location address. |
| |
| If a stack entry is required to be a location description, but it is an implicit |
| pointer value IPV with the target architecture default address space, then it is |
| implicitly converted to a location description with one single location |
| description specified by IPV. See |
| :ref:`amdgpu-dwarf-implicit-location-description-operations`. |
| |
| .. note:: |
| |
| Is this rule required for DWARF Version 5 backwards compatibility? If not, it |
| can be eliminated, and the producer can use |
| ``DW_OP_LLVM_form_aspace_address``. |
| |
| If a stack entry is required to be a value, but it is a location description L |
| with one memory location description SL in the target architecture default |
| address space with a bit offset B that is a multiple of 8, then it is implicitly |
| converted to a value equal to B divided by 8 (the byte size) with the generic |
| type. |
| |
| 1. ``DW_OP_addr`` |
| |
| ``DW_OP_addr`` has a single byte constant value operand, which has the size |
| of the generic type, that represents an address A. |
| |
| It pushes a location description L with one memory location description SL |
| on the stack. SL specifies the memory location storage corresponding to the |
| target architecture default address space with a bit offset equal to A |
| scaled by 8 (the byte size). |
| |
| *If the DWARF is part of a code object, then A may need to be relocated. For |
| example, in the ELF code object format, A must be adjusted by the difference |
| between the ELF segment virtual address and the virtual address at which the |
| segment is loaded.* |
| |
| 2. ``DW_OP_addrx`` |
| |
| ``DW_OP_addrx`` has a single unsigned LEB128 integer operand that represents |
| a zero-based index into the ``.debug_addr`` section relative to the value of |
| the ``DW_AT_addr_base`` attribute of the associated compilation unit. The |
| address value A in the ``.debug_addr`` section has the size of the generic |
| type. |
| |
| It pushes a location description L with one memory location description SL |
| on the stack. SL specifies the memory location storage corresponding to the |
| target architecture default address space with a bit offset equal to A |
| scaled by 8 (the byte size). |
| |
| *If the DWARF is part of a code object, then A may need to be relocated. For |
| example, in the ELF code object format, A must be adjusted by the difference |
| between the ELF segment virtual address and the virtual address at which the |
| segment is loaded.* |
| |
| 3. ``DW_OP_LLVM_form_aspace_address`` *New* |
| |
| ``DW_OP_LLVM_form_aspace_address`` pops top two stack entries. The first |
| must be an integral type value that represents a target architecture |
| specific address space identifier AS. The second must be an integral type |
| value that represents an address A. |
| |
| The address size S is defined as the address bit size of the target |
| architecture specific address space that corresponds to AS. |
| |
| A is adjusted to S bits by zero extending if necessary, and then treating |
| the least significant S bits as an unsigned value A'. |
| |
| It pushes a location description L with one memory location description SL |
| on the stack. SL specifies the memory location storage LS that corresponds |
| to AS with a bit offset equal to A' scaled by 8 (the byte size). |
| |
| If AS is an address space that is specific to context elements, then LS |
| corresponds to the location storage associated with the current context. |
| |
| *For example, if AS is for per thread storage then LS is the location |
| storage for the current thread. For languages that are implemented using a |
| SIMT execution model, then if AS is for per lane storage then LS is the |
| location storage for the current lane of the current thread. Therefore, if L |
| is accessed by an operation, the location storage selected when the location |
| description was created is accessed, and not the location storage associated |
| with the current context of the access operation.* |
| |
| The DWARF expression is ill-formed if AS is not one of the values defined by |
| the target architecture specific ``DW_ASPACE_LLVM_*`` values. |
| |
| See :ref:`amdgpu-dwarf-implicit-location-description-operations` for special |
| rules concerning implicit pointer values produced by dereferencing implicit |
| location descriptions created by the ``DW_OP_implicit_pointer`` and |
| ``DW_OP_LLVM_aspace_implicit_pointer`` operations. |
| |
| 4. ``DW_OP_form_tls_address`` |
| |
| ``DW_OP_form_tls_address`` pops one stack entry that must be an integral |
| type value and treats it as a thread-local storage address TA. |
| |
| It pushes a location description L with one memory location description SL |
| on the stack. SL is the target architecture specific memory location |
| description that corresponds to the thread-local storage address TA. |
| |
| The meaning of the thread-local storage address TA is defined by the |
| run-time environment. If the run-time environment supports multiple |
| thread-local storage blocks for a single thread, then the block |
| corresponding to the executable or shared library containing this DWARF |
| expression is used. |
| |
| *Some implementations of C, C++, Fortran, and other languages, support a |
| thread-local storage class. Variables with this storage class have distinct |
| values and addresses in distinct threads, much as automatic variables have |
| distinct values and addresses in each subprogram invocation. Typically, |
| there is a single block of storage containing all thread-local variables |
| declared in the main executable, and a separate block for the variables |
| declared in each shared library. Each thread-local variable can then be |
| accessed in its block using an identifier. This identifier is typically a |
| byte offset into the block and pushed onto the DWARF stack by one of the* |
| ``DW_OP_const*`` *operations prior to the* ``DW_OP_form_tls_address`` |
| *operation. Computing the address of the appropriate block can be complex |
| (in some cases, the compiler emits a function call to do it), and difficult |
| to describe using ordinary DWARF location descriptions. Instead of forcing |
| complex thread-local storage calculations into the DWARF expressions, the* |
| ``DW_OP_form_tls_address`` *allows the consumer to perform the computation |
| based on the target architecture specific run-time environment.* |
| |
| 5. ``DW_OP_call_frame_cfa`` |
| |
| ``DW_OP_call_frame_cfa`` pushes the location description L of the Canonical |
| Frame Address (CFA) of the current subprogram, obtained from the call frame |
| information on the stack. See :ref:`amdgpu-dwarf-call-frame-information`. |
| |
| *Although the value of the* ``DW_AT_frame_base`` *attribute of the debugger |
| information entry corresponding to the current subprogram can be computed |
| using a location list expression, in some cases this would require an |
| extensive location list because the values of the registers used in |
| computing the CFA change during a subprogram execution. If the call frame |
| information is present, then it already encodes such changes, and it is |
| space efficient to reference that using the* ``DW_OP_call_frame_cfa`` |
| *operation.* |
| |
| 6. ``DW_OP_fbreg`` |
| |
| ``DW_OP_fbreg`` has a single signed LEB128 integer operand that represents a |
| byte displacement B. |
| |
| The location description L for the *frame base* of the current subprogram is |
| obtained from the ``DW_AT_frame_base`` attribute of the debugger information |
| entry corresponding to the current subprogram as described in |
| :ref:`amdgpu-dwarf-low-level-information`. |
| |
| The location description L is updated as if the ``DW_OP_LLVM_offset_uconst |
| B`` operation was applied. The updated L is pushed on the stack. |
| |
| 7. ``DW_OP_breg0``, ``DW_OP_breg1``, ..., ``DW_OP_breg31`` |
| |
| The ``DW_OP_breg<N>`` operations encode the numbers of up to 32 registers, |
| numbered from 0 through 31, inclusive. The register number R corresponds to |
| the N in the operation name. |
| |
| They have a single signed LEB128 integer operand that represents a byte |
| displacement B. |
| |
| The address space identifier AS is defined as the one corresponding to the |
| target architecture specific default address space. |
| |
| The address size S is defined as the address bit size of the target |
| architecture specific address space corresponding to AS. |
| |
| The contents of the register specified by R are retrieved as if a |
| ``DW_OP_regval_type R, DR`` operation was performed where DR is the offset |
| of a hypothetical debug information entry in the current compilation unit |
| for an unsigned integral base type of size S bits. B is added and the least |
| significant S bits are treated as an unsigned value to be used as an address |
| A. |
| |
| They push a location description L comprising one memory location |
| description LS on the stack. LS specifies the memory location storage that |
| corresponds to AS with a bit offset equal to A scaled by 8 (the byte size). |
| |
| 8. ``DW_OP_bregx`` |
| |
| ``DW_OP_bregx`` has two operands. The first is an unsigned LEB128 integer |
| that represents a register number R. The second is a signed LEB128 |
| integer that represents a byte displacement B. |
| |
| The action is the same as for ``DW_OP_breg<N>``, except that R is used as |
| the register number and B is used as the byte displacement. |
| |
| 9. ``DW_OP_LLVM_aspace_bregx`` *New* |
| |
| ``DW_OP_LLVM_aspace_bregx`` has two operands. The first is an unsigned |
| LEB128 integer that represents a register number R. The second is a signed |
| LEB128 integer that represents a byte displacement B. It pops one stack |
| entry that is required to be an integral type value that represents a target |
| architecture specific address space identifier AS. |
| |
| The action is the same as for ``DW_OP_breg<N>``, except that R is used as |
| the register number, B is used as the byte displacement, and AS is used as |
| the address space identifier. |
| |
| The DWARF expression is ill-formed if AS is not one of the values defined by |
| the target architecture specific ``DW_ASPACE_LLVM_*`` values. |
| |
| .. note:: |
| |
| Could also consider adding ``DW_OP_LLVM_aspace_breg0, |
| DW_OP_LLVM_aspace_breg1, ..., DW_OP_LLVM_aspace_bref31`` which would save |
| encoding size. |
| |
| .. _amdgpu-dwarf-register-location-description-operations: |
| |
| A.2.5.4.4.4 Register Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.6.1.1.3. |
| |
| There is a register location storage that corresponds to each of the target |
| architecture registers. The size of each register location storage corresponds |
| to the size of the corresponding target architecture register. |
| |
| A register location description specifies a register location storage. The bit |
| offset corresponds to a bit position within the register. Bits accessed using a |
| register location description access the corresponding target architecture |
| register starting at the specified bit offset. |
| |
| 1. ``DW_OP_reg0``, ``DW_OP_reg1``, ..., ``DW_OP_reg31`` |
| |
| ``DW_OP_reg<N>`` operations encode the numbers of up to 32 registers, |
| numbered from 0 through 31, inclusive. The target architecture register |
| number R corresponds to the N in the operation name. |
| |
| The operation is equivalent to performing ``DW_OP_regx R``. |
| |
| 2. ``DW_OP_regx`` |
| |
| ``DW_OP_regx`` has a single unsigned LEB128 integer operand that represents |
| a target architecture register number R. |
| |
| If the current call frame is the top call frame, it pushes a location |
| description L that specifies one register location description SL on the |
| stack. SL specifies the register location storage that corresponds to R with |
| a bit offset of 0 for the current thread. |
| |
| If the current call frame is not the top call frame, call frame information |
| (see :ref:`amdgpu-dwarf-call-frame-information`) is used to determine the |
| location description that holds the register for the current call frame and |
| current program location of the current thread. The resulting location |
| description L is pushed. |
| |
| *Note that if call frame information is used, the resulting location |
| description may be register, memory, or undefined.* |
| |
| *An implementation may evaluate the call frame information immediately, or |
| may defer evaluation until L is accessed by an operation. If evaluation is |
| deferred, R and the current context can be recorded in L. When accessed, the |
| recorded context is used to evaluate the call frame information, not the |
| current context of the access operation.* |
| |
| *These operations obtain a register location. To fetch the contents of a |
| register, it is necessary to use* ``DW_OP_regval_type``\ *, use one of the* |
| ``DW_OP_breg*`` *register-based addressing operations, or use* ``DW_OP_deref*`` |
| *on a register location description.* |
| |
| .. _amdgpu-dwarf-implicit-location-description-operations: |
| |
| A.2.5.4.4.5 Implicit Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.6.1.1.4. |
| |
| Implicit location storage represents a piece or all of an object which has no |
| actual location in the program but whose contents are nonetheless known, either |
| as a constant or can be computed from other locations and values in the program. |
| |
| An implicit location description specifies an implicit location storage. The bit |
| offset corresponds to a bit position within the implicit location storage. Bits |
| accessed using an implicit location description, access the corresponding |
| implicit storage value starting at the bit offset. |
| |
| 1. ``DW_OP_implicit_value`` |
| |
| ``DW_OP_implicit_value`` has two operands. The first is an unsigned LEB128 |
| integer that represents a byte size S. The second is a block of bytes with a |
| length equal to S treated as a literal value V. |
| |
| An implicit location storage LS is created with the literal value V and a |
| size of S. |
| |
| It pushes location description L with one implicit location description SL |
| on the stack. SL specifies LS with a bit offset of 0. |
| |
| 2. ``DW_OP_stack_value`` |
| |
| ``DW_OP_stack_value`` pops one stack entry that must be a value V. |
| |
| An implicit location storage LS is created with the literal value V using |
| the size, encoding, and endianity specified by V's base type. |
| |
| It pushes a location description L with one implicit location description SL |
| on the stack. SL specifies LS with a bit offset of 0. |
| |
| *The* ``DW_OP_stack_value`` *operation specifies that the object does not |
| exist in memory, but its value is nonetheless known. In this form, the |
| location description specifies the actual value of the object, rather than |
| specifying the memory or register storage that holds the value.* |
| |
| See ``DW_OP_implicit_pointer`` (following) for special rules concerning |
| implicit pointer values produced by dereferencing implicit location |
| descriptions created by the ``DW_OP_implicit_pointer`` and |
| ``DW_OP_LLVM_aspace_implicit_pointer`` operations. |
| |
| Note: Since location descriptions are allowed on the stack, the |
| ``DW_OP_stack_value`` operation no longer terminates the DWARF operation |
| expression execution as in DWARF Version 5. |
| |
| 3. ``DW_OP_implicit_pointer`` |
| |
| *An optimizing compiler may eliminate a pointer, while still retaining the |
| value that the pointer addressed.* ``DW_OP_implicit_pointer`` *allows a |
| producer to describe this value.* |
| |
| ``DW_OP_implicit_pointer`` *specifies an object is a pointer to the target |
| architecture default address space that cannot be represented as a real |
| pointer, even though the value it would point to can be described. In this |
| form, the location description specifies a debugging information entry that |
| represents the actual location description of the object to which the |
| pointer would point. Thus, a consumer of the debug information would be able |
| to access the dereferenced pointer, even when it cannot access the pointer |
| itself.* |
| |
| ``DW_OP_implicit_pointer`` has two operands. The first operand is a 4-byte |
| unsigned value in the 32-bit DWARF format, or an 8-byte unsigned value in |
| the 64-bit DWARF format, that represents the byte offset DR of a debugging |
| information entry D relative to the beginning of the ``.debug_info`` section |
| that contains the current compilation unit. The second operand is a signed |
| LEB128 integer that represents a byte displacement B. |
| |
| *Note that D might not be in the current compilation unit.* |
| |
| *The first operand interpretation is exactly like that for* |
| ``DW_FORM_ref_addr``\ *.* |
| |
| The address space identifier AS is defined as the one corresponding to the |
| target architecture specific default address space. |
| |
| The address size S is defined as the address bit size of the target |
| architecture specific address space corresponding to AS. |
| |
| An implicit location storage LS is created with the debugging information |
| entry D, address space AS, and size of S. |
| |
| It pushes a location description L that comprises one implicit location |
| description SL on the stack. SL specifies LS with a bit offset of 0. |
| |
| It is an evaluation error if a ``DW_OP_deref*`` operation pops a location |
| description L', and retrieves S bits, such that any retrieved bits come from |
| an implicit location storage that is the same as LS, unless both the |
| following conditions are met: |
| |
| 1. All retrieved bits come from an implicit location description that |
| refers to an implicit location storage that is the same as LS. |
| |
| *Note that all bits do not have to come from the same implicit location |
| description, as L' may involve composite location descriptions.* |
| |
| 2. The bits come from consecutive ascending offsets within their respective |
| implicit location storage. |
| |
| *These rules are equivalent to retrieving the complete contents of LS.* |
| |
| If both the above conditions are met, then the value V pushed by the |
| ``DW_OP_deref*`` operation is an implicit pointer value IPV with a target |
| architecture specific address space of AS, a debugging information entry of |
| D, and a base type of T. If AS is the target architecture default address |
| space, then T is the generic type. Otherwise, T is a target architecture |
| specific integral type with a bit size equal to S. |
| |
| If IPV is either implicitly converted to a location description (only done |
| if AS is the target architecture default address space) or used by |
| ``DW_OP_LLVM_form_aspace_address`` (only done if the address space popped by |
| ``DW_OP_LLVM_form_aspace_address`` is AS), then the resulting location |
| description RL is: |
| |
| * If D has a ``DW_AT_location`` attribute, the DWARF expression E from the |
| ``DW_AT_location`` attribute is evaluated with the current context, except |
| that the result kind is a location description, the compilation unit is |
| the one that contains D, the object is unspecified, and the initial stack |
| is empty. RL is the expression result. |
| |
| *Note that E is evaluated with the context of the expression accessing |
| IPV, and not the context of the expression that contained the* |
| ``DW_OP_implicit_pointer`` *or* ``DW_OP_LLVM_aspace_implicit_pointer`` |
| *operation that created L.* |
| |
| * If D has a ``DW_AT_const_value`` attribute, then an implicit location |
| storage RLS is created from the ``DW_AT_const_value`` attribute's value |
| with a size matching the size of the ``DW_AT_const_value`` attribute's |
| value. RL comprises one implicit location description SRL. SRL specifies |
| RLS with a bit offset of 0. |
| |
| .. note:: |
| |
| If using ``DW_AT_const_value`` for variables and formal parameters is |
| deprecated and instead ``DW_AT_location`` is used with an implicit |
| location description, then this rule would not be required. |
| |
| * Otherwise, it is an evaluation error. |
| |
| The bit offset of RL is updated as if the ``DW_OP_LLVM_offset_uconst B`` |
| operation was applied. |
| |
| If a ``DW_OP_stack_value`` operation pops a value that is the same as IPV, |
| then it pushes a location description that is the same as L. |
| |
| It is an evaluation error if LS or IPV is accessed in any other manner. |
| |
| *The restrictions on how an implicit pointer location description created |
| by* ``DW_OP_implicit_pointer`` *and* ``DW_OP_LLVM_aspace_implicit_pointer`` |
| *can be used are to simplify the DWARF consumer. Similarly, for an implicit |
| pointer value created by* ``DW_OP_deref*`` *and* ``DW_OP_stack_value``\ *.* |
| |
| 4. ``DW_OP_LLVM_aspace_implicit_pointer`` *New* |
| |
| ``DW_OP_LLVM_aspace_implicit_pointer`` has two operands that are the same as |
| for ``DW_OP_implicit_pointer``. |
| |
| It pops one stack entry that must be an integral type value that represents |
| a target architecture specific address space identifier AS. |
| |
| The location description L that is pushed on the stack is the same as for |
| ``DW_OP_implicit_pointer``, except that the address space identifier used is |
| AS. |
| |
| The DWARF expression is ill-formed if AS is not one of the values defined by |
| the target architecture specific ``DW_ASPACE_LLVM_*`` values. |
| |
| .. note:: |
| |
| This definition of ``DW_OP_LLVM_aspace_implicit_pointer`` may change when |
| full support for address classes is added as required for languages such |
| as OpenCL/SyCL. |
| |
| *Typically a* ``DW_OP_implicit_pointer`` *or* |
| ``DW_OP_LLVM_aspace_implicit_pointer`` *operation is used in a DWARF expression |
| E*\ :sub:`1` *of a* ``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter`` |
| *debugging information entry D*\ :sub:`1`\ *'s* ``DW_AT_location`` *attribute. |
| The debugging information entry referenced by the* ``DW_OP_implicit_pointer`` |
| *or* ``DW_OP_LLVM_aspace_implicit_pointer`` *operations is typically itself a* |
| ``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter`` *debugging information |
| entry D*\ :sub:`2` *whose* ``DW_AT_location`` *attribute gives a second DWARF |
| expression E*\ :sub:`2`\ *.* |
| |
| *D*\ :sub:`1` *and E*\ :sub:`1` *are describing the location of a pointer type |
| object. D*\ :sub:`2` *and E*\ :sub:`2` *are describing the location of the |
| object pointed to by that pointer object.* |
| |
| *However, D*\ :sub:`2` *may be any debugging information entry that contains a* |
| ``DW_AT_location`` *or* ``DW_AT_const_value`` *attribute (for example,* |
| ``DW_TAG_dwarf_procedure``\ *). By using E*\ :sub:`2`\ *, a consumer can |
| reconstruct the value of the object when asked to dereference the pointer |
| described by E*\ :sub:`1` *which contains the* ``DW_OP_implicit_pointer`` *or* |
| ``DW_OP_LLVM_aspace_implicit_pointer`` *operation.* |
| |
| .. _amdgpu-dwarf-composite-location-description-operations: |
| |
| A.2.5.4.4.6 Composite Location Description Operations |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.6.1.2. |
| |
| A composite location storage represents an object or value which may be |
| contained in part of another location storage or contained in parts of more |
| than one location storage. |
| |
| Each part has a part location description L and a part bit size S. L can have |
| one or more single location descriptions SL. If there are more than one SL then |
| that indicates that part is located in more than one place. The bits of each |
| place of the part comprise S contiguous bits from the location storage LS |
| specified by SL starting at the bit offset specified by SL. All the bits must |
| be within the size of LS or the DWARF expression is ill-formed. |
| |
| A composite location storage can have zero or more parts. The parts are |
| contiguous such that the zero-based location storage bit index will range over |
| each part with no gaps between them. Therefore, the size of a composite location |
| storage is the sum of the size of its parts. The DWARF expression is ill-formed |
| if the size of the contiguous location storage is larger than the size of the |
| memory location storage corresponding to the largest target architecture |
| specific address space. |
| |
| A composite location description specifies a composite location storage. The bit |
| offset corresponds to a bit position within the composite location storage. |
| |
| There are operations that create a composite location storage. |
| |
| There are other operations that allow a composite location storage to be |
| incrementally created. Each part is created by a separate operation. There may |
| be one or more operations to create the final composite location storage. A |
| series of such operations describes the parts of the composite location storage |
| that are in the order that the associated part operations are executed. |
| |
| To support incremental creation, a composite location storage can be in an |
| incomplete state. When an incremental operation operates on an incomplete |
| composite location storage, it adds a new part, otherwise it creates a new |
| composite location storage. The ``DW_OP_LLVM_piece_end`` operation explicitly |
| makes an incomplete composite location storage complete. |
| |
| A composite location description that specifies a composite location storage |
| that is incomplete is termed an incomplete composite location description. A |
| composite location description that specifies a composite location storage that |
| is complete is termed a complete composite location description. |
| |
| If the top stack entry is a location description that has one incomplete |
| composite location description SL after the execution of an operation expression |
| has completed, SL is converted to a complete composite location description. |
| |
| *Note that this conversion does not happen after the completion of an operation |
| expression that is evaluated on the same stack by the* ``DW_OP_call*`` |
| *operations. Such executions are not a separate evaluation of an operation |
| expression, but rather the continued evaluation of the same operation expression |
| that contains the* ``DW_OP_call*`` *operation.* |
| |
| If a stack entry is required to be a location description L, but L has an |
| incomplete composite location description, then the DWARF expression is |
| ill-formed. The exception is for the operations involved in incrementally |
| creating a composite location description as described below. |
| |
| *Note that a DWARF operation expression may arbitrarily compose composite |
| location descriptions from any other location description, including those that |
| have multiple single location descriptions, and those that have composite |
| location descriptions.* |
| |
| *The incremental composite location description operations are defined to be |
| compatible with the definitions in DWARF Version 5.* |
| |
| 1. ``DW_OP_piece`` |
| |
| ``DW_OP_piece`` has a single unsigned LEB128 integer that represents a byte |
| size S. |
| |
| The action is based on the context: |
| |
| * If the stack is empty, then a location description L comprised of one |
| incomplete composite location description SL is pushed on the stack. |
| |
| An incomplete composite location storage LS is created with a single part |
| P. P specifies a location description PL and has a bit size of S scaled by |
| 8 (the byte size). PL is comprised of one undefined location description |
| PSL. |
| |
| SL specifies LS with a bit offset of 0. |
| |
| * Otherwise, if the top stack entry is a location description L comprised of |
| one incomplete composite location description SL, then the incomplete |
| composite location storage LS that SL specifies is updated to append a new |
| part P. P specifies a location description PL and has a bit size of S |
| scaled by 8 (the byte size). PL is comprised of one undefined location |
| description PSL. L is left on the stack. |
| |
| * Otherwise, if the top stack entry is a location description or can be |
| converted to one, then it is popped and treated as a part location |
| description PL. Then: |
| |
| * If the top stack entry (after popping PL) is a location description L |
| comprised of one incomplete composite location description SL, then the |
| incomplete composite location storage LS that SL specifies is updated to |
| append a new part P. P specifies the location description PL and has a |
| bit size of S scaled by 8 (the byte size). L is left on the stack. |
| |
| * Otherwise, a location description L comprised of one incomplete |
| composite location description SL is pushed on the stack. |
| |
| An incomplete composite location storage LS is created with a single |
| part P. P specifies the location description PL and has a bit size of S |
| scaled by 8 (the byte size). |
| |
| SL specifies LS with a bit offset of 0. |
| |
| * Otherwise, the DWARF expression is ill-formed |
| |
| *Many compilers store a single variable in sets of registers or store a |
| variable partially in memory and partially in registers.* ``DW_OP_piece`` |
| *provides a way of describing where a part of a variable is located.* |
| |
| *If a non-0 byte displacement is required, the* ``DW_OP_LLVM_offset`` |
| *operation can be used to update the location description before using it as |
| the part location description of a* ``DW_OP_piece`` *operation.* |
| |
| *The evaluation rules for the* ``DW_OP_piece`` *operation allow it to be |
| compatible with the DWARF Version 5 definition.* |
| |
| .. note:: |
| |
| Since these extensions allow location descriptions to be entries on the |
| stack, a simpler operation to create composite location descriptions could |
| be defined. For example, just one operation that specifies how many parts, |
| and pops pairs of stack entries for the part size and location |
| description. Not only would this be a simpler operation and avoid the |
| complexities of incomplete composite location descriptions, but it may |
| also have a smaller encoding in practice. However, the desire for |
| compatibility with DWARF Version 5 is likely a stronger consideration. |
| |
| 2. ``DW_OP_bit_piece`` |
| |
| ``DW_OP_bit_piece`` has two operands. The first is an unsigned LEB128 |
| integer that represents the part bit size S. The second is an unsigned |
| LEB128 integer that represents a bit displacement B. |
| |
| The action is the same as for ``DW_OP_piece``, except that any part created |
| has the bit size S, and the location description PL of any created part is |
| updated as if the ``DW_OP_constu B; DW_OP_LLVM_bit_offset`` operations were |
| applied. |
| |
| ``DW_OP_bit_piece`` *is used instead of* ``DW_OP_piece`` *when the piece to |
| be assembled is not byte-sized or is not at the start of the part location |
| description.* |
| |
| *If a computed bit displacement is required, the* ``DW_OP_LLVM_bit_offset`` |
| *operation can be used to update the location description before using it as |
| the part location description of a* ``DW_OP_bit_piece`` *operation.* |
| |
| .. note:: |
| |
| The bit offset operand is not needed as ``DW_OP_LLVM_bit_offset`` can be |
| used on the part's location description. |
| |
| 3. ``DW_OP_LLVM_piece_end`` *New* |
| |
| If the top stack entry is not a location description L comprised of one |
| incomplete composite location description SL, then the DWARF expression is |
| ill-formed. |
| |
| Otherwise, the incomplete composite location storage LS specified by SL is |
| updated to be a complete composite location description with the same parts. |
| |
| 4. ``DW_OP_LLVM_extend`` *New* |
| |
| ``DW_OP_LLVM_extend`` has two operands. The first is an unsigned LEB128 |
| integer that represents the element bit size S. The second is an unsigned |
| LEB128 integer that represents a count C. |
| |
| It pops one stack entry that must be a location description and is treated |
| as the part location description PL. |
| |
| A location description L comprised of one complete composite location |
| description SL is pushed on the stack. |
| |
| A complete composite location storage LS is created with C identical parts |
| P. Each P specifies PL and has a bit size of S. |
| |
| SL specifies LS with a bit offset of 0. |
| |
| The DWARF expression is ill-formed if the element bit size or count are 0. |
| |
| 5. ``DW_OP_LLVM_select_bit_piece`` *New* |
| |
| ``DW_OP_LLVM_select_bit_piece`` has two operands. The first is an unsigned |
| LEB128 integer that represents the element bit size S. The second is an |
| unsigned LEB128 integer that represents a count C. |
| |
| It pops three stack entries. The first must be an integral type value that |
| represents a bit mask value M. The second must be a location description |
| that represents the one-location description L1. The third must be a |
| location description that represents the zero-location description L0. |
| |
| A complete composite location storage LS is created with C parts P\ :sub:`N` |
| ordered in ascending N from 0 to C-1 inclusive. Each P\ :sub:`N` specifies |
| location description PL\ :sub:`N` and has a bit size of S. |
| |
| PL\ :sub:`N` is as if the ``DW_OP_LLVM_bit_offset N*S`` operation was |
| applied to PLX\ :sub:`N`\ . |
| |
| PLX\ :sub:`N` is the same as L0 if the N\ :sup:`th` least significant bit of |
| M is a zero, otherwise it is the same as L1. |
| |
| A location description L comprised of one complete composite location |
| description SL is pushed on the stack. SL specifies LS with a bit offset of |
| 0. |
| |
| The DWARF expression is ill-formed if S or C are 0, or if the bit size of M |
| is less than C. |
| |
| .. note:: |
| |
| Should the count operand for DW_OP_extend and DW_OP_select_bit_piece be |
| changed to get the count value off the stack? This would allow support for |
| architectures that have variable length vector instructions such as ARM |
| and RISC-V. |
| |
| 6. ``DW_OP_LLVM_overlay`` *New* |
| |
| ``DW_OP_LLVM_overlay`` pops four stack entries. The first must be an |
| integral type value that represents the overlay byte size value S. The |
| second must be an integral type value that represents the overlay byte |
| offset value O. The third must be a location description that represents the |
| overlay location description OL. The fourth must be a location description |
| that represents the base location description BL. |
| |
| The action is the same as for ``DW_OP_LLVM_bit_overlay``, except that the |
| overlay bit size BS and overlay bit offset BO used are S and O respectively |
| scaled by 8 (the byte size). |
| |
| 7. ``DW_OP_LLVM_bit_overlay`` *New* |
| |
| ``DW_OP_LLVM_bit_overlay`` pops four stack entries. The first must be an |
| integral type value that represents the overlay bit size value BS. The |
| second must be an integral type value that represents the overlay bit offset |
| value BO. The third must be a location description that represents the |
| overlay location description OL. The fourth must be a location description |
| that represents the base location description BL. |
| |
| The DWARF expression is ill-formed if BS or BO are negative values. |
| |
| *rbss(L)* is the minimum remaining bit storage size of L which is defined as |
| follows. LS is the location storage and LO is the location bit offset |
| specified by a single location description SL of L. The remaining bit |
| storage size RBSS of SL is the bit size of LS minus LO. *rbss(L)* is the |
| minimum RBSS of each single location description SL of L. |
| |
| The DWARF expression is ill-formed if *rbss(BL)* is less than BO plus BS. |
| |
| If BS is 0, then the operation pushes BL. |
| |
| If BO is 0 and BS equals *rbss(BL)*, then the operation pushes OL. |
| |
| Otherwise, the operation is equivalent to performing the following steps to |
| push a composite location description. |
| |
| *The composite location description is conceptually the base location |
| description BL with the overlay location description OL positioned as an |
| overlay starting at the overlay offset BO and covering overlay bit size BS.* |
| |
| 1. If BO is not 0 then push BL followed by performing the ``DW_OP_bit_piece |
| BO, 0`` operation. |
| 2. Push OL followed by performing the ``DW_OP_bit_piece BS, 0`` operation. |
| 3. If *rbss(BL)* is greater than BO plus BS, push BL followed by performing |
| the ``DW_OP_bit_piece (rbss(BL) - BO - BS), (BO + BS)`` operation. |
| 4. Perform the ``DW_OP_LLVM_piece_end`` operation. |
| |
| .. _amdgpu-dwarf-location-list-expressions: |
| |
| A.2.5.5 DWARF Location List Expressions |
| +++++++++++++++++++++++++++++++++++++++ |
| |
| .. note:: |
| |
| This section replaces DWARF Version 5 section 2.6.2. |
| |
| *To meet the needs of recent computer architectures and optimization techniques, |
| debugging information must be able to describe the location of an object whose |
| location changes over the object’s lifetime, and may reside at multiple |
| locations during parts of an object's lifetime. Location list expressions are |
| used in place of operation expressions whenever the object whose location is |
| being described has these requirements.* |
| |
| A location list expression consists of a series of location list entries. Each |
| location list entry is one of the following kinds: |
| |
| *Bounded location description* |
| |
| This kind of location list entry provides an operation expression that |
| evaluates to the location description of an object that is valid over a |
| lifetime bounded by a starting and ending address. The starting address is the |
| lowest address of the address range over which the location is valid. The |
| ending address is the address of the first location past the highest address |
| of the address range. |
| |
| The location list entry matches when the current program location is within |
| the given range. |
| |
| There are several kinds of bounded location description entries which differ |
| in the way that they specify the starting and ending addresses. |
| |
| *Default location description* |
| |
| This kind of location list entry provides an operation expression that |
| evaluates to the location description of an object that is valid when no |
| bounded location description entry applies. |
| |
| The location list entry matches when the current program location is not |
| within the range of any bounded location description entry. |
| |
| *Base address* |
| |
| This kind of location list entry provides an address to be used as the base |
| address for beginning and ending address offsets given in certain kinds of |
| bounded location description entries. The applicable base address of a bounded |
| location description entry is the address specified by the closest preceding |
| base address entry in the same location list. If there is no preceding base |
| address entry, then the applicable base address defaults to the base address |
| of the compilation unit (see DWARF Version 5 section 3.1.1). |
| |
| In the case of a compilation unit where all of the machine code is contained |
| in a single contiguous section, no base address entry is needed. |
| |
| *End-of-list* |
| |
| This kind of location list entry marks the end of the location list |
| expression. |
| |
| The address ranges defined by the bounded location description entries of a |
| location list expression may overlap. When they do, they describe a situation in |
| which an object exists simultaneously in more than one place. |
| |
| If all of the address ranges in a given location list expression do not |
| collectively cover the entire range over which the object in question is |
| defined, and there is no following default location description entry, it is |
| assumed that the object is not available for the portion of the range that is |
| not covered. |
| |
| The result of the evaluation of a DWARF location list expression is: |
| |
| * If the current program location is not specified, then it is an evaluation |
| error. |
| |
| .. note:: |
| |
| If the location list only has a single default entry, should that be |
| considered a match if there is no program location? If there are non-default |
| entries then it seems it has to be an evaluation error when there is no |
| program location as that indicates the location depends on the program |
| location which is not known. |
| |
| * If there are no matching location list entries, then the result is a location |
| description that comprises one undefined location description. |
| |
| * Otherwise, the operation expression E of each matching location list entry is |
| evaluated with the current context, except that the result kind is a location |
| description, the object is unspecified, and the initial stack is empty. The |
| location list entry result is the location description returned by the |
| evaluation of E. |
| |
| The result is a location description that is comprised of the union of the |
| single location descriptions of the location description result of each |
| matching location list entry. |
| |
| A location list expression can only be used as the value of a debugger |
| information entry attribute that is encoded using class ``loclist`` or |
| ``loclistsptr`` (see :ref:`amdgpu-dwarf-classes-and-forms`). The value of the |
| attribute provides an index into a separate object file section called |
| ``.debug_loclists`` or ``.debug_loclists.dwo`` (for split DWARF object files) |
| that contains the location list entries. |
| |
| A ``DW_OP_call*`` and ``DW_OP_implicit_pointer`` operation can be used to |
| specify a debugger information entry attribute that has a location list |
| expression. Several debugger information entry attributes allow DWARF |
| expressions that are evaluated with an initial stack that includes a location |
| description that may originate from the evaluation of a location list |
| expression. |
| |
| *This location list representation, the* ``loclist`` *and* ``loclistsptr`` |
| *class, and the related* ``DW_AT_loclists_base`` *attribute are new in DWARF |
| Version 5. Together they eliminate most, or all of the code object relocations |
| previously needed for location list expressions.* |
| |
| .. note:: |
| |
| The rest of this section is the same as DWARF Version 5 section 2.6.2. |
| |
| .. _amdgpu-dwarf-address-spaces: |
| |
| A.2.13 Address Spaces |
| ~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This is a new section after DWARF Version 5 section 2.12 Segmented Addresses. |
| |
| DWARF address spaces correspond to target architecture specific linear |
| addressable memory areas. They are used in DWARF expression location |
| descriptions to describe in which target architecture specific memory area data |
| resides. |
| |
| *Target architecture specific DWARF address spaces may correspond to hardware |
| supported facilities such as memory utilizing base address registers, scratchpad |
| memory, and memory with special interleaving. The size of addresses in these |
| address spaces may vary. Their access and allocation may be hardware managed |
| with each thread or group of threads having access to independent storage. For |
| these reasons they may have properties that do not allow them to be viewed as |
| part of the unified global virtual address space accessible by all threads.* |
| |
| *It is target architecture specific whether multiple DWARF address spaces are |
| supported and how source language memory spaces map to target architecture |
| specific DWARF address spaces. A target architecture may map multiple source |
| language memory spaces to the same target architecture specific DWARF address |
| class. Optimization may determine that variable lifetime and access pattern |
| allows them to be allocated in faster scratchpad memory represented by a |
| different DWARF address space than the default for the source language memory |
| space.* |
| |
| Although DWARF address space identifiers are target architecture specific, |
| ``DW_ASPACE_LLVM_none`` is a common address space supported by all target |
| architectures, and defined as the target architecture default address space. |
| |
| DWARF address space identifiers are used by: |
| |
| * The ``DW_AT_LLVM_address_space`` attribute. |
| |
| * The DWARF expression operations: ``DW_OP_aspace_bregx``, |
| ``DW_OP_form_aspace_address``, ``DW_OP_aspace_implicit_pointer``, and |
| ``DW_OP_xderef*``. |
| |
| * The CFI instructions: ``DW_CFA_def_aspace_cfa`` and |
| ``DW_CFA_def_aspace_cfa_sf``. |
| |
| .. note:: |
| |
| Currently, DWARF defines address class values as being target architecture |
| specific, and defines a DW_AT_address_class attribute. With the removal of |
| DW_AT_segment in DWARF 6, it is unclear how the address class is intended to |
| be used as the term is not used elsewhere. Should these be replaced by this |
| proposal's more complete address space? Or are they intended to represent |
| source language memory spaces such as in OpenCL? |
| |
| .. _amdgpu-dwarf-memory-spaces: |
| |
| A.2.14 Memory Spaces |
| ~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This is a new section after DWARF Version 5 section 2.12 Segmented Addresses. |
| |
| DWARF memory spaces are used for source languages that have the concept of |
| memory spaces. They are used in the ``DW_AT_LLVM_memory_space`` attribute for |
| pointer type, reference type, variable, formal parameter, and constant debugger |
| information entries. |
| |
| Each DWARF memory space is conceptually a separate source language memory space |
| with its own lifetime and aliasing rules. DWARF memory spaces are used to |
| specify the source language memory spaces that pointer type and reference type |
| values refer, and to specify the source language memory space in which variables |
| are allocated. |
| |
| Although DWARF memory space identifiers are source language specific, |
| ``DW_MSPACE_LLVM_none`` is a common memory space supported by all source |
| languages, and defined as the source language default memory space. |
| |
| The set of currently defined DWARF memory spaces, together with source language |
| mappings, is given in :ref:`amdgpu-dwarf-source-language-memory-spaces-table`. |
| |
| Vendor defined source language memory spaces may be defined using codes in the |
| range ``DW_MSPACE_LLVM_lo_user`` to ``DW_MSPACE_LLVM_hi_user``. |
| |
| .. table:: Source language memory spaces |
| :name: amdgpu-dwarf-source-language-memory-spaces-table |
| |
| =========================== ============ ============== ============== ============== |
| Memory Space Name Meaning C/C++ OpenCL CUDA/HIP |
| =========================== ============ ============== ============== ============== |
| ``DW_MSPACE_LLVM_none`` generic *default* generic *default* |
| ``DW_MSPACE_LLVM_global`` global global |
| ``DW_MSPACE_LLVM_constant`` constant constant constant |
| ``DW_MSPACE_LLVM_group`` thread-group local shared |
| ``DW_MSPACE_LLVM_private`` thread private |
| ``DW_MSPACE_LLVM_lo_user`` |
| ``DW_MSPACE_LLVM_hi_user`` |
| =========================== ============ ============== ============== ============== |
| |
| .. note:: |
| |
| The approach presented in |
| :ref:`amdgpu-dwarf-source-language-memory-spaces-table` is to define the |
| default ``DW_MSPACE_LLVM_none`` to be the generic address class and not the |
| global address class. This matches how CLANG and LLVM have added support for |
| CUDA-like languages on top of existing C++ language support. This allows all |
| addresses to be generic by default which matches CUDA-like languages. |
| |
| An alternative approach is to define ``DW_MSPACE_LLVM_none`` as being the |
| global memory space and then change ``DW_MSPACE_LLVM_global`` to |
| ``DW_MSPACE_LLVM_generic``. This would match the reality that languages that |
| do not support multiple memory spaces only have one default global memory |
| space. Generally, in these languages if they expose that the target |
| architecture supports multiple memory spaces, the default one is still the |
| global memory space. Then a language that does support multiple memory spaces |
| has to explicitly indicate which pointers have the added ability to reference |
| more than the global memory space. However, compilers generating DWARF for |
| CUDA-like languages would then have to define every CUDA-like language pointer |
| type or reference type with a ``DW_AT_LLVM_memory_space`` attribute of |
| ``DW_MSPACE_LLVM_generic`` to match the language semantics. |
| |
| A.3 Program Scope Entries |
| ------------------------- |
| |
| .. note:: |
| |
| This section provides changes to existing debugger information entry |
| attributes. These would be incorporated into the corresponding DWARF Version 5 |
| chapter 3 sections. |
| |
| A.3.1 Unit Entries |
| ~~~~~~~~~~~~~~~~~~ |
| |
| .. _amdgpu-dwarf-full-and-partial-compilation-unit-entries: |
| |
| A.3.1.1 Full and Partial Compilation Unit Entries |
| +++++++++++++++++++++++++++++++++++++++++++++++++ |
| |
| .. note:: |
| |
| This augments DWARF Version 5 section 3.1.1 and Table 3.1. |
| |
| Additional language codes defined for use with the ``DW_AT_language`` attribute |
| are defined in :ref:`amdgpu-dwarf-language-names-table`. |
| |
| .. table:: Language Names |
| :name: amdgpu-dwarf-language-names-table |
| |
| ==================== ============================= |
| Language Name Meaning |
| ==================== ============================= |
| ``DW_LANG_LLVM_HIP`` HIP Language. |
| ==================== ============================= |
| |
| The HIP language [:ref:`HIP <amdgpu-dwarf-HIP>`] can be supported by extending |
| the C++ language. |
| |
| .. note:: |
| |
| The following new attribute is added. |
| |
| 1. A ``DW_TAG_compile_unit`` debugger information entry for a compilation unit |
| may have a ``DW_AT_LLVM_augmentation`` attribute, whose value is an |
| augmentation string. |
| |
| *The augmentation string allows producers to indicate that there is |
| additional vendor or target specific information in the debugging |
| information entries. For example, this might be information about the |
| version of vendor specific extensions that are being used.* |
| |
| If not present, or if the string is empty, then the compilation unit has no |
| augmentation string. |
| |
| The format for the augmentation string is: |
| |
| | ``[``\ *vendor*\ ``:v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ * |
| |
| Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y |
| version number of the extensions used, and *options* is an optional string |
| providing additional information about the extensions. The version number |
| must conform to semantic versioning [:ref:`SEMVER <amdgpu-dwarf-SEMVER>`]. |
| The *options* string must not contain the "\ ``]``\ " character. |
| |
| For example: |
| |
| :: |
| |
| [abc:v0.0][def:v1.2:feature-a=on,feature-b=3] |
| |
| A.3.3 Subroutine and Entry Point Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. _amdgpu-dwarf-low-level-information: |
| |
| A.3.3.5 Low-Level Information |
| +++++++++++++++++++++++++++++ |
| |
| 1. A ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or |
| ``DW_TAG_entry_point`` debugger information entry may have a |
| ``DW_AT_return_addr`` attribute, whose value is a DWARF expression E. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an empty initial stack, and other context |
| elements corresponding to the source language thread of execution upon which |
| the user is focused, if any. The result of the evaluation is the location |
| description L of the place where the return address for the current call |
| frame's subprogram or entry point is stored. |
| |
| The DWARF is ill-formed if L is not comprised of one memory location |
| description for one of the target architecture specific address spaces. |
| |
| .. note:: |
| |
| It is unclear why ``DW_TAG_inlined_subroutine`` has a |
| ``DW_AT_return_addr`` attribute but not a ``DW_AT_frame_base`` or |
| ``DW_AT_static_link`` attribute. Seems it would either have all of them or |
| none. Since inlined subprograms do not have a call frame it seems they |
| would have none of these attributes. |
| |
| 2. A ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information entry |
| may have a ``DW_AT_frame_base`` attribute, whose value is a DWARF expression |
| E. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an empty initial stack, and other context |
| elements corresponding to the source language thread of execution upon which |
| the user is focused, if any. |
| |
| The DWARF is ill-formed if E contains a ``DW_OP_fbreg`` operation, or the |
| resulting location description L is not comprised of one single location |
| description SL. |
| |
| If SL is a register location description for register R, then L is replaced |
| with the result of evaluating a ``DW_OP_bregx R, 0`` operation. This |
| computes the frame base memory location description in the target |
| architecture default address space. |
| |
| *This allows the more compact* ``DW_OP_reg*`` *to be used instead of* |
| ``DW_OP_breg* 0``\ *.* |
| |
| .. note:: |
| |
| This rule could be removed and require the producer to create the required |
| location description directly using ``DW_OP_call_frame_cfa``, |
| ``DW_OP_breg*``, or ``DW_OP_LLVM_aspace_bregx``. This would also then |
| allow a target to implement the call frames within a large register. |
| |
| Otherwise, the DWARF is ill-formed if SL is not a memory location |
| description in any of the target architecture specific address spaces. |
| |
| The resulting L is the *frame base* for the subprogram or entry point. |
| |
| *Typically, E will use the* ``DW_OP_call_frame_cfa`` *operation or be a |
| stack pointer register plus or minus some offset.* |
| |
| *The frame base for a subprogram is typically an address relative to the |
| first unit of storage allocated for the subprogram's stack frame. The* |
| ``DW_AT_frame_base`` *attribute can be used in several ways:* |
| |
| 1. *In subprograms that need location lists to locate local variables, the* |
| ``DW_AT_frame_base`` *can hold the needed location list, while all |
| variables' location descriptions can be simpler ones involving the frame |
| base.* |
| |
| 2. *It can be used in resolving "up-level" addressing within |
| nested routines. (See also* ``DW_AT_static_link``\ *, below)* |
| |
| *Some languages support nested subroutines. In such languages, it is |
| possible to reference the local variables of an outer subroutine from within |
| an inner subroutine. The* ``DW_AT_static_link`` *and* ``DW_AT_frame_base`` |
| *attributes allow debuggers to support this same kind of referencing.* |
| |
| 3. If a ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information |
| entry is lexically nested, it may have a ``DW_AT_static_link`` attribute, |
| whose value is a DWARF expression E. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an empty initial stack, and other context |
| elements corresponding to the source language thread of execution upon which |
| the user is focused, if any. The result of the evaluation is the location |
| description L of the *canonical frame address* (see |
| :ref:`amdgpu-dwarf-call-frame-information`) of the relevant call frame of |
| the subprogram instance that immediately lexically encloses the current call |
| frame's subprogram or entry point. |
| |
| The DWARF is ill-formed if L is not comprised of one memory location |
| description for one of the target architecture specific address spaces. |
| |
| In the context of supporting nested subroutines, the DW_AT_frame_base |
| attribute value obeys the following constraints: |
| |
| 1. It computes a value that does not change during the life of the |
| subprogram, and |
| |
| 2. The computed value is unique among instances of the same subroutine. |
| |
| *For typical DW_AT_frame_base use, this means that a recursive subroutine's |
| stack frame must have non-zero size.* |
| |
| *If a debugger is attempting to resolve an up-level reference to a variable, |
| it uses the nesting structure of DWARF to determine which subroutine is the |
| lexical parent and the* ``DW_AT_static_link`` *value to identify the |
| appropriate active frame of the parent. It can then attempt to find the |
| reference within the context of the parent.* |
| |
| .. note:: |
| |
| The following new attributes are added. |
| |
| 4. For languages that are implemented using a SIMT execution model, a |
| ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or |
| ``DW_TAG_entry_point`` debugger information entry may have a |
| ``DW_AT_LLVM_lanes`` attribute whose value is an integer constant that is |
| the number of source language threads of execution per target architecture |
| thread. |
| |
| *For example, a compiler may map source language threads of execution onto |
| lanes of a target architecture thread using a SIMT execution model.* |
| |
| It is the static number of source language threads of execution per target |
| architecture thread. It is not the dynamic number of source language threads |
| of execution with which the target architecture thread was initiated, for |
| example, due to smaller or partial work-groups. |
| |
| If not present, the default value of 1 is used. |
| |
| The DWARF is ill-formed if the value is less than or equal to 0. |
| |
| 5. For source languages that are implemented using a SIMT execution model, a |
| ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or |
| ``DW_TAG_entry_point`` debugging information entry may have a |
| ``DW_AT_LLVM_lane_pc`` attribute whose value is a DWARF expression E. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an empty initial stack, and other context |
| elements corresponding to the source language thread of execution upon which |
| the user is focused, if any. |
| |
| The resulting location description L is for a lane count sized vector of |
| generic type elements. The lane count is the value of the |
| ``DW_AT_LLVM_lanes`` attribute. Each element holds the conceptual program |
| location of the corresponding lane. If the lane was not active when the |
| current subprogram was called, its element is an undefined location |
| description. |
| |
| The DWARF is ill-formed if L does not have exactly one single location |
| description. |
| |
| ``DW_AT_LLVM_lane_pc`` *allows the compiler to indicate conceptually where |
| each SIMT lane of a target architecture thread is positioned even when it is |
| in divergent control flow that is not active.* |
| |
| *Typically, the result is a location description with one composite location |
| description with each part being a location description with either one |
| undefined location description or one memory location description.* |
| |
| If not present, the target architecture thread is not being used in a SIMT |
| manner, and the thread's current program location is used. |
| |
| 6. For languages that are implemented using a SIMT execution model, a |
| ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or |
| ``DW_TAG_entry_point`` debugger information entry may have a |
| ``DW_AT_LLVM_active_lane`` attribute whose value is a DWARF expression E. |
| |
| E is evaluated with a context that has a result kind of a location |
| description, an unspecified object, the compilation unit that contains E, an |
| empty initial stack, and other context elements corresponding to the source |
| language thread of execution upon which the user is focused, if any. |
| |
| The DWARF is ill-formed if L does not have exactly one single location |
| description SL. |
| |
| The active lane bit mask V for the current program location is obtained by |
| reading from SL using a target architecture specific integral base type T |
| that has a bit size equal to the value of the ``DW_AT_LLVM_lanes`` attribute |
| of the subprogram corresponding to context's frame and program location. The |
| N\ :sup:`th` least significant bit of the mask corresponds to the N\ |
| :sup:`th` lane. If the bit is 1 the lane is active, otherwise it is |
| inactive. The result of the attribute is the value V. |
| |
| *Some targets may update the target architecture execution mask for regions |
| of code that must execute with different sets of lanes than the current |
| active lanes. For example, some code must execute with all lanes made |
| temporarily active.* ``DW_AT_LLVM_active_lane`` *allows the compiler to |
| provide the means to determine the source language active lanes at any |
| program location. Typically, this attribute will use a loclist to express |
| different locations of the active lane mask at different program locations.* |
| |
| If not present and ``DW_AT_LLVM_lanes`` is greater than 1, then the target |
| architecture execution mask is used. |
| |
| 7. A ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or |
| ``DW_TAG_entry_point`` debugger information entry may have a |
| ``DW_AT_LLVM_iterations`` attribute whose value is an integer constant or a |
| DWARF expression E. Its value is the number of source language loop |
| iterations executing concurrently by the target architecture for a single |
| source language thread of execution. |
| |
| *A compiler may generate code that executes more than one iteration of a |
| source language loop concurrently using optimization techniques such as |
| software pipelining or SIMD vectorization. The number of concurrent |
| iterations may vary for different loop nests in the same subprogram. |
| Typically, this attribute will use a loclist to express different values at |
| different program locations.* |
| |
| If the attribute is an integer constant, then the value is the constant. The |
| DWARF is ill-formed if the constant is less than or equal to 0. |
| |
| Otherwise, E is evaluated with a context that has a result kind of a |
| location description, an unspecified object, the compilation unit that |
| contains E, an empty initial stack, and other context elements corresponding |
| to the source language thread of execution upon which the user is focused, |
| if any. The DWARF is ill-formed if the result is not a location description |
| comprised of one implicit location description, that when read as the |
| generic type, results in a value V that is less than or equal to 0. The |
| result of the attribute is the value V. |
| |
| If not present, the default value of 1 is used. |
| |
| A.3.4 Call Site Entries and Parameters |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| A.3.4.2 Call Site Parameters |
| ++++++++++++++++++++++++++++ |
| |
| 1. The call site entry may own ``DW_TAG_call_site_parameter`` debugging |
| information entries representing the parameters passed to the call. Call |
| site parameter entries occur in the same order as the corresponding |
| parameters in the source. Each such entry has a ``DW_AT_location`` attribute |
| which is a location description. This location description describes where |
| the parameter is passed (usually either some register, or a memory location |
| expressible as the contents of the stack register plus some offset). |
| |
| 2. A ``DW_TAG_call_site_parameter`` debugger information entry may have a |
| ``DW_AT_call_value`` attribute, whose value is a DWARF operation expression |
| E\ :sub:`1`\ . |
| |
| The result of the ``DW_AT_call_value`` attribute is obtained by evaluating |
| E\ :sub:`1` with a context that has a result kind of a value, an unspecified |
| object, the compilation unit that contains E, an empty initial stack, and |
| other context elements corresponding to the source language thread of |
| execution upon which the user is focused, if any. The resulting value V\ |
| :sub:`1` is the value of the parameter at the time of the call made by the |
| call site. |
| |
| For parameters passed by reference, where the code passes a pointer to a |
| location which contains the parameter, or for reference type parameters, the |
| ``DW_TAG_call_site_parameter`` debugger information entry may also have a |
| ``DW_AT_call_data_location`` attribute whose value is a DWARF operation |
| expression E\ :sub:`2`\ , and a ``DW_AT_call_data_value`` attribute whose |
| value is a DWARF operation expression E\ :sub:`3`\ . |
| |
| The value of the ``DW_AT_call_data_location`` attribute is obtained by |
| evaluating E\ :sub:`2` with a context that has a result kind of a location |
| description, an unspecified object, the compilation unit that contains E, an |
| empty initial stack, and other context elements corresponding to the source |
| language thread of execution upon which the user is focused, if any. The |
| resulting location description L\ :sub:`2` is the location where the |
| referenced parameter lives during the call made by the call site. If E\ |
| :sub:`2` would just be a ``DW_OP_push_object_address``, then the |
| ``DW_AT_call_data_location`` attribute may be omitted. |
| |
| .. note:: |
| |
| The DWARF Version 5 implies that ``DW_OP_push_object_address`` may be used |
| but does not state what object must be specified in the context. Either |
| ``DW_OP_push_object_address`` cannot be used, or the object to be passed |
| in the context must be defined. |
| |
| The value of the ``DW_AT_call_data_value`` attribute is obtained by |
| evaluating E\ :sub:`3` with a context that has a result kind of a value, an |
| unspecified object, the compilation unit that contains E, an empty initial |
| stack, and other context elements corresponding to the source language |
| thread of execution upon which the user is focused, if any. The resulting |
| value V\ :sub:`3` is the value in L\ :sub:`2` at the time of the call made |
| by the call site. |
| |
| The result of these attributes is undefined if the current call frame is not |
| for the subprogram containing the ``DW_TAG_call_site_parameter`` debugger |
| information entry or the current program location is not for the call site |
| containing the ``DW_TAG_call_site_parameter`` debugger information entry in |
| the current call frame. |
| |
| *The consumer may have to virtually unwind to the call site (see* |
| :ref:`amdgpu-dwarf-call-frame-information`\ *) in order to evaluate these |
| attributes. This will ensure the source language thread of execution upon |
| which the user is focused corresponds to the call site needed to evaluate |
| the expression.* |
| |
| If it is not possible to avoid the expressions of these attributes from |
| accessing registers or memory locations that might be clobbered by the |
| subprogram being called by the call site, then the associated attribute |
| should not be provided. |
| |
| *The reason for the restriction is that the parameter may need to be |
| accessed during the execution of the callee. The consumer may virtually |
| unwind from the called subprogram back to the caller and then evaluate the |
| attribute expressions. The call frame information (see* |
| :ref:`amdgpu-dwarf-call-frame-information`\ *) will not be able to restore |
| registers that have been clobbered, and clobbered memory will no longer have |
| the value at the time of the call.* |
| |
| 3. Each call site parameter entry may also have a ``DW_AT_call_parameter`` |
| attribute which contains a reference to a ``DW_TAG_formal_parameter`` entry, |
| ``DW_AT_type attribute`` referencing the type of the parameter or |
| ``DW_AT_name`` attribute describing the parameter's name. |
| |
| *Examples using call site entries and related attributes are found in Appendix |
| D.15.* |
| |
| .. _amdgpu-dwarf-lexical-block-entries: |
| |
| A.3.5 Lexical Block Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This section is the same as DWARF Version 5 section 3.5. |
| |
| A.4 Data Object and Object List Entries |
| --------------------------------------- |
| |
| .. note:: |
| |
| This section provides changes to existing debugger information entry |
| attributes. These would be incorporated into the corresponding DWARF Version 5 |
| chapter 4 sections. |
| |
| .. _amdgpu-dwarf-data-object-entries: |
| |
| A.4.1 Data Object Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Program variables, formal parameters and constants are represented by debugging |
| information entries with the tags ``DW_TAG_variable``, |
| ``DW_TAG_formal_parameter`` and ``DW_TAG_constant``, respectively. |
| |
| *The tag DW_TAG_constant is used for languages that have true named constants.* |
| |
| The debugging information entry for a program variable, formal parameter or |
| constant may have the following attributes: |
| |
| 1. A ``DW_AT_location`` attribute, whose value is a DWARF expression E that |
| describes the location of a variable or parameter at run-time. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an empty initial stack, and other context |
| elements corresponding to the source language thread of execution upon which |
| the user is focused, if any. The result of the evaluation is the location |
| description of the base of the data object. |
| |
| See :ref:`amdgpu-dwarf-control-flow-operations` for special evaluation rules |
| used by the ``DW_OP_call*`` operations. |
| |
| .. note:: |
| |
| Delete the description of how the ``DW_OP_call*`` operations evaluate a |
| ``DW_AT_location`` attribute as that is now described in the operations. |
| |
| .. note:: |
| |
| See the discussion about the ``DW_AT_location`` attribute in the |
| ``DW_OP_call*`` operation. Having each attribute only have a single |
| purpose and single execution semantics seems desirable. It makes it easier |
| for the consumer that no longer have to track the context. It makes it |
| easier for the producer as it can rely on a single semantics for each |
| attribute. |
| |
| For that reason, limiting the ``DW_AT_location`` attribute to only |
| supporting evaluating the location description of an object, and using a |
| different attribute and encoding class for the evaluation of DWARF |
| expression *procedures* on the same operation expression stack seems |
| desirable. |
| |
| 2. ``DW_AT_const_value`` |
| |
| .. note:: |
| |
| Could deprecate using the ``DW_AT_const_value`` attribute for |
| ``DW_TAG_variable`` or ``DW_TAG_formal_parameter`` debugger information |
| entries that have been optimized to a constant. Instead, |
| ``DW_AT_location`` could be used with a DWARF expression that produces an |
| implicit location description now that any location description can be |
| used within a DWARF expression. This allows the ``DW_OP_call*`` operations |
| to be used to push the location description of any variable regardless of |
| how it is optimized. |
| |
| 3. ``DW_AT_LLVM_memory_space`` |
| |
| A ``DW_AT_memory_space`` attribute with a constant value representing a source |
| language specific DWARF memory space (see 2.14 "Memory Spaces"). If omitted, |
| defaults to ``DW_MSPACE_none``. |
| |
| |
| A.4.2 Common Block Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| A common block entry also has a ``DW_AT_location`` attribute whose value is a |
| DWARF expression E that describes the location of the common block at run-time. |
| The result of the attribute is obtained by evaluating E with a context that has |
| a result kind of a location description, an unspecified object, the compilation |
| unit that contains E, an empty initial stack, and other context elements |
| corresponding to the source language thread of execution upon which the user is |
| focused, if any. The result of the evaluation is the location description of the |
| base of the common block. See :ref:`amdgpu-dwarf-control-flow-operations` for |
| special evaluation rules used by the ``DW_OP_call*`` operations. |
| |
| A.5 Type Entries |
| ---------------- |
| |
| .. note:: |
| |
| This section provides changes to existing debugger information entry |
| attributes. These would be incorporated into the corresponding DWARF Version 5 |
| chapter 5 sections. |
| |
| .. _amdgpu-dwarf-base-type-entries: |
| |
| A.5.1 Base Type Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| The following new attribute is added. |
| |
| 1. A ``DW_TAG_base_type`` debugger information entry for a base type T may have |
| a ``DW_AT_LLVM_vector_size`` attribute whose value is an integer constant |
| that is the vector type size N. |
| |
| The representation of a vector base type is as N contiguous elements, each |
| one having the representation of a base type T' that is the same as T |
| without the ``DW_AT_LLVM_vector_size`` attribute. |
| |
| If a ``DW_TAG_base_type`` debugger information entry does not have a |
| ``DW_AT_LLVM_vector_size`` attribute, then the base type is not a vector |
| type. |
| |
| The DWARF is ill-formed if N is not greater than 0. |
| |
| .. note:: |
| |
| LLVM has mention of a non-upstreamed debugger information entry that is |
| intended to support vector types. However, that was not for a base type so |
| would not be suitable as the type of a stack value entry. But perhaps that |
| could be replaced by using this attribute. |
| |
| .. note:: |
| |
| Compare this with the ``DW_AT_GNU_vector`` extension supported by GNU. Is |
| it better to add an attribute to the existing ``DW_TAG_base_type`` debug |
| entry, or allow some forms of ``DW_TAG_array_type`` (those that have the |
| ``DW_AT_GNU_vector`` attribute) to be used as stack entry value types? |
| |
| .. _amdgpu-dwarf-type-modifier-entries: |
| |
| A.5.3 Type Modifier Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This section augments DWARF Version 5 section 5.3. |
| |
| A modified type entry describing a pointer or reference type (using |
| ``DW_TAG_pointer_type``, ``DW_TAG_reference_type`` or |
| ``DW_TAG_rvalue_reference_type``\ ) may have a ``DW_AT_LLVM_memory_space`` |
| attribute with a constant value representing a source language specific DWARF |
| memory space (see :ref:`amdgpu-dwarf-memory-spaces`). If omitted, defaults to |
| DW_MSPACE_LLVM_none. |
| |
| A modified type entry describing a pointer or reference type (using |
| ``DW_TAG_pointer_type``, ``DW_TAG_reference_type`` or |
| ``DW_TAG_rvalue_reference_type``\ ) may have a ``DW_AT_LLVM_address_space`` |
| attribute with a constant value AS representing an architecture specific DWARF |
| address space (see :ref:`amdgpu-dwarf-address-spaces`). If omitted, defaults to |
| ``DW_ASPACE_LLVM_none``. DR is the offset of a hypothetical debug information |
| entry D in the current compilation unit for an integral base type matching the |
| address size of AS. An object P having the given pointer or reference type are |
| dereferenced as if the ``DW_OP_push_object_address; DW_OP_deref_type DR; |
| DW_OP_constu AS; DW_OP_form_aspace_address`` operation expression was evaluated |
| with the current context except: the result kind is location description; the |
| initial stack is empty; and the object is the location description of P. |
| |
| .. note:: |
| |
| What if the current context does not have a current target architecture |
| defined? |
| |
| .. note:: |
| |
| With the expanded support for DWARF address spaces, it may be worth examining |
| if they can be used for what was formerly supported by DWARF 5 segments. That |
| would include specifying the address space of all code addresses (compilation |
| units, subprograms, subprogram entries, labels, subprogram types, etc.). |
| Either the code address attributes could be extended to allow a exprloc form |
| (so that ``DW_OP_form_aspace_address`` can be used) or the |
| ``DW_AT_LLVM_address_space`` attribute be allowed on all DIEs that allow |
| ``DW_AT_segment``. |
| |
| A.5.7 Structure, Union, Class and Interface Type Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| A.5.7.3 Derived or Extended Structures, Classes and Interfaces |
| ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| |
| 1. For a ``DW_AT_data_member_location`` attribute there are two cases: |
| |
| 1. If the attribute is an integer constant B, it provides the offset in |
| bytes from the beginning of the containing entity. |
| |
| The result of the attribute is obtained by evaluating a |
| ``DW_OP_LLVM_offset B`` operation with an initial stack comprising the |
| location description of the beginning of the containing entity. The |
| result of the evaluation is the location description of the base of the |
| member entry. |
| |
| *If the beginning of the containing entity is not byte aligned, then the |
| beginning of the member entry has the same bit displacement within a |
| byte.* |
| |
| 2. Otherwise, the attribute must be a DWARF expression E which is evaluated |
| with a context that has a result kind of a location description, an |
| unspecified object, the compilation unit that contains E, an initial |
| stack comprising the location description of the beginning of the |
| containing entity, and other context elements corresponding to the |
| source language thread of execution upon which the user is focused, if |
| any. The result of the evaluation is the location description of the |
| base of the member entry. |
| |
| .. note:: |
| |
| The beginning of the containing entity can now be any location |
| description, including those with more than one single location |
| description, and those with single location descriptions that are of any |
| kind and have any bit offset. |
| |
| A.5.7.8 Member Function Entries |
| +++++++++++++++++++++++++++++++ |
| |
| 1. An entry for a virtual function also has a ``DW_AT_vtable_elem_location`` |
| attribute whose value is a DWARF expression E. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an initial stack comprising the location |
| description of the object of the enclosing type, and other context elements |
| corresponding to the source language thread of execution upon which the user |
| is focused, if any. The result of the evaluation is the location description |
| of the slot for the function within the virtual function table for the |
| enclosing class. |
| |
| A.5.14 Pointer to Member Type Entries |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| 1. The ``DW_TAG_ptr_to_member_type`` debugging information entry has a |
| ``DW_AT_use_location`` attribute whose value is a DWARF expression E. It is |
| used to compute the location description of the member of the class to which |
| the pointer to member entry points. |
| |
| *The method used to find the location description of a given member of a |
| class, structure, or union is common to any instance of that class, |
| structure, or union and to any instance of the pointer to member type. The |
| method is thus associated with the pointer to member type, rather than with |
| each object that has a pointer to member type.* |
| |
| The ``DW_AT_use_location`` DWARF expression is used in conjunction with the |
| location description for a particular object of the given pointer to member |
| type and for a particular structure or class instance. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an unspecified object, the |
| compilation unit that contains E, an initial stack comprising two entries, |
| and other context elements corresponding to the source language thread of |
| execution upon which the user is focused, if any. The first stack entry is |
| the value of the pointer to member object itself. The second stack entry is |
| the location description of the base of the entire class, structure, or |
| union instance containing the member whose location is being calculated. The |
| result of the evaluation is the location description of the member of the |
| class to which the pointer to member entry points. |
| |
| A.5.18 Dynamic Properties of Types |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| A.5.18.1 Data Location |
| ++++++++++++++++++++++ |
| |
| *Some languages may represent objects using descriptors to hold information, |
| including a location and/or run-time parameters, about the data that represents |
| the value for that object.* |
| |
| 1. The ``DW_AT_data_location`` attribute may be used with any type that |
| provides one or more levels of hidden indirection and/or run-time parameters |
| in its representation. Its value is a DWARF operation expression E which |
| computes the location description of the data for an object. When this |
| attribute is omitted, the location description of the data is the same as |
| the location description of the object. |
| |
| The result of the attribute is obtained by evaluating E with a context that |
| has a result kind of a location description, an object that is the location |
| description of the data descriptor, the compilation unit that contains E, an |
| empty initial stack, and other context elements corresponding to the source |
| language thread of execution upon which the user is focused, if any. The |
| result of the evaluation is the location description of the base of the |
| member entry. |
| |
| *E will typically involve an operation expression that begins with a* |
| ``DW_OP_push_object_address`` *operation which loads the location |
| description of the object which can then serve as a descriptor in subsequent |
| calculation.* |
| |
| .. note:: |
| |
| Since ``DW_AT_data_member_location``, ``DW_AT_use_location``, and |
| ``DW_AT_vtable_elem_location`` allow both operation expressions and |
| location list expressions, why does ``DW_AT_data_location`` not allow |
| both? In all cases they apply to data objects so less likely that |
| optimization would cause different operation expressions for different |
| program location ranges. But if supporting for some then should be for |
| all. |
| |
| It seems odd this attribute is not the same as |
| ``DW_AT_data_member_location`` in having an initial stack with the |
| location description of the object since the expression has to need it. |
| |
| A.6 Other Debugging Information |
| ------------------------------- |
| |
| .. note:: |
| |
| This section provides changes to existing debugger information entry |
| attributes. These would be incorporated into the corresponding DWARF Version 5 |
| chapter 6 sections. |
| |
| A.6.1 Accelerated Access |
| ~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. _amdgpu-dwarf-lookup-by-name: |
| |
| A.6.1.1 Lookup By Name |
| ++++++++++++++++++++++ |
| |
| A.6.1.1.1 Contents of the Name Index |
| #################################### |
| |
| .. note:: |
| |
| The following provides changes to DWARF Version 5 section 6.1.1.1. |
| |
| The rule for debugger information entries included in the name index in the |
| optional ``.debug_names`` section is extended to also include named |
| ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location`` |
| attribute that includes a ``DW_OP_LLVM_form_aspace_address`` operation. |
| |
| The name index must contain an entry for each debugging information entry that |
| defines a named subprogram, label, variable, type, or namespace, subject to the |
| following rules: |
| |
| * ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location`` |
| attribute that includes a ``DW_OP_addr``, ``DW_OP_LLVM_form_aspace_address``, |
| or ``DW_OP_form_tls_address`` operation are included; otherwise, they are |
| excluded. |
| |
| A.6.1.1.4 Data Representation of the Name Index |
| ############################################### |
| |
| .. _amdgpu-dwarf-name-index-section-header: |
| |
| |
| A.6.1.1.4.1 Section Header |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| .. note:: |
| |
| The following provides an addition to DWARF Version 5 section 6.1.1.4.1 item |
| 14 ``augmentation_string``. |
| |
| A null-terminated UTF-8 vendor specific augmentation string, which provides |
| additional information about the contents of this index. If provided, the |
| recommended format for augmentation string is: |
| |
| | ``[``\ *vendor*\ ``:v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ * |
| |
| Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y |
| version number of the extensions used in the DWARF of the compilation unit, and |
| *options* is an optional string providing additional information about the |
| extensions. The version number must conform to semantic versioning [:ref:`SEMVER |
| <amdgpu-dwarf-SEMVER>`]. The *options* string must not contain the "\ ``]``\ " |
| character. |
| |
| For example: |
| |
| :: |
| |
| [abc:v0.0][def:v1.2:feature-a=on,feature-b=3] |
| |
| .. note:: |
| |
| This is different to the definition in DWARF Version 5 but is consistent with |
| the other augmentation strings and allows multiple vendor extensions to be |
| supported. |
| |
| .. _amdgpu-dwarf-line-number-information: |
| |
| A.6.2 Line Number Information |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| A.6.2.4 The Line Number Program Header |
| ++++++++++++++++++++++++++++++++++++++ |
| |
| A.6.2.4.1 Standard Content Descriptions |
| ####################################### |
| |
| .. note:: |
| |
| This augments DWARF Version 5 section 6.2.4.1. |
| |
| .. _amdgpu-dwarf-line-number-information-dw-lnct-llvm-source: |
| |
| 1. ``DW_LNCT_LLVM_source`` |
| |
| The component is a null-terminated UTF-8 source text string with "\ ``\n``\ |
| " line endings. This content code is paired with the same forms as |
| ``DW_LNCT_path``. It can be used for file name entries. |
| |
| The value is an empty null-terminated string if no source is available. If |
| the source is available but is an empty file then the value is a |
| null-terminated single "\ ``\n``\ ". |
| |
| *When the source field is present, consumers can use the embedded source |
| instead of attempting to discover the source on disk using the file path |
| provided by the* ``DW_LNCT_path`` *field. When the source field is absent, |
| consumers can access the file to get the source text.* |
| |
| *This is particularly useful for programming languages that support runtime |
| compilation and runtime generation of source text. In these cases, the |
| source text does not reside in any permanent file. For example, the OpenCL |
| language [:ref:`OpenCL <amdgpu-dwarf-OpenCL>`] supports online compilation.* |
| |
| 2. ``DW_LNCT_LLVM_is_MD5`` |
| |
| ``DW_LNCT_LLVM_is_MD5`` indicates if the ``DW_LNCT_MD5`` content kind, if |
| present, is valid: when 0 it is not valid and when 1 it is valid. If |
| ``DW_LNCT_LLVM_is_MD5`` content kind is not present, and ``DW_LNCT_MD5`` |
| content kind is present, then the MD5 checksum is valid. |
| |
| ``DW_LNCT_LLVM_is_MD5`` is always paired with the ``DW_FORM_udata`` form. |
| |
| *This allows a compilation unit to have a mixture of files with and without |
| MD5 checksums. This can happen when multiple relocatable files are linked |
| together.* |
| |
| .. _amdgpu-dwarf-call-frame-information: |
| |
| A.6.4 Call Frame Information |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| .. note:: |
| |
| This section provides changes to existing call frame information and defines |
| instructions added by these extensions. Additional support is added for |
| address spaces. Register unwind DWARF expressions are generalized to allow any |
| location description, including those with composite and implicit location |
| descriptions. |
| |
| These changes would be incorporated into the DWARF Version 5 section 6.4. |
| |
| .. _amdgpu-dwarf-structure_of-call-frame-information: |
| |
| A.6.4.1 Structure of Call Frame Information |
| +++++++++++++++++++++++++++++++++++++++++++ |
| |
| The register rules are: |
| |
| *undefined* |
| A register that has this rule has no recoverable value in the previous frame. |
| The previous value of this register is the undefined location description (see |
| :ref:`amdgpu-dwarf-undefined-location-description-operations`). |
| |
| *By convention, the register is not preserved by a callee.* |
| |
| *same value* |
| This register has not been modified from the previous caller frame. |
| |
| If the current frame is the top frame, then the previous value of this |
| register is the location description L that specifies one register location |
| description SL. SL specifies the register location storage that corresponds to |
| the register with a bit offset of 0 for the current thread. |
| |
| If the current frame is not the top frame, then the previous value of this |
| register is the location description obtained using the call frame information |
| for the callee frame and callee program location invoked by the current caller |
| frame for the same register. |
| |
| *By convention, the register i
|