[DebugInfo] Emit negative DW_AT_bit_offset in explicit signed form (#87994)

Before this patch, the value of DW_AT_bit_offset, used for bitfields
before DWARF version 4, was always emitted as an unsigned integer using
the form DW_FORM_data<n>. If the value was originally a signed integer,
for instance in the case of negative offsets, it was up to debug
information consumers to re-cast it to a signed integer.

This is problematic since the burden of deciding if the value should be
read as signed or unsigned was put onto the debug info consumers: the
DWARF specification doesn't define DW_AT_bit_offset's underlying type.
If a debugger decided to interpret this attribute in the form data<n> as
unsigned, then negative offsets would be completely broken.

The DWARF specification version 3 mentions in the Data Representation
section, page 127:

> If one of the DW_FORM_data<n> forms is used to represent a signed or
unsigned integer, it can be hard for a consumer to discover the context
necessary to determine which interpretation is intended. Producers are
therefore strongly encouraged to use DW_FORM_sdata or DW_FORM_udata for
signed and unsigned integers respectively, rather than DW_FORM_data<n>.

Therefore, the proposal is to use DW_FORM_sdata, which is explicitly
signed. This is an indication to consumers that the offset must be
parsed unambiguously as a signed integer.

Finally, gcc already uses DW_FORM_sdata for negative offsets, fixing the
potential ambiguity altogether.

This patch mimics gcc's behaviour by emitting negative values of
DW_AT_bit_offset using the DW_FORM_sdata form. This eliminates any
potential misinterpretation.

One could argue that all values should use DW_FORM_sdata, but for the
sake of parity with gcc, it is safe to restrict the change to negative
values.
4 files changed
tree: 7a998e70cca12f184bb5dabdd886e27c2c900a6f
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. libc/
  11. libclc/
  12. libcxx/
  13. libcxxabi/
  14. libunwind/
  15. lld/
  16. lldb/
  17. llvm/
  18. llvm-libgcc/
  19. mlir/
  20. offload/
  21. openmp/
  22. polly/
  23. pstl/
  24. runtimes/
  25. third-party/
  26. utils/
  27. .clang-format
  28. .clang-tidy
  29. .git-blame-ignore-revs
  30. .gitattributes
  31. .gitignore
  32. .mailmap
  33. CODE_OF_CONDUCT.md
  34. CONTRIBUTING.md
  35. LICENSE.TXT
  36. pyproject.toml
  37. README.md
  38. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.