Stateful variable-location annotations in Disassembler::PrintInstructions() (follow-up to #147460) (#152887)

**Context**  
Follow-up to
[#147460](https://github.com/llvm/llvm-project/pull/147460), which added
the ability to surface register-resident variable locations.
This PR moves the annotation logic out of `Instruction::Dump()` and into
`Disassembler::PrintInstructions()`, and adds lightweight state tracking
so we only print changes at range starts and when variables go out of
scope.

---

## What this does

While iterating the instructions for a function, we maintain a “live
variable map” keyed by `lldb::user_id_t` (the `Variable`’s ID) to
remember each variable’s last emitted location string. For each
instruction:

- **New (or newly visible) variable** → print `name = <location>` once
at the start of its DWARF location range, cache it.
- **Location changed** (e.g., DWARF range switched to a different
register/const) → print the updated mapping.
- **Out of scope** (was tracked previously but not found for the current
PC) → print `name = <undef>` and drop it.

This produces **concise, stateful annotations** that highlight variable
lifetime transitions without spamming every line.

---

## Why in `PrintInstructions()`?

- Keeps `Instruction` stateless and avoids changing the
`Instruction::Dump()` virtual API.
- Makes it straightforward to diff state across instructions (`prev →
current`) inside the single driver loop.

---

## How it works (high-level)

1. For the current PC, get in-scope variables via
`StackFrame::GetInScopeVariableList(/*get_parent=*/true)`.
2. For each `Variable`, query
`DWARFExpressionList::GetExpressionEntryAtAddress(func_load_addr,
current_pc)` (added in #144238).
3. If the entry exists, call `DumpLocation(..., eDescriptionLevelBrief,
abi)` to get a short, ABI-aware location string (e.g., `DW_OP_reg3 RBX →
RBX`).
4. Compare against the last emitted location in the live map:
   - If not present → emit `name = <location>` and record it.
   - If different → emit updated mapping and record it.
5. After processing current in-scope variables, compute the set
difference vs. the previous map and emit `name = <undef>` for any that
disappeared.

Internally:
- We respect file↔load address translation already provided by
`DWARFExpressionList`.
- We reuse the ABI to map LLVM register numbers to arch register names.

---

## Example output (x86_64, simplified)

```
->  0x55c6f5f6a140 <+0>:  cmpl   $0x2, %edi                                                         ; argc = RDI, argv = RSI
    0x55c6f5f6a143 <+3>:  jl     0x55c6f5f6a176            ; <+54> at d_original_example.c:6:3
    0x55c6f5f6a145 <+5>:  pushq  %r15
    0x55c6f5f6a147 <+7>:  pushq  %r14
    0x55c6f5f6a149 <+9>:  pushq  %rbx
    0x55c6f5f6a14a <+10>: movq   %rsi, %rbx
    0x55c6f5f6a14d <+13>: movl   %edi, %r14d
    0x55c6f5f6a150 <+16>: movl   $0x1, %r15d                                                        ; argc = R14
    0x55c6f5f6a156 <+22>: nopw   %cs:(%rax,%rax)                                                    ; i = R15, argv = RBX
    0x55c6f5f6a160 <+32>: movq   (%rbx,%r15,8), %rdi
    0x55c6f5f6a164 <+36>: callq  0x55c6f5f6a030            ; symbol stub for: puts
    0x55c6f5f6a169 <+41>: incq   %r15
    0x55c6f5f6a16c <+44>: cmpq   %r15, %r14
    0x55c6f5f6a16f <+47>: jne    0x55c6f5f6a160            ; <+32> at d_original_example.c:5:10
    0x55c6f5f6a171 <+49>: popq   %rbx                                                               ; i = <undef>
    0x55c6f5f6a172 <+50>: popq   %r14                                                               ; argv = RSI
    0x55c6f5f6a174 <+52>: popq   %r15                                                               ; argc = RDI
    0x55c6f5f6a176 <+54>: xorl   %eax, %eax
    0x55c6f5f6a178 <+56>: retq  
```

Only transitions are shown: the start of a location, changes, and
end-of-lifetime.

---

## Scope & limitations (by design)

- Handles **simple locations** first (registers, const-in-register cases
surfaced by `DumpLocation`).
- **Memory/composite locations** are out of scope for this PR.
- Annotations appear **only at range boundaries** (start/change/end) to
minimize noise.
- Output is **target-independent**; register names come from the target
ABI.

## Implementation notes

- All annotation printing now happens in
`Disassembler::PrintInstructions()`.
- Uses `std::unordered_map<lldb::user_id_t, std::string>` as the live
map.
- No persistent state across calls; the map is rebuilt while walking
instruction by instruction.
- **No changes** to the `Instruction` interface.

---

## Requested feedback

- Placement and wording of the `<undef>` marker.
- Whether we should optionally gate this behind a setting (currently
always on when disassembling with an `ExecutionContext`).
- Preference for immediate inclusion of tests vs. follow-up patch.

---

Thanks for reviewing! Happy to adjust behavior/format based on feedback.

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
19 files changed
tree: 7c69a5d7fedbbbbea72dbf42590e6937576e5b71
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. flang-rt/
  11. libc/
  12. libclc/
  13. libcxx/
  14. libcxxabi/
  15. libsycl/
  16. libunwind/
  17. lld/
  18. lldb/
  19. llvm/
  20. llvm-libgcc/
  21. mlir/
  22. offload/
  23. openmp/
  24. orc-rt/
  25. polly/
  26. runtimes/
  27. third-party/
  28. utils/
  29. .clang-format
  30. .clang-format-ignore
  31. .clang-tidy
  32. .git-blame-ignore-revs
  33. .gitattributes
  34. .gitignore
  35. .mailmap
  36. CODE_OF_CONDUCT.md
  37. CONTRIBUTING.md
  38. LICENSE.TXT
  39. pyproject.toml
  40. README.md
  41. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.