[lldb] Improve mid-function epilogue scanning for x86 (#110965)

The x86 assembly instruction scanner creates incorrect UnwindPlans when
a mid-function epilogue has a non-epilogue instruction in it.

The x86 instruction analysis which creates an UnwindPlan handles
mid-function epilogues by tracking "epilogue instructions" (register
loads from stack, stack pointer increasing, etc) and any UnwindPlan
updates which are NOT epilogue instructions update the "prologue
UnwindPlan" saved row. It detects a LEAVE/RET/unconditional JMP out of
the function and after that instruction, re-instates the "prologue Row".

There's a parallel piece of data tracked across the duration of the
function, current_sp_bytes_offset_from_fa, and we reflect the "value
after prologue instructions" in
prologue_completed_sp_bytes_offset_from_cfa. When the CFA is calculated
in terms of the frame pointer ($ebp/$rbp), we don't add changes to the
stack pointer to the UnwindPlan, so this separate mechanism is used for
the "current value" and the "last value at prologue setup".

(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule which
is formally correct, but it could also track the stack pointer value as
sp=$rsp+<x> and update this register rule every time $rsp is modified.)

This leads to a bug when there is an instruction in an epilogue which
isn't recognzied as an epilogue instruction.
prologue_completed_sp_bytes_offset_from_cfa is always set to the value
of current_sp_bytes_offset_from_fa unless the current instruction is an
epilogue instruction. With a non-epilogue instruction in the middle of
the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa value
from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa. Once the epilogue is
finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa. But now $rsp has a very
incorrect value in it.

This patch tracks when we've updated current_sp_bytes_offset_from_fa in
the current instruction analysis. If it was updated looking at an
epilogue instruction, `is_epilogue` will be set correctly. Otherwise
it's a "prologue" instruction and we should update
prologue_completed_sp_bytes_offset_from_cfa. Any instruction that is
unrecognized will leave prologue_completed_sp_bytes_offset_from_cfa
unmodified.

The actual instruction we hit this with was a BTRQ but I added a NOP to
the unit test which is only 1 byte and made the update to the unit test
a little simpler. This bug is hit with a NOP just as well.

UnwindAssemblyInstEmulation has a much better algorithm for handling
mid-function epilogues, which "forward" the current unwind state Row
when it sees branches within the function, to the target instruction
offset. This avoids detecting prologue/epilogue instructions altogether.

rdar://137153323
2 files changed
tree: 6cfa4d982434abba723653b91a228a30ed77ce0c
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. libc/
  11. libclc/
  12. libcxx/
  13. libcxxabi/
  14. libunwind/
  15. lld/
  16. lldb/
  17. llvm/
  18. llvm-libgcc/
  19. mlir/
  20. offload/
  21. openmp/
  22. polly/
  23. pstl/
  24. runtimes/
  25. third-party/
  26. utils/
  27. .clang-format
  28. .clang-tidy
  29. .git-blame-ignore-revs
  30. .gitattributes
  31. .gitignore
  32. .mailmap
  33. CODE_OF_CONDUCT.md
  34. CONTRIBUTING.md
  35. LICENSE.TXT
  36. pyproject.toml
  37. README.md
  38. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.