[lldb][AArch64] Fix expression evaluation with Guarded Control Stacks (#123918) When the Guarded Control Stack (GCS) is enabled, returns cause the processor to validate that the address at the location pointed to by gcspr_el0 matches the one in the link register. ``` ret (lr=A) << pc | GCS | +=====+ | A | | B | << gcspr_el0 Fault: tried to return to A when you should have returned to B. ``` Therefore when an expression wrapper function tries to return to the expression return address (usually `_start` if there is a libc), it would fault. ``` ret (lr=_start) << pc | GCS | +============+ | user_func1 | | user_func2 | << gcspr_el0 Fault: tried to return to _start when you should have returned to user_func2. ``` To fix this we must push that return address to the GCS in PrepareTrivialCall. This value is then consumed by the final return and the expression completes as expected. If for some reason that fails, we will manually restore the value of gcspr_el0, because it turns out that PrepareTrivialCall does not restore registers if it fails at all. So for now I am handling gcspr_el0 specifically, but I have filed https://github.com/llvm/llvm-project/issues/124269 to address the general problem. (the other things PrepareTrivialCall does are exceedingly likely to not fail, so we have never noticed this) ``` ret (lr=_start) << pc | GCS | +============+ | user_func1 | | user_func2 | | _start | << gcspr_el0 No fault, we return to _start as normal. ``` The gcspr_el0 register will be restored after expression evaluation so that the program can continue correctly. However, due to restrictions in the Linux GCS ABI, we will not restore the enable bit of gcs_features_enabled. Re-enabling GCS via ptrace is not supported because it requires memory to be allocated by the kernel. We could disable GCS if the expression enabled GCS, however this would use up that state transition that the program might later rely on. And generally it is cleaner to ignore the enable bit, rather than one state transition of it. We will also not restore the GCS entry that was overwritten with the expression's return address. On the grounds that: * This entry will never be used by the program. If the program branches, the entry will be overwritten. If the program returns, gcspr_el0 will point to the entry before the expression return address and that entry will instead be validated. * Any expression that calls functions will overwrite even more entries, so the user needs to be aware of that anyway if they want to preserve the contents of the GCS for inspection. * An expression could leave the program in a state where restoring the value makes the situation worse. Especially if we ever support this in bare metal debugging. I will later document all this on https://lldb.llvm.org/use/aarch64-linux.html. Tests have been added for: * A function call that does not interact with GCS. * A call that does, and disables it (we do not re-enable it). * A call that does, and enables it (we do not disable it again). * Failure to push an entry to the GCS stack.
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.