[asan] Speed up ASan ODR indicator-based checking (#100923) **Summary**: When ASan checks for a potential ODR violation on a global it loops over a linked list of all globals to find those with the matching value of an indicator. With the default setting 'detect_odr_violation=1', ASan doesn't report violations on same-size globals but it still has to traverse the list. For larger binaries with a ton of shared libs and globals (and a non-trivial volume of same-sized duplicates) this gets extremely expensive. This patch adds an indicator indexed (multi-)map of globals to speed up the search. > Note: asan used to use a map to store globals a while ago which was replaced with a list when the codebase [moved off of STL](https://github.com/llvm/llvm-project/commit/e4bada2c946e5399fc37bd67421de01c0047ad38). Internally we see many examples where ODR checking takes *seconds* (even double digits). With this patch it's practically free and `__asan_register_globals` doesn't show up prominently in the perf profile anymore. There are several high-level questions: 1. I understand that the intent is that we hit the slow path rarely, ideally once before the process dies with an error. But in practice we hit the slow path a lot. It feels reasonable to keep the amount of work bounded even in the worst case, even if it requires a bit of extra memory. But if not, it'd be great to learn about the tradeoffs. 2. Poisoning based ODR checking remains on the slow path. Internally we build everything with `-fsanitize-address-use-odr-indicator` so I'm not sure if poisoning-based check would exhibit the same behavior (looking at the code, the shape looks very similar, so it might?). 3. Globals with an ODR indicator of `-1` need to be skipped for the purposes of ODR checking (cf. https://github.com/llvm/llvm-project/commit/a257639a6935a2c63377784c5c9c3b73864a2582). But they are still getting added to the list of globals and hence take up space and slow down the iteration over the list of globals. It would be a good saving if we could avoid adding them to the globals list. 4. Any reason to use a linked list instead of e.g. a vector to store globals? **Test Plan**: * `cmake --build build --target check-asan` looks good * Perf-wise things look good when linking against this version of compiler-rt. --------- Co-authored-by: Vitaly Buka <vitalybuka@google.com>
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.