[Hashing] Replace CityHash mixers with xxh3 (#194567) Replace the CityHash-style mixer in hash_combine and (transitively) hash_value(std::basic_string), hash_value(StringRef), and therefore DenseMap<StringRef, X> lookups, with a flatten-and-call into xxh3_64bits, a modern hash superior to CityHash. hash_value(int) / hash_value(ptr) keep the existing Murmur-style hash_16_bytes mixer; those are the dominant DenseMap key paths and a fully-inline 16-byte mix beats inlining xxh3's larger 0..16-byte short path. To break dependency cycle: xxHash64, xxh3_64bits, and xxh3_128bits ArrayRef/StringRef overloads move from llvm/Support/xxhash.h to inline overloads in llvm/ADT/ArrayRef.h and llvm/ADT/StringRef.h, so xxhash.h has no ADT dependencies. A variant that inlined xxh3's 0..16-byte fast path at every combine_bytes call site (vs. always calling out-of-line xxh3_64bits) showed no measurable compile-time improvement on the tracker, so combine_bytes is a one-liner over the out-of-line entry point. llvm-compile-time-tracker.com (CTMark, instructions:u) ``` stage1-O0-g -1.76% (sqlite3 -3.78%) stage1-aarch64-O0-g -1.40% (sqlite3 -2.86%) stage1-ReleaseLTO-g -1.13% stage1-ReleaseThinLTO -0.45% stage1-O3 -0.43% stage1-aarch64-O3 -0.42% stage2-O0-g -0.42% stage2-O3 -0.15% clang build -0.71% (wall -0.42%) ``` DenseMap-of-pointer paths (dominant at -O3) are untouched, so higher- optimization configs see smaller wins as expected. opt's .text shrinks ~92 KB. Subsumes the StringRef-only carve-out proposed in #191115. Notes on properties not introduced by this patch: - Endianness: hash_combine over native integers was already not cross-host stable. memcpy of a native integer into the buffer is host-encoded; fetch32 normalized the read but not the underlying bytes, so on LE vs BE the value fed to the mixer already differed. xxh3 inherits the same property: same byte stream, different mixer. - Process seed: combine_bytes XORs get_execution_seed into the result, which cancels under hash_combine(x) ^ hash_combine(y). The pre-patch short/state paths fed the seed through hash_16_bytes / shift_mix non-linearly, so this is a regression in seed effectiveness under that pattern. Default seed is constant, so this only matters under LLVM_ENABLE_ABI_BREAKING_CHECKS. Follow-up: add a seeded xxh3 entry point in libSupport. Aided by Claude opus 4.7
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.