[Hashing] Replace CityHash mixers with xxh3 (#194567)

Replace the CityHash-style mixer in hash_combine and (transitively)
hash_value(std::basic_string), hash_value(StringRef), and therefore
DenseMap<StringRef, X> lookups, with a flatten-and-call into
xxh3_64bits, a modern hash superior to CityHash.

hash_value(int) / hash_value(ptr) keep the existing Murmur-style
hash_16_bytes mixer; those are the dominant DenseMap key paths and a
fully-inline 16-byte mix beats inlining xxh3's larger 0..16-byte short
path.

To break dependency cycle: xxHash64, xxh3_64bits, and xxh3_128bits
ArrayRef/StringRef overloads move from llvm/Support/xxhash.h to inline
overloads in llvm/ADT/ArrayRef.h and llvm/ADT/StringRef.h, so xxhash.h
has no ADT dependencies.

A variant that inlined xxh3's 0..16-byte fast path at every
combine_bytes call site (vs. always calling out-of-line xxh3_64bits)
showed no measurable compile-time improvement on the tracker, so
combine_bytes is a one-liner over the out-of-line entry point.

llvm-compile-time-tracker.com (CTMark, instructions:u)
```
  stage1-O0-g           -1.76%   (sqlite3 -3.78%)
  stage1-aarch64-O0-g   -1.40%   (sqlite3 -2.86%)
  stage1-ReleaseLTO-g   -1.13%
  stage1-ReleaseThinLTO -0.45%
  stage1-O3             -0.43%
  stage1-aarch64-O3     -0.42%
  stage2-O0-g           -0.42%
  stage2-O3             -0.15%
  clang build           -0.71%   (wall -0.42%)
```

DenseMap-of-pointer paths (dominant at -O3) are untouched, so higher-
optimization configs see smaller wins as expected. opt's .text shrinks
~92 KB. Subsumes the StringRef-only carve-out proposed in #191115.

Notes on properties not introduced by this patch:

- Endianness: hash_combine over native integers was already not
cross-host
  stable. memcpy of a native integer into the buffer is host-encoded;
  fetch32 normalized the read but not the underlying bytes, so on LE vs
  BE the value fed to the mixer already differed. xxh3 inherits the same
  property: same byte stream, different mixer.

- Process seed: combine_bytes XORs get_execution_seed into the result,
  which cancels under hash_combine(x) ^ hash_combine(y). The pre-patch
  short/state paths fed the seed through hash_16_bytes / shift_mix
  non-linearly, so this is a regression in seed effectiveness under that
  pattern. Default seed is constant, so this only matters under
  LLVM_ENABLE_ABI_BREAKING_CHECKS. Follow-up: add a seeded xxh3 entry
  point in libSupport.

Aided by Claude opus 4.7
2 files changed
tree: 0599dd58f5fa64365c8baedd56d853619f46a2fe
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. flang-rt/
  11. libc/
  12. libclc/
  13. libcxx/
  14. libcxxabi/
  15. libsycl/
  16. libunwind/
  17. lld/
  18. lldb/
  19. llvm/
  20. llvm-libgcc/
  21. mlir/
  22. offload/
  23. openmp/
  24. orc-rt/
  25. polly/
  26. runtimes/
  27. third-party/
  28. utils/
  29. .clang-format
  30. .clang-format-ignore
  31. .clang-tidy
  32. .git-blame-ignore-revs
  33. .gitattributes
  34. .gitignore
  35. .mailmap
  36. CODE_OF_CONDUCT.md
  37. CONTRIBUTING.md
  38. LICENSE.TXT
  39. pyproject.toml
  40. README.md
  41. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.