[lld][macho] Support order cstrings with -order_file (#140307)

Expand the `-order_file` also accept cstrings to order.
The purpose is to order hot cstrings for performance (implemented in
this diff), and then later on we can also order cold cstrings for
compression size win.

Due to the speciality of cstrings, there's no way to pass in symbol
names in the order file as the existing -order_file, so we expect `<hash
of cstring literal content>` to represent/identify each cstring.

```
// An order file has one entry per line, in the following format:
  //
  //   <cpu>:<object file>:[<symbol name> | CStringEntryPrefix <cstring hash>]
  //
  // <cpu> and <object file> are optional.
  // If not specified, then that entry tries to match either,
  //
  // 1) any symbol of the <symbol name>;
  // Parsing this format is not quite straightforward because the symbol name
  // itself can contain colons, so when encountering a colon, we consider the
  // preceding characters to decide if it can be a valid CPU type or file path.
  // If a symbol is matched by multiple entries, then it takes the
  // lowest-ordered entry (the one nearest to the front of the list.)
  //
  // or 2) any cstring literal with the given hash, if the entry has the
  // CStringEntryPrefix prefix defined below in the file. <cstring hash> is the
  // hash of cstring literal content.
  //
  // Cstring literals are not symbolized, we can't identify them by name
  // However, cstrings are deduplicated, hence unique, so we use the hash of
  // the content of cstring literals to identify them and assign priority to it.
  // We use the same hash as used in StringPiece, i.e. 31 bit:
  // xxh3_64bits(string) & 0x7fffffff
  //
```

The ordering of cstring has to happen during/before the finalizing of
the cstring section content in the `finalizeContents()` function, which
happens before the writer is run

---------

Co-authored-by: Sharon Xu <sharonxu@fb.com>
4 files changed
tree: 177b506cc48cd4a3d35e26f101655f3393ede964
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. flang-rt/
  11. libc/
  12. libclc/
  13. libcxx/
  14. libcxxabi/
  15. libunwind/
  16. lld/
  17. lldb/
  18. llvm/
  19. llvm-libgcc/
  20. mlir/
  21. offload/
  22. openmp/
  23. polly/
  24. pstl/
  25. runtimes/
  26. third-party/
  27. utils/
  28. .clang-format
  29. .clang-format-ignore
  30. .clang-tidy
  31. .git-blame-ignore-revs
  32. .gitattributes
  33. .gitignore
  34. .mailmap
  35. CODE_OF_CONDUCT.md
  36. CONTRIBUTING.md
  37. LICENSE.TXT
  38. pyproject.toml
  39. README.md
  40. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.