Warn when unique objects might be duplicated in shared libraries (#125526)

This is attempt 2 to merge this, the first one is #117622. This properly
disables the tests when building for playstation, since the warning is
disabled there.

When a hidden object is built into multiple shared libraries, each
instance of the library will get its own copy. If
the object was supposed to be globally unique (e.g. a global variable or
static data member), this can cause very subtle bugs.

An object might be incorrectly duplicated if it:

Is defined in a header (so it might appear in multiple TUs), and
Has external linkage (otherwise it's supposed to be duplicated), and
Has hidden visibility (or else the dynamic linker will handle it)
The duplication is only a problem semantically if one of the following
is true:

The object is mutable (the copies won't be in sync), or
Its initialization has side effects (it may now run more than once), or
The value of its address is used (different copies have different
addresses).
To detect this, we add a new -Wunique-object-duplication warning. It
warns on cases (1) and (2) above. To be conservative, we only warn in
case (2) if we are certain the initializer has side effects, and we
don't warn on new because the only side effect is some extra memory
usage.

We don't currently warn on case (3) because doing so is prone to false
positives: there are many reasons for taking the address which aren't
inherently problematic (e.g. passing to a function that expects a
pointer). We only run into problems if the code inspects the value of
the address.

The check is currently disabled for windows, which uses its own analogue
of visibility (declimport/declexport). The check is also disabled inside
templates, since it can give false positives if a template is never
instantiated.

Resolving the warning
The warning can be fixed in several ways:

If the object in question doesn't need to be mutable, it should be made
const. Note that the variable must be completely immutable, e.g. we'll
warn on const int* p because the pointer itself is mutable. To silence
the warning, it should instead be const int* const p.
If the object must be mutable, it (or the enclosing function, in the
case of static local variables) should be made visible using
__attribute((visibility("default")))
If the object is supposed to be duplicated, it should be be given
internal linkage.
Testing
I've tested the warning by running it on clang itself, as well as on
chromium. Compiling clang resulted in [10 warnings across 6
files](https://github.com/user-attachments/files/17908069/clang-warnings.txt),
while Chromium resulted in [160 warnings across 85
files](https://github.com/user-attachments/files/17908072/chromium-warnings.txt),
mostly in third-party code. Almost all warnings were due to mutable
variables.

I evaluated the warnings by manual inspection. I believe all the
resulting warnings are true positives, i.e. they represent
potentially-problematic code where duplication might cause a problem.
For the clang warnings, I also validated them by either adding const or
visibility annotations as appropriate.

Limitations
I am aware of four main limitations with the current warning:

We do not warn when the address of a duplicated object is taken, since
doing so is prone to false positives. I'm hopeful that we can create a
refined version in the future, however.
We only warn for side-effectful initialization if we are certain side
effects exist. Warning on potential side effects produced a huge number
of false positives; I don't expect there's much that can be done about
this in modern C++ code bases, since proving a lack of side effects is
difficult.
Windows uses a different system (declexport/import) instead of
visibility. From manual testing, it seems to behave analogously to the
visibility system for the purposes of this warning, but to keep things
simple the warning is disabled on windows for now.
We don't warn on code inside templates. This is unfortuate, since it
masks many real issues, e.g. a templated variable which is implicitly
instantiated the same way in multiple TUs should be globally unique, but
may accidentally be duplicated. Unfortunately, we found some potential
false positives during testing that caused us to disable the warning for
now.
7 files changed
tree: 6d18c7f92b9f567d21b2e52e8d3f160d65bc1259
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. libc/
  11. libclc/
  12. libcxx/
  13. libcxxabi/
  14. libunwind/
  15. lld/
  16. lldb/
  17. llvm/
  18. llvm-libgcc/
  19. mlir/
  20. offload/
  21. openmp/
  22. polly/
  23. pstl/
  24. runtimes/
  25. third-party/
  26. utils/
  27. .clang-format
  28. .clang-tidy
  29. .git-blame-ignore-revs
  30. .gitattributes
  31. .gitignore
  32. .mailmap
  33. CODE_OF_CONDUCT.md
  34. CONTRIBUTING.md
  35. LICENSE.TXT
  36. pyproject.toml
  37. README.md
  38. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.