[analyzer] Never create Regions wrapping reference TypedValueRegions (NFCI) (#118096)

Like in the test case:
```c++
struct String {
  String(const String &) {}
};

struct MatchComponent {
  unsigned numbers[2];
  String prerelease;
  MatchComponent(MatchComponent const &) = default;
};

MatchComponent get();
void consume(MatchComponent const &);

MatchComponent parseMatchComponent() {
  MatchComponent component = get();
  component.numbers[0] = 10;
  component.numbers[1] = 20;
  return component; // We should have no stack addr escape warning here.
}

void top() {
  consume(parseMatchComponent());
}
```

When calling `consume(parseMatchComponent())` the
`parseMatchComponent()` would return a copy of a temporary of
`component`. That copy would invoke the
`MatchComponent::MatchComponent(const MatchComponent &)` ctor.

That ctor would have a (reference typed) ParamVarRegion, holding the
location (lvalue) of the object we are about to copy (&component). So
far so good, but just before evaluating the binding operation for
initializing the `numbers` field of the temporary, we evaluate the
ArrayInitLoopExpr representing the by-value elementwise copy of the
array `component.numbers`. This is represented by a LazyCompoundVal,
because we (usually) don't just copy large arrays and bind many
individual direct bindings. Rather, we take a snapshot by using a LCV.

However, notice that the LCV representing this copy would look like
this:
  lazyCompoundVal{ParamVarRegion{"reference param of cctor"}.numbers}

Notice that it refers to the numbers field of a reference. It would be
much better to desugar the reference to the actual object, thus it
should be: `lazyCompoundVal{component.numbers}`

Actually, when binding the result of the ArrayInitLoopExpr to the
`temp_object.numbers` in the compiler-generated member initializer of
the cctor, we should have two individual direct bindings because this is
a "small array":
```
  binding &Element{temp_object.numbers, 0} <- loadFrom(&Element{component.numbers, 0})
  binding &Element{temp_object.numbers, 1} <- loadFrom(&Element{component.numbers, 1})
```
Where `loadFrom(...)` would be:
```
  loadFrom(&Element{component.numbers, 0}): 10 U32b
  loadFrom(&Element{component.numbers, 1}): 20 U32b
```
So the store should look like this, after PostInitializer of
`temp_object.numbers`:
```
  temp_object at offset  0: 10 U32b
  temp_object at offset 32: 20 U32b
```

The lesson is that it's okay to have TypedValueRegions of references as
long as we don't form subregions of those. If we ever want to refer to a
subregion of a "reference" we actually meant to "desugar" the reference
and slice a subregion of the pointee of the reference instead.

Once this canonicalization is in place, we can also drop the special
handling of references in `ProcessInitializer`, because now reference
TypedValueRegions are eagerly desugared into their referee region when
forming a subregion of it.

There should be no practical differences, but there are of course bugs
that this patch may surface.
5 files changed
tree: 153378feb381113a853fe92e96259bff5c55dc3c
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. libc/
  11. libclc/
  12. libcxx/
  13. libcxxabi/
  14. libunwind/
  15. lld/
  16. lldb/
  17. llvm/
  18. llvm-libgcc/
  19. mlir/
  20. offload/
  21. openmp/
  22. polly/
  23. pstl/
  24. runtimes/
  25. third-party/
  26. utils/
  27. .clang-format
  28. .clang-tidy
  29. .git-blame-ignore-revs
  30. .gitattributes
  31. .gitignore
  32. .mailmap
  33. CODE_OF_CONDUCT.md
  34. CONTRIBUTING.md
  35. LICENSE.TXT
  36. pyproject.toml
  37. README.md
  38. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.