[libc++] Optimize {std,ranges}::distance for segmented iterators (#133612)

This patch enhances the performance of `std::distance` and
`std::ranges::distance` for non-random-access segmented iterators, e.g.,
`std::join_view` iterators. The original implementation operates in
linear time, `O(n)`, where `n` is the total number of elements. The
optimized version reduces this to approximately `O(n / segment_size)` by
leveraging segmented structure, where `segment_size` is the average size
of each segment.

The table below summarizes the peak performance improvements observed
across different segment sizes, with the total element count `n` ranging
up to `1 << 20` (1,048,576 elements), based on benchmark results.

```
----------------------------------------------------------------------------------------
Container/n/segment_size                          std::distance     std::ranges::distance
----------------------------------------------------------------------------------------
join_view(vector<vector<int>>)/1048576/256	         401.6x	              422.9x
join_view(deque<deque<int>>)/1048576/256 	         112.1x	              132.6x
join_view(vector<vector<int>>)/1048576/1024         1669.2x              1559.1x
join_view(deque<deque<int>>)/1048576/1024            487.7x               497.4x
```

## Benchmarks


#### Segment size = 1024

```
-----------------------------------------------------------------------------------------
Benchmark                                                    Before      After    Speedup
-----------------------------------------------------------------------------------------
std::distance(join_view(vector<vector<int>>))/50            38.8 ns     1.01 ns     38.4x
std::distance(join_view(vector<vector<int>>))/1024           660 ns     1.02 ns    647.1x
std::distance(join_view(vector<vector<int>>))/4096          2934 ns     1.98 ns   1481.8x
std::distance(join_view(vector<vector<int>>))/8192          5751 ns     3.92 ns   1466.8x
std::distance(join_view(vector<vector<int>>))/16384        11520 ns     7.06 ns   1631.7x
std::distance(join_view(vector<vector<int>>))/65536        46367 ns     32.2 ns   1440.6x
std::distance(join_view(vector<vector<int>>))/262144      182611 ns      114 ns   1601.9x
std::distance(join_view(vector<vector<int>>))/1048576     737785 ns      442 ns   1669.2x
std::distance(join_view(deque<deque<int>>))/50              53.1 ns     6.13 ns      8.7x
std::distance(join_view(deque<deque<int>>))/1024             854 ns     7.53 ns    113.4x
std::distance(join_view(deque<deque<int>>))/4096            3507 ns     14.7 ns    238.6x
std::distance(join_view(deque<deque<int>>))/8192            7114 ns     17.6 ns    404.2x
std::distance(join_view(deque<deque<int>>))/16384          13997 ns     30.7 ns    455.9x
std::distance(join_view(deque<deque<int>>))/65536          55598 ns      114 ns    487.7x
std::distance(join_view(deque<deque<int>>))/262144        214293 ns      480 ns    446.4x
std::distance(join_view(deque<deque<int>>))/1048576       833000 ns     2183 ns    381.6x
rng::distance(join_view(vector<vector<int>>))/50            39.1 ns     1.10 ns     35.5x
rng::distance(join_view(vector<vector<int>>))/1024           689 ns     1.14 ns    604.4x
rng::distance(join_view(vector<vector<int>>))/4096          2753 ns     2.15 ns   1280.5x
rng::distance(join_view(vector<vector<int>>))/8192          5530 ns     4.61 ns   1199.6x
rng::distance(join_view(vector<vector<int>>))/16384        10968 ns     7.97 ns   1376.2x
rng::distance(join_view(vector<vector<int>>))/65536        46009 ns     35.3 ns   1303.4x
rng::distance(join_view(vector<vector<int>>))/262144      190569 ns      124 ns   1536.9x
rng::distance(join_view(vector<vector<int>>))/1048576     746724 ns      479 ns   1559.1x
rng::distance(join_view(deque<deque<int>>))/50              51.6 ns     6.57 ns      7.9x
rng::distance(join_view(deque<deque<int>>))/1024             826 ns     6.50 ns    127.1x
rng::distance(join_view(deque<deque<int>>))/4096            3323 ns     12.5 ns    265.8x
rng::distance(join_view(deque<deque<int>>))/8192            6619 ns     19.1 ns    346.5x
rng::distance(join_view(deque<deque<int>>))/16384          13495 ns     33.2 ns    406.5x
rng::distance(join_view(deque<deque<int>>))/65536          53668 ns      114 ns    470.8x
rng::distance(join_view(deque<deque<int>>))/262144        236277 ns      475 ns    497.4x
rng::distance(join_view(deque<deque<int>>))/1048576       914177 ns     2157 ns    423.8x
-----------------------------------------------------------------------------------------
```



#### Segment size = 256

```
-----------------------------------------------------------------------------------------
Benchmark                                                    Before      After    Speedup
-----------------------------------------------------------------------------------------
std::distance(join_view(vector<vector<int>>))/50            38.1 ns     1.02 ns     37.4x
std::distance(join_view(vector<vector<int>>))/1024           689 ns     2.06 ns    334.5x
std::distance(join_view(vector<vector<int>>))/4096          2815 ns     7.01 ns    401.6x
std::distance(join_view(vector<vector<int>>))/8192          5507 ns     14.3 ns    385.1x
std::distance(join_view(vector<vector<int>>))/16384        11050 ns     33.7 ns    327.9x
std::distance(join_view(vector<vector<int>>))/65536        44197 ns      118 ns    374.6x
std::distance(join_view(vector<vector<int>>))/262144      175793 ns      449 ns    391.5x
std::distance(join_view(vector<vector<int>>))/1048576     703242 ns     2140 ns    328.7x
std::distance(join_view(deque<deque<int>>))/50              50.2 ns     6.12 ns      8.2x
std::distance(join_view(deque<deque<int>>))/1024             835 ns     11.4 ns     73.2x
std::distance(join_view(deque<deque<int>>))/4096            3353 ns     32.9 ns    101.9x
std::distance(join_view(deque<deque<int>>))/8192            6711 ns     64.2 ns    104.5x
std::distance(join_view(deque<deque<int>>))/16384          13231 ns      118 ns    112.1x
std::distance(join_view(deque<deque<int>>))/65536          53523 ns      556 ns     96.3x
std::distance(join_view(deque<deque<int>>))/262144        219101 ns     2166 ns    101.2x
std::distance(join_view(deque<deque<int>>))/1048576       880277 ns    15852 ns     55.5x
rng::distance(join_view(vector<vector<int>>))/50            37.7 ns     1.13 ns     33.4x
rng::distance(join_view(vector<vector<int>>))/1024           697 ns     2.14 ns    325.7x
rng::distance(join_view(vector<vector<int>>))/4096          2804 ns     7.52 ns    373.0x
rng::distance(join_view(vector<vector<int>>))/8192          5749 ns     15.2 ns    378.2x
rng::distance(join_view(vector<vector<int>>))/16384        11742 ns     34.8 ns    337.4x
rng::distance(join_view(vector<vector<int>>))/65536        47274 ns      116 ns    407.7x
rng::distance(join_view(vector<vector<int>>))/262144      187774 ns      444 ns    422.9x
rng::distance(join_view(vector<vector<int>>))/1048576     749724 ns     2109 ns    355.5x
rng::distance(join_view(deque<deque<int>>))/50              53.0 ns     6.09 ns      8.7x
rng::distance(join_view(deque<deque<int>>))/1024             895 ns     11.0 ns     81.4x
rng::distance(join_view(deque<deque<int>>))/4096            3825 ns     30.6 ns    125.0x
rng::distance(join_view(deque<deque<int>>))/8192            7550 ns     60.5 ns    124.8x
rng::distance(join_view(deque<deque<int>>))/16384          14847 ns      112 ns    132.6x
rng::distance(join_view(deque<deque<int>>))/65536          56888 ns      453 ns    125.6x
rng::distance(join_view(deque<deque<int>>))/262144        231395 ns     2034 ns    113.8x
rng::distance(join_view(deque<deque<int>>))/1048576       933093 ns    15012 ns     62.2x
-----------------------------------------------------------------------------------------
```

Addresses a subtask of #102817.

---------

Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Co-authored-by: A. Jiang <de34@live.cn>
5 files changed
tree: 9abf813327427b30fcfe1c63e0436ad99f273037
  1. .ci/
  2. .github/
  3. bolt/
  4. clang/
  5. clang-tools-extra/
  6. cmake/
  7. compiler-rt/
  8. cross-project-tests/
  9. flang/
  10. flang-rt/
  11. libc/
  12. libclc/
  13. libcxx/
  14. libcxxabi/
  15. libsycl/
  16. libunwind/
  17. lld/
  18. lldb/
  19. llvm/
  20. llvm-libgcc/
  21. mlir/
  22. offload/
  23. openmp/
  24. orc-rt/
  25. polly/
  26. runtimes/
  27. third-party/
  28. utils/
  29. .clang-format
  30. .clang-format-ignore
  31. .clang-tidy
  32. .git-blame-ignore-revs
  33. .gitattributes
  34. .gitignore
  35. .mailmap
  36. CODE_OF_CONDUCT.md
  37. CONTRIBUTING.md
  38. LICENSE.TXT
  39. pyproject.toml
  40. README.md
  41. SECURITY.md
README.md

The LLVM Compiler Infrastructure

OpenSSF Scorecard OpenSSF Best Practices libc++

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.