e217da741e023bec7a3d4359c0af1037776bcbd0 - llvm-project/libc

commit: e217da741e023bec7a3d4359c0af1037776bcbd0
[log]
author: Joseph Huber <huberjn@outlook.com>
Sun Feb 22 18:23:03 2026 -0600
committer: Copybara-Service <copybara-worker@google.com>
Mon Feb 23 05:05:01 2026 -0800
tree: 023b5ea8cba05f04588f2fa9eabcbd68525e1e2a
parent: 6a1feca7fc997364b99c0d8c312979452579a1b9 [diff]

[libc] Update the GPU allocator to work under post-Volta ITS

Summary:
There were several gaps that caused the allocator not to work under
NVIDIA's independent thread scheduling model. The problems (I know of)
are fixed in this commit. Generally this required using correct masks,
synchronizing before a few dependent operations, and overhauling the
allocate function to stick with the existing mask instead of querying
it.

The general idiom here is that at the start we obtain a single mask and
opportunistically use it. Every use must specifically sync this subset.
I.e. query a single time and never change it.

This passes most tests, however I have encountered two issues.
1. A bug in `nvlink` failing to link symbols called in 'free'
2. A deadlock under heavy divergence caused by IPSCCP altering control
   flow.

I will address these later, but for now this makes the *source* correct
so it can be enabled by anyone else if they need it.

GitOrigin-RevId: eac18e783f034bd294a82cd0e69a7abf73583d28

src/__support/GPU/allocator.cpp[diff]

1 file changed

tree: 023b5ea8cba05f04588f2fa9eabcbd68525e1e2a