[scudo][standalone] Change the release loop for efficiency purposes

Summary:
On 32-b, the release algo loops multiple times over the freelist for a size
class, which lead to a decrease in performance when there were a lot of free
blocks.

This changes the release functions to loop only once over the freelist, at the
cost of using a little bit more memory for the release process: instead of
working on one region at a time, we pass the whole memory area covered by all
the regions for a given size class, and work on sub-areas of `RegionSize` in
this large area. For 64-b, we just have 1 sub-area encompassing the whole
region. Of course, not all the sub-areas within that large memory area will
belong to the class id we are working on, but those will just be left untouched
(which will not add to the RSS during the release process).

Reviewers: pcc, cferris, hctim, eugenis

Subscribers: llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D83993

GitOrigin-RevId: 998334da2b1536e7c8f11c560770c8d4cfacb354
4 files changed