[libc] Handle differing wavefront sizes correctly in the AMDHSA loader (#117788)

Summary:
The AMDGPU backend can handle wavefront sizes of 32 and 64, with the
native hardware preferring one or the other. The user can override the
hardware with `-mwavefrontsize64` or `-mwavefrontsize32` which
previously wasn't handled. We need to know the wavefront size to know
how much memory to allocate and how to index the RPC buffer. There isn't
a good way to do this with ROCm so we just use the LLVM support for
offloading to check this from the image.
GitOrigin-RevId: 38049dc8eef0dca7e82c25e4012228a9a135e255
5 files changed