[OpenMP][NVPTX] Rewrite CUDA intrinsics with NVVM intrinsics

This patch makes prep for dropping CUDA when compiling `deviceRTLs`.
CUDA intrinsics are replaced by NVVM intrinsics which refers to code in
`__clang_cuda_intrinsics.h`. We don't want to directly include it because in the
near future we're going to switch to OpenMP and by then the header cannot be
used anymore.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95327

GitOrigin-RevId: 27cc4a8138d819f78bc4fc028e39772bbda84dbd
1 file changed