[OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP

Once we switch to build deviceRTLs with OpenMP, primitives and CUDA
intrinsics cannot be used directly anymore because `__device__` is not recognized
by OpenMP compiler. To avoid involving all CUDA internal headers we had in `clang`,
we forward declared these functions. Eventually they will be transformed into
right LLVM instrinsics.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95058

GitOrigin-RevId: 33a5d212c6198af2bd902bb8e4cfd0f0bec0114f
1 file changed