[OpenMP][libomptarget] Enable usage of shared memory slots
Summary:
Allow the runtime to use the existing shared memory statically allocated slots.
When a variable is globalized, the underlying memory can be either shared or global memory (both have block-wide visibility). In this case, we allow that the storage to use a limited amount of shared memory that has been statically allocated already. Only if shared memory doesn't prove to be enough do we then invoke malloc() to create a new global memory slot.
Reviewers: ABataev, carlo.bertolli, grokos, caomhin
Reviewed By: grokos
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D44486
git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@327639 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu b/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
index 41976f6..e739ca9 100644
--- a/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
+++ b/libomptarget/deviceRTLs/nvptx/src/data_sharing.cu
@@ -342,16 +342,7 @@
DataSharingState.SlotPtr[WID] = RootS;
DataSharingState.TailPtr[WID] = RootS;
-
- // Initialize the stack pointer to be equal to the end of
- // the shared memory slot. This way we ensure that the global
- // version of the stack will be used.
- // TODO: remove this:
- DataSharingState.StackPtr[WID] = RootS->DataEnd;
-
- // TODO: When the use of shared memory is enabled we will have to
- // initialize this with the start of the Data region like so:
- // DataSharingState.StackPtr[WID] = (void *)&RootS->Data[0];
+ DataSharingState.StackPtr[WID] = (void *)&RootS->Data[0];
// We initialize the list of references to arguments here.
omptarget_nvptx_globalArgs.Init();
@@ -368,11 +359,6 @@
// Called by: master, TODO: call by workers
EXTERN void* __kmpc_data_sharing_push_stack(size_t DataSize,
int16_t UseSharedMemory) {
- // TODO: Add shared memory support. For now, use global memory only for
- // storing the data sharing slots so ignore the pre-allocated
- // shared memory slot.
-
- // Use global memory for storing the stack.
if (IsMasterThread()) {
unsigned WID = getWarpId();