[OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3
The synchronization at the end of parallel region cannot make sure all threads
exit the scope. As a result, the assertions right after it might be hit, and
further the `state::assumeInitialState(IsSPMD)` in `__kmpc_target_deinit` may
not hold as well. We either add a synchronization right after the parallel region,
or remove the assertions and assuptions. Here we choose the first one as those
assertions and assumptions can help optimizations.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112861
GitOrigin-RevId: 025f5492401489269ab980910f4fda98f5b06bd1
diff --git a/libomptarget/DeviceRTL/src/Parallelism.cpp b/libomptarget/DeviceRTL/src/Parallelism.cpp
index 8dcda21..ae7df3f 100644
--- a/libomptarget/DeviceRTL/src/Parallelism.cpp
+++ b/libomptarget/DeviceRTL/src/Parallelism.cpp
@@ -123,6 +123,11 @@
synchronize::threadsAligned();
}
+ // Synchronize all threads to make sure every thread exits the scope above;
+ // otherwise the following assertions and the assumption in
+ // __kmpc_target_deinit may not hold.
+ synchronize::threadsAligned();
+
ASSERT(state::ParallelTeamSize == 1u);
ASSERT(icv::ActiveLevel == 0u);
ASSERT(icv::Level == 0u);