| Rewriting generic-mode kernel with a customized state machine. [OMP131] |
| ======================================================================= |
| |
| .. _omp131: |
| |
| This optimization remark indicates that a generic-mode kernel on the device was |
| specialized for the given target region. When offloading in generic-mode, a |
| state machine is required to schedule the work between the parallel worker |
| threads. This optimization specializes the state machine in cases where there is |
| a known number of parallel regions inside the kernel. A much simpler state |
| machine can be used if it is known that there is no nested parallelism and the |
| number of regions to schedule is a static amount. |
| |
| Examples |
| -------- |
| |
| This optimization should occur on any generic-mode kernel that has visibility on |
| all parallel regions, but cannot be moved to SPMD-mode and has no nested |
| parallelism. |
| |
| .. code-block:: c++ |
| |
| #pragma omp declare target |
| int TID; |
| #pragma omp end declare target |
| |
| void foo() { |
| #pragma omp target |
| { |
| TID = omp_get_thread_num(); |
| #pragma omp parallel |
| { |
| work(); |
| } |
| } |
| } |
| |
| .. code-block:: console |
| |
| $ clang++ -fopenmp -fopenmp-targets=nvptx64 -O2 -Rpass=openmp-opt omp131.cpp |
| omp131.cpp:8:1: remark: Rewriting generic-mode kernel with a customized state machine. [OMP131] |
| #pragma omp target |
| ^ |
| |
| Diagnostic Scope |
| ---------------- |
| |
| OpenMP target offloading optimization remark. |