| ========== |
| KernelInfo |
| ========== |
| |
| .. contents:: |
| :local: |
| |
| Introduction |
| ============ |
| |
| This LLVM IR pass reports various statistics for codes compiled for GPUs. The |
| goal of these statistics is to help identify bad code patterns and ways to |
| mitigate them. The pass operates at the LLVM IR level so that it can, in |
| theory, support any LLVM-based compiler for programming languages supporting |
| GPUs. |
| |
| By default, the pass runs at the end of LTO, and options like |
| ``-Rpass=kernel-info`` enable its remarks. Example ``opt`` and ``clang`` |
| command lines appear in the next section. |
| |
| Remarks include summary statistics (e.g., total size of static allocas) and |
| individual occurrences (e.g., source location of each alloca). Examples of the |
| output appear in tests in `llvm/test/Analysis/KernelInfo`. |
| |
| Example Command Lines |
| ===================== |
| |
| To analyze a C program as it appears to an LLVM GPU backend at the end of LTO: |
| |
| .. code-block:: shell |
| |
| $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ |
| -Rpass=kernel-info |
| |
| To analyze specified LLVM IR, perhaps previously generated by something like |
| ``clang -save-temps -g -fopenmp --offload-arch=native test.c``: |
| |
| .. code-block:: shell |
| |
| $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ |
| -pass-remarks=kernel-info -passes=kernel-info |
| |
| When specifying an LLVM pass pipeline on the command line, ``kernel-info`` still |
| runs at the end of LTO by default. ``-no-kernel-info-end-lto`` disables that |
| behavior so you can position ``kernel-info`` explicitly: |
| |
| .. code-block:: shell |
| |
| $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ |
| -Rpass=kernel-info \ |
| -Xoffload-linker --lto-newpm-passes='lto<O2>' |
| |
| $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ |
| -Rpass=kernel-info -mllvm -no-kernel-info-end-lto \ |
| -Xoffload-linker --lto-newpm-passes='module(kernel-info),lto<O2>' |
| |
| $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ |
| -pass-remarks=kernel-info \ |
| -passes='lto<O2>' |
| |
| $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ |
| -pass-remarks=kernel-info -no-kernel-info-end-lto \ |
| -passes='module(kernel-info),lto<O2>' |