1========== 2KernelInfo 3========== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11This LLVM IR pass reports various statistics for codes compiled for GPUs. The 12goal of these statistics is to help identify bad code patterns and ways to 13mitigate them. The pass operates at the LLVM IR level so that it can, in 14theory, support any LLVM-based compiler for programming languages supporting 15GPUs. 16 17By default, the pass runs at the end of LTO, and options like 18``-Rpass=kernel-info`` enable its remarks. Example ``opt`` and ``clang`` 19command lines appear in the next section. 20 21Remarks include summary statistics (e.g., total size of static allocas) and 22individual occurrences (e.g., source location of each alloca). Examples of the 23output appear in tests in `llvm/test/Analysis/KernelInfo`. 24 25Example Command Lines 26===================== 27 28To analyze a C program as it appears to an LLVM GPU backend at the end of LTO: 29 30.. code-block:: shell 31 32 $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ 33 -Rpass=kernel-info 34 35To analyze specified LLVM IR, perhaps previously generated by something like 36``clang -save-temps -g -fopenmp --offload-arch=native test.c``: 37 38.. code-block:: shell 39 40 $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ 41 -pass-remarks=kernel-info -passes=kernel-info 42 43When specifying an LLVM pass pipeline on the command line, ``kernel-info`` still 44runs at the end of LTO by default. ``-no-kernel-info-end-lto`` disables that 45behavior so you can position ``kernel-info`` explicitly: 46 47.. code-block:: shell 48 49 $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ 50 -Rpass=kernel-info \ 51 -Xoffload-linker --lto-newpm-passes='lto<O2>' 52 53 $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \ 54 -Rpass=kernel-info -mllvm -no-kernel-info-end-lto \ 55 -Xoffload-linker --lto-newpm-passes='module(kernel-info),lto<O2>' 56 57 $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ 58 -pass-remarks=kernel-info \ 59 -passes='lto<O2>' 60 61 $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \ 62 -pass-remarks=kernel-info -no-kernel-info-end-lto \ 63 -passes='module(kernel-info),lto<O2>' 64