xref: /llvm-project/llvm/docs/KernelInfo.rst (revision 18f8106f310ee702046a11f360af47947c030d2e)
1==========
2KernelInfo
3==========
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11This LLVM IR pass reports various statistics for codes compiled for GPUs.  The
12goal of these statistics is to help identify bad code patterns and ways to
13mitigate them.  The pass operates at the LLVM IR level so that it can, in
14theory, support any LLVM-based compiler for programming languages supporting
15GPUs.
16
17By default, the pass runs at the end of LTO, and options like
18``-Rpass=kernel-info`` enable its remarks.  Example ``opt`` and ``clang``
19command lines appear in the next section.
20
21Remarks include summary statistics (e.g., total size of static allocas) and
22individual occurrences (e.g., source location of each alloca).  Examples of the
23output appear in tests in `llvm/test/Analysis/KernelInfo`.
24
25Example Command Lines
26=====================
27
28To analyze a C program as it appears to an LLVM GPU backend at the end of LTO:
29
30.. code-block:: shell
31
32  $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
33      -Rpass=kernel-info
34
35To analyze specified LLVM IR, perhaps previously generated by something like
36``clang -save-temps -g -fopenmp --offload-arch=native test.c``:
37
38.. code-block:: shell
39
40  $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
41      -pass-remarks=kernel-info -passes=kernel-info
42
43When specifying an LLVM pass pipeline on the command line, ``kernel-info`` still
44runs at the end of LTO by default.  ``-no-kernel-info-end-lto`` disables that
45behavior so you can position ``kernel-info`` explicitly:
46
47.. code-block:: shell
48
49  $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
50      -Rpass=kernel-info \
51      -Xoffload-linker --lto-newpm-passes='lto<O2>'
52
53  $ clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
54      -Rpass=kernel-info -mllvm -no-kernel-info-end-lto \
55      -Xoffload-linker --lto-newpm-passes='module(kernel-info),lto<O2>'
56
57  $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
58      -pass-remarks=kernel-info \
59      -passes='lto<O2>'
60
61  $ opt -disable-output test-openmp-nvptx64-nvidia-cuda-sm_70.bc \
62      -pass-remarks=kernel-info -no-kernel-info-end-lto \
63      -passes='module(kernel-info),lto<O2>'
64