xref: /llvm-project/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst (revision 5674a3c88088e668b684326c2194a6282e8270ff)
1=====================================
2Cross Translation Unit (CTU) Analysis
3=====================================
4
5Normally, static analysis works in the boundary of one translation unit (TU).
6However, with additional steps and configuration we can enable the analysis to inline the definition of a function from
7another TU.
8
9.. contents::
10   :local:
11
12Overview
13________
14CTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH
15files or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static
16analysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options
17of the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This
18process can be automated by other tools, like `CodeChecker <https://github.com/Ericsson/codechecker>`_ and scan-build-py
19(preference for the former).
20
21PCH-based analysis
22__________________
23The analysis needs the PCH dumps of all the translations units used in the project.
24These can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem.
25The index, which maps symbols' USR names to PCH dumps containing them must also be generated by the
26`clang-extdef-mapping`. Entries in the index *must* have an `.ast` suffix if the goal
27is to use PCH-based analysis, as the lack of that extension signals that the entry is to be used as a source-file, and parsed on-demand.
28This tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to
29determine the compilation flags used.
30The analysis invocation must be provided with the directory which contains the dumps and the mapping files.
31
32
33Manual CTU Analysis
34###################
35Let's consider these source files in our minimal example:
36
37.. code-block:: cpp
38
39  // main.cpp
40  int foo();
41
42  int main() {
43    return 3 / foo();
44  }
45
46.. code-block:: cpp
47
48  // foo.cpp
49  int foo() {
50    return 0;
51  }
52
53And a compilation database:
54
55.. code-block:: bash
56
57  [
58    {
59      "directory": "/path/to/your/project",
60      "command": "clang++ -c foo.cpp -o foo.o",
61      "file": "foo.cpp"
62    },
63    {
64      "directory": "/path/to/your/project",
65      "command": "clang++ -c main.cpp -o main.o",
66      "file": "main.cpp"
67    }
68  ]
69
70We'd like to analyze `main.cpp` and discover the division by zero bug.
71In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file
72of `foo.cpp`:
73
74.. code-block:: bash
75
76  $ pwd $ /path/to/your/project
77  $ clang++ -emit-ast -o foo.cpp.ast foo.cpp
78  $ # Check that the .ast file is generated:
79  $ ls
80  compile_commands.json  foo.cpp.ast  foo.cpp  main.cpp
81  $
82
83The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the
84source files in format `<USR-Length>:<USR> <File-Path>`:
85
86.. code-block:: bash
87
88  $ clang-extdef-mapping -p . foo.cpp
89  9:c:@F@foo# /path/to/your/project/foo.cpp
90  $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
91
92We have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files:
93
94.. code-block:: bash
95
96  $ sed -i -e "s/.cpp/.cpp.ast/g" externalDefMap.txt
97
98We still have to further modify the `externalDefMap.txt` file to contain relative paths:
99
100.. code-block:: bash
101
102  $ sed -i -e "s|$(pwd)/||g" externalDefMap.txt
103
104Now everything is available for the CTU analysis.
105We have to feed Clang with CTU specific extra arguments:
106
107.. code-block:: bash
108
109  $ pwd
110  /path/to/your/project
111  $ clang++ --analyze \
112      -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
113      -Xclang -analyzer-config -Xclang ctu-dir=. \
114      -Xclang -analyzer-output=plist-multi-file \
115      main.cpp
116  main.cpp:5:12: warning: Division by zero
117    return 3 / foo();
118           ~~^~~~~~~
119  1 warning generated.
120  $ # The plist file with the result is generated.
121  $ ls -F
122  compile_commands.json  externalDefMap.txt  foo.ast  foo.cpp  foo.cpp.ast  main.cpp  main.plist
123  $
124
125This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
126`CodeChecker` or `scan-build-py`.
127
128Automated CTU Analysis with CodeChecker
129#######################################
130The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
131Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
132
133.. code-block:: bash
134
135  $ CodeChecker analyze --ctu compile_commands.json -o reports
136  $ ls -F
137  compile_commands.json  foo.cpp  foo.cpp.ast  main.cpp  reports/
138  $ tree reports
139  reports
140  ├── compile_cmd.json
141  ├── compiler_info.json
142  ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
143  ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
144  ├── metadata.json
145  └── unique_compile_commands.json
146
147  0 directories, 6 files
148  $
149
150The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
151E.g. one may use `CodeChecker parse` to view the results in command line:
152
153.. code-block:: bash
154
155  $ CodeChecker parse reports
156  [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
157    return 3 / foo();
158             ^
159
160  Found 1 defect(s) in main.cpp
161
162
163  ----==== Summary ====----
164  -----------------------
165  Filename | Report count
166  -----------------------
167  main.cpp |            1
168  -----------------------
169  -----------------------
170  Severity | Report count
171  -----------------------
172  HIGH     |            1
173  -----------------------
174  ----=================----
175  Total number of reports: 1
176  ----=================----
177
178Or we can use `CodeChecker parse -e html` to export the results into HTML format:
179
180.. code-block:: bash
181
182  $ CodeChecker parse -e html -o html_out reports
183  $ firefox html_out/index.html
184
185Automated CTU Analysis with scan-build-py (don't do it)
186#############################################################
187We actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU.
188`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
189
190Example usage of scan-build-py:
191
192.. code-block:: bash
193
194  $ /your/path/to/llvm-project/clang/tools/scan-build-py/bin/analyze-build --ctu
195  analyze-build: Run 'scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk' to examine bug reports.
196  $ /your/path/to/llvm-project/clang/tools/scan-view/bin/scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk
197  Starting scan-view at: http://127.0.0.1:8181
198    Use Ctrl-C to exit.
199  [6336:6431:0717/175357.633914:ERROR:browser_process_sub_thread.cc(209)] Waited 5 ms for network service
200  Opening in existing browser session.
201  ^C
202  $
203
204.. _ctu-on-demand:
205
206On-demand analysis
207__________________
208The analysis produces the necessary AST structure of external TUs during analysis. This requires the
209exact compiler invocations for each TU, which can be generated by hand, or by tools driving the analyzer.
210The compiler invocation is a shell command that could be used to compile the TU-s main source file.
211The mapping from absolute source file paths of a TU to lists of compilation command segments used to
212compile said TU are given in YAML format referred to as `invocation list`, and must be passed as an
213analyzer-config argument.
214The index, which maps function USR names to source files containing them must also be generated by the
215`clang-extdef-mapping`. Entries in the index must *not* have an `.ast` suffix if the goal
216is to use On-demand analysis, as that extension signals that the entry is to be used as an PCH-dump.
217The mapping of external definitions implicitly uses a
218:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used.
219The analysis invocation must be provided with the directory which contains the mapping
220files, and the `invocation list` which is used to determine compiler flags.
221
222
223Manual CTU Analysis
224###################
225
226Let's consider these source files in our minimal example:
227
228.. code-block:: cpp
229
230  // main.cpp
231  int foo();
232
233  int main() {
234    return 3 / foo();
235  }
236
237.. code-block:: cpp
238
239  // foo.cpp
240  int foo() {
241    return 0;
242  }
243
244The compilation database:
245
246.. code-block:: bash
247
248  [
249    {
250      "directory": "/path/to/your/project",
251      "command": "clang++ -c foo.cpp -o foo.o",
252      "file": "foo.cpp"
253    },
254    {
255      "directory": "/path/to/your/project",
256      "command": "clang++ -c main.cpp -o main.o",
257      "file": "main.cpp"
258    }
259  ]
260
261The `invocation list`:
262
263.. code-block:: bash
264
265  "/path/to/your/project/foo.cpp":
266    - "clang++"
267    - "-c"
268    - "/path/to/your/project/foo.cpp"
269    - "-o"
270    - "/path/to/your/project/foo.o"
271
272  "/path/to/your/project/main.cpp":
273    - "clang++"
274    - "-c"
275    - "/path/to/your/project/main.cpp"
276    - "-o"
277    - "/path/to/your/project/main.o"
278
279We'd like to analyze `main.cpp` and discover the division by zero bug.
280As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of
281external definitions in the source files in format `<USR-Length>:<USR> <File-Path>`:
282
283.. code-block:: bash
284
285  $ clang-extdef-mapping -p . foo.cpp
286  9:c:@F@foo# /path/to/your/project/foo.cpp
287  $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
288
289Now everything is available for the CTU analysis.
290We have to feed Clang with CTU specific extra arguments:
291
292.. code-block:: bash
293
294  $ pwd
295  /path/to/your/project
296  $ clang++ --analyze \
297      -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
298      -Xclang -analyzer-config -Xclang ctu-dir=. \
299      -Xclang -analyzer-config -Xclang ctu-invocation-list=invocations.yaml \
300      -Xclang -analyzer-output=plist-multi-file \
301      main.cpp
302  main.cpp:5:12: warning: Division by zero
303    return 3 / foo();
304           ~~^~~~~~~
305  1 warning generated.
306  $ # The plist file with the result is generated.
307  $ ls -F
308  compile_commands.json  externalDefMap.txt  foo.cpp  main.cpp  main.plist
309  $
310
311This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
312`CodeChecker` or `scan-build-py`.
313
314Automated CTU Analysis with CodeChecker
315#######################################
316The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
317Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
318
319.. code-block:: bash
320
321  $ CodeChecker analyze --ctu --ctu-ast-loading-mode on-demand compile_commands.json -o reports
322  $ ls -F
323  compile_commands.json  foo.cpp main.cpp  reports/
324  $ tree reports
325  reports
326  ├── compile_cmd.json
327  ├── compiler_info.json
328  ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
329  ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
330  ├── metadata.json
331  └── unique_compile_commands.json
332
333  0 directories, 6 files
334  $
335
336The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
337E.g. one may use `CodeChecker parse` to view the results in command line:
338
339.. code-block:: bash
340
341  $ CodeChecker parse reports
342  [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
343    return 3 / foo();
344             ^
345
346  Found 1 defect(s) in main.cpp
347
348
349  ----==== Summary ====----
350  -----------------------
351  Filename | Report count
352  -----------------------
353  main.cpp |            1
354  -----------------------
355  -----------------------
356  Severity | Report count
357  -----------------------
358  HIGH     |            1
359  -----------------------
360  ----=================----
361  Total number of reports: 1
362  ----=================----
363
364Or we can use `CodeChecker parse -e html` to export the results into HTML format:
365
366.. code-block:: bash
367
368  $ CodeChecker parse -e html -o html_out reports
369  $ firefox html_out/index.html
370
371Automated CTU Analysis with scan-build-py (don't do it)
372#######################################################
373We actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU.
374`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
375
376Currently On-demand analysis is not supported with `scan-build-py`.
377