xref: /netbsd-src/external/apache2/llvm/dist/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst (revision e038c9c4676b0f19b1b7dd08a940c6ed64a6d5ae)
17330f729Sjoerg=====================================
27330f729SjoergCross Translation Unit (CTU) Analysis
37330f729Sjoerg=====================================
47330f729Sjoerg
57330f729SjoergNormally, static analysis works in the boundary of one translation unit (TU).
6*e038c9c4SjoergHowever, with additional steps and configuration we can enable the analysis to inline the definition of a function from
7*e038c9c4Sjoerganother TU.
87330f729Sjoerg
97330f729Sjoerg.. contents::
107330f729Sjoerg   :local:
117330f729Sjoerg
12*e038c9c4SjoergOverview
13*e038c9c4Sjoerg________
14*e038c9c4SjoergCTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH
15*e038c9c4Sjoergfiles or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static
16*e038c9c4Sjoerganalysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options
17*e038c9c4Sjoergof the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This
18*e038c9c4Sjoergprocess can be automated by other tools, like `CodeChecker <https://github.com/Ericsson/codechecker>`_ and scan-build-py
19*e038c9c4Sjoerg(preference for the former).
207330f729Sjoerg
21*e038c9c4SjoergPCH-based analysis
22*e038c9c4Sjoerg__________________
23*e038c9c4SjoergThe analysis needs the PCH dumps of all the translations units used in the project.
24*e038c9c4SjoergThese can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem.
25*e038c9c4SjoergThe index, which maps symbols' USR names to PCH dumps containing them must also be generated by the
26*e038c9c4Sjoerg`clang-extdef-mapping`. Entries in the index *must* have an `.ast` suffix if the goal
27*e038c9c4Sjoergis to use PCH-based analysis, as the lack of that extension signals that the entry is to be used as a source-file, and parsed on-demand.
28*e038c9c4SjoergThis tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to
29*e038c9c4Sjoergdetermine the compilation flags used.
30*e038c9c4SjoergThe analysis invocation must be provided with the directory which contains the dumps and the mapping files.
31*e038c9c4Sjoerg
32*e038c9c4Sjoerg
33*e038c9c4SjoergManual CTU Analysis
34*e038c9c4Sjoerg###################
357330f729SjoergLet's consider these source files in our minimal example:
367330f729Sjoerg
377330f729Sjoerg.. code-block:: cpp
387330f729Sjoerg
397330f729Sjoerg  // main.cpp
407330f729Sjoerg  int foo();
417330f729Sjoerg
427330f729Sjoerg  int main() {
437330f729Sjoerg    return 3 / foo();
447330f729Sjoerg  }
457330f729Sjoerg
467330f729Sjoerg.. code-block:: cpp
477330f729Sjoerg
487330f729Sjoerg  // foo.cpp
497330f729Sjoerg  int foo() {
507330f729Sjoerg    return 0;
517330f729Sjoerg  }
527330f729Sjoerg
537330f729SjoergAnd a compilation database:
547330f729Sjoerg
557330f729Sjoerg.. code-block:: bash
567330f729Sjoerg
577330f729Sjoerg  [
587330f729Sjoerg    {
597330f729Sjoerg      "directory": "/path/to/your/project",
607330f729Sjoerg      "command": "clang++ -c foo.cpp -o foo.o",
617330f729Sjoerg      "file": "foo.cpp"
627330f729Sjoerg    },
637330f729Sjoerg    {
647330f729Sjoerg      "directory": "/path/to/your/project",
657330f729Sjoerg      "command": "clang++ -c main.cpp -o main.o",
667330f729Sjoerg      "file": "main.cpp"
677330f729Sjoerg    }
687330f729Sjoerg  ]
697330f729Sjoerg
707330f729SjoergWe'd like to analyze `main.cpp` and discover the division by zero bug.
71*e038c9c4SjoergIn order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file
72*e038c9c4Sjoergof `foo.cpp`:
737330f729Sjoerg
747330f729Sjoerg.. code-block:: bash
757330f729Sjoerg
767330f729Sjoerg  $ pwd $ /path/to/your/project
777330f729Sjoerg  $ clang++ -emit-ast -o foo.cpp.ast foo.cpp
787330f729Sjoerg  $ # Check that the .ast file is generated:
797330f729Sjoerg  $ ls
807330f729Sjoerg  compile_commands.json  foo.cpp.ast  foo.cpp  main.cpp
817330f729Sjoerg  $
827330f729Sjoerg
83*e038c9c4SjoergThe next step is to create a CTU index file which holds the `USR` name and location of external definitions in the
84*e038c9c4Sjoergsource files:
857330f729Sjoerg
867330f729Sjoerg.. code-block:: bash
877330f729Sjoerg
887330f729Sjoerg  $ clang-extdef-mapping -p . foo.cpp
897330f729Sjoerg  c:@F@foo# /path/to/your/project/foo.cpp
907330f729Sjoerg  $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
917330f729Sjoerg
927330f729SjoergWe have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files:
937330f729Sjoerg
947330f729Sjoerg.. code-block:: bash
957330f729Sjoerg
967330f729Sjoerg  $ sed -i -e "s/.cpp/.cpp.ast/g" externalDefMap.txt
977330f729Sjoerg
987330f729SjoergWe still have to further modify the `externalDefMap.txt` file to contain relative paths:
997330f729Sjoerg
1007330f729Sjoerg.. code-block:: bash
1017330f729Sjoerg
1027330f729Sjoerg  $ sed -i -e "s|$(pwd)/||g" externalDefMap.txt
1037330f729Sjoerg
1047330f729SjoergNow everything is available for the CTU analysis.
1057330f729SjoergWe have to feed Clang with CTU specific extra arguments:
1067330f729Sjoerg
1077330f729Sjoerg.. code-block:: bash
1087330f729Sjoerg
1097330f729Sjoerg  $ pwd
1107330f729Sjoerg  /path/to/your/project
111*e038c9c4Sjoerg  $ clang++ --analyze \
112*e038c9c4Sjoerg      -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
113*e038c9c4Sjoerg      -Xclang -analyzer-config -Xclang ctu-dir=. \
114*e038c9c4Sjoerg      -Xclang -analyzer-output=plist-multi-file \
115*e038c9c4Sjoerg      main.cpp
1167330f729Sjoerg  main.cpp:5:12: warning: Division by zero
1177330f729Sjoerg    return 3 / foo();
1187330f729Sjoerg           ~~^~~~~~~
1197330f729Sjoerg  1 warning generated.
1207330f729Sjoerg  $ # The plist file with the result is generated.
121*e038c9c4Sjoerg  $ ls -F
1227330f729Sjoerg  compile_commands.json  externalDefMap.txt  foo.ast  foo.cpp  foo.cpp.ast  main.cpp  main.plist
1237330f729Sjoerg  $
1247330f729Sjoerg
125*e038c9c4SjoergThis manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
126*e038c9c4Sjoerg`CodeChecker` or `scan-build-py`.
1277330f729Sjoerg
1287330f729SjoergAutomated CTU Analysis with CodeChecker
129*e038c9c4Sjoerg#######################################
1307330f729SjoergThe `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
1317330f729SjoergOnce we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
1327330f729Sjoerg
1337330f729Sjoerg.. code-block:: bash
1347330f729Sjoerg
1357330f729Sjoerg  $ CodeChecker analyze --ctu compile_commands.json -o reports
136*e038c9c4Sjoerg  $ ls -F
137*e038c9c4Sjoerg  compile_commands.json  foo.cpp  foo.cpp.ast  main.cpp  reports/
1387330f729Sjoerg  $ tree reports
1397330f729Sjoerg  reports
1407330f729Sjoerg  ├── compile_cmd.json
1417330f729Sjoerg  ├── compiler_info.json
1427330f729Sjoerg  ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
1437330f729Sjoerg  ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
1447330f729Sjoerg  ├── metadata.json
1457330f729Sjoerg  └── unique_compile_commands.json
1467330f729Sjoerg
1477330f729Sjoerg  0 directories, 6 files
1487330f729Sjoerg  $
1497330f729Sjoerg
1507330f729SjoergThe `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
1517330f729SjoergE.g. one may use `CodeChecker parse` to view the results in command line:
1527330f729Sjoerg
1537330f729Sjoerg.. code-block:: bash
1547330f729Sjoerg
1557330f729Sjoerg  $ CodeChecker parse reports
1567330f729Sjoerg  [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
1577330f729Sjoerg    return 3 / foo();
1587330f729Sjoerg             ^
1597330f729Sjoerg
1607330f729Sjoerg  Found 1 defect(s) in main.cpp
1617330f729Sjoerg
1627330f729Sjoerg
1637330f729Sjoerg  ----==== Summary ====----
1647330f729Sjoerg  -----------------------
1657330f729Sjoerg  Filename | Report count
1667330f729Sjoerg  -----------------------
1677330f729Sjoerg  main.cpp |            1
1687330f729Sjoerg  -----------------------
1697330f729Sjoerg  -----------------------
1707330f729Sjoerg  Severity | Report count
1717330f729Sjoerg  -----------------------
1727330f729Sjoerg  HIGH     |            1
1737330f729Sjoerg  -----------------------
1747330f729Sjoerg  ----=================----
1757330f729Sjoerg  Total number of reports: 1
1767330f729Sjoerg  ----=================----
1777330f729Sjoerg
1787330f729SjoergOr we can use `CodeChecker parse -e html` to export the results into HTML format:
1797330f729Sjoerg
1807330f729Sjoerg.. code-block:: bash
1817330f729Sjoerg
1827330f729Sjoerg  $ CodeChecker parse -e html -o html_out reports
1837330f729Sjoerg  $ firefox html_out/index.html
1847330f729Sjoerg
1857330f729SjoergAutomated CTU Analysis with scan-build-py (don't do it)
186*e038c9c4Sjoerg#############################################################
187*e038c9c4SjoergWe actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU.
188*e038c9c4Sjoerg`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
1897330f729Sjoerg
1907330f729SjoergExample usage of scan-build-py:
1917330f729Sjoerg
1927330f729Sjoerg.. code-block:: bash
1937330f729Sjoerg
1947330f729Sjoerg  $ /your/path/to/llvm-project/clang/tools/scan-build-py/bin/analyze-build --ctu
1957330f729Sjoerg  analyze-build: Run 'scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk' to examine bug reports.
1967330f729Sjoerg  $ /your/path/to/llvm-project/clang/tools/scan-view/bin/scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk
1977330f729Sjoerg  Starting scan-view at: http://127.0.0.1:8181
1987330f729Sjoerg    Use Ctrl-C to exit.
1997330f729Sjoerg  [6336:6431:0717/175357.633914:ERROR:browser_process_sub_thread.cc(209)] Waited 5 ms for network service
2007330f729Sjoerg  Opening in existing browser session.
2017330f729Sjoerg  ^C
2027330f729Sjoerg  $
203*e038c9c4Sjoerg
204*e038c9c4Sjoerg.. _ctu-on-demand:
205*e038c9c4Sjoerg
206*e038c9c4SjoergOn-demand analysis
207*e038c9c4Sjoerg__________________
208*e038c9c4SjoergThe analysis produces the necessary AST structure of external TUs during analysis. This requires the
209*e038c9c4Sjoergexact compiler invocations for each TU, which can be generated by hand, or by tools driving the analyzer.
210*e038c9c4SjoergThe compiler invocation is a shell command that could be used to compile the TU-s main source file.
211*e038c9c4SjoergThe mapping from absolute source file paths of a TU to lists of compilation command segments used to
212*e038c9c4Sjoergcompile said TU are given in YAML format referred to as `invocation list`, and must be passed as an
213*e038c9c4Sjoerganalyer-config argument.
214*e038c9c4SjoergThe index, which maps function USR names to source files containing them must also be generated by the
215*e038c9c4Sjoerg`clang-extdef-mapping`. Entries in the index must *not* have an `.ast` suffix if the goal
216*e038c9c4Sjoergis to use On-demand analysis, as that extension signals that the entry is to be used as an PCH-dump.
217*e038c9c4SjoergThe mapping of external definitions implicitly uses a
218*e038c9c4Sjoerg:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used.
219*e038c9c4SjoergThe analysis invocation must be provided with the directory which contains the mapping
220*e038c9c4Sjoergfiles, and the `invocation list` which is used to determine compiler flags.
221*e038c9c4Sjoerg
222*e038c9c4Sjoerg
223*e038c9c4SjoergManual CTU Analysis
224*e038c9c4Sjoerg###################
225*e038c9c4Sjoerg
226*e038c9c4SjoergLet's consider these source files in our minimal example:
227*e038c9c4Sjoerg
228*e038c9c4Sjoerg.. code-block:: cpp
229*e038c9c4Sjoerg
230*e038c9c4Sjoerg  // main.cpp
231*e038c9c4Sjoerg  int foo();
232*e038c9c4Sjoerg
233*e038c9c4Sjoerg  int main() {
234*e038c9c4Sjoerg    return 3 / foo();
235*e038c9c4Sjoerg  }
236*e038c9c4Sjoerg
237*e038c9c4Sjoerg.. code-block:: cpp
238*e038c9c4Sjoerg
239*e038c9c4Sjoerg  // foo.cpp
240*e038c9c4Sjoerg  int foo() {
241*e038c9c4Sjoerg    return 0;
242*e038c9c4Sjoerg  }
243*e038c9c4Sjoerg
244*e038c9c4SjoergThe compilation database:
245*e038c9c4Sjoerg
246*e038c9c4Sjoerg.. code-block:: bash
247*e038c9c4Sjoerg
248*e038c9c4Sjoerg  [
249*e038c9c4Sjoerg    {
250*e038c9c4Sjoerg      "directory": "/path/to/your/project",
251*e038c9c4Sjoerg      "command": "clang++ -c foo.cpp -o foo.o",
252*e038c9c4Sjoerg      "file": "foo.cpp"
253*e038c9c4Sjoerg    },
254*e038c9c4Sjoerg    {
255*e038c9c4Sjoerg      "directory": "/path/to/your/project",
256*e038c9c4Sjoerg      "command": "clang++ -c main.cpp -o main.o",
257*e038c9c4Sjoerg      "file": "main.cpp"
258*e038c9c4Sjoerg    }
259*e038c9c4Sjoerg  ]
260*e038c9c4Sjoerg
261*e038c9c4SjoergThe `invocation list`:
262*e038c9c4Sjoerg
263*e038c9c4Sjoerg.. code-block:: bash
264*e038c9c4Sjoerg
265*e038c9c4Sjoerg  "/path/to/your/project/foo.cpp":
266*e038c9c4Sjoerg    - "clang++"
267*e038c9c4Sjoerg    - "-c"
268*e038c9c4Sjoerg    - "/path/to/your/project/foo.cpp"
269*e038c9c4Sjoerg    - "-o"
270*e038c9c4Sjoerg    - "/path/to/your/project/foo.o"
271*e038c9c4Sjoerg
272*e038c9c4Sjoerg  "/path/to/your/project/main.cpp":
273*e038c9c4Sjoerg    - "clang++"
274*e038c9c4Sjoerg    - "-c"
275*e038c9c4Sjoerg    - "/path/to/your/project/main.cpp"
276*e038c9c4Sjoerg    - "-o"
277*e038c9c4Sjoerg    - "/path/to/your/project/main.o"
278*e038c9c4Sjoerg
279*e038c9c4SjoergWe'd like to analyze `main.cpp` and discover the division by zero bug.
280*e038c9c4SjoergAs we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of
281*e038c9c4Sjoergexternal definitions in the source files:
282*e038c9c4Sjoerg
283*e038c9c4Sjoerg.. code-block:: bash
284*e038c9c4Sjoerg
285*e038c9c4Sjoerg  $ clang-extdef-mapping -p . foo.cpp
286*e038c9c4Sjoerg  c:@F@foo# /path/to/your/project/foo.cpp
287*e038c9c4Sjoerg  $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
288*e038c9c4Sjoerg
289*e038c9c4SjoergNow everything is available for the CTU analysis.
290*e038c9c4SjoergWe have to feed Clang with CTU specific extra arguments:
291*e038c9c4Sjoerg
292*e038c9c4Sjoerg.. code-block:: bash
293*e038c9c4Sjoerg
294*e038c9c4Sjoerg  $ pwd
295*e038c9c4Sjoerg  /path/to/your/project
296*e038c9c4Sjoerg  $ clang++ --analyze \
297*e038c9c4Sjoerg      -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
298*e038c9c4Sjoerg      -Xclang -analyzer-config -Xclang ctu-dir=. \
299*e038c9c4Sjoerg      -Xclang -analyzer-config -Xclang ctu-invocation-list=invocations.yaml \
300*e038c9c4Sjoerg      -Xclang -analyzer-output=plist-multi-file \
301*e038c9c4Sjoerg      main.cpp
302*e038c9c4Sjoerg  main.cpp:5:12: warning: Division by zero
303*e038c9c4Sjoerg    return 3 / foo();
304*e038c9c4Sjoerg           ~~^~~~~~~
305*e038c9c4Sjoerg  1 warning generated.
306*e038c9c4Sjoerg  $ # The plist file with the result is generated.
307*e038c9c4Sjoerg  $ ls -F
308*e038c9c4Sjoerg  compile_commands.json  externalDefMap.txt  foo.cpp  main.cpp  main.plist
309*e038c9c4Sjoerg  $
310*e038c9c4Sjoerg
311*e038c9c4SjoergThis manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
312*e038c9c4Sjoerg`CodeChecker` or `scan-build-py`.
313*e038c9c4Sjoerg
314*e038c9c4SjoergAutomated CTU Analysis with CodeChecker
315*e038c9c4Sjoerg#######################################
316*e038c9c4SjoergThe `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
317*e038c9c4SjoergOnce we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
318*e038c9c4Sjoerg
319*e038c9c4Sjoerg.. code-block:: bash
320*e038c9c4Sjoerg
321*e038c9c4Sjoerg  $ CodeChecker analyze --ctu --ctu-ast-loading-mode on-demand compile_commands.json -o reports
322*e038c9c4Sjoerg  $ ls -F
323*e038c9c4Sjoerg  compile_commands.json  foo.cpp main.cpp  reports/
324*e038c9c4Sjoerg  $ tree reports
325*e038c9c4Sjoerg  reports
326*e038c9c4Sjoerg  ├── compile_cmd.json
327*e038c9c4Sjoerg  ├── compiler_info.json
328*e038c9c4Sjoerg  ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
329*e038c9c4Sjoerg  ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
330*e038c9c4Sjoerg  ├── metadata.json
331*e038c9c4Sjoerg  └── unique_compile_commands.json
332*e038c9c4Sjoerg
333*e038c9c4Sjoerg  0 directories, 6 files
334*e038c9c4Sjoerg  $
335*e038c9c4Sjoerg
336*e038c9c4SjoergThe `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
337*e038c9c4SjoergE.g. one may use `CodeChecker parse` to view the results in command line:
338*e038c9c4Sjoerg
339*e038c9c4Sjoerg.. code-block:: bash
340*e038c9c4Sjoerg
341*e038c9c4Sjoerg  $ CodeChecker parse reports
342*e038c9c4Sjoerg  [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
343*e038c9c4Sjoerg    return 3 / foo();
344*e038c9c4Sjoerg             ^
345*e038c9c4Sjoerg
346*e038c9c4Sjoerg  Found 1 defect(s) in main.cpp
347*e038c9c4Sjoerg
348*e038c9c4Sjoerg
349*e038c9c4Sjoerg  ----==== Summary ====----
350*e038c9c4Sjoerg  -----------------------
351*e038c9c4Sjoerg  Filename | Report count
352*e038c9c4Sjoerg  -----------------------
353*e038c9c4Sjoerg  main.cpp |            1
354*e038c9c4Sjoerg  -----------------------
355*e038c9c4Sjoerg  -----------------------
356*e038c9c4Sjoerg  Severity | Report count
357*e038c9c4Sjoerg  -----------------------
358*e038c9c4Sjoerg  HIGH     |            1
359*e038c9c4Sjoerg  -----------------------
360*e038c9c4Sjoerg  ----=================----
361*e038c9c4Sjoerg  Total number of reports: 1
362*e038c9c4Sjoerg  ----=================----
363*e038c9c4Sjoerg
364*e038c9c4SjoergOr we can use `CodeChecker parse -e html` to export the results into HTML format:
365*e038c9c4Sjoerg
366*e038c9c4Sjoerg.. code-block:: bash
367*e038c9c4Sjoerg
368*e038c9c4Sjoerg  $ CodeChecker parse -e html -o html_out reports
369*e038c9c4Sjoerg  $ firefox html_out/index.html
370*e038c9c4Sjoerg
371*e038c9c4SjoergAutomated CTU Analysis with scan-build-py (don't do it)
372*e038c9c4Sjoerg#######################################################
373*e038c9c4SjoergWe actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU.
374*e038c9c4Sjoerg`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
375*e038c9c4Sjoerg
376*e038c9c4SjoergCurrently On-demand analysis is not supported with `scan-build-py`.
377*e038c9c4Sjoerg
378