1===================================== 2Cross Translation Unit (CTU) Analysis 3===================================== 4 5Normally, static analysis works in the boundary of one translation unit (TU). 6However, with additional steps and configuration we can enable the analysis to inline the definition of a function from 7another TU. 8 9.. contents:: 10 :local: 11 12Overview 13________ 14CTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH 15files or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static 16analysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options 17of the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This 18process can be automated by other tools, like `CodeChecker <https://github.com/Ericsson/codechecker>`_ and scan-build-py 19(preference for the former). 20 21PCH-based analysis 22__________________ 23The analysis needs the PCH dumps of all the translations units used in the project. 24These can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem. 25The index, which maps symbols' USR names to PCH dumps containing them must also be generated by the 26`clang-extdef-mapping`. Entries in the index *must* have an `.ast` suffix if the goal 27is to use PCH-based analysis, as the lack of that extension signals that the entry is to be used as a source-file, and parsed on-demand. 28This tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to 29determine the compilation flags used. 30The analysis invocation must be provided with the directory which contains the dumps and the mapping files. 31 32 33Manual CTU Analysis 34################### 35Let's consider these source files in our minimal example: 36 37.. code-block:: cpp 38 39 // main.cpp 40 int foo(); 41 42 int main() { 43 return 3 / foo(); 44 } 45 46.. code-block:: cpp 47 48 // foo.cpp 49 int foo() { 50 return 0; 51 } 52 53And a compilation database: 54 55.. code-block:: bash 56 57 [ 58 { 59 "directory": "/path/to/your/project", 60 "command": "clang++ -c foo.cpp -o foo.o", 61 "file": "foo.cpp" 62 }, 63 { 64 "directory": "/path/to/your/project", 65 "command": "clang++ -c main.cpp -o main.o", 66 "file": "main.cpp" 67 } 68 ] 69 70We'd like to analyze `main.cpp` and discover the division by zero bug. 71In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file 72of `foo.cpp`: 73 74.. code-block:: bash 75 76 $ pwd $ /path/to/your/project 77 $ clang++ -emit-ast -o foo.cpp.ast foo.cpp 78 $ # Check that the .ast file is generated: 79 $ ls 80 compile_commands.json foo.cpp.ast foo.cpp main.cpp 81 $ 82 83The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the 84source files in format `<USR-Length>:<USR> <File-Path>`: 85 86.. code-block:: bash 87 88 $ clang-extdef-mapping -p . foo.cpp 89 9:c:@F@foo# /path/to/your/project/foo.cpp 90 $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt 91 92We have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files: 93 94.. code-block:: bash 95 96 $ sed -i -e "s/.cpp/.cpp.ast/g" externalDefMap.txt 97 98We still have to further modify the `externalDefMap.txt` file to contain relative paths: 99 100.. code-block:: bash 101 102 $ sed -i -e "s|$(pwd)/||g" externalDefMap.txt 103 104Now everything is available for the CTU analysis. 105We have to feed Clang with CTU specific extra arguments: 106 107.. code-block:: bash 108 109 $ pwd 110 /path/to/your/project 111 $ clang++ --analyze \ 112 -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ 113 -Xclang -analyzer-config -Xclang ctu-dir=. \ 114 -Xclang -analyzer-output=plist-multi-file \ 115 main.cpp 116 main.cpp:5:12: warning: Division by zero 117 return 3 / foo(); 118 ~~^~~~~~~ 119 1 warning generated. 120 $ # The plist file with the result is generated. 121 $ ls -F 122 compile_commands.json externalDefMap.txt foo.ast foo.cpp foo.cpp.ast main.cpp main.plist 123 $ 124 125This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use 126`CodeChecker` or `scan-build-py`. 127 128Automated CTU Analysis with CodeChecker 129####################################### 130The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang. 131Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: 132 133.. code-block:: bash 134 135 $ CodeChecker analyze --ctu compile_commands.json -o reports 136 $ ls -F 137 compile_commands.json foo.cpp foo.cpp.ast main.cpp reports/ 138 $ tree reports 139 reports 140 ├── compile_cmd.json 141 ├── compiler_info.json 142 ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist 143 ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist 144 ├── metadata.json 145 └── unique_compile_commands.json 146 147 0 directories, 6 files 148 $ 149 150The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools. 151E.g. one may use `CodeChecker parse` to view the results in command line: 152 153.. code-block:: bash 154 155 $ CodeChecker parse reports 156 [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero] 157 return 3 / foo(); 158 ^ 159 160 Found 1 defect(s) in main.cpp 161 162 163 ----==== Summary ====---- 164 ----------------------- 165 Filename | Report count 166 ----------------------- 167 main.cpp | 1 168 ----------------------- 169 ----------------------- 170 Severity | Report count 171 ----------------------- 172 HIGH | 1 173 ----------------------- 174 ----=================---- 175 Total number of reports: 1 176 ----=================---- 177 178Or we can use `CodeChecker parse -e html` to export the results into HTML format: 179 180.. code-block:: bash 181 182 $ CodeChecker parse -e html -o html_out reports 183 $ firefox html_out/index.html 184 185Automated CTU Analysis with scan-build-py (don't do it) 186############################################################# 187We actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU. 188`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. 189 190Example usage of scan-build-py: 191 192.. code-block:: bash 193 194 $ /your/path/to/llvm-project/clang/tools/scan-build-py/bin/analyze-build --ctu 195 analyze-build: Run 'scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk' to examine bug reports. 196 $ /your/path/to/llvm-project/clang/tools/scan-view/bin/scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk 197 Starting scan-view at: http://127.0.0.1:8181 198 Use Ctrl-C to exit. 199 [6336:6431:0717/175357.633914:ERROR:browser_process_sub_thread.cc(209)] Waited 5 ms for network service 200 Opening in existing browser session. 201 ^C 202 $ 203 204.. _ctu-on-demand: 205 206On-demand analysis 207__________________ 208The analysis produces the necessary AST structure of external TUs during analysis. This requires the 209exact compiler invocations for each TU, which can be generated by hand, or by tools driving the analyzer. 210The compiler invocation is a shell command that could be used to compile the TU-s main source file. 211The mapping from absolute source file paths of a TU to lists of compilation command segments used to 212compile said TU are given in YAML format referred to as `invocation list`, and must be passed as an 213analyzer-config argument. 214The index, which maps function USR names to source files containing them must also be generated by the 215`clang-extdef-mapping`. Entries in the index must *not* have an `.ast` suffix if the goal 216is to use On-demand analysis, as that extension signals that the entry is to be used as an PCH-dump. 217The mapping of external definitions implicitly uses a 218:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used. 219The analysis invocation must be provided with the directory which contains the mapping 220files, and the `invocation list` which is used to determine compiler flags. 221 222 223Manual CTU Analysis 224################### 225 226Let's consider these source files in our minimal example: 227 228.. code-block:: cpp 229 230 // main.cpp 231 int foo(); 232 233 int main() { 234 return 3 / foo(); 235 } 236 237.. code-block:: cpp 238 239 // foo.cpp 240 int foo() { 241 return 0; 242 } 243 244The compilation database: 245 246.. code-block:: bash 247 248 [ 249 { 250 "directory": "/path/to/your/project", 251 "command": "clang++ -c foo.cpp -o foo.o", 252 "file": "foo.cpp" 253 }, 254 { 255 "directory": "/path/to/your/project", 256 "command": "clang++ -c main.cpp -o main.o", 257 "file": "main.cpp" 258 } 259 ] 260 261The `invocation list`: 262 263.. code-block:: bash 264 265 "/path/to/your/project/foo.cpp": 266 - "clang++" 267 - "-c" 268 - "/path/to/your/project/foo.cpp" 269 - "-o" 270 - "/path/to/your/project/foo.o" 271 272 "/path/to/your/project/main.cpp": 273 - "clang++" 274 - "-c" 275 - "/path/to/your/project/main.cpp" 276 - "-o" 277 - "/path/to/your/project/main.o" 278 279We'd like to analyze `main.cpp` and discover the division by zero bug. 280As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of 281external definitions in the source files in format `<USR-Length>:<USR> <File-Path>`: 282 283.. code-block:: bash 284 285 $ clang-extdef-mapping -p . foo.cpp 286 9:c:@F@foo# /path/to/your/project/foo.cpp 287 $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt 288 289Now everything is available for the CTU analysis. 290We have to feed Clang with CTU specific extra arguments: 291 292.. code-block:: bash 293 294 $ pwd 295 /path/to/your/project 296 $ clang++ --analyze \ 297 -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ 298 -Xclang -analyzer-config -Xclang ctu-dir=. \ 299 -Xclang -analyzer-config -Xclang ctu-invocation-list=invocations.yaml \ 300 -Xclang -analyzer-output=plist-multi-file \ 301 main.cpp 302 main.cpp:5:12: warning: Division by zero 303 return 3 / foo(); 304 ~~^~~~~~~ 305 1 warning generated. 306 $ # The plist file with the result is generated. 307 $ ls -F 308 compile_commands.json externalDefMap.txt foo.cpp main.cpp main.plist 309 $ 310 311This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use 312`CodeChecker` or `scan-build-py`. 313 314Automated CTU Analysis with CodeChecker 315####################################### 316The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang. 317Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: 318 319.. code-block:: bash 320 321 $ CodeChecker analyze --ctu --ctu-ast-loading-mode on-demand compile_commands.json -o reports 322 $ ls -F 323 compile_commands.json foo.cpp main.cpp reports/ 324 $ tree reports 325 reports 326 ├── compile_cmd.json 327 ├── compiler_info.json 328 ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist 329 ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist 330 ├── metadata.json 331 └── unique_compile_commands.json 332 333 0 directories, 6 files 334 $ 335 336The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools. 337E.g. one may use `CodeChecker parse` to view the results in command line: 338 339.. code-block:: bash 340 341 $ CodeChecker parse reports 342 [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero] 343 return 3 / foo(); 344 ^ 345 346 Found 1 defect(s) in main.cpp 347 348 349 ----==== Summary ====---- 350 ----------------------- 351 Filename | Report count 352 ----------------------- 353 main.cpp | 1 354 ----------------------- 355 ----------------------- 356 Severity | Report count 357 ----------------------- 358 HIGH | 1 359 ----------------------- 360 ----=================---- 361 Total number of reports: 1 362 ----=================---- 363 364Or we can use `CodeChecker parse -e html` to export the results into HTML format: 365 366.. code-block:: bash 367 368 $ CodeChecker parse -e html -o html_out reports 369 $ firefox html_out/index.html 370 371Automated CTU Analysis with scan-build-py (don't do it) 372####################################################### 373We actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU. 374`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. 375 376Currently On-demand analysis is not supported with `scan-build-py`. 377