1test-suite Guide 2================ 3 4Quickstart 5---------- 6 71. The lit test runner is required to run the tests. You can either use one 8 from an LLVM build: 9 10 ```bash 11 % <path to llvm build>/bin/llvm-lit --version 12 lit 20.0.0dev 13 ``` 14 15 An alternative is installing it as a Python package in a Python virtual 16 environment: 17 18 ```bash 19 % python3 -m venv .venv 20 % . .venv/bin/activate 21 % pip install git+https://github.com/llvm/llvm-project.git#subdirectory=llvm/utils/lit 22 % lit --version 23 lit 20.0.0dev 24 ``` 25 26 Installing the official Python release of lit in a Python virtual 27 environment could also work. This will install the most recent 28 release of lit: 29 30 ```bash 31 % python3 -m venv .venv 32 % . .venv/bin/activate 33 % pip install lit 34 % lit --version 35 lit 18.1.8 36 ``` 37 38 Please note that recent tests may rely on features not in the latest released lit. 39 If in doubt, try one of the previous methods. 40 412. Check out the `test-suite` module with: 42 43 ```bash 44 % git clone https://github.com/llvm/llvm-test-suite.git test-suite 45 ``` 46 473. Create a build directory and use CMake to configure the suite. Use the 48 `CMAKE_C_COMPILER` option to specify the compiler to test. Use a cache file 49 to choose a typical build configuration: 50 51 ```bash 52 % mkdir test-suite-build 53 % cd test-suite-build 54 % cmake -DCMAKE_C_COMPILER=<path to llvm build>/bin/clang \ 55 -C../test-suite/cmake/caches/O3.cmake \ 56 ../test-suite 57 ``` 58 59**NOTE!** if you are using your built clang, and you want to build and run the 60MicroBenchmarks/XRay microbenchmarks, you need to add `compiler-rt` to your 61`LLVM_ENABLE_RUNTIMES` cmake flag. 62 634. Build the benchmarks: 64 65 ```text 66 % make 67 Scanning dependencies of target timeit-target 68 [ 0%] Building C object tools/CMakeFiles/timeit-target.dir/timeit.c.o 69 [ 0%] Linking C executable timeit-target 70 ... 71 ``` 72 735. Run the tests with lit: 74 75 ```text 76 % llvm-lit -v -j 1 -o results.json . 77 -- Testing: 474 tests, 1 threads -- 78 PASS: test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test (1 of 474) 79 ********** TEST 'test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test' RESULTS ********** 80 compile_time: 0.2192 81 exec_time: 0.0462 82 hash: "59620e187c6ac38b36382685ccd2b63b" 83 size: 83348 84 ********** 85 PASS: test-suite :: MultiSource/Applications/ALAC/encode/alacconvert-encode.test (2 of 474) 86 ... 87 ``` 88**NOTE!** even in the case you only want to get the compile-time results(code size, llvm stats etc), 89you need to run the test with the above `llvm-lit` command. In that case, the *results.json* file will 90contain compile-time metrics. 91 926. Show and compare result files (optional): 93 94 ```bash 95 # Make sure pandas and scipy are installed. Prepend `sudo` if necessary. 96 % pip install pandas scipy 97 # Show a single result file: 98 % test-suite/utils/compare.py results.json 99 # Compare two result files: 100 % test-suite/utils/compare.py results_a.json results_b.json 101 ``` 102 103 104Structure 105--------- 106 107The test-suite contains benchmark and test programs. The programs come with 108reference outputs so that their correctness can be checked. The suite comes 109with tools to collect metrics such as benchmark runtime, compilation time and 110code size. 111 112The test-suite is divided into several directories: 113 114- `SingleSource/` 115 116 Contains test programs that are only a single source file in size. A 117 subdirectory may contain several programs. 118 119- `MultiSource/` 120 121 Contains subdirectories which entire programs with multiple source files. 122 Large benchmarks and whole applications go here. 123 124- `MicroBenchmarks/` 125 126 Programs using the [google-benchmark](https://github.com/google/benchmark) 127 library. The programs define functions that are run multiple times until the 128 measurement results are statistically significant. 129 130- `External/` 131 132 Contains descriptions and test data for code that cannot be directly 133 distributed with the test-suite. The most prominent members of this 134 directory are the SPEC CPU benchmark suites. 135 See [External Suites](#external-suites). 136 137- `Bitcode/` 138 139 These tests are mostly written in LLVM bitcode. 140 141- `CTMark/` 142 143 Contains symbolic links to other benchmarks forming a representative sample 144 for compilation performance measurements. 145 146### Benchmarks 147 148Every program can work as a correctness test. Some programs are unsuitable for 149performance measurements. Setting the `TEST_SUITE_BENCHMARKING_ONLY` CMake 150option to `ON` will disable them. 151 152The MultiSource benchmarks consist of the following apps and benchmarks: 153 154| MultiSource | Language | Application Area | Remark | 155|----------------------|-----------|-------------------------------|----------------------| 156| 7zip | C/C++ | Compression/Decompression | | 157| ASCI_Purple | C | SMG2000 benchmark and solver | Memory intensive app | 158| ASC_Sequoia | C | Simulation and solver | | 159| BitBench | C | uudecode/uuencode utility | Bit Stream benchmark for functional compilers | 160| Bullet | C++ | Bullet 2.75 physics engine | | 161| DOE-ProxyApps-C++ | C++ | HPC/scientific apps | Small applications, representative of our larger DOE workloads | 162| DOE-ProxyApps-C | C | HPC/scientific apps | " | 163| Fhourstones | C | Game/solver | Integer benchmark that efficiently solves positions in the game of Connect-4 | 164| Fhourstones-3.1 | C | Game/solver | " | 165| FreeBench | C | Benchmark suite | Raytracer, four in a row, neural network, file compressor, Fast Fourier/Cosine/Sine Transform | 166| llubenchmark | C | Linked-list micro-benchmark | | 167| mafft | C | Bioinformatics | A multiple sequence alignment program | 168| MallocBench | C | Benchmark suite | cfrac, espresso, gawk, gs, make, p2c, perl | 169| McCat | C | Benchmark suite | Quicksort, bubblesort, eigenvalues | 170| mediabench | C | Benchmark suite | adpcm, g721, gsm, jpeg, mpeg2 | 171| MiBench | C | Embedded benchmark suite | Automotive, consumer, office, security, telecom apps | 172| nbench | C | | BYTE Magazine's BYTEmark benchmark program | 173| NPB-serial | C | Parallel computing | Serial version of the NPB IS code | 174| Olden | C | Data Structures | SGI version of the Olden benchmark | 175| OptimizerEval | C | Solver | Preston Brigg's optimizer evaluation framework | 176| PAQ8p | C++ | Data compression | | 177| Prolangs-C++ | C++ | Benchmark suite | city, employ, life, NP, ocean, primes, simul, vcirc | 178| Prolangs-C | C | Benchmark suite | agrep, archie-client, bison, gnugo, unix-smail | 179| Ptrdist | C | Pointer-Intensive Benchmark Suite | | 180| Rodinia | C | Scientific apps | backprop, pathfinder, srad | 181| SciMark2-C | C | Scientific apps | FFT, LU, Montecarlo, sparse matmul | 182| sim | C | Dynamic programming | A Time-Efficient, Linear-Space Local Similarity Algorithm | 183| tramp3d-v4 | C++ | Numerical analysis | Template-intensive numerical program based on FreePOOMA | 184| Trimaran | C | Encryption | 3des, md5, crc | 185| TSVC | C | Vectorization benchmark | Test Suite for Vectorizing Compilers (TSVC) | 186| VersaBench | C | Benchmark suite | 8b10b, beamformer, bmm, dbms, ecbdes | 187 188All MultiSource applications are suitable for performance measurements 189and will run when CMake option `TEST_SUITE_BENCHMARKING_ONLY` is set. 190 191Configuration 192------------- 193 194The test-suite has configuration options to customize building and running the 195benchmarks. CMake can print a list of them: 196 197```bash 198% cd test-suite-build 199# Print basic options: 200% cmake -LH 201# Print all options: 202% cmake -LAH 203``` 204 205### Common Configuration Options 206 207- `CMAKE_C_FLAGS` 208 209 Specify extra flags to be passed to C compiler invocations. The flags are 210 also passed to the C++ compiler and linker invocations. See 211 [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html) 212 213- `CMAKE_C_COMPILER` 214 215 Select the C compiler executable to be used. Note that the C++ compiler is 216 inferred automatically i.e. when specifying `path/to/clang` CMake will 217 automatically use `path/to/clang++` as the C++ compiler. See 218 [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html) 219 220- `CMAKE_Fortran_COMPILER` 221 222 Select the Fortran compiler executable to be used. Not set by default and not 223 required unless running the Fortran Test Suite. 224 225- `CMAKE_BUILD_TYPE` 226 227 Select a build type like `OPTIMIZE` or `DEBUG` selecting a set of predefined 228 compiler flags. These flags are applied regardless of the `CMAKE_C_FLAGS` 229 option and may be changed by modifying `CMAKE_C_FLAGS_OPTIMIZE` etc. See 230 [https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html](https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html) 231 232- `TEST_SUITE_FORTRAN` 233 234 Activate that Fortran tests. This is a work in progress. More information can be 235 found in the [Flang documentation](https://flang.llvm.org/docs/FortranLLVMTestSuite.html) 236 237- `TEST_SUITE_RUN_UNDER` 238 239 Prefix test invocations with the given tool. This is typically used to run 240 cross-compiled tests within a simulator tool. 241 242- `TEST_SUITE_BENCHMARKING_ONLY` 243 244 Disable tests that are unsuitable for performance measurements. The disabled 245 tests either run for a very short time or are dominated by I/O performance 246 making them unsuitable as compiler performance tests. 247 248- `TEST_SUITE_SUBDIRS` 249 250 Semicolon-separated list of directories to include. This can be used to only 251 build parts of the test-suite or to include external suites. This option 252 does not work reliably with deeper subdirectories as it skips intermediate 253 `CMakeLists.txt` files which may be required. 254 255- `TEST_SUITE_COLLECT_STATS` 256 257 Collect internal LLVM statistics. Appends `-save-stats=obj` when invoking the 258 compiler and makes the lit runner collect and merge the statistic files. 259 260- `TEST_SUITE_RUN_BENCHMARKS` 261 262 If this is set to `OFF` then lit will not actually run the tests but just 263 collect build statistics like compile time and code size. 264 265- `TEST_SUITE_USE_PERF` 266 267 Use the `perf` tool for time measurement instead of the `timeit` tool that 268 comes with the test-suite. The `perf` is usually available on linux systems. 269 270- `TEST_SUITE_SPEC2000_ROOT`, `TEST_SUITE_SPEC2006_ROOT`, `TEST_SUITE_SPEC2017_ROOT`, ... 271 272 Specify installation directories of external benchmark suites. You can find 273 more information about expected versions or usage in the README files in the 274 `External` directory (such as `External/SPEC/README`) 275 276### Common CMake Flags 277 278- `-GNinja` 279 280 Generate build files for the ninja build tool. 281 282- `-Ctest-suite/cmake/caches/<cachefile.cmake>` 283 284 Use a CMake cache. The test-suite comes with several CMake caches which 285 predefine common or tricky build configurations. 286 287 288Displaying and Analyzing Results 289-------------------------------- 290 291The `compare.py` script displays and compares result files. A result file is 292produced when invoking lit with the `-o filename.json` flag. 293 294Example usage: 295 296- Basic Usage: 297 298 ```text 299 % test-suite/utils/compare.py baseline.json 300 Warning: 'test-suite :: External/SPEC/CINT2006/403.gcc/403.gcc.test' has No metrics! 301 Tests: 508 302 Metric: exec_time 303 304 Program baseline 305 306 INT2006/456.hmmer/456.hmmer 1222.90 307 INT2006/464.h264ref/464.h264ref 928.70 308 ... 309 baseline 310 count 506.000000 311 mean 20.563098 312 std 111.423325 313 min 0.003400 314 25% 0.011200 315 50% 0.339450 316 75% 4.067200 317 max 1222.896800 318 ``` 319 320- Show compile_time or text segment size metrics: 321 322 ```bash 323 % test-suite/utils/compare.py -m compile_time baseline.json 324 % test-suite/utils/compare.py -m size.__text baseline.json 325 ``` 326 327- Compare two result files and filter short running tests: 328 329 ```bash 330 % test-suite/utils/compare.py --filter-short baseline.json experiment.json 331 ... 332 Program baseline experiment diff 333 334 SingleSour.../Benchmarks/Linpack/linpack-pc 5.16 4.30 -16.5% 335 MultiSourc...erolling-dbl/LoopRerolling-dbl 7.01 7.86 12.2% 336 SingleSour...UnitTests/Vectorizer/gcc-loops 3.89 3.54 -9.0% 337 ... 338 ``` 339 340- Merge multiple baseline and experiment result files by taking the minimum 341 runtime each: 342 343 ```bash 344 % test-suite/utils/compare.py base0.json base1.json base2.json vs exp0.json exp1.json exp2.json 345 ``` 346 347### Continuous Tracking with LNT 348 349LNT is a set of client and server tools for continuously monitoring 350performance. You can find more information at 351[https://llvm.org/docs/lnt](https://llvm.org/docs/lnt). The official LNT instance 352of the LLVM project is hosted at [http://lnt.llvm.org](http://lnt.llvm.org). 353 354 355External Suites 356--------------- 357 358External suites such as SPEC can be enabled by either 359 360- placing (or linking) them into the `test-suite/test-suite-externals/xxx` directory (example: `test-suite/test-suite-externals/speccpu2000`) 361- using a configuration option such as `-D TEST_SUITE_SPEC2000_ROOT=path/to/speccpu2000` 362 363You can find further information in the respective README files such as 364`test-suite/External/SPEC/README`. 365 366For the SPEC benchmarks you can switch between the `test`, `train` and 367`ref` input datasets via the `TEST_SUITE_RUN_TYPE` configuration option. 368The `train` dataset is used by default. 369 370In addition to SPEC, the multimedia frameworks ffmpeg and dav1d can also 371be hooked up as external projects in the same way. By including them in 372llvm-test-suite, a lot more of potentially vectorizable code gets compiled 373- which can catch compiler bugs merely by triggering code generation asserts. 374Including them also adds small code correctness tests, that compare the 375output of the compiler generated functions against handwritten assembly 376functions. (On x86, building the assembly requires having the nasm tool 377available.) The integration into llvm-test-suite doesn't run the projects' 378full testsuites though. The projects also contain microbenchmarks for 379measuring the performance of some functions. See the `README.md` files in 380the respective `ffmpeg` and `dav1d` directories under 381`llvm-test-suite/External` for further details. 382 383 384Custom Suites 385------------- 386 387You can build custom suites using the test-suite infrastructure. A custom suite 388has a `CMakeLists.txt` file at the top directory. The `CMakeLists.txt` will be 389picked up automatically if placed into a subdirectory of the test-suite or when 390setting the `TEST_SUITE_SUBDIRS` variable: 391 392```bash 393% cmake -DTEST_SUITE_SUBDIRS=path/to/my/benchmark-suite ../test-suite 394``` 395 396 397Profile Guided Optimization 398--------------------------- 399 400Profile guided optimization requires to compile and run twice. First the 401benchmark should be compiled with profile generation instrumentation enabled 402and setup for training data. The lit runner will merge the profile files 403using `llvm-profdata` so they can be used by the second compilation run. 404 405Example: 406```bash 407# Profile generation run using LLVM IR PGO: 408% cmake -DTEST_SUITE_PROFILE_GENERATE=ON \ 409 -DTEST_SUITE_USE_IR_PGO=ON \ 410 -DTEST_SUITE_RUN_TYPE=train \ 411 ../test-suite 412% make 413% llvm-lit . 414# Use the profile data for compilation and actual benchmark run: 415% cmake -DTEST_SUITE_PROFILE_GENERATE=OFF \ 416 -DTEST_SUITE_PROFILE_USE=ON \ 417 -DTEST_SUITE_RUN_TYPE=ref \ 418 . 419% make 420% llvm-lit -o result.json . 421``` 422 423To use Clang frontend's PGO instead of LLVM IR PGO, set `-DTEST_SUITE_USE_IR_PGO=OFF`. 424 425The `TEST_SUITE_RUN_TYPE` setting only affects the SPEC benchmark suites. 426 427 428Cross Compilation and External Devices 429-------------------------------------- 430 431### Compilation 432 433CMake allows to cross compile to a different target via toolchain files. More 434information can be found here: 435 436- [https://llvm.org/docs/lnt/tests.html#cross-compiling](https://llvm.org/docs/lnt/tests.html#cross-compiling) 437 438- [https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html](https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html) 439 440Cross compilation from macOS to iOS is possible with the 441`test-suite/cmake/caches/target-target-*-iphoneos-internal.cmake` CMake cache 442files; this requires an internal iOS SDK. 443 444### Running 445 446There are two ways to run the tests in a cross compilation setting: 447 448- Via SSH connection to an external device: The `TEST_SUITE_REMOTE_HOST` option 449 should be set to the SSH hostname. The executables and data files need to be 450 transferred to the device after compilation. This is typically done via the 451 `rsync` make target. After this, the lit runner can be used on the host 452 machine. It will prefix the benchmark and verification command lines with an 453 `ssh` command. 454 455 Example: 456 457 ```bash 458 % cmake -G Ninja -D CMAKE_C_COMPILER=path/to/clang \ 459 -C ../test-suite/cmake/caches/target-arm64-iphoneos-internal.cmake \ 460 -D CMAKE_BUILD_TYPE=Release \ 461 -D TEST_SUITE_REMOTE_HOST=mydevice \ 462 ../test-suite 463 % ninja 464 % ninja rsync 465 % llvm-lit -j1 -o result.json . 466 ``` 467 468- You can specify a simulator for the target machine with the 469 `TEST_SUITE_RUN_UNDER` setting. The lit runner will prefix all benchmark 470 invocations with it. 471 472 473Running the test-suite via LNT 474------------------------------ 475 476The LNT tool can run the test-suite. Use this when submitting test results to 477an LNT instance. See 478[https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite](https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite) 479for details. 480 481Running the test-suite via Makefiles (deprecated) 482------------------------------------------------- 483 484**Note**: The test-suite comes with a set of Makefiles that are considered 485deprecated. They do not support newer testing modes like `Bitcode` or 486`Microbenchmarks` and are harder to use. 487 488Old documentation is available in the 489[test-suite Makefile Guide](TestSuiteMakefileGuide). 490