1============================= 2Advanced Build Configurations 3============================= 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake 12does not build the project, it generates the files needed by your build tool 13(GNU make, Visual Studio, etc.) for building LLVM. 14 15If **you are a new contributor**, please start with the :doc:`GettingStarted` or 16:doc:`CMake` pages. This page is intended for users doing more complex builds. 17 18Many of the examples below are written assuming specific CMake Generators. 19Unless otherwise explicitly called out these commands should work with any CMake 20generator. 21 22Many of the build configurations mentioned on this documentation page can be 23utilized by using a CMake cache. A CMake cache is essentially a configuration 24file that sets the necessary flags for a specific build configuration. The caches 25for Clang are located in :code:`/clang/cmake/caches` within the monorepo. They 26can be passed to CMake using the :code:`-C` flag as demonstrated in the examples 27below along with additional configuration flags. 28 29Bootstrap Builds 30================ 31 32The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a 33high level a multi-stage build is a chain of builds that pass data from one 34stage into the next. The most common and simple version of this is a traditional 35bootstrap build. 36 37In a simple two-stage bootstrap build, we build clang using the system compiler, 38then use that just-built clang to build clang again. In CMake this simplest form 39of a bootstrap build can be configured with a single option, 40CLANG_ENABLE_BOOTSTRAP. 41 42.. code-block:: console 43 44 $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \ 45 -DCLANG_ENABLE_BOOTSTRAP=On \ 46 -DLLVM_ENABLE_PROJECTS="clang" \ 47 <path to source>/llvm 48 $ ninja stage2 49 50This command itself isn't terribly useful because it assumes default 51configurations for each stage. The next series of examples utilize CMake cache 52scripts to provide more complex options. 53 54By default, only a few CMake options will be passed between stages. 55The list, called _BOOTSTRAP_DEFAULT_PASSTHROUGH, is defined in clang/CMakeLists.txt. 56To force the passing of the variables between stages, use the -DCLANG_BOOTSTRAP_PASSTHROUGH 57CMake option, each variable separated by a ";". As example: 58 59.. code-block:: console 60 61 $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \ 62 -DCLANG_ENABLE_BOOTSTRAP=On \ 63 -DCLANG_BOOTSTRAP_PASSTHROUGH="CMAKE_INSTALL_PREFIX;CMAKE_VERBOSE_MAKEFILE" \ 64 -DLLVM_ENABLE_PROJECTS="clang" \ 65 <path to source>/llvm 66 $ ninja stage2 67 68CMake options starting by ``BOOTSTRAP_`` will be passed only to the stage2 build. 69This gives the opportunity to use Clang specific build flags. 70For example, the following CMake call will enabled '-fno-addrsig' only during 71the stage2 build for C and C++. 72 73.. code-block:: console 74 75 $ cmake [..] -DBOOTSTRAP_CMAKE_CXX_FLAGS='-fno-addrsig' -DBOOTSTRAP_CMAKE_C_FLAGS='-fno-addrsig' [..] 76 77The clang build system refers to builds as stages. A stage1 build is a standard 78build using the compiler installed on the host, and a stage2 build is built 79using the stage1 compiler. This nomenclature holds up to more stages too. In 80general a stage*n* build is built using the output from stage*n-1*. 81 82Apple Clang Builds (A More Complex Bootstrap) 83============================================= 84 85Apple's Clang builds are a slightly more complicated example of the simple 86bootstrapping scenario. Apple Clang is built using a 2-stage build. 87 88The stage1 compiler is a host-only compiler with some options set. The stage1 89compiler is a balance of optimization vs build time because it is a throwaway. 90The stage2 compiler is the fully optimized compiler intended to ship to users. 91 92Setting up these compilers requires a lot of options. To simplify the 93configuration the Apple Clang build settings are contained in CMake Cache files. 94You can build an Apple Clang compiler using the following commands: 95 96.. code-block:: console 97 98 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/Apple-stage1.cmake <path to source>/llvm 99 $ ninja stage2-distribution 100 101This CMake invocation configures the stage1 host compiler, and sets 102CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the 103stage2 configuration step. 104 105When you build the stage2-distribution target it builds the minimal stage1 106compiler and required tools, then configures and builds the stage2 compiler 107based on the settings in Apple-stage2.cmake. 108 109This pattern of using cache scripts to set complex settings, and specifically to 110make later stage builds include cache scripts is common in our more advanced 111build configurations. 112 113Multi-stage PGO 114=============== 115 116Profile-Guided Optimizations (PGO) is a really great way to optimize the code 117clang generates. Our multi-stage PGO builds are a workflow for generating PGO 118profiles that can be used to optimize clang. 119 120At a high level, the way PGO works is that you build an instrumented compiler, 121then you run the instrumented compiler against sample source files. While the 122instrumented compiler runs it will output a bunch of files containing 123performance counters (.profraw files). After generating all the profraw files 124you use llvm-profdata to merge the files into a single profdata file that you 125can feed into the LLVM_PROFDATA_FILE option. 126 127Our PGO.cmake cache automates that whole process. You can use it for 128configuration with CMake with the following command: 129 130.. code-block:: console 131 132 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/PGO.cmake \ 133 <path to source>/llvm 134 135There are several additional options that the cache file also accepts to modify 136the build, particularly the PGO_INSTRUMENT_LTO option. Setting this option to 137Thin or Full will enable ThinLTO or full LTO respectively, further enhancing 138the performance gains from a PGO build by enabling interprocedural 139optimizations. For example, to run a CMake configuration for a PGO build 140that also enables ThinTLO, use the following command: 141 142.. code-block:: console 143 144 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/PGO.cmake \ 145 -DPGO_INSTRUMENT_LTO=Thin \ 146 <path to source>/llvm 147 148By default, clang will generate profile data by compiling a simple 149hello world program. You can also tell clang use an external 150project for generating profile data that may be a better fit for your 151use case. The project you specify must either be a lit test suite 152(use the CLANG_PGO_TRAINING_DATA option) or a CMake project (use the 153CLANG_PERF_TRAINING_DATA_SOURCE_DIR option). 154 155For example, If you wanted to use the 156`LLVM Test Suite <https://github.com/llvm/llvm-test-suite/>`_ to generate 157profile data you would use the following command: 158 159.. code-block:: console 160 161 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/PGO.cmake \ 162 -DBOOTSTRAP_CLANG_PGO_TRAINING_DATA_SOURCE_DIR=<path to llvm-test-suite> \ 163 -DBOOTSTRAP_CLANG_PGO_TRAINING_DEPS=runtimes 164 165The BOOTSTRAP\_ prefixes tells CMake to pass the variables on to the instrumented 166stage two build. And the CLANG_PGO_TRAINING_DEPS option let's you specify 167additional build targets to build before building the external project. The 168LLVM Test Suite requires compiler-rt to build, so we need to add the 169`runtimes` target as a dependency. 170 171After configuration, building the stage2-instrumented-generate-profdata target 172will automatically build the stage1 compiler, build the instrumented compiler 173with the stage1 compiler, and then run the instrumented compiler against the 174perf training data: 175 176.. code-block:: console 177 178 $ ninja stage2-instrumented-generate-profdata 179 180If you let that run for a few hours or so, it will place a profdata file in your 181build directory. This takes a really long time because it builds clang twice, 182and you *must* have compiler-rt in your build tree. 183 184This process uses any source files under the perf-training directory as training 185data as long as the source files are marked up with LIT-style RUN lines. 186 187After it finishes you can use :code:`find . -name clang.profdata` to find it, but it 188should be at a path something like: 189 190.. code-block:: console 191 192 <build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata 193 194You can feed that file into the LLVM_PROFDATA_FILE option when you build your 195optimized compiler. 196 197It may be necessary to build additional targets before running perf training, such as 198builtins and runtime libraries. You can use the :code:`CLANG_PGO_TRAINING_DEPS` CMake 199variable for that purpose: 200 201.. code-block:: cmake 202 203 set(CLANG_PGO_TRAINING_DEPS builtins runtimes CACHE STRING "") 204 205The PGO cache has a slightly different stage naming scheme than other 206multi-stage builds. It generates three stages: stage1, stage2-instrumented, and 207stage2. Both of the stage2 builds are built using the stage1 compiler. 208 209The PGO cache generates the following additional targets: 210 211**stage2-instrumented** 212 Builds a stage1 compiler, runtime, and required tools (llvm-config, 213 llvm-profdata) then uses that compiler to build an instrumented stage2 compiler. 214 215**stage2-instrumented-generate-profdata** 216 Depends on stage2-instrumented and will use the instrumented compiler to 217 generate profdata based on the training files in clang/utils/perf-training 218 219**stage2** 220 Depends on stage2-instrumented-generate-profdata and will use the stage1 221 compiler with the stage2 profdata to build a PGO-optimized compiler. 222 223**stage2-check-llvm** 224 Depends on stage2 and runs check-llvm using the stage2 compiler. 225 226**stage2-check-clang** 227 Depends on stage2 and runs check-clang using the stage2 compiler. 228 229**stage2-check-all** 230 Depends on stage2 and runs check-all using the stage2 compiler. 231 232**stage2-test-suite** 233 Depends on stage2 and runs the test-suite using the stage2 compiler (requires 234 in-tree test-suite). 235 236BOLT 237==== 238 239`BOLT <https://github.com/llvm/llvm-project/blob/main/bolt/README.md>`_ 240(Binary Optimization and Layout Tool) is a tool that optimizes binaries 241post-link by profiling them at runtime and then using that information to 242optimize the layout of the final binary among other optimizations performed 243at the binary level. There are also CMake caches available to build 244LLVM/Clang with BOLT. 245 246To configure a single-stage build that builds LLVM/Clang and then optimizes 247it with BOLT, use the following CMake configuration: 248 249.. code-block:: console 250 251 $ cmake <path to source>/llvm -C <path to source>/clang/cmake/caches/BOLT.cmake 252 253Then, build the BOLT-optimized binary by running the following ninja command: 254 255.. code-block:: console 256 257 $ ninja clang-bolt 258 259If you're seeing errors in the build process, try building with a recent 260version of Clang/LLVM by setting the CMAKE_C_COMPILER and 261CMAKE_CXX_COMPILER flags to the appropriate values. 262 263It is also possible to use BOLT on top of PGO and (Thin)LTO for an even more 264significant runtime speedup. To configure a three stage PGO build with ThinLTO 265that optimizes the resulting binary with BOLT, use the following CMake 266configuration command: 267 268.. code-block:: console 269 270 $ cmake -G Ninja <path to source>/llvm \ 271 -C <path to source>/clang/cmake/caches/BOLT-PGO.cmake \ 272 -DBOOTSTRAP_LLVM_ENABLE_LLD=ON \ 273 -DBOOTSTRAP_BOOTSTRAP_LLVM_ENABLE_LLD=ON \ 274 -DPGO_INSTRUMENT_LTO=Thin 275 276Then, to build the final optimized binary, build the stage2-clang-bolt target: 277 278.. code-block:: console 279 280 $ ninja stage2-clang-bolt 281 2823-Stage Non-Determinism 283======================= 284 285In the ancient lore of compilers non-determinism is like the multi-headed hydra. 286Whenever its head pops up, terror and chaos ensue. 287 288Historically one of the tests to verify that a compiler was deterministic would 289be a three stage build. The idea of a three stage build is you take your sources 290and build a compiler (stage1), then use that compiler to rebuild the sources 291(stage2), then you use that compiler to rebuild the sources a third time 292(stage3) with an identical configuration to the stage2 build. At the end of 293this, you have a stage2 and stage3 compiler that should be bit-for-bit 294identical. 295 296You can perform one of these 3-stage builds with LLVM and clang using the 297following commands: 298 299.. code-block:: console 300 301 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/3-stage.cmake <path to source>/llvm 302 $ ninja stage3 303 304After the build you can compare the stage2 and stage3 compilers. 305