1============================= 2Advanced Build Configurations 3============================= 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake 12does not build the project, it generates the files needed by your build tool 13(GNU make, Visual Studio, etc.) for building LLVM. 14 15If **you are a new contributor**, please start with the :doc:`GettingStarted` or 16:doc:`CMake` pages. This page is intended for users doing more complex builds. 17 18Many of the examples below are written assuming specific CMake Generators. 19Unless otherwise explicitly called out these commands should work with any CMake 20generator. 21 22Many of the build configurations mentioned on this documentation page can be 23utilized by using a CMake cache. A CMake cache is essentially a configuration 24file that sets the necessary flags for a specific build configuration. The caches 25for Clang are located in :code:`/clang/cmake/caches` within the monorepo. They 26can be passed to CMake using the :code:`-C` flag as demonstrated in the examples 27below along with additional configuration flags. 28 29Bootstrap Builds 30================ 31 32The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a 33high level a multi-stage build is a chain of builds that pass data from one 34stage into the next. The most common and simple version of this is a traditional 35bootstrap build. 36 37In a simple two-stage bootstrap build, we build clang using the system compiler, 38then use that just-built clang to build clang again. In CMake this simplest form 39of a bootstrap build can be configured with a single option, 40CLANG_ENABLE_BOOTSTRAP. 41 42.. code-block:: console 43 44 $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCLANG_ENABLE_BOOTSTRAP=On <path to source> 45 $ ninja stage2 46 47This command itself isn't terribly useful because it assumes default 48configurations for each stage. The next series of examples utilize CMake cache 49scripts to provide more complex options. 50 51By default, only a few CMake options will be passed between stages. 52The list, called _BOOTSTRAP_DEFAULT_PASSTHROUGH, is defined in clang/CMakeLists.txt. 53To force the passing of the variables between stages, use the -DCLANG_BOOTSTRAP_PASSTHROUGH 54CMake option, each variable separated by a ";". As example: 55 56.. code-block:: console 57 58 $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCLANG_ENABLE_BOOTSTRAP=On -DCLANG_BOOTSTRAP_PASSTHROUGH="CMAKE_INSTALL_PREFIX;CMAKE_VERBOSE_MAKEFILE" <path to source> 59 $ ninja stage2 60 61CMake options starting by ``BOOTSTRAP_`` will be passed only to the stage2 build. 62This gives the opportunity to use Clang specific build flags. 63For example, the following CMake call will enabled '-fno-addrsig' only during 64the stage2 build for C and C++. 65 66.. code-block:: console 67 68 $ cmake [..] -DBOOTSTRAP_CMAKE_CXX_FLAGS='-fno-addrsig' -DBOOTSTRAP_CMAKE_C_FLAGS='-fno-addrsig' [..] 69 70The clang build system refers to builds as stages. A stage1 build is a standard 71build using the compiler installed on the host, and a stage2 build is built 72using the stage1 compiler. This nomenclature holds up to more stages too. In 73general a stage*n* build is built using the output from stage*n-1*. 74 75Apple Clang Builds (A More Complex Bootstrap) 76============================================= 77 78Apple's Clang builds are a slightly more complicated example of the simple 79bootstrapping scenario. Apple Clang is built using a 2-stage build. 80 81The stage1 compiler is a host-only compiler with some options set. The stage1 82compiler is a balance of optimization vs build time because it is a throwaway. 83The stage2 compiler is the fully optimized compiler intended to ship to users. 84 85Setting up these compilers requires a lot of options. To simplify the 86configuration the Apple Clang build settings are contained in CMake Cache files. 87You can build an Apple Clang compiler using the following commands: 88 89.. code-block:: console 90 91 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/Apple-stage1.cmake <path to source> 92 $ ninja stage2-distribution 93 94This CMake invocation configures the stage1 host compiler, and sets 95CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the 96stage2 configuration step. 97 98When you build the stage2-distribution target it builds the minimal stage1 99compiler and required tools, then configures and builds the stage2 compiler 100based on the settings in Apple-stage2.cmake. 101 102This pattern of using cache scripts to set complex settings, and specifically to 103make later stage builds include cache scripts is common in our more advanced 104build configurations. 105 106Multi-stage PGO 107=============== 108 109Profile-Guided Optimizations (PGO) is a really great way to optimize the code 110clang generates. Our multi-stage PGO builds are a workflow for generating PGO 111profiles that can be used to optimize clang. 112 113At a high level, the way PGO works is that you build an instrumented compiler, 114then you run the instrumented compiler against sample source files. While the 115instrumented compiler runs it will output a bunch of files containing 116performance counters (.profraw files). After generating all the profraw files 117you use llvm-profdata to merge the files into a single profdata file that you 118can feed into the LLVM_PROFDATA_FILE option. 119 120Our PGO.cmake cache automates that whole process. You can use it for 121configuration with CMake with the following command: 122 123.. code-block:: console 124 125 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/PGO.cmake \ 126 <path to source>/llvm 127 128There are several additional options that the cache file also accepts to modify 129the build, particularly the PGO_INSTRUMENT_LTO option. Setting this option to 130Thin or Full will enable ThinLTO or full LTO respectively, further enhancing 131the performance gains from a PGO build by enabling interprocedural 132optimizations. For example, to run a CMake configuration for a PGO build 133that also enables ThinTLO, use the following command: 134 135.. code-block:: console 136 137 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/PGO.cmake \ 138 -DPGO_INSTRUMENT_LTO=Thin \ 139 <path to source>/llvm 140 141After configuration, building the stage2-instrumented-generate-profdata target 142will automatically build the stage1 compiler, build the instrumented compiler 143with the stage1 compiler, and then run the instrumented compiler against the 144perf training data: 145 146.. code-block:: console 147 148 $ ninja stage2-instrumented-generate-profdata 149 150If you let that run for a few hours or so, it will place a profdata file in your 151build directory. This takes a really long time because it builds clang twice, 152and you *must* have compiler-rt in your build tree. 153 154This process uses any source files under the perf-training directory as training 155data as long as the source files are marked up with LIT-style RUN lines. 156 157After it finishes you can use :code:`find . -name clang.profdata` to find it, but it 158should be at a path something like: 159 160.. code-block:: console 161 162 <build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata 163 164You can feed that file into the LLVM_PROFDATA_FILE option when you build your 165optimized compiler. 166 167It may be necessary to build additional targets before running perf training, such as 168builtins and runtime libraries. You can use the :code:`CLANG_PERF_TRAINING_DEPS` CMake 169variable for that purpose: 170 171.. code-block:: cmake 172 173 set(CLANG_PERF_TRAINING_DEPS builtins runtimes CACHE STRING "") 174 175The PGO cache has a slightly different stage naming scheme than other 176multi-stage builds. It generates three stages: stage1, stage2-instrumented, and 177stage2. Both of the stage2 builds are built using the stage1 compiler. 178 179The PGO cache generates the following additional targets: 180 181**stage2-instrumented** 182 Builds a stage1 compiler, runtime, and required tools (llvm-config, 183 llvm-profdata) then uses that compiler to build an instrumented stage2 compiler. 184 185**stage2-instrumented-generate-profdata** 186 Depends on stage2-instrumented and will use the instrumented compiler to 187 generate profdata based on the training files in clang/utils/perf-training 188 189**stage2** 190 Depends on stage2-instrumented-generate-profdata and will use the stage1 191 compiler with the stage2 profdata to build a PGO-optimized compiler. 192 193**stage2-check-llvm** 194 Depends on stage2 and runs check-llvm using the stage2 compiler. 195 196**stage2-check-clang** 197 Depends on stage2 and runs check-clang using the stage2 compiler. 198 199**stage2-check-all** 200 Depends on stage2 and runs check-all using the stage2 compiler. 201 202**stage2-test-suite** 203 Depends on stage2 and runs the test-suite using the stage2 compiler (requires 204 in-tree test-suite). 205 206BOLT 207==== 208 209`BOLT <https://github.com/llvm/llvm-project/blob/main/bolt/README.md>`_ 210(Binary Optimization and Layout Tool) is a tool that optimizes binaries 211post-link by profiling them at runtime and then using that information to 212optimize the layout of the final binary among other optimizations performed 213at the binary level. There are also CMake caches available to build 214LLVM/Clang with BOLT. 215 216To configure a single-stage build that builds LLVM/Clang and then optimizes 217it with BOLT, use the following CMake configuration: 218 219.. code-block:: console 220 221 $ cmake <path to source>/llvm -C <path to source>/clang/cmake/caches/BOLT.cmake 222 223Then, build the BOLT-optimized binary by running the following ninja command: 224 225.. code-block:: console 226 227 $ ninja clang++-bolt 228 229If you're seeing errors in the build process, try building with a recent 230version of Clang/LLVM by setting the CMAKE_C_COMPILER and 231CMAKE_CXX_COMPILER flags to the appropriate values. 232 233It is also possible to use BOLT on top of PGO and (Thin)LTO for an even more 234significant runtime speedup. To configure a three stage PGO build with ThinLTO 235that optimizes the resulting binary with BOLT, use the following CMake 236configuration command: 237 238.. code-block:: console 239 240 $ cmake -G Ninja <path to source>/llvm \ 241 -C <path to source>/clang/cmake/caches/BOLT-PGO.cmake \ 242 -DBOOTSTRAP_LLVM_ENABLE_LLD=ON \ 243 -DBOOTSTRAP_BOOTSTRAP_LLVM_ENABLE_LLD=ON \ 244 -DPGO_INSTRUMENT_LTO=Thin 245 246Then, to build the final optimized binary, build the stage2-clang++-bolt 247target: 248 249.. code-block:: console 250 251 $ ninja stage2-clang++-bolt 252 2533-Stage Non-Determinism 254======================= 255 256In the ancient lore of compilers non-determinism is like the multi-headed hydra. 257Whenever its head pops up, terror and chaos ensue. 258 259Historically one of the tests to verify that a compiler was deterministic would 260be a three stage build. The idea of a three stage build is you take your sources 261and build a compiler (stage1), then use that compiler to rebuild the sources 262(stage2), then you use that compiler to rebuild the sources a third time 263(stage3) with an identical configuration to the stage2 build. At the end of 264this, you have a stage2 and stage3 compiler that should be bit-for-bit 265identical. 266 267You can perform one of these 3-stage builds with LLVM & clang using the 268following commands: 269 270.. code-block:: console 271 272 $ cmake -G Ninja -C <path to source>/clang/cmake/caches/3-stage.cmake <path to source> 273 $ cmake --build . --target stage3 --parallel 274 275After the build you can compare the stage2 & stage3 compilers. 276