1========================== 2Source-based Code Coverage 3========================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11This document explains how to use clang's source-based code coverage feature. 12It's called "source-based" because it operates on AST and preprocessor 13information directly. This allows it to generate very precise coverage data. 14 15Clang ships two other code coverage implementations: 16 17* :doc:`SanitizerCoverage` - A low-overhead tool meant for use alongside the 18 various sanitizers. It can provide up to edge-level coverage. 19 20* gcov - A GCC-compatible coverage implementation which operates on DebugInfo. 21 This is enabled by ``-ftest-coverage`` or ``--coverage``. 22 23From this point onwards "code coverage" will refer to the source-based kind. 24 25The code coverage workflow 26========================== 27 28The code coverage workflow consists of three main steps: 29 30* Compiling with coverage enabled. 31 32* Running the instrumented program. 33 34* Creating coverage reports. 35 36The next few sections work through a complete, copy-'n-paste friendly example 37based on this program: 38 39.. code-block:: cpp 40 41 % cat <<EOF > foo.cc 42 #define BAR(x) ((x) || (x)) 43 template <typename T> void foo(T x) { 44 for (unsigned I = 0; I < 10; ++I) { BAR(I); } 45 } 46 int main() { 47 foo<int>(0); 48 foo<float>(0); 49 return 0; 50 } 51 EOF 52 53Compiling with coverage enabled 54=============================== 55 56To compile code with coverage enabled, pass ``-fprofile-instr-generate 57-fcoverage-mapping`` to the compiler: 58 59.. code-block:: console 60 61 # Step 1: Compile with coverage enabled. 62 % clang++ -fprofile-instr-generate -fcoverage-mapping foo.cc -o foo 63 64Note that linking together code with and without coverage instrumentation is 65supported. Uninstrumented code simply won't be accounted for in reports. 66 67Running the instrumented program 68================================ 69 70The next step is to run the instrumented program. When the program exits it 71will write a **raw profile** to the path specified by the ``LLVM_PROFILE_FILE`` 72environment variable. If that variable does not exist, the profile is written 73to ``default.profraw`` in the current directory of the program. If 74``LLVM_PROFILE_FILE`` contains a path to a non-existent directory, the missing 75directory structure will be created. Additionally, the following special 76**pattern strings** are rewritten: 77 78* "%p" expands out to the process ID. 79 80* "%h" expands out to the hostname of the machine running the program. 81 82* "%Nm" expands out to the instrumented binary's signature. When this pattern 83 is specified, the runtime creates a pool of N raw profiles which are used for 84 on-line profile merging. The runtime takes care of selecting a raw profile 85 from the pool, locking it, and updating it before the program exits. If N is 86 not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. N must 87 be between 1 and 9. The merge pool specifier can only occur once per filename 88 pattern. 89 90.. code-block:: console 91 92 # Step 2: Run the program. 93 % LLVM_PROFILE_FILE="foo.profraw" ./foo 94 95Creating coverage reports 96========================= 97 98Raw profiles have to be **indexed** before they can be used to generate 99coverage reports. This is done using the "merge" tool in ``llvm-profdata`` 100(which can combine multiple raw profiles and index them at the same time): 101 102.. code-block:: console 103 104 # Step 3(a): Index the raw profile. 105 % llvm-profdata merge -sparse foo.profraw -o foo.profdata 106 107There are multiple different ways to render coverage reports. The simplest 108option is to generate a line-oriented report: 109 110.. code-block:: console 111 112 # Step 3(b): Create a line-oriented coverage report. 113 % llvm-cov show ./foo -instr-profile=foo.profdata 114 115This report includes a summary view as well as dedicated sub-views for 116templated functions and their instantiations. For our example program, we get 117distinct views for ``foo<int>(...)`` and ``foo<float>(...)``. If 118``-show-line-counts-or-regions`` is enabled, ``llvm-cov`` displays sub-line 119region counts (even in macro expansions): 120 121.. code-block:: none 122 123 1| 20|#define BAR(x) ((x) || (x)) 124 ^20 ^2 125 2| 2|template <typename T> void foo(T x) { 126 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 127 ^22 ^20 ^20^20 128 4| 2|} 129 ------------------ 130 | void foo<int>(int): 131 | 2| 1|template <typename T> void foo(T x) { 132 | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 133 | ^11 ^10 ^10^10 134 | 4| 1|} 135 ------------------ 136 | void foo<float>(int): 137 | 2| 1|template <typename T> void foo(T x) { 138 | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 139 | ^11 ^10 ^10^10 140 | 4| 1|} 141 ------------------ 142 143To generate a file-level summary of coverage statistics instead of a 144line-oriented report, try: 145 146.. code-block:: console 147 148 # Step 3(c): Create a coverage summary. 149 % llvm-cov report ./foo -instr-profile=foo.profdata 150 Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover 151 -------------------------------------------------------------------------------------------------------------------------------------- 152 /tmp/foo.cc 13 0 100.00% 3 0 100.00% 13 0 100.00% 153 -------------------------------------------------------------------------------------------------------------------------------------- 154 TOTAL 13 0 100.00% 3 0 100.00% 13 0 100.00% 155 156The ``llvm-cov`` tool supports specifying a custom demangler, writing out 157reports in a directory structure, and generating html reports. For the full 158list of options, please refer to the `command guide 159<https://llvm.org/docs/CommandGuide/llvm-cov.html>`_. 160 161A few final notes: 162 163* The ``-sparse`` flag is optional but can result in dramatically smaller 164 indexed profiles. This option should not be used if the indexed profile will 165 be reused for PGO. 166 167* Raw profiles can be discarded after they are indexed. Advanced use of the 168 profile runtime library allows an instrumented program to merge profiling 169 information directly into an existing raw profile on disk. The details are 170 out of scope. 171 172* The ``llvm-profdata`` tool can be used to merge together multiple raw or 173 indexed profiles. To combine profiling data from multiple runs of a program, 174 try e.g: 175 176 .. code-block:: console 177 178 % llvm-profdata merge -sparse foo1.profraw foo2.profdata -o foo3.profdata 179 180Exporting coverage data 181======================= 182 183Coverage data can be exported into JSON using the ``llvm-cov export`` 184sub-command. There is a comprehensive reference which defines the structure of 185the exported data at a high level in the llvm-cov source code. 186 187Interpreting reports 188==================== 189 190There are four statistics tracked in a coverage summary: 191 192* Function coverage is the percentage of functions which have been executed at 193 least once. A function is considered to be executed if any of its 194 instantiations are executed. 195 196* Instantiation coverage is the percentage of function instantiations which 197 have been executed at least once. Template functions and static inline 198 functions from headers are two kinds of functions which may have multiple 199 instantiations. 200 201* Line coverage is the percentage of code lines which have been executed at 202 least once. Only executable lines within function bodies are considered to be 203 code lines. 204 205* Region coverage is the percentage of code regions which have been executed at 206 least once. A code region may span multiple lines (e.g in a large function 207 body with no control flow). However, it's also possible for a single line to 208 contain multiple code regions (e.g in "return x || y && z"). 209 210Of these four statistics, function coverage is usually the least granular while 211region coverage is the most granular. The project-wide totals for each 212statistic are listed in the summary. 213 214Format compatibility guarantees 215=============================== 216 217* There are no backwards or forwards compatibility guarantees for the raw 218 profile format. Raw profiles may be dependent on the specific compiler 219 revision used to generate them. It's inadvisable to store raw profiles for 220 long periods of time. 221 222* Tools must retain **backwards** compatibility with indexed profile formats. 223 These formats are not forwards-compatible: i.e, a tool which uses format 224 version X will not be able to understand format version (X+k). 225 226* Tools must also retain **backwards** compatibility with the format of the 227 coverage mappings emitted into instrumented binaries. These formats are not 228 forwards-compatible. 229 230* The JSON coverage export format has a (major, minor, patch) version triple. 231 Only a major version increment indicates a backwards-incompatible change. A 232 minor version increment is for added functionality, and patch version 233 increments are for bugfixes. 234 235Using the profiling runtime without static initializers 236======================================================= 237 238By default the compiler runtime uses a static initializer to determine the 239profile output path and to register a writer function. To collect profiles 240without using static initializers, do this manually: 241 242* Export a ``int __llvm_profile_runtime`` symbol from each instrumented shared 243 library and executable. When the linker finds a definition of this symbol, it 244 knows to skip loading the object which contains the profiling runtime's 245 static initializer. 246 247* Forward-declare ``void __llvm_profile_initialize_file(void)`` and call it 248 once from each instrumented executable. This function parses 249 ``LLVM_PROFILE_FILE``, sets the output path, and truncates any existing files 250 at that path. To get the same behavior without truncating existing files, 251 pass a filename pattern string to ``void __llvm_profile_set_filename(char 252 *)``. These calls can be placed anywhere so long as they precede all calls 253 to ``__llvm_profile_write_file``. 254 255* Forward-declare ``int __llvm_profile_write_file(void)`` and call it to write 256 out a profile. This function returns 0 when it succeeds, and a non-zero value 257 otherwise. Calling this function multiple times appends profile data to an 258 existing on-disk raw profile. 259 260In C++ files, declare these as ``extern "C"``. 261 262Collecting coverage reports for the llvm project 263================================================ 264 265To prepare a coverage report for llvm (and any of its sub-projects), add 266``-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On`` to the cmake configuration. Raw 267profiles will be written to ``$BUILD_DIR/profiles/``. To prepare an html 268report, run ``llvm/utils/prepare-code-coverage-artifact.py``. 269 270To specify an alternate directory for raw profiles, use 271``-DLLVM_PROFILE_DATA_DIR``. To change the size of the profile merge pool, use 272``-DLLVM_PROFILE_MERGE_POOL_SIZE``. 273 274Drawbacks and limitations 275========================= 276 277* Prior to version 2.26, the GNU binutils BFD linker is not able link programs 278 compiled with ``-fcoverage-mapping`` in its ``--gc-sections`` mode. Possible 279 workarounds include disabling ``--gc-sections``, upgrading to a newer version 280 of BFD, or using the Gold linker. 281 282* Code coverage does not handle unpredictable changes in control flow or stack 283 unwinding in the presence of exceptions precisely. Consider the following 284 function: 285 286 .. code-block:: cpp 287 288 int f() { 289 may_throw(); 290 return 0; 291 } 292 293 If the call to ``may_throw()`` propagates an exception into ``f``, the code 294 coverage tool may mark the ``return`` statement as executed even though it is 295 not. A call to ``longjmp()`` can have similar effects. 296