1=========================== 2LLVM Branch Weight Metadata 3=========================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11Branch Weight Metadata represents branch weights as its likeliness to be taken 12(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an 13``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind. 14The first operator is always a ``MDString`` node with the string 15"branch_weights". Number of operators depends on the terminator type. 16 17Branch weights might be fetch from the profiling file, or generated based on 18`__builtin_expect`_ and `__builtin_expect_with_probability`_ instruction. 19 20All weights are represented as an unsigned 32-bit values, where higher value 21indicates greater chance to be taken. 22 23Supported Instructions 24====================== 25 26``BranchInst`` 27^^^^^^^^^^^^^^ 28 29Metadata is only assigned to the conditional branches. There are two extra 30operands for the true and the false branch. 31We optionally track if the metadata was added by ``__builtin_expect`` or 32``__builtin_expect_with_probability`` with an optional field ``!"expected"``. 33 34.. code-block:: none 35 36 !0 = !{ 37 !"branch_weights", 38 [ !"expected", ] 39 i32 <TRUE_BRANCH_WEIGHT>, 40 i32 <FALSE_BRANCH_WEIGHT> 41 } 42 43``SwitchInst`` 44^^^^^^^^^^^^^^ 45 46Branch weights are assigned to every case (including the ``default`` case which 47is always case #0). 48 49.. code-block:: none 50 51 !0 = !{ 52 !"branch_weights", 53 [ !"expected", ] 54 i32 <DEFAULT_BRANCH_WEIGHT> 55 [ , i32 <CASE_BRANCH_WEIGHT> ... ] 56 } 57 58``IndirectBrInst`` 59^^^^^^^^^^^^^^^^^^ 60 61Branch weights are assigned to every destination. 62 63.. code-block:: none 64 65 !0 = !{ 66 !"branch_weights", 67 [ !"expected", ] 68 i32 <LABEL_BRANCH_WEIGHT> 69 [ , i32 <LABEL_BRANCH_WEIGHT> ... ] 70 } 71 72``CallInst`` 73^^^^^^^^^^^^^^^^^^ 74 75Calls may have branch weight metadata, containing the execution count of 76the call. It is currently used in SamplePGO mode only, to augment the 77block and entry counts which may not be accurate with sampling. 78 79.. code-block:: none 80 81 !0 = !{ 82 !"branch_weights", 83 [ !"expected", ] 84 i32 <CALL_BRANCH_WEIGHT> 85 } 86 87``InvokeInst`` 88^^^^^^^^^^^^^^^^^^ 89 90Invoke instruction may have branch weight metadata with one or two weights. 91The second weight is optional and corresponds to the unwind branch. 92If only one weight is set then it contains the execution count of the call 93and used in SamplePGO mode only as described for the call instruction. If both 94weights are specified then the second weight contains count of unwind branch 95taken and the first weights contains the execution count of the call minus 96the count of unwind branch taken. Both weights specified are used to calculate 97BranchProbability as for BranchInst and for SamplePGO the sum of both weights 98is used. 99 100.. code-block:: none 101 102 !0 = !{ 103 !"branch_weights", 104 [ !"expected", ] 105 i32 <INVOKE_NORMAL_WEIGHT> 106 [ , i32 <INVOKE_UNWIND_WEIGHT> ] 107 } 108 109Other 110^^^^^ 111 112Other terminator instructions are not allowed to contain Branch Weight Metadata. 113 114.. _\__builtin_expect: 115 116Built-in ``expect`` Instructions 117================================ 118 119``__builtin_expect(long exp, long c)`` instruction provides branch prediction 120information. The return value is the value of ``exp``. 121 122It is especially useful in conditional statements. Currently Clang supports two 123conditional statements: 124 125``if`` statement 126^^^^^^^^^^^^^^^^ 127 128The ``exp`` parameter is the condition. The ``c`` parameter is the expected 129comparison value. If it is equal to 1 (true), the condition is likely to be 130true, in other case condition is likely to be false. For example: 131 132.. code-block:: c++ 133 134 if (__builtin_expect(x > 0, 1)) { 135 // This block is likely to be taken. 136 } 137 138``switch`` statement 139^^^^^^^^^^^^^^^^^^^^ 140 141The ``exp`` parameter is the value. The ``c`` parameter is the expected 142value. If the expected value doesn't show on the cases list, the ``default`` 143case is assumed to be likely taken. 144 145.. code-block:: c++ 146 147 switch (__builtin_expect(x, 5)) { 148 default: break; 149 case 0: // ... 150 case 3: // ... 151 case 5: // This case is likely to be taken. 152 } 153 154.. _\__builtin_expect_with_probability: 155 156Built-in ``expect.with.probability`` Instruction 157================================================ 158 159``__builtin_expect_with_probability(long exp, long c, double probability)`` has 160the same semantics as ``__builtin_expect``, but the caller provides the 161probability that ``exp == c``. The last argument ``probability`` must be 162constant floating-point expression and be in the range [0.0, 1.0] inclusive. 163The usage is also similar as ``__builtin_expect``, for example: 164 165``if`` statement 166^^^^^^^^^^^^^^^^ 167 168If the expect comparison value ``c`` is equal to 1(true), and probability 169value ``probability`` is set to 0.8, that means the probability of condition 170to be true is 80% while that of false is 20%. 171 172.. code-block:: c++ 173 174 if (__builtin_expect_with_probability(x > 0, 1, 0.8)) { 175 // This block is likely to be taken with probability 80%. 176 } 177 178``switch`` statement 179^^^^^^^^^^^^^^^^^^^^ 180 181This is basically the same as ``switch`` statement in ``__builtin_expect``. 182The probability that ``exp`` is equal to the expect value is given in 183the third argument ``probability``, while the probability of other value is 184the average of remaining probability(``1.0 - probability``). For example: 185 186.. code-block:: c++ 187 188 switch (__builtin_expect_with_probability(x, 5, 0.7)) { 189 default: break; // Take this case with probability 10% 190 case 0: break; // Take this case with probability 10% 191 case 3: break; // Take this case with probability 10% 192 case 5: break; // This case is likely to be taken with probability 70% 193 } 194 195CFG Modifications 196================= 197 198Branch Weight Metatada is not proof against CFG changes. If terminator operands' 199are changed some action should be taken. In other case some misoptimizations may 200occur due to incorrect branch prediction information. 201 202Function Entry Counts 203===================== 204 205To allow comparing different functions during inter-procedural analysis and 206optimization, ``MD_prof`` nodes can also be assigned to a function definition. 207The first operand is a string indicating the name of the associated counter. 208 209Currently, one counter is supported: "function_entry_count". The second operand 210is a 64-bit counter that indicates the number of times that this function was 211invoked (in the case of instrumentation-based profiles). In the case of 212sampling-based profiles, this operand is an approximation of how many times 213the function was invoked. 214 215For example, in the code below, the instrumentation for function foo() 216indicates that it was called 2,590 times at runtime. 217 218.. code-block:: llvm 219 220 define i32 @foo() !prof !1 { 221 ret i32 0 222 } 223 !1 = !{!"function_entry_count", i64 2590} 224 225If "function_entry_count" has more than 2 operands, the later operands are 226the GUID of the functions that needs to be imported by ThinLTO. This is only 227set by sampling based profile. It is needed because the sampling based profile 228was collected on a binary that had already imported and inlined these functions, 229and we need to ensure the IR matches in the ThinLTO backends for profile 230annotation. The reason why we cannot annotate this on the callsite is that it 231can only goes down 1 level in the call chain. For the cases where 232foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels 233in the call chain to import both bar_in_b_cc and baz_in_c_cc. 234