17a1d5ef7SJ. Ryan Stinnett# Instruction referencing for debug info 27a1d5ef7SJ. Ryan Stinnett 37a1d5ef7SJ. Ryan StinnettThis document explains how LLVM uses value tracking, or instruction 47a1d5ef7SJ. Ryan Stinnettreferencing, to determine variable locations for debug info in the code 57a1d5ef7SJ. Ryan Stinnettgeneration stage of compilation. This content is aimed at those working on code 67a1d5ef7SJ. Ryan Stinnettgeneration targets and optimisation passes. It may also be of interest to anyone 77a1d5ef7SJ. Ryan Stinnettcurious about low-level debug info handling. 87a1d5ef7SJ. Ryan Stinnett 97a1d5ef7SJ. Ryan Stinnett# Problem statement 107a1d5ef7SJ. Ryan Stinnett 117a1d5ef7SJ. Ryan StinnettAt the end of compilation, LLVM must produce a DWARF location list (or similar) 127a1d5ef7SJ. Ryan Stinnettdescribing what register or stack location a variable can be found in, for each 137a1d5ef7SJ. Ryan Stinnettinstruction in that variable's lexical scope. We could track the virtual 147a1d5ef7SJ. Ryan Stinnettregister that the variable resides in through compilation, however this is 157a1d5ef7SJ. Ryan Stinnettvulnerable to register optimisations during regalloc, and instruction 167a1d5ef7SJ. Ryan Stinnettmovements. 177a1d5ef7SJ. Ryan Stinnett 187a1d5ef7SJ. Ryan Stinnett# Solution: instruction referencing 197a1d5ef7SJ. Ryan Stinnett 207a1d5ef7SJ. Ryan StinnettRather than identify the virtual register that a variable value resides in, 217a1d5ef7SJ. Ryan Stinnettinstead in instruction referencing mode, LLVM refers to the machine instruction 227a1d5ef7SJ. Ryan Stinnettand operand position that the value is defined in. Consider the LLVM IR way of 237a1d5ef7SJ. Ryan Stinnettreferring to instruction values: 247a1d5ef7SJ. Ryan Stinnett 25b878245aSJ. Ryan Stinnett```llvm 267a1d5ef7SJ. Ryan Stinnett%2 = add i32 %0, %1 27*400d4fd7SStephen Tozer #dbg_value(metadata i32 %2, 28b878245aSJ. Ryan Stinnett``` 297a1d5ef7SJ. Ryan Stinnett 307a1d5ef7SJ. Ryan StinnettIn LLVM IR, the IR Value is synonymous with the instruction that computes the 317a1d5ef7SJ. Ryan Stinnettvalue, to the extent that in memory a Value is a pointer to the computing 327a1d5ef7SJ. Ryan Stinnettinstruction. Instruction referencing implements this relationship in the 337a1d5ef7SJ. Ryan Stinnettcodegen backend of LLVM, after instruction selection. Consider the X86 assembly 347a1d5ef7SJ. Ryan Stinnettbelow and instruction referencing debug info, corresponding to the earlier 357a1d5ef7SJ. Ryan StinnettLLVM IR: 367a1d5ef7SJ. Ryan Stinnett 37b878245aSJ. Ryan Stinnett```text 387a1d5ef7SJ. Ryan Stinnett%2:gr32 = ADD32rr %0, %1, implicit-def $eflags, debug-instr-number 1 397a1d5ef7SJ. Ryan StinnettDBG_INSTR_REF 1, 0, !123, !456, debug-location !789 40b878245aSJ. Ryan Stinnett``` 417a1d5ef7SJ. Ryan Stinnett 42b878245aSJ. Ryan StinnettWhile the function remains in SSA form, virtual register `%2` is sufficient to 437a1d5ef7SJ. Ryan Stinnettidentify the value computed by the instruction -- however the function 447a1d5ef7SJ. Ryan Stinnetteventually leaves SSA form, and register optimisations will obscure which 457a1d5ef7SJ. Ryan Stinnettregister the desired value is in. Instead, a more consistent way of identifying 46b878245aSJ. Ryan Stinnettthe instruction's value is to refer to the `MachineOperand` where the value is 47b878245aSJ. Ryan Stinnettdefined: independently of which register is defined by that `MachineOperand`. In 48b878245aSJ. Ryan Stinnettthe code above, the `DBG_INSTR_REF` instruction refers to instruction number 49b878245aSJ. Ryan Stinnettone, operand zero, while the `ADD32rr` has a `debug-instr-number` attribute 50b878245aSJ. Ryan Stinnettattached indicating that it is instruction number one. 517a1d5ef7SJ. Ryan Stinnett 527a1d5ef7SJ. Ryan StinnettDe-coupling variable locations from registers avoids difficulties involving 537a1d5ef7SJ. Ryan Stinnettregister allocation and optimisation, but requires additional instrumentation 547a1d5ef7SJ. Ryan Stinnettwhen the instructions are optimised instead. Optimisations that replace 557a1d5ef7SJ. Ryan Stinnettinstructions with optimised versions that compute the same value must either 567a1d5ef7SJ. Ryan Stinnettpreserve the instruction number, or record a substitution from the old 577a1d5ef7SJ. Ryan Stinnettinstruction / operand number pair to the new instruction / operand pair -- see 58b878245aSJ. Ryan Stinnett`MachineFunction::substituteDebugValuesForInst`. If debug info maintenance is 59b878245aSJ. Ryan Stinnettnot performed, or an instruction is eliminated as dead code, the variable 60b878245aSJ. Ryan Stinnettlocation is safely dropped and marked "optimised out". The exception is 61b878245aSJ. Ryan Stinnettinstructions that are mutated rather than replaced, which always need debug info 627a1d5ef7SJ. Ryan Stinnettmaintenance. 637a1d5ef7SJ. Ryan Stinnett 647a1d5ef7SJ. Ryan Stinnett# Register allocator considerations 657a1d5ef7SJ. Ryan Stinnett 667a1d5ef7SJ. Ryan StinnettWhen the register allocator runs, debugging instructions do not directly refer 677a1d5ef7SJ. Ryan Stinnettto any virtual registers, and thus there is no need for expensive location 68b878245aSJ. Ryan Stinnettmaintenance during regalloc (i.e. `LiveDebugVariables`). Debug instructions are 697a1d5ef7SJ. Ryan Stinnettunlinked from the function, then linked back in after register allocation 707a1d5ef7SJ. Ryan Stinnettcompletes. 717a1d5ef7SJ. Ryan Stinnett 72b878245aSJ. Ryan StinnettThe exception is `PHI` instructions: these become implicit definitions at 73b878245aSJ. Ryan Stinnettcontrol flow merges once regalloc finishes, and any debug numbers attached to 74b878245aSJ. Ryan Stinnett`PHI` instructions are lost. To circumvent this, debug numbers of `PHI`s are 75b878245aSJ. Ryan Stinnettrecorded at the start of register allocation (`phi-node-elimination`), then 76b878245aSJ. Ryan Stinnett`DBG_PHI` instructions are inserted after regalloc finishes. This requires some 777a1d5ef7SJ. Ryan Stinnettmaintenance of which register a variable is located in during regalloc, but at 787a1d5ef7SJ. Ryan Stinnettsingle positions (block entry points) rather than ranges of instructions. 797a1d5ef7SJ. Ryan Stinnett 807a1d5ef7SJ. Ryan StinnettAn example, before regalloc: 817a1d5ef7SJ. Ryan Stinnett 82b878245aSJ. Ryan Stinnett```text 837a1d5ef7SJ. Ryan Stinnettbb.2: 847a1d5ef7SJ. Ryan Stinnett %2 = PHI %1, %bb.0, %2, %bb.1, debug-instr-number 1 85b878245aSJ. Ryan Stinnett``` 867a1d5ef7SJ. Ryan Stinnett 877a1d5ef7SJ. Ryan StinnettAfter: 887a1d5ef7SJ. Ryan Stinnett 89b878245aSJ. Ryan Stinnett```text 907a1d5ef7SJ. Ryan Stinnettbb.2: 917a1d5ef7SJ. Ryan Stinnett DBG_PHI $rax, 1 92b878245aSJ. Ryan Stinnett``` 937a1d5ef7SJ. Ryan Stinnett 94b878245aSJ. Ryan Stinnett# `LiveDebugValues` 957a1d5ef7SJ. Ryan Stinnett 967a1d5ef7SJ. Ryan StinnettAfter optimisations and code layout complete, information about variable 977a1d5ef7SJ. Ryan Stinnettvalues must be translated into variable locations, i.e. registers and stack 98a225d897SJ. Ryan Stinnettslots. This is performed in the [`LiveDebugValues` pass][LiveDebugValues], where 997a1d5ef7SJ. Ryan Stinnettthe debug instructions and machine code are separated out into two independent 1007a1d5ef7SJ. Ryan Stinnettfunctions: 1017a1d5ef7SJ. Ryan Stinnett * One that assigns values to variable names, 1027a1d5ef7SJ. Ryan Stinnett * One that assigns values to machine registers and stack slots. 1037a1d5ef7SJ. Ryan Stinnett 104b878245aSJ. Ryan StinnettLLVM's existing SSA tools are used to place `PHI`s for each function, between 1057a1d5ef7SJ. Ryan Stinnettvariable values and the values contained in machine locations, with value 106b878245aSJ. Ryan Stinnettpropagation eliminating any unnecessary `PHI`s. The two can then be joined up 1077a1d5ef7SJ. Ryan Stinnettto map variables to values, then values to locations, for each instruction in 1087a1d5ef7SJ. Ryan Stinnettthe function. 1097a1d5ef7SJ. Ryan Stinnett 1107a1d5ef7SJ. Ryan StinnettKey to this process is being able to identify the movement of values between 1117a1d5ef7SJ. Ryan Stinnettregisters and stack locations, so that the location of values can be preserved 1127a1d5ef7SJ. Ryan Stinnettfor the full time that they are resident in the machine. 1137a1d5ef7SJ. Ryan Stinnett 1147a1d5ef7SJ. Ryan Stinnett# Required target support and transition guide 1157a1d5ef7SJ. Ryan Stinnett 1167a1d5ef7SJ. Ryan StinnettInstruction referencing will work on any target, but likely with poor coverage. 1177a1d5ef7SJ. Ryan StinnettSupporting instruction referencing well requires: 118b878245aSJ. Ryan Stinnett * Target hooks to be implemented to allow `LiveDebugValues` to follow values 119b878245aSJ. Ryan Stinnett through the machine, 120b878245aSJ. Ryan Stinnett * Target-specific optimisations to be instrumented, to preserve instruction 121b878245aSJ. Ryan Stinnett numbers. 1227a1d5ef7SJ. Ryan Stinnett 1237a1d5ef7SJ. Ryan Stinnett## Target hooks 1247a1d5ef7SJ. Ryan Stinnett 125b878245aSJ. Ryan Stinnett`TargetInstrInfo::isCopyInstrImpl` must be implemented to recognise any 126b878245aSJ. Ryan Stinnettinstructions that are copy-like -- `LiveDebugValues` uses this to identify when 1277a1d5ef7SJ. Ryan Stinnettvalues move between registers. 1287a1d5ef7SJ. Ryan Stinnett 129b878245aSJ. Ryan Stinnett`TargetInstrInfo::isLoadFromStackSlotPostFE` and 130b878245aSJ. Ryan Stinnett`TargetInstrInfo::isStoreToStackSlotPostFE` are needed to identify spill and 1317a1d5ef7SJ. Ryan Stinnettrestore instructions. Each should return the destination or source register 132b878245aSJ. Ryan Stinnettrespectively. `LiveDebugValues` will track the movement of a value from / to 1337a1d5ef7SJ. Ryan Stinnettthe stack slot. In addition, any instruction that writes to a stack spill 134b878245aSJ. Ryan Stinnettshould have a `MachineMemoryOperand` attached, so that `LiveDebugValues` can 1357a1d5ef7SJ. Ryan Stinnettrecognise that a slot has been clobbered. 1367a1d5ef7SJ. Ryan Stinnett 1377a1d5ef7SJ. Ryan Stinnett## Target-specific optimisation instrumentation 1387a1d5ef7SJ. Ryan Stinnett 139b878245aSJ. Ryan StinnettOptimisations come in two flavours: those that mutate a `MachineInstr` to make 1407a1d5ef7SJ. Ryan Stinnettit do something different, and those that create a new instruction to replace 1417a1d5ef7SJ. Ryan Stinnettthe operation of the old. 1427a1d5ef7SJ. Ryan Stinnett 1437a1d5ef7SJ. Ryan StinnettThe former _must_ be instrumented -- the relevant question is whether any 1447a1d5ef7SJ. Ryan Stinnettregister def in any operand will produce a different value, as a result of the 145b878245aSJ. Ryan Stinnettmutation. If the answer is yes, then there is a risk that a `DBG_INSTR_REF` 1467a1d5ef7SJ. Ryan Stinnettinstruction referring to that operand will end up assigning the different 1477a1d5ef7SJ. Ryan Stinnettvalue to a variable, presenting the debugging developer with an unexpected 148b878245aSJ. Ryan Stinnettvariable value. In such scenarios, call `MachineInstr::dropDebugNumber()` on the 149b878245aSJ. Ryan Stinnettmutated instruction to erase its instruction number. Any `DBG_INSTR_REF` 1507a1d5ef7SJ. Ryan Stinnettreferring to it will produce an empty variable location instead, that appears 1517a1d5ef7SJ. Ryan Stinnettas "optimised out" in the debugger. 1527a1d5ef7SJ. Ryan Stinnett 1537a1d5ef7SJ. Ryan StinnettFor the latter flavour of optimisation, to increase coverage you should record 1547a1d5ef7SJ. Ryan Stinnettan instruction number substitution: a mapping from the old instruction number / 1557a1d5ef7SJ. Ryan Stinnettoperand pair to new instruction number / operand pair. Consider if we replace 1567a1d5ef7SJ. Ryan Stinnetta three-address add instruction with a two-address add: 1577a1d5ef7SJ. Ryan Stinnett 158b878245aSJ. Ryan Stinnett```text 1597a1d5ef7SJ. Ryan Stinnett%2:gr32 = ADD32rr %0, %1, debug-instr-number 1 160b878245aSJ. Ryan Stinnett``` 1617a1d5ef7SJ. Ryan Stinnett 1627a1d5ef7SJ. Ryan Stinnettbecomes 1637a1d5ef7SJ. Ryan Stinnett 164b878245aSJ. Ryan Stinnett```text 1657a1d5ef7SJ. Ryan Stinnett%2:gr32 = ADD32rr %0(tied-def 0), %1, debug-instr-number 2 166b878245aSJ. Ryan Stinnett``` 1677a1d5ef7SJ. Ryan Stinnett 1687a1d5ef7SJ. Ryan StinnettWith a substitution from "instruction number 1 operand 0" to "instruction number 169b878245aSJ. Ryan Stinnett2 operand 0" recorded in the `MachineFunction`. In `LiveDebugValues`, 170b878245aSJ. Ryan Stinnett`DBG_INSTR_REF`s will be mapped through the substitution table to find the most 171b878245aSJ. Ryan Stinnettrecent instruction number / operand number of the value it refers to. 1727a1d5ef7SJ. Ryan Stinnett 173b878245aSJ. Ryan StinnettUse `MachineFunction::substituteDebugValuesForInst` to automatically produce 1747a1d5ef7SJ. Ryan Stinnettsubstitutions between an old and new instruction. It assumes that any operand 1757a1d5ef7SJ. Ryan Stinnettthat is a def in the old instruction is a def in the new instruction at the 1767a1d5ef7SJ. Ryan Stinnettsame operand position. This works most of the time, for example in the example 1777a1d5ef7SJ. Ryan Stinnettabove. 1787a1d5ef7SJ. Ryan Stinnett 1797a1d5ef7SJ. Ryan StinnettIf operand numbers do not line up between the old and new instruction, use 180b878245aSJ. Ryan Stinnett`MachineInstr::getDebugInstrNum` to acquire the instruction number for the new 181b878245aSJ. Ryan Stinnettinstruction, and `MachineFunction::makeDebugValueSubstitution` to record the 1827a1d5ef7SJ. Ryan Stinnettmapping between register definitions in the old and new instructions. If some 1837a1d5ef7SJ. Ryan Stinnettvalues computed by the old instruction are no longer computed by the new 184b878245aSJ. Ryan Stinnettinstruction, record no substitution -- `LiveDebugValues` will safely drop the 1857a1d5ef7SJ. Ryan Stinnettnow unavailable variable value. 1867a1d5ef7SJ. Ryan Stinnett 187b878245aSJ. Ryan StinnettShould your target clone instructions, much the same as the `TailDuplicator` 1887a1d5ef7SJ. Ryan Stinnettoptimisation pass, do not attempt to preserve the instruction numbers or 189b878245aSJ. Ryan Stinnettrecord any substitutions. `MachineFunction::CloneMachineInstr` should drop the 1907a1d5ef7SJ. Ryan Stinnettinstruction number of any cloned instruction, to avoid duplicate numbers 191b878245aSJ. Ryan Stinnettappearing to `LiveDebugValues`. Dealing with duplicated instructions is a 1927a1d5ef7SJ. Ryan Stinnettnatural extension to instruction referencing that's currently unimplemented. 1937a1d5ef7SJ. Ryan Stinnett 194d97947e1SAiden Grossman[LiveDebugValues]: project:SourceLevelDebugging.rst#LiveDebugValues expansion of variable locations 195