1a3dcc7abSJ. Ryan Stinnett# Debug info migration: From intrinsics to records 222da8096SJeremy Morse 322da8096SJeremy MorseWe're planning on removing debug info intrinsics from LLVM, as they're slow, unwieldy and can confuse optimisation passes if they're not expecting them. Instead of having a sequence of instructions that looks like this: 422da8096SJeremy Morse 522da8096SJeremy Morse```text 622da8096SJeremy Morse %add = add i32 %foo, %bar 722da8096SJeremy Morse call void @llvm.dbg.value(metadata %add, ... 822da8096SJeremy Morse %sub = sub i32 %add, %tosub 922da8096SJeremy Morse call void @llvm.dbg.value(metadata %sub, ... 1022da8096SJeremy Morse call void @a_normal_function() 1122da8096SJeremy Morse``` 1222da8096SJeremy Morse 1322da8096SJeremy Morsewith `dbg.value` intrinsics representing debug info records, it would instead be printed as: 1422da8096SJeremy Morse 1522da8096SJeremy Morse```text 1622da8096SJeremy Morse %add = add i32 %foo, %bar 1722da8096SJeremy Morse #dbg_value(%add, ... 1822da8096SJeremy Morse %sub = sub i32 %add, %tosub 1922da8096SJeremy Morse #dbg_value(%sub, ... 2022da8096SJeremy Morse call void @a_normal_function() 2122da8096SJeremy Morse``` 2222da8096SJeremy Morse 2322da8096SJeremy MorseThe debug records are not instructions, do not appear in the instruction list, and won't appear in your optimisation passes unless you go digging for them deliberately. 2422da8096SJeremy Morse 2522da8096SJeremy Morse# Great, what do I need to do! 2622da8096SJeremy Morse 27*65f81df4SJeremy MorseWe've largely completed the migration. The remaining rough edge is that going forwards, instructions must be inserted into basic blocks using iterators rather than instruction pointers. In almost all circumstances you can just call `getIterator` on an instruction pointer -- however, if you call a function that returns the start of a basic block, such as: 28*65f81df4SJeremy Morse 29*65f81df4SJeremy Morse1. BasicBlock::begin 30*65f81df4SJeremy Morse2. BasicBlock::getFirstNonPHIIt 31*65f81df4SJeremy Morse3. BasicBlock::getFirstInsertionPt 32*65f81df4SJeremy Morse 33*65f81df4SJeremy MorseThen you must past that iterator into the insertion function without modification (the iterator carries a debug-info bit). That's all! Read on for a more detailed explanation. 34a8e03aedSStephen Tozer 35a8e03aedSStephen Tozer## API Changes 3622da8096SJeremy Morse 3722da8096SJeremy MorseThere are two significant changes to be aware of. Firstly, we're adding a single bit of debug relevant data to the `BasicBlock::iterator` class (it's so that we can determine whether ranges intend on including debug info at the beginning of a block or not). That means when writing passes that insert LLVM IR instructions, you need to identify positions with `BasicBlock::iterator` rather than just a bare `Instruction *`. Most of the time this means that after identifying where you intend on inserting something, you must also call `getIterator` on the instruction position -- however when inserting at the start of a block you _must_ use `getFirstInsertionPt`, `getFirstNonPHIIt` or `begin` and use that iterator to insert, rather than just fetching a pointer to the first instruction. 3822da8096SJeremy Morse 3922da8096SJeremy MorseThe second matter is that if you transfer sequences of instructions from one place to another manually, i.e. repeatedly using `moveBefore` where you might have used `splice`, then you should instead use the method `moveBeforePreserving`. `moveBeforePreserving` will transfer debug info records with the instruction they're attached to. This is something that happens automatically today -- if you use `moveBefore` on every element of an instruction sequence, then debug intrinsics will be moved in the normal course of your code, but we lose this behaviour with non-instruction debug info. 4022da8096SJeremy Morse 41a8e03aedSStephen TozerFor a more in-depth overview of how to update existing code to support debug records, see [the guide below](#how-to-update-existing-code). 42a8e03aedSStephen Tozer 43e19199bdSStephen Tozer## Textual IR Changes 44e19199bdSStephen Tozer 45e19199bdSStephen TozerAs we change from using debug intrinsics to debug records, any tools that depend on parsing IR produced by LLVM will need to handle the new format. For the most part, the difference between the printed form of a debug intrinsic call and a debug record is trivial: 46e19199bdSStephen Tozer 47e19199bdSStephen Tozer1. An extra 2 spaces of indentation are added. 48e19199bdSStephen Tozer2. The text `(tail|notail|musttail)? call void @llvm.dbg.<type>` is replaced with `#dbg_<type>`. 49e19199bdSStephen Tozer3. The leading `metadata ` is removed from each argument to the intrinsic. 50e19199bdSStephen Tozer4. The DILocation changes from being an instruction attachment with the format `!dbg !<Num>`, to being an ordinary argument, i.e. `!<Num>`, that is passed as the final argument to the debug record. 51e19199bdSStephen Tozer 52e19199bdSStephen TozerFollowing these rules, we have this example of a debug intrinsic and the equivalent debug record: 53e19199bdSStephen Tozer 54e19199bdSStephen Tozer``` 55e19199bdSStephen Tozer; Debug Intrinsic: 56e19199bdSStephen Tozer call void @llvm.dbg.value(metadata i32 %add, metadata !10, metadata !DIExpression()), !dbg !20 57e19199bdSStephen Tozer; Debug Record: 58e19199bdSStephen Tozer #dbg_value(i32 %add, !10, !DIExpression(), !20) 59e19199bdSStephen Tozer``` 60e19199bdSStephen Tozer 61e19199bdSStephen Tozer### Test updates 62e19199bdSStephen Tozer 63e19199bdSStephen TozerAny tests downstream of the main LLVM repo that test the IR output of LLVM may break as a result of the change to using records. Updating an individual test to expect records instead of intrinsics should be trivial, given the update rules above. Updating many tests may be burdensome however; to update the lit tests in the main repository, the following steps were used: 64e19199bdSStephen Tozer 65e19199bdSStephen Tozer1. Collect the list of failing lit tests into a single file, `failing-tests.txt`, separated by (and ending with) newlines. 66e19199bdSStephen Tozer2. Use the following line to split the failing tests into tests that use update_test_checks and tests that don't: 67e19199bdSStephen Tozer ``` 68e19199bdSStephen Tozer $ while IFS= read -r f; do grep -q "Assertions have been autogenerated by" "$f" && echo "$f" >> update-checks-tests.txt || echo "$f" >> manual-tests.txt; done < failing-tests.txt 69e19199bdSStephen Tozer ``` 70e19199bdSStephen Tozer3. For the tests that use update_test_checks, run the appropriate update_test_checks script - for the main LLVM repo, this was achieved with: 71e19199bdSStephen Tozer ``` 72e19199bdSStephen Tozer $ xargs ./llvm/utils/update_test_checks.py --opt-binary ./build/bin/opt < update-checks-tests.txt 73e19199bdSStephen Tozer $ xargs ./llvm/utils/update_cc_test_checks.py --llvm-bin ./build/bin/ < update-checks-tests.txt 74e19199bdSStephen Tozer ``` 75e19199bdSStephen Tozer4. The remaining tests can be manually updated, although if there is a large number of tests then the following scripts may be useful; firstly, a script used to extract the check-line prefixes from a file: 76e19199bdSStephen Tozer ``` 77e19199bdSStephen Tozer $ cat ./get-checks.sh 78e19199bdSStephen Tozer #!/bin/bash 79e19199bdSStephen Tozer 80e19199bdSStephen Tozer # Always add CHECK, since it's more effort than it's worth to filter files where 81e19199bdSStephen Tozer # every RUN line uses other check prefixes. 82e19199bdSStephen Tozer # Then detect every instance of "check-prefix(es)=..." and add the 83e19199bdSStephen Tozer # comma-separated arguments as extra checks. 84e19199bdSStephen Tozer for filename in "$@" 85e19199bdSStephen Tozer do 86e19199bdSStephen Tozer echo "$filename,CHECK" 87e19199bdSStephen Tozer allchecks=$(grep -Eo 'check-prefix(es)?[ =][A-Z0-9_,-]+' $filename | sed -E 's/.+[= ]([A-Z0-9_,-]+).*/\1/g; s/,/\n/g') 88e19199bdSStephen Tozer for check in $allchecks; do 89e19199bdSStephen Tozer echo "$filename,$check" 90e19199bdSStephen Tozer done 91e19199bdSStephen Tozer done 92e19199bdSStephen Tozer ``` 93e19199bdSStephen Tozer Then a second script to perform the work of actually updating the check-lines in each of the failing tests, with a series of simple substitution patterns: 94e19199bdSStephen Tozer ``` 95e19199bdSStephen Tozer $ cat ./substitute-checks.sh 96e19199bdSStephen Tozer #!/bin/bash 97e19199bdSStephen Tozer 98e19199bdSStephen Tozer file="$1" 99e19199bdSStephen Tozer check="$2" 100e19199bdSStephen Tozer 101e19199bdSStephen Tozer # Any test that explicitly tests debug intrinsic output is not suitable to 102e19199bdSStephen Tozer # update by this script. 103e19199bdSStephen Tozer if grep -q "write-experimental-debuginfo=false" "$file"; then 104e19199bdSStephen Tozer exit 0 105e19199bdSStephen Tozer fi 106e19199bdSStephen Tozer 107e19199bdSStephen Tozer sed -i -E -e " 108e19199bdSStephen Tozer /(#|;|\/\/).*$check[A-Z0-9_\-]*:/!b 109e19199bdSStephen Tozer /DIGlobalVariableExpression/b 110e19199bdSStephen Tozer /!llvm.dbg./bpostcall 111e19199bdSStephen Tozer s/((((((no|must)?tail )?call.*)?void )?@)?llvm.)?dbg\.([a-z]+)/#dbg_\7/ 112e19199bdSStephen Tozer :postcall 113e19199bdSStephen Tozer /declare #dbg_/d 114e19199bdSStephen Tozer s/metadata //g 115e19199bdSStephen Tozer s/metadata\{/{/g 116e19199bdSStephen Tozer s/DIExpression\(([^)]*)\)\)(,( !dbg)?)?/DIExpression(\1),/ 117e19199bdSStephen Tozer /#dbg_/!b 118e19199bdSStephen Tozer s/((\))?(,) )?!dbg (![0-9]+)/\3\4\2/ 119e19199bdSStephen Tozer s/((\))?(, ))?!dbg/\3/ 120e19199bdSStephen Tozer " "$file" 121e19199bdSStephen Tozer ``` 122e19199bdSStephen Tozer Both of these scripts combined can be used on the list in `manual-tests.txt` as follows: 123e19199bdSStephen Tozer ``` 124e19199bdSStephen Tozer $ cat manual-tests.txt | xargs ./get-checks.sh | sort | uniq | awk -F ',' '{ system("./substitute-checks.sh " $1 " " $2) }' 125e19199bdSStephen Tozer ``` 126e19199bdSStephen Tozer These scripts dealt successfully with the vast majority of checks in `clang/test` and `llvm/test`. 127e19199bdSStephen Tozer5. Verify the resulting tests pass, and detect any failing tests: 128e19199bdSStephen Tozer ``` 129e19199bdSStephen Tozer $ xargs ./build/bin/llvm-lit -q < failing-tests.txt 130e19199bdSStephen Tozer ******************** 131e19199bdSStephen Tozer Failed Tests (5): 132e19199bdSStephen Tozer LLVM :: DebugInfo/Generic/dbg-value-lower-linenos.ll 133e19199bdSStephen Tozer LLVM :: Transforms/HotColdSplit/transfer-debug-info.ll 134e19199bdSStephen Tozer LLVM :: Transforms/ObjCARC/basic.ll 135e19199bdSStephen Tozer LLVM :: Transforms/ObjCARC/ensure-that-exception-unwind-path-is-visited.ll 136e19199bdSStephen Tozer LLVM :: Transforms/SafeStack/X86/debug-loc2.ll 137e19199bdSStephen Tozer 138e19199bdSStephen Tozer 139e19199bdSStephen Tozer Total Discovered Tests: 295 140e19199bdSStephen Tozer Failed: 5 (1.69%) 141e19199bdSStephen Tozer ``` 142e19199bdSStephen Tozer6. Some tests may have failed - the update scripts are simplistic and preserve no context across lines, and so there are cases that they will not handle; the remaining cases must be manually updated (or handled by further scripts). 143e19199bdSStephen Tozer 144f0dbcfe3SOrlando Cazalet-Hyams# C-API changes 145f0dbcfe3SOrlando Cazalet-Hyams 146d732a329SOrlando Cazalet-HyamsSome new functions that have been added are temporary and will be deprecated in the future. The intention is that they'll help downstream projects adapt during the transition period. 147f0dbcfe3SOrlando Cazalet-Hyams 148f0dbcfe3SOrlando Cazalet-Hyams``` 149db394edfSCarlos Alberto EncisoDeleted functions 150db394edfSCarlos Alberto Enciso----------------- 1512a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDeclareBefore # Insert a debug record (new debug info format) instead of a debug intrinsic (old debug info format). 1522a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDeclareAtEnd # Same as above. 1532a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDbgValueBefore # Same as above. 1542a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDbgValueAtEnd # Same as above. 155d732a329SOrlando Cazalet-Hyams 156db394edfSCarlos Alberto EncisoNew functions (to be deprecated) 157db394edfSCarlos Alberto Enciso-------------------------------- 158db394edfSCarlos Alberto EncisoLLVMIsNewDbgInfoFormat # Returns true if the module is in the new non-instruction mode. 159db394edfSCarlos Alberto EncisoLLVMSetIsNewDbgInfoFormat # Convert to the requested debug info format. 160db394edfSCarlos Alberto Enciso 161d732a329SOrlando Cazalet-HyamsNew functions (no plans to deprecate) 162db394edfSCarlos Alberto Enciso------------------------------------- 163c320df4aSMichal RosteckiLLVMGetFirstDbgRecord # Obtain the first debug record attached to an instruction. 164c320df4aSMichal RosteckiLLVMGetLastDbgRecord # Obtain the last debug record attached to an instruction. 165c320df4aSMichal RosteckiLLVMGetNextDbgRecord # Get next debug record or NULL. 166c320df4aSMichal RosteckiLLVMGetPreviousDbgRecord # Get previous debug record or NULL. 167db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDeclareRecordBefore # Insert a debug record (new debug info format). 168db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDeclareRecordAtEnd # Same as above. See info below. 169db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDbgValueRecordBefore # Same as above. See info below. 170db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDbgValueRecordAtEnd # Same as above. See info below. 171db394edfSCarlos Alberto Enciso 172d732a329SOrlando Cazalet-HyamsLLVMPositionBuilderBeforeDbgRecords # See info below. 173d732a329SOrlando Cazalet-HyamsLLVMPositionBuilderBeforeInstrAndDbgRecords # See info below. 174f0dbcfe3SOrlando Cazalet-Hyams``` 175f0dbcfe3SOrlando Cazalet-Hyams 176db394edfSCarlos Alberto Enciso`LLVMDIBuilderInsertDeclareRecordBefore`, `LLVMDIBuilderInsertDeclareRecordAtEnd`, `LLVMDIBuilderInsertDbgValueRecordBefore` and `LLVMDIBuilderInsertDbgValueRecordAtEnd` are replacing the deleted `LLVMDIBuilderInsertDeclareBefore-style` functions. 177db394edfSCarlos Alberto Enciso 178d732a329SOrlando Cazalet-Hyams`LLVMPositionBuilderBeforeDbgRecords` and `LLVMPositionBuilderBeforeInstrAndDbgRecords` behave the same as `LLVMPositionBuilder` and `LLVMPositionBuilderBefore` except the insertion position is set before the debug records that precede the target instruction. Note that this doesn't mean that debug intrinsics before the chosen instruction are skipped, only debug records (which unlike debug records are not themselves instructions). 179d732a329SOrlando Cazalet-Hyams 180d732a329SOrlando Cazalet-HyamsIf you don't know which function to call then follow this rule: 181d732a329SOrlando Cazalet-HyamsIf you are trying to insert at the start of a block, or purposfully skip debug intrinsics to determine the insertion point for any other reason, then call the new functions. 182d732a329SOrlando Cazalet-Hyams 183d732a329SOrlando Cazalet-Hyams`LLVMPositionBuilder` and `LLVMPositionBuilderBefore` are unchanged. They insert before the indicated instruction but after any attached debug records. 184d732a329SOrlando Cazalet-Hyams 185c320df4aSMichal Rostecki`LLVMGetFirstDbgRecord`, `LLVMGetLastDbgRecord`, `LLVMGetNextDbgRecord` and `LLVMGetPreviousDbgRecord` can be used for iterating over debug records attached to instructions (provided as `LLVMValueRef`). 186c320df4aSMichal Rostecki 187c320df4aSMichal Rostecki```c 188c320df4aSMichal RosteckiLLVMDbgRecordRef DbgRec; 189c320df4aSMichal Rosteckifor (DbgRec = LLVMGetFirstDbgRecord(Inst); DbgRec; 190c320df4aSMichal Rostecki DbgRec = LLVMGetNextDbgRecord(DbgRec)) { 191c320df4aSMichal Rostecki // do something with DbgRec 192c320df4aSMichal Rostecki} 193c320df4aSMichal Rostecki``` 194c320df4aSMichal Rostecki 195c320df4aSMichal Rostecki```c 196c320df4aSMichal RosteckiLLVMDbgRecordRef DbgRec; 197c320df4aSMichal Rosteckifor (DbgRec = LLVMGetLastDbgRecord(Inst); DbgRec; 198c320df4aSMichal Rostecki DbgRec = LLVMGetPreviousDbgRecord(DbgRec)) { 199c320df4aSMichal Rostecki // do something with DbgRec 200c320df4aSMichal Rostecki} 201c320df4aSMichal Rostecki```` 202c320df4aSMichal Rostecki 203a8e03aedSStephen Tozer# The new "Debug Record" model 20422da8096SJeremy Morse 205a8e03aedSStephen TozerBelow is a brief overview of the new representation that replaces debug intrinsics; for an instructive guide on updating old code, see [here](#how-to-update-existing-code). 20622da8096SJeremy Morse 20722da8096SJeremy Morse## What exactly have you replaced debug intrinsics with? 20822da8096SJeremy Morse 209763be018SStephen TozerWe're using a dedicated C++ class called `DbgRecord` to store debug info, with a one-to-one relationship between each instance of a debug intrinsic and each `DbgRecord` object in any LLVM IR program; these `DbgRecord`s are represented in the IR as non-instruction debug records, as described in the [Source Level Debugging](project:SourceLevelDebugging.rst#Debug Records) document. This class has a set of subclasses that store exactly the same information as is stored in debugging intrinsics. Each one also has almost entirely the same set of methods, that behave in the same way: 21022da8096SJeremy Morse 211763be018SStephen Tozer https://llvm.org/docs/doxygen/classllvm_1_1DbgRecord.html 212ffd08c77SStephen Tozer https://llvm.org/docs/doxygen/classllvm_1_1DbgVariableRecord.html 213bdc77d1eSStephen Tozer https://llvm.org/docs/doxygen/classllvm_1_1DbgLabelRecord.html 21422da8096SJeremy Morse 215bdc77d1eSStephen TozerThis allows you to treat a `DbgVariableRecord` as if it's a `dbg.value`/`dbg.declare`/`dbg.assign` intrinsic most of the time, for example in generic (auto-param) lambdas, and the same for `DbgLabelRecord` and `dbg.label`s. 21622da8096SJeremy Morse 217763be018SStephen Tozer## How do these `DbgRecords` fit into the instruction stream? 21822da8096SJeremy Morse 21922da8096SJeremy MorseLike so: 22022da8096SJeremy Morse 22122da8096SJeremy Morse```text 22222da8096SJeremy Morse +---------------+ +---------------+ 22322da8096SJeremy Morse---------------->| Instruction +--------->| Instruction | 22422da8096SJeremy Morse +-------+-------+ +---------------+ 22522da8096SJeremy Morse | 22622da8096SJeremy Morse | 22722da8096SJeremy Morse | 22822da8096SJeremy Morse | 22922da8096SJeremy Morse v 23075dfa58eSStephen Tozer +-------------+ 23175dfa58eSStephen Tozer <-------+ DbgMarker |<------- 23275dfa58eSStephen Tozer / +-------------+ \ 23322da8096SJeremy Morse / \ 23422da8096SJeremy Morse / \ 23522da8096SJeremy Morse v ^ 236763be018SStephen Tozer +-------------+ +-------------+ +-------------+ 237763be018SStephen Tozer | DbgRecord +--->| DbgRecord +-->| DbgRecord | 238763be018SStephen Tozer +-------------+ +-------------+ +-------------+ 23922da8096SJeremy Morse``` 24022da8096SJeremy Morse 24175dfa58eSStephen TozerEach instruction has a pointer to a `DbgMarker` (which will become optional), that contains a list of `DbgRecord` objects. No debugging records appear in the instruction list at all. `DbgRecord`s have a parent pointer to their owning `DbgMarker`, and each `DbgMarker` has a pointer back to it's owning instruction. 24222da8096SJeremy Morse 243ffd08c77SStephen TozerNot shown are the links from DbgRecord to other parts of the `Value`/`Metadata` hierachy: `DbgRecord` subclasses have tracking pointers to the DIMetadata that they use, and `DbgVariableRecord` has references to `Value`s that are stored in a `DebugValueUser` base class. This refers to a `ValueAsMetadata` object referring to `Value`s, via the `TrackingMetadata` facility. 24422da8096SJeremy Morse 245bdc77d1eSStephen TozerThe various kinds of debug intrinsic (value, declare, assign, label) are all stored in `DbgRecord` subclasses, with a "RecordKind" field distinguishing `DbgLabelRecord`s from `DbgVariableRecord`s, and a `LocationType` field in the `DbgVariableRecord` class further disambiguating the various debug variable intrinsics it can represent. 24622da8096SJeremy Morse 247a8e03aedSStephen Tozer# How to update existing code 248a8e03aedSStephen Tozer 249a8e03aedSStephen TozerAny existing code that interacts with debug intrinsics in some way will need to be updated to interact with debug records in the same way. A few quick rules to keep in mind when updating code: 250a8e03aedSStephen Tozer 251a8e03aedSStephen Tozer- Debug records will not be seen when iterating over instructions; to find the debug records that appear immediately before an instruction, you'll need to iterate over `Instruction::getDbgRecordRange()`. 252a8e03aedSStephen Tozer- Debug records have interfaces that are identical to those of debug intrinsics, meaning that any code that operates on debug intrinsics can be trivially applied to debug records as well. The exceptions for this are `Instruction` or `CallInst` methods that don't logically apply to debug records, and `isa`/`cast`/`dyn_cast` methods, are replaced by methods on the `DbgRecord` class itself. 253a8e03aedSStephen Tozer- Debug records cannot appear in a module that also contains debug intrinsics; the two are mutually exclusive. As debug records are the future format, handling records correctly should be prioritized in new code. 254a8e03aedSStephen Tozer- Until support for intrinsics is no longer present, a valid hotfix for code that only handles debug intrinsics and is non-trivial to update is to convert the module to the intrinsic format using `Module::setIsNewDbgInfoFormat`, and convert it back afterwards. 255a8e03aedSStephen Tozer - This can also be performed within a lexical scope for a module or an individual function using the class `ScopedDbgInfoFormatSetter`: 256a8e03aedSStephen Tozer ``` 257a8e03aedSStephen Tozer void handleModule(Module &M) { 258a8e03aedSStephen Tozer { 259a8e03aedSStephen Tozer ScopedDbgInfoFormatSetter FormatSetter(M, false); 260a8e03aedSStephen Tozer handleModuleWithDebugIntrinsics(M); 261a8e03aedSStephen Tozer } 262a8e03aedSStephen Tozer // Module returns to previous debug info format after exiting the above block. 263a8e03aedSStephen Tozer } 264a8e03aedSStephen Tozer ``` 265a8e03aedSStephen Tozer 266a8e03aedSStephen TozerBelow is a rough guide on how existing code that currently supports debug intrinsics can be updated to support debug records. 267a8e03aedSStephen Tozer 268a8e03aedSStephen Tozer## Creating debug records 269a8e03aedSStephen Tozer 270a8e03aedSStephen TozerDebug records will automatically be created by the `DIBuilder` class when the new format is enabled. As with instructions, it is also possible to call `DbgRecord::clone` to create an unattached copy of an existing record. 271a8e03aedSStephen Tozer 272a8e03aedSStephen Tozer## Skipping debug records, ignoring debug-uses of `Values`, stably counting instructions, etc. 273a8e03aedSStephen Tozer 274a8e03aedSStephen TozerThis will all happen transparently without needing to think about it! 275a8e03aedSStephen Tozer 276a8e03aedSStephen Tozer``` 277a8e03aedSStephen Tozerfor (Instruction &I : BB) { 278a8e03aedSStephen Tozer // Old: Skips debug intrinsics 279a8e03aedSStephen Tozer if (isa<DbgInfoIntrinsic>(&I)) 280a8e03aedSStephen Tozer continue; 281a8e03aedSStephen Tozer // New: No extra code needed, debug records are skipped by default. 282a8e03aedSStephen Tozer ... 283a8e03aedSStephen Tozer} 284a8e03aedSStephen Tozer``` 285a8e03aedSStephen Tozer 286a8e03aedSStephen Tozer## Finding debug records 28722da8096SJeremy Morse 288ffd08c77SStephen TozerUtilities such as `findDbgUsers` and the like now have an optional argument that will return the set of `DbgVariableRecord` records that refer to a `Value`. You should be able to treat them the same as intrinsics. 28922da8096SJeremy Morse 290a8e03aedSStephen Tozer``` 291a8e03aedSStephen Tozer// Old: 292a8e03aedSStephen Tozer SmallVector<DbgVariableIntrinsic *> DbgUsers; 293a8e03aedSStephen Tozer findDbgUsers(DbgUsers, V); 294a8e03aedSStephen Tozer for (auto *DVI : DbgUsers) { 295a8e03aedSStephen Tozer if (DVI->getParent() != BB) 296a8e03aedSStephen Tozer DVI->replaceVariableLocationOp(V, New); 297a8e03aedSStephen Tozer } 298a8e03aedSStephen Tozer// New: 299a8e03aedSStephen Tozer SmallVector<DbgVariableIntrinsic *> DbgUsers; 300a8e03aedSStephen Tozer SmallVector<DbgVariableRecord *> DVRUsers; 301a8e03aedSStephen Tozer findDbgUsers(DbgUsers, V, &DVRUsers); 302a8e03aedSStephen Tozer for (auto *DVI : DbgUsers) 303a8e03aedSStephen Tozer if (DVI->getParent() != BB) 304a8e03aedSStephen Tozer DVI->replaceVariableLocationOp(V, New); 305a8e03aedSStephen Tozer for (auto *DVR : DVRUsers) 306a8e03aedSStephen Tozer if (DVR->getParent() != BB) 307a8e03aedSStephen Tozer DVR->replaceVariableLocationOp(V, New); 308a8e03aedSStephen Tozer``` 309a8e03aedSStephen Tozer 310a8e03aedSStephen Tozer## Examining debug records at positions 31122da8096SJeremy Morse 312763be018SStephen TozerCall `Instruction::getDbgRecordRange()` to get the range of `DbgRecord` objects that are attached to an instruction. 31322da8096SJeremy Morse 314a8e03aedSStephen Tozer``` 315a8e03aedSStephen Tozerfor (Instruction &I : BB) { 316a8e03aedSStephen Tozer // Old: Uses a data member of a debug intrinsic, and then skips to the next 317a8e03aedSStephen Tozer // instruction. 318a8e03aedSStephen Tozer if (DbgInfoIntrinsic *DII = dyn_cast<DbgInfoIntrinsic>(&I)) { 319a8e03aedSStephen Tozer recordDebugLocation(DII->getDebugLoc()); 320a8e03aedSStephen Tozer continue; 321a8e03aedSStephen Tozer } 322a8e03aedSStephen Tozer // New: Iterates over the debug records that appear before `I`, and treats 323a8e03aedSStephen Tozer // them identically to the intrinsic block above. 324a8e03aedSStephen Tozer // NB: This should always appear at the top of the for-loop, so that we 325a8e03aedSStephen Tozer // process the debug records preceding `I` before `I` itself. 326a8e03aedSStephen Tozer for (DbgRecord &DR = I.getDbgRecordRange()) { 327a8e03aedSStephen Tozer recordDebugLocation(DR.getDebugLoc()); 328a8e03aedSStephen Tozer } 329a8e03aedSStephen Tozer processInstruction(I); 330a8e03aedSStephen Tozer} 331a8e03aedSStephen Tozer``` 33222da8096SJeremy Morse 333a8e03aedSStephen TozerThis can also be passed through the function `filterDbgVars` to specifically 334a8e03aedSStephen Tozeriterate over DbgVariableRecords, which are more commonly used. 33522da8096SJeremy Morse 336a8e03aedSStephen Tozer``` 337a8e03aedSStephen Tozerfor (Instruction &I : BB) { 338a8e03aedSStephen Tozer // Old: If `I` is a DbgVariableIntrinsic we record the variable, and apply 339a8e03aedSStephen Tozer // extra logic if it is an `llvm.dbg.declare`. 340a8e03aedSStephen Tozer if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I)) { 341a8e03aedSStephen Tozer recordVariable(DVI->getVariable()); 342a8e03aedSStephen Tozer if (DbgDeclareInst *DDI = dyn_cast<DbgDeclareInst>(DVI)) 343a8e03aedSStephen Tozer recordDeclareAddress(DDI->getAddress()); 344a8e03aedSStephen Tozer continue; 345a8e03aedSStephen Tozer } 346a8e03aedSStephen Tozer // New: `filterDbgVars` is used to iterate over only DbgVariableRecords. 347a8e03aedSStephen Tozer for (DbgVariableRecord &DVR = filterDbgVars(I.getDbgRecordRange())) { 348a8e03aedSStephen Tozer recordVariable(DVR.getVariable()); 349a8e03aedSStephen Tozer // Debug variable records are not cast to subclasses; simply call the 350a8e03aedSStephen Tozer // appropriate `isDbgX()` check, and use the methods as normal. 351a8e03aedSStephen Tozer if (DVR.isDbgDeclare()) 352a8e03aedSStephen Tozer recordDeclareAddress(DVR.getAddress()); 353a8e03aedSStephen Tozer } 354a8e03aedSStephen Tozer // ... 355a8e03aedSStephen Tozer} 356a8e03aedSStephen Tozer``` 35722da8096SJeremy Morse 358a8e03aedSStephen Tozer## Processing individual debug records 359a8e03aedSStephen Tozer 360a8e03aedSStephen TozerIn most cases, any code that operates on debug intrinsics can be extracted to a template function or auto lambda (if it is not already in one) that can be applied to both debug intrinsics and debug records - though keep in mind the main exception that `isa`/`cast`/`dyn_cast` do not apply to `DbgVariableRecord` types. 361a8e03aedSStephen Tozer 362a8e03aedSStephen Tozer``` 363a8e03aedSStephen Tozer// Old: Function that operates on debug variable intrinsics in a BasicBlock, and 364a8e03aedSStephen Tozer// collects llvm.dbg.declares. 365a8e03aedSStephen Tozervoid processDbgInfoInBlock(BasicBlock &BB, 366a8e03aedSStephen Tozer SmallVectorImpl<DbgDeclareInst*> &DeclareIntrinsics) { 367a8e03aedSStephen Tozer for (Instruction &I : BB) { 368a8e03aedSStephen Tozer if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I)) { 369a8e03aedSStephen Tozer processVariableValue(DebugVariable(DVI), DVI->getValue()); 370a8e03aedSStephen Tozer if (DbgDeclareInst *DDI = dyn_cast<DbgDeclareInst>(DVI)) 371a8e03aedSStephen Tozer Declares.push_back(DDI); 372a8e03aedSStephen Tozer else if (!isa<Constant>(DVI->getValue())) 373a8e03aedSStephen Tozer DVI->setKillLocation(); 374a8e03aedSStephen Tozer } 375a8e03aedSStephen Tozer } 376a8e03aedSStephen Tozer} 377a8e03aedSStephen Tozer 378a8e03aedSStephen Tozer// New: Template function is used to deduplicate handling of intrinsics and 379a8e03aedSStephen Tozer// records. 380a8e03aedSStephen Tozer// An overloaded function is also used to handle isa/cast/dyn_cast operations 381a8e03aedSStephen Tozer// for intrinsics and records, since those functions cannot be directly applied 382a8e03aedSStephen Tozer// to DbgRecords. 383a8e03aedSStephen TozerDbgDeclareInst *DynCastToDeclare(DbgVariableIntrinsic *DVI) { 384a8e03aedSStephen Tozer return dyn_cast<DbgDeclareInst>(DVI); 385a8e03aedSStephen Tozer} 386a8e03aedSStephen TozerDbgVariableRecord *DynCastToDeclare(DbgVariableRecord *DVR) { 387a8e03aedSStephen Tozer return DVR->isDbgDeclare() ? DVR : nullptr; 388a8e03aedSStephen Tozer} 389a8e03aedSStephen Tozer 390a8e03aedSStephen Tozertemplate<typename DbgVarTy, DbgDeclTy> 391a8e03aedSStephen Tozervoid processDbgVariable(DbgVarTy *DbgVar, 392a8e03aedSStephen Tozer SmallVectorImpl<DbgDeclTy*> &Declares) { 393a8e03aedSStephen Tozer processVariableValue(DebugVariable(DbgVar), DbgVar->getValue()); 394a8e03aedSStephen Tozer if (DbgDeclTy *DbgDeclare = DynCastToDeclare(DbgVar)) 395a8e03aedSStephen Tozer Declares.push_back(DbgDeclare); 396a8e03aedSStephen Tozer else if (!isa<Constant>(DbgVar->getValue())) 397a8e03aedSStephen Tozer DbgVar->setKillLocation(); 398a8e03aedSStephen Tozer}; 399a8e03aedSStephen Tozer 400a8e03aedSStephen Tozervoid processDbgInfoInBlock(BasicBlock &BB, 401a8e03aedSStephen Tozer SmallVectorImpl<DbgDeclareInst*> &DeclareIntrinsics, 402a8e03aedSStephen Tozer SmallVectorImpl<DbgVariableRecord*> &DeclareRecords) { 403a8e03aedSStephen Tozer for (Instruction &I : BB) { 404a8e03aedSStephen Tozer if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I)) 405a8e03aedSStephen Tozer processDbgVariable(DVI, DeclareIntrinsics); 406a8e03aedSStephen Tozer for (DbgVariableRecord *DVR : filterDbgVars(I.getDbgRecordRange())) 407a8e03aedSStephen Tozer processDbgVariable(DVR, DeclareRecords); 408a8e03aedSStephen Tozer } 409a8e03aedSStephen Tozer} 410a8e03aedSStephen Tozer``` 411a8e03aedSStephen Tozer 412a8e03aedSStephen Tozer## Moving and deleting debug records 413a8e03aedSStephen Tozer 414a8e03aedSStephen TozerYou can use `DbgRecord::removeFromParent` to unlink a `DbgRecord` from it's marker, and then `BasicBlock::insertDbgRecordBefore` or `BasicBlock::insertDbgRecordAfter` to re-insert the `DbgRecord` somewhere else. You cannot insert a `DbgRecord` at an arbitary point in a list of `DbgRecord`s (if you're doing this with `llvm.dbg.value`s then it's unlikely to be correct). 415a8e03aedSStephen Tozer 416a8e03aedSStephen TozerErase `DbgRecord`s by calling `eraseFromParent`. 417a8e03aedSStephen Tozer 418a8e03aedSStephen Tozer``` 419a8e03aedSStephen Tozer// Old: Move a debug intrinsic to the start of the block, and delete all other intrinsics for the same variable in the block. 420a8e03aedSStephen Tozervoid moveDbgIntrinsicToStart(DbgVariableIntrinsic *DVI) { 421a8e03aedSStephen Tozer BasicBlock *ParentBB = DVI->getParent(); 422a8e03aedSStephen Tozer DVI->removeFromParent(); 423a8e03aedSStephen Tozer for (Instruction &I : ParentBB) { 424a8e03aedSStephen Tozer if (auto *BlockDVI = dyn_cast<DbgVariableIntrinsic>(&I)) 425a8e03aedSStephen Tozer if (BlockDVI->getVariable() == DVI->getVariable()) 426a8e03aedSStephen Tozer BlockDVI->eraseFromParent(); 427a8e03aedSStephen Tozer } 428a8e03aedSStephen Tozer DVI->insertBefore(ParentBB->getFirstInsertionPt()); 429a8e03aedSStephen Tozer} 430a8e03aedSStephen Tozer 431a8e03aedSStephen Tozer// New: Perform the same operation, but for a debug record. 432a8e03aedSStephen Tozervoid moveDbgRecordToStart(DbgVariableRecord *DVR) { 433a8e03aedSStephen Tozer BasicBlock *ParentBB = DVR->getParent(); 434a8e03aedSStephen Tozer DVR->removeFromParent(); 435a8e03aedSStephen Tozer for (Instruction &I : ParentBB) { 436a8e03aedSStephen Tozer for (auto &BlockDVR : filterDbgVars(I.getDbgRecordRange())) 437a8e03aedSStephen Tozer if (BlockDVR->getVariable() == DVR->getVariable()) 438a8e03aedSStephen Tozer BlockDVR->eraseFromParent(); 439a8e03aedSStephen Tozer } 440a8e03aedSStephen Tozer DVR->insertBefore(ParentBB->getFirstInsertionPt()); 441a8e03aedSStephen Tozer} 442a8e03aedSStephen Tozer``` 443a8e03aedSStephen Tozer 444a8e03aedSStephen Tozer## What about dangling debug records? 44522da8096SJeremy Morse 44622da8096SJeremy MorseIf you have a block like so: 44722da8096SJeremy Morse 44822da8096SJeremy Morse```text 44922da8096SJeremy Morse foo: 45022da8096SJeremy Morse %bar = add i32 %baz... 45122da8096SJeremy Morse dbg.value(metadata i32 %bar,... 45222da8096SJeremy Morse br label %xyzzy 45322da8096SJeremy Morse``` 45422da8096SJeremy Morse 455763be018SStephen Tozeryour optimisation pass may wish to erase the terminator and then do something to the block. This is easy to do when debug info is kept in instructions, but with `DbgRecord`s there is no trailing instruction to attach the variable information to in the block above, once the terminator is erased. For such degenerate blocks, `DbgRecord`s are stored temporarily in a map in `LLVMContext`, and are re-inserted when a terminator is reinserted to the block or other instruction inserted at `end()`. 45622da8096SJeremy Morse 45722da8096SJeremy MorseThis can technically lead to trouble in the vanishingly rare scenario where an optimisation pass erases a terminator and then decides to erase the whole block. (We recommend not doing that). 458a8e03aedSStephen Tozer 459a8e03aedSStephen Tozer## Anything else? 460a8e03aedSStephen Tozer 461a8e03aedSStephen TozerThe above guide does not comprehensively cover every pattern that could apply to debug intrinsics; as mentioned at the [start of the guide](#how-to-update-existing-code), you can temporarily convert the target module from debug records to intrinsics as a stopgap measure. Most operations that can be performed on debug intrinsics have exact equivalents for debug records, but if you encounter any exceptions, reading the class docs (linked [here](#what-exactly-have-you-replaced-debug-intrinsics-with)) may give some insight, there may be examples in the existing codebase, and you can always ask for help on the [forums](https://discourse.llvm.org/tag/debuginfo). 462