xref: /llvm-project/llvm/docs/RemoveDIsDebugInfo.md (revision 65f81df473904d2df2b9eaa91ff4fcbe69f8fb00)
1a3dcc7abSJ. Ryan Stinnett# Debug info migration: From intrinsics to records
222da8096SJeremy Morse
322da8096SJeremy MorseWe're planning on removing debug info intrinsics from LLVM, as they're slow, unwieldy and can confuse optimisation passes if they're not expecting them. Instead of having a sequence of instructions that looks like this:
422da8096SJeremy Morse
522da8096SJeremy Morse```text
622da8096SJeremy Morse    %add = add i32 %foo, %bar
722da8096SJeremy Morse    call void @llvm.dbg.value(metadata %add, ...
822da8096SJeremy Morse    %sub = sub i32 %add, %tosub
922da8096SJeremy Morse    call void @llvm.dbg.value(metadata %sub, ...
1022da8096SJeremy Morse    call void @a_normal_function()
1122da8096SJeremy Morse```
1222da8096SJeremy Morse
1322da8096SJeremy Morsewith `dbg.value` intrinsics representing debug info records, it would instead be printed as:
1422da8096SJeremy Morse
1522da8096SJeremy Morse```text
1622da8096SJeremy Morse    %add = add i32 %foo, %bar
1722da8096SJeremy Morse      #dbg_value(%add, ...
1822da8096SJeremy Morse    %sub = sub i32 %add, %tosub
1922da8096SJeremy Morse      #dbg_value(%sub, ...
2022da8096SJeremy Morse    call void @a_normal_function()
2122da8096SJeremy Morse```
2222da8096SJeremy Morse
2322da8096SJeremy MorseThe debug records are not instructions, do not appear in the instruction list, and won't appear in your optimisation passes unless you go digging for them deliberately.
2422da8096SJeremy Morse
2522da8096SJeremy Morse# Great, what do I need to do!
2622da8096SJeremy Morse
27*65f81df4SJeremy MorseWe've largely completed the migration. The remaining rough edge is that going forwards, instructions must be inserted into basic blocks using iterators rather than instruction pointers. In almost all circumstances you can just call `getIterator` on an instruction pointer -- however, if you call a function that returns the start of a basic block, such as:
28*65f81df4SJeremy Morse
29*65f81df4SJeremy Morse1. BasicBlock::begin
30*65f81df4SJeremy Morse2. BasicBlock::getFirstNonPHIIt
31*65f81df4SJeremy Morse3. BasicBlock::getFirstInsertionPt
32*65f81df4SJeremy Morse
33*65f81df4SJeremy MorseThen you must past that iterator into the insertion function without modification (the iterator carries a debug-info bit). That's all! Read on for a more detailed explanation.
34a8e03aedSStephen Tozer
35a8e03aedSStephen Tozer## API Changes
3622da8096SJeremy Morse
3722da8096SJeremy MorseThere are two significant changes to be aware of. Firstly, we're adding a single bit of debug relevant data to the `BasicBlock::iterator` class (it's so that we can determine whether ranges intend on including debug info at the beginning of a block or not). That means when writing passes that insert LLVM IR instructions, you need to identify positions with `BasicBlock::iterator` rather than just a bare `Instruction *`. Most of the time this means that after identifying where you intend on inserting something, you must also call `getIterator` on the instruction position -- however when inserting at the start of a block you _must_ use `getFirstInsertionPt`, `getFirstNonPHIIt` or `begin` and use that iterator to insert, rather than just fetching a pointer to the first instruction.
3822da8096SJeremy Morse
3922da8096SJeremy MorseThe second matter is that if you transfer sequences of instructions from one place to another manually, i.e. repeatedly using `moveBefore` where you might have used `splice`, then you should instead use the method `moveBeforePreserving`. `moveBeforePreserving` will transfer debug info records with the instruction they're attached to. This is something that happens automatically today -- if you use `moveBefore` on every element of an instruction sequence, then debug intrinsics will be moved in the normal course of your code, but we lose this behaviour with non-instruction debug info.
4022da8096SJeremy Morse
41a8e03aedSStephen TozerFor a more in-depth overview of how to update existing code to support debug records, see [the guide below](#how-to-update-existing-code).
42a8e03aedSStephen Tozer
43e19199bdSStephen Tozer## Textual IR Changes
44e19199bdSStephen Tozer
45e19199bdSStephen TozerAs we change from using debug intrinsics to debug records, any tools that depend on parsing IR produced by LLVM will need to handle the new format. For the most part, the difference between the printed form of a debug intrinsic call and a debug record is trivial:
46e19199bdSStephen Tozer
47e19199bdSStephen Tozer1. An extra 2 spaces of indentation are added.
48e19199bdSStephen Tozer2. The text `(tail|notail|musttail)? call void @llvm.dbg.<type>` is replaced with `#dbg_<type>`.
49e19199bdSStephen Tozer3. The leading `metadata ` is removed from each argument to the intrinsic.
50e19199bdSStephen Tozer4. The DILocation changes from being an instruction attachment with the format `!dbg !<Num>`, to being an ordinary argument, i.e. `!<Num>`, that is passed as the final argument to the debug record.
51e19199bdSStephen Tozer
52e19199bdSStephen TozerFollowing these rules, we have this example of a debug intrinsic and the equivalent debug record:
53e19199bdSStephen Tozer
54e19199bdSStephen Tozer```
55e19199bdSStephen Tozer; Debug Intrinsic:
56e19199bdSStephen Tozer  call void @llvm.dbg.value(metadata i32 %add, metadata !10, metadata !DIExpression()), !dbg !20
57e19199bdSStephen Tozer; Debug Record:
58e19199bdSStephen Tozer    #dbg_value(i32 %add, !10, !DIExpression(), !20)
59e19199bdSStephen Tozer```
60e19199bdSStephen Tozer
61e19199bdSStephen Tozer### Test updates
62e19199bdSStephen Tozer
63e19199bdSStephen TozerAny tests downstream of the main LLVM repo that test the IR output of LLVM may break as a result of the change to using records. Updating an individual test to expect records instead of intrinsics should be trivial, given the update rules above. Updating many tests may be burdensome however; to update the lit tests in the main repository, the following steps were used:
64e19199bdSStephen Tozer
65e19199bdSStephen Tozer1. Collect the list of failing lit tests into a single file, `failing-tests.txt`, separated by (and ending with) newlines.
66e19199bdSStephen Tozer2. Use the following line to split the failing tests into tests that use update_test_checks and tests that don't:
67e19199bdSStephen Tozer    ```
68e19199bdSStephen Tozer    $ while IFS= read -r f; do grep -q "Assertions have been autogenerated by" "$f" && echo "$f" >> update-checks-tests.txt || echo "$f" >> manual-tests.txt; done < failing-tests.txt
69e19199bdSStephen Tozer    ```
70e19199bdSStephen Tozer3. For the tests that use update_test_checks, run the appropriate update_test_checks script - for the main LLVM repo, this was achieved with:
71e19199bdSStephen Tozer    ```
72e19199bdSStephen Tozer    $ xargs ./llvm/utils/update_test_checks.py --opt-binary ./build/bin/opt < update-checks-tests.txt
73e19199bdSStephen Tozer    $ xargs ./llvm/utils/update_cc_test_checks.py --llvm-bin ./build/bin/ < update-checks-tests.txt
74e19199bdSStephen Tozer    ```
75e19199bdSStephen Tozer4. The remaining tests can be manually updated, although if there is a large number of tests then the following scripts may be useful; firstly, a script used to extract the check-line prefixes from a file:
76e19199bdSStephen Tozer    ```
77e19199bdSStephen Tozer    $ cat ./get-checks.sh
78e19199bdSStephen Tozer    #!/bin/bash
79e19199bdSStephen Tozer
80e19199bdSStephen Tozer    # Always add CHECK, since it's more effort than it's worth to filter files where
81e19199bdSStephen Tozer    # every RUN line uses other check prefixes.
82e19199bdSStephen Tozer    # Then detect every instance of "check-prefix(es)=..." and add the
83e19199bdSStephen Tozer    # comma-separated arguments as extra checks.
84e19199bdSStephen Tozer    for filename in "$@"
85e19199bdSStephen Tozer    do
86e19199bdSStephen Tozer        echo "$filename,CHECK"
87e19199bdSStephen Tozer        allchecks=$(grep -Eo 'check-prefix(es)?[ =][A-Z0-9_,-]+' $filename | sed -E 's/.+[= ]([A-Z0-9_,-]+).*/\1/g; s/,/\n/g')
88e19199bdSStephen Tozer        for check in $allchecks; do
89e19199bdSStephen Tozer            echo "$filename,$check"
90e19199bdSStephen Tozer        done
91e19199bdSStephen Tozer    done
92e19199bdSStephen Tozer    ```
93e19199bdSStephen Tozer    Then a second script to perform the work of actually updating the check-lines in each of the failing tests, with a series of simple substitution patterns:
94e19199bdSStephen Tozer    ```
95e19199bdSStephen Tozer    $ cat ./substitute-checks.sh
96e19199bdSStephen Tozer    #!/bin/bash
97e19199bdSStephen Tozer
98e19199bdSStephen Tozer    file="$1"
99e19199bdSStephen Tozer    check="$2"
100e19199bdSStephen Tozer
101e19199bdSStephen Tozer    # Any test that explicitly tests debug intrinsic output is not suitable to
102e19199bdSStephen Tozer    # update by this script.
103e19199bdSStephen Tozer    if grep -q "write-experimental-debuginfo=false" "$file"; then
104e19199bdSStephen Tozer        exit 0
105e19199bdSStephen Tozer    fi
106e19199bdSStephen Tozer
107e19199bdSStephen Tozer    sed -i -E -e "
108e19199bdSStephen Tozer    /(#|;|\/\/).*$check[A-Z0-9_\-]*:/!b
109e19199bdSStephen Tozer    /DIGlobalVariableExpression/b
110e19199bdSStephen Tozer    /!llvm.dbg./bpostcall
111e19199bdSStephen Tozer    s/((((((no|must)?tail )?call.*)?void )?@)?llvm.)?dbg\.([a-z]+)/#dbg_\7/
112e19199bdSStephen Tozer    :postcall
113e19199bdSStephen Tozer    /declare #dbg_/d
114e19199bdSStephen Tozer    s/metadata //g
115e19199bdSStephen Tozer    s/metadata\{/{/g
116e19199bdSStephen Tozer    s/DIExpression\(([^)]*)\)\)(,( !dbg)?)?/DIExpression(\1),/
117e19199bdSStephen Tozer    /#dbg_/!b
118e19199bdSStephen Tozer    s/((\))?(,) )?!dbg (![0-9]+)/\3\4\2/
119e19199bdSStephen Tozer    s/((\))?(, ))?!dbg/\3/
120e19199bdSStephen Tozer    " "$file"
121e19199bdSStephen Tozer    ```
122e19199bdSStephen Tozer    Both of these scripts combined can be used on the list in `manual-tests.txt` as follows:
123e19199bdSStephen Tozer    ```
124e19199bdSStephen Tozer    $ cat manual-tests.txt | xargs ./get-checks.sh | sort | uniq | awk -F ',' '{ system("./substitute-checks.sh " $1 " " $2) }'
125e19199bdSStephen Tozer    ```
126e19199bdSStephen Tozer    These scripts dealt successfully with the vast majority of checks in `clang/test` and `llvm/test`.
127e19199bdSStephen Tozer5. Verify the resulting tests pass, and detect any failing tests:
128e19199bdSStephen Tozer    ```
129e19199bdSStephen Tozer    $ xargs ./build/bin/llvm-lit -q < failing-tests.txt
130e19199bdSStephen Tozer    ********************
131e19199bdSStephen Tozer    Failed Tests (5):
132e19199bdSStephen Tozer    LLVM :: DebugInfo/Generic/dbg-value-lower-linenos.ll
133e19199bdSStephen Tozer    LLVM :: Transforms/HotColdSplit/transfer-debug-info.ll
134e19199bdSStephen Tozer    LLVM :: Transforms/ObjCARC/basic.ll
135e19199bdSStephen Tozer    LLVM :: Transforms/ObjCARC/ensure-that-exception-unwind-path-is-visited.ll
136e19199bdSStephen Tozer    LLVM :: Transforms/SafeStack/X86/debug-loc2.ll
137e19199bdSStephen Tozer
138e19199bdSStephen Tozer
139e19199bdSStephen Tozer    Total Discovered Tests: 295
140e19199bdSStephen Tozer    Failed: 5 (1.69%)
141e19199bdSStephen Tozer    ```
142e19199bdSStephen Tozer6. Some tests may have failed - the update scripts are simplistic and preserve no context across lines, and so there are cases that they will not handle; the remaining cases must be manually updated (or handled by further scripts).
143e19199bdSStephen Tozer
144f0dbcfe3SOrlando Cazalet-Hyams# C-API changes
145f0dbcfe3SOrlando Cazalet-Hyams
146d732a329SOrlando Cazalet-HyamsSome new functions that have been added are temporary and will be deprecated in the future. The intention is that they'll help downstream projects adapt during the transition period.
147f0dbcfe3SOrlando Cazalet-Hyams
148f0dbcfe3SOrlando Cazalet-Hyams```
149db394edfSCarlos Alberto EncisoDeleted functions
150db394edfSCarlos Alberto Enciso-----------------
1512a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDeclareBefore   # Insert a debug record (new debug info format) instead of a debug intrinsic (old debug info format).
1522a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDeclareAtEnd    # Same as above.
1532a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDbgValueBefore  # Same as above.
1542a2fd488SOrlando Cazalet-HyamsLLVMDIBuilderInsertDbgValueAtEnd   # Same as above.
155d732a329SOrlando Cazalet-Hyams
156db394edfSCarlos Alberto EncisoNew functions (to be deprecated)
157db394edfSCarlos Alberto Enciso--------------------------------
158db394edfSCarlos Alberto EncisoLLVMIsNewDbgInfoFormat     # Returns true if the module is in the new non-instruction mode.
159db394edfSCarlos Alberto EncisoLLVMSetIsNewDbgInfoFormat  # Convert to the requested debug info format.
160db394edfSCarlos Alberto Enciso
161d732a329SOrlando Cazalet-HyamsNew functions (no plans to deprecate)
162db394edfSCarlos Alberto Enciso-------------------------------------
163c320df4aSMichal RosteckiLLVMGetFirstDbgRecord                    # Obtain the first debug record attached to an instruction.
164c320df4aSMichal RosteckiLLVMGetLastDbgRecord                     # Obtain the last debug record attached to an instruction.
165c320df4aSMichal RosteckiLLVMGetNextDbgRecord                     # Get next debug record or NULL.
166c320df4aSMichal RosteckiLLVMGetPreviousDbgRecord                 # Get previous debug record or NULL.
167db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDeclareRecordBefore   # Insert a debug record (new debug info format).
168db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDeclareRecordAtEnd    # Same as above. See info below.
169db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDbgValueRecordBefore  # Same as above. See info below.
170db394edfSCarlos Alberto EncisoLLVMDIBuilderInsertDbgValueRecordAtEnd   # Same as above. See info below.
171db394edfSCarlos Alberto Enciso
172d732a329SOrlando Cazalet-HyamsLLVMPositionBuilderBeforeDbgRecords          # See info below.
173d732a329SOrlando Cazalet-HyamsLLVMPositionBuilderBeforeInstrAndDbgRecords  # See info below.
174f0dbcfe3SOrlando Cazalet-Hyams```
175f0dbcfe3SOrlando Cazalet-Hyams
176db394edfSCarlos Alberto Enciso`LLVMDIBuilderInsertDeclareRecordBefore`, `LLVMDIBuilderInsertDeclareRecordAtEnd`, `LLVMDIBuilderInsertDbgValueRecordBefore` and `LLVMDIBuilderInsertDbgValueRecordAtEnd` are replacing the deleted `LLVMDIBuilderInsertDeclareBefore-style` functions.
177db394edfSCarlos Alberto Enciso
178d732a329SOrlando Cazalet-Hyams`LLVMPositionBuilderBeforeDbgRecords` and `LLVMPositionBuilderBeforeInstrAndDbgRecords` behave the same as `LLVMPositionBuilder` and `LLVMPositionBuilderBefore` except the insertion position is set before the debug records that precede the target instruction. Note that this doesn't mean that debug intrinsics before the chosen instruction are skipped, only debug records (which unlike debug records are not themselves instructions).
179d732a329SOrlando Cazalet-Hyams
180d732a329SOrlando Cazalet-HyamsIf you don't know which function to call then follow this rule:
181d732a329SOrlando Cazalet-HyamsIf you are trying to insert at the start of a block, or purposfully skip debug intrinsics to determine the insertion point for any other reason, then call the new functions.
182d732a329SOrlando Cazalet-Hyams
183d732a329SOrlando Cazalet-Hyams`LLVMPositionBuilder` and `LLVMPositionBuilderBefore` are unchanged. They insert before the indicated instruction but after any attached debug records.
184d732a329SOrlando Cazalet-Hyams
185c320df4aSMichal Rostecki`LLVMGetFirstDbgRecord`, `LLVMGetLastDbgRecord`, `LLVMGetNextDbgRecord` and `LLVMGetPreviousDbgRecord` can be used for iterating over debug records attached to instructions (provided as `LLVMValueRef`).
186c320df4aSMichal Rostecki
187c320df4aSMichal Rostecki```c
188c320df4aSMichal RosteckiLLVMDbgRecordRef DbgRec;
189c320df4aSMichal Rosteckifor (DbgRec = LLVMGetFirstDbgRecord(Inst); DbgRec;
190c320df4aSMichal Rostecki     DbgRec = LLVMGetNextDbgRecord(DbgRec)) {
191c320df4aSMichal Rostecki  // do something with DbgRec
192c320df4aSMichal Rostecki}
193c320df4aSMichal Rostecki```
194c320df4aSMichal Rostecki
195c320df4aSMichal Rostecki```c
196c320df4aSMichal RosteckiLLVMDbgRecordRef DbgRec;
197c320df4aSMichal Rosteckifor (DbgRec = LLVMGetLastDbgRecord(Inst); DbgRec;
198c320df4aSMichal Rostecki     DbgRec = LLVMGetPreviousDbgRecord(DbgRec)) {
199c320df4aSMichal Rostecki  // do something with DbgRec
200c320df4aSMichal Rostecki}
201c320df4aSMichal Rostecki````
202c320df4aSMichal Rostecki
203a8e03aedSStephen Tozer# The new "Debug Record" model
20422da8096SJeremy Morse
205a8e03aedSStephen TozerBelow is a brief overview of the new representation that replaces debug intrinsics; for an instructive guide on updating old code, see [here](#how-to-update-existing-code).
20622da8096SJeremy Morse
20722da8096SJeremy Morse## What exactly have you replaced debug intrinsics with?
20822da8096SJeremy Morse
209763be018SStephen TozerWe're using a dedicated C++ class called `DbgRecord` to store debug info, with a one-to-one relationship between each instance of a debug intrinsic and each `DbgRecord` object in any LLVM IR program; these `DbgRecord`s are represented in the IR as non-instruction debug records, as described in the [Source Level Debugging](project:SourceLevelDebugging.rst#Debug Records) document. This class has a set of subclasses that store exactly the same information as is stored in debugging intrinsics. Each one also has almost entirely the same set of methods, that behave in the same way:
21022da8096SJeremy Morse
211763be018SStephen Tozer  https://llvm.org/docs/doxygen/classllvm_1_1DbgRecord.html
212ffd08c77SStephen Tozer  https://llvm.org/docs/doxygen/classllvm_1_1DbgVariableRecord.html
213bdc77d1eSStephen Tozer  https://llvm.org/docs/doxygen/classllvm_1_1DbgLabelRecord.html
21422da8096SJeremy Morse
215bdc77d1eSStephen TozerThis allows you to treat a `DbgVariableRecord` as if it's a `dbg.value`/`dbg.declare`/`dbg.assign` intrinsic most of the time, for example in generic (auto-param) lambdas, and the same for `DbgLabelRecord` and `dbg.label`s.
21622da8096SJeremy Morse
217763be018SStephen Tozer## How do these `DbgRecords` fit into the instruction stream?
21822da8096SJeremy Morse
21922da8096SJeremy MorseLike so:
22022da8096SJeremy Morse
22122da8096SJeremy Morse```text
22222da8096SJeremy Morse                 +---------------+          +---------------+
22322da8096SJeremy Morse---------------->|  Instruction  +--------->|  Instruction  |
22422da8096SJeremy Morse                 +-------+-------+          +---------------+
22522da8096SJeremy Morse                         |
22622da8096SJeremy Morse                         |
22722da8096SJeremy Morse                         |
22822da8096SJeremy Morse                         |
22922da8096SJeremy Morse                         v
23075dfa58eSStephen Tozer                  +-------------+
23175dfa58eSStephen Tozer          <-------+  DbgMarker  |<-------
23275dfa58eSStephen Tozer         /        +-------------+        \
23322da8096SJeremy Morse        /                                 \
23422da8096SJeremy Morse       /                                   \
23522da8096SJeremy Morse      v                                     ^
236763be018SStephen Tozer +-------------+    +-------------+   +-------------+
237763be018SStephen Tozer |  DbgRecord  +--->|  DbgRecord  +-->|  DbgRecord  |
238763be018SStephen Tozer +-------------+    +-------------+   +-------------+
23922da8096SJeremy Morse```
24022da8096SJeremy Morse
24175dfa58eSStephen TozerEach instruction has a pointer to a `DbgMarker` (which will become optional), that contains a list of `DbgRecord` objects. No debugging records appear in the instruction list at all. `DbgRecord`s have a parent pointer to their owning `DbgMarker`, and each `DbgMarker` has a pointer back to it's owning instruction.
24222da8096SJeremy Morse
243ffd08c77SStephen TozerNot shown are the links from DbgRecord to other parts of the `Value`/`Metadata` hierachy: `DbgRecord` subclasses have tracking pointers to the DIMetadata that they use, and `DbgVariableRecord` has references to `Value`s that are stored in a `DebugValueUser` base class. This refers to a `ValueAsMetadata` object referring to `Value`s, via the `TrackingMetadata` facility.
24422da8096SJeremy Morse
245bdc77d1eSStephen TozerThe various kinds of debug intrinsic (value, declare, assign, label) are all stored in `DbgRecord` subclasses, with a "RecordKind" field distinguishing `DbgLabelRecord`s from `DbgVariableRecord`s, and a `LocationType` field in the `DbgVariableRecord` class further disambiguating the various debug variable intrinsics it can represent.
24622da8096SJeremy Morse
247a8e03aedSStephen Tozer# How to update existing code
248a8e03aedSStephen Tozer
249a8e03aedSStephen TozerAny existing code that interacts with debug intrinsics in some way will need to be updated to interact with debug records in the same way. A few quick rules to keep in mind when updating code:
250a8e03aedSStephen Tozer
251a8e03aedSStephen Tozer- Debug records will not be seen when iterating over instructions; to find the debug records that appear immediately before an instruction, you'll need to iterate over `Instruction::getDbgRecordRange()`.
252a8e03aedSStephen Tozer- Debug records have interfaces that are identical to those of debug intrinsics, meaning that any code that operates on debug intrinsics can be trivially applied to debug records as well. The exceptions for this are `Instruction` or `CallInst` methods that don't logically apply to debug records, and `isa`/`cast`/`dyn_cast` methods, are replaced by methods on the `DbgRecord` class itself.
253a8e03aedSStephen Tozer- Debug records cannot appear in a module that also contains debug intrinsics; the two are mutually exclusive. As debug records are the future format, handling records correctly should be prioritized in new code.
254a8e03aedSStephen Tozer- Until support for intrinsics is no longer present, a valid hotfix for code that only handles debug intrinsics and is non-trivial to update is to convert the module to the intrinsic format using `Module::setIsNewDbgInfoFormat`, and convert it back afterwards.
255a8e03aedSStephen Tozer  - This can also be performed within a lexical scope for a module or an individual function using the class `ScopedDbgInfoFormatSetter`:
256a8e03aedSStephen Tozer  ```
257a8e03aedSStephen Tozer  void handleModule(Module &M) {
258a8e03aedSStephen Tozer    {
259a8e03aedSStephen Tozer      ScopedDbgInfoFormatSetter FormatSetter(M, false);
260a8e03aedSStephen Tozer      handleModuleWithDebugIntrinsics(M);
261a8e03aedSStephen Tozer    }
262a8e03aedSStephen Tozer    // Module returns to previous debug info format after exiting the above block.
263a8e03aedSStephen Tozer  }
264a8e03aedSStephen Tozer  ```
265a8e03aedSStephen Tozer
266a8e03aedSStephen TozerBelow is a rough guide on how existing code that currently supports debug intrinsics can be updated to support debug records.
267a8e03aedSStephen Tozer
268a8e03aedSStephen Tozer## Creating debug records
269a8e03aedSStephen Tozer
270a8e03aedSStephen TozerDebug records will automatically be created by the `DIBuilder` class when the new format is enabled. As with instructions, it is also possible to call `DbgRecord::clone` to create an unattached copy of an existing record.
271a8e03aedSStephen Tozer
272a8e03aedSStephen Tozer## Skipping debug records, ignoring debug-uses of `Values`, stably counting instructions, etc.
273a8e03aedSStephen Tozer
274a8e03aedSStephen TozerThis will all happen transparently without needing to think about it!
275a8e03aedSStephen Tozer
276a8e03aedSStephen Tozer```
277a8e03aedSStephen Tozerfor (Instruction &I : BB) {
278a8e03aedSStephen Tozer  // Old: Skips debug intrinsics
279a8e03aedSStephen Tozer  if (isa<DbgInfoIntrinsic>(&I))
280a8e03aedSStephen Tozer    continue;
281a8e03aedSStephen Tozer  // New: No extra code needed, debug records are skipped by default.
282a8e03aedSStephen Tozer  ...
283a8e03aedSStephen Tozer}
284a8e03aedSStephen Tozer```
285a8e03aedSStephen Tozer
286a8e03aedSStephen Tozer## Finding debug records
28722da8096SJeremy Morse
288ffd08c77SStephen TozerUtilities such as `findDbgUsers` and the like now have an optional argument that will return the set of `DbgVariableRecord` records that refer to a `Value`. You should be able to treat them the same as intrinsics.
28922da8096SJeremy Morse
290a8e03aedSStephen Tozer```
291a8e03aedSStephen Tozer// Old:
292a8e03aedSStephen Tozer  SmallVector<DbgVariableIntrinsic *> DbgUsers;
293a8e03aedSStephen Tozer  findDbgUsers(DbgUsers, V);
294a8e03aedSStephen Tozer  for (auto *DVI : DbgUsers) {
295a8e03aedSStephen Tozer    if (DVI->getParent() != BB)
296a8e03aedSStephen Tozer      DVI->replaceVariableLocationOp(V, New);
297a8e03aedSStephen Tozer  }
298a8e03aedSStephen Tozer// New:
299a8e03aedSStephen Tozer  SmallVector<DbgVariableIntrinsic *> DbgUsers;
300a8e03aedSStephen Tozer  SmallVector<DbgVariableRecord *> DVRUsers;
301a8e03aedSStephen Tozer  findDbgUsers(DbgUsers, V, &DVRUsers);
302a8e03aedSStephen Tozer  for (auto *DVI : DbgUsers)
303a8e03aedSStephen Tozer    if (DVI->getParent() != BB)
304a8e03aedSStephen Tozer      DVI->replaceVariableLocationOp(V, New);
305a8e03aedSStephen Tozer  for (auto *DVR : DVRUsers)
306a8e03aedSStephen Tozer    if (DVR->getParent() != BB)
307a8e03aedSStephen Tozer      DVR->replaceVariableLocationOp(V, New);
308a8e03aedSStephen Tozer```
309a8e03aedSStephen Tozer
310a8e03aedSStephen Tozer## Examining debug records at positions
31122da8096SJeremy Morse
312763be018SStephen TozerCall `Instruction::getDbgRecordRange()` to get the range of `DbgRecord` objects that are attached to an instruction.
31322da8096SJeremy Morse
314a8e03aedSStephen Tozer```
315a8e03aedSStephen Tozerfor (Instruction &I : BB) {
316a8e03aedSStephen Tozer  // Old: Uses a data member of a debug intrinsic, and then skips to the next
317a8e03aedSStephen Tozer  // instruction.
318a8e03aedSStephen Tozer  if (DbgInfoIntrinsic *DII = dyn_cast<DbgInfoIntrinsic>(&I)) {
319a8e03aedSStephen Tozer    recordDebugLocation(DII->getDebugLoc());
320a8e03aedSStephen Tozer    continue;
321a8e03aedSStephen Tozer  }
322a8e03aedSStephen Tozer  // New: Iterates over the debug records that appear before `I`, and treats
323a8e03aedSStephen Tozer  // them identically to the intrinsic block above.
324a8e03aedSStephen Tozer  // NB: This should always appear at the top of the for-loop, so that we
325a8e03aedSStephen Tozer  // process the debug records preceding `I` before `I` itself.
326a8e03aedSStephen Tozer  for (DbgRecord &DR = I.getDbgRecordRange()) {
327a8e03aedSStephen Tozer    recordDebugLocation(DR.getDebugLoc());
328a8e03aedSStephen Tozer  }
329a8e03aedSStephen Tozer  processInstruction(I);
330a8e03aedSStephen Tozer}
331a8e03aedSStephen Tozer```
33222da8096SJeremy Morse
333a8e03aedSStephen TozerThis can also be passed through the function `filterDbgVars` to specifically
334a8e03aedSStephen Tozeriterate over DbgVariableRecords, which are more commonly used.
33522da8096SJeremy Morse
336a8e03aedSStephen Tozer```
337a8e03aedSStephen Tozerfor (Instruction &I : BB) {
338a8e03aedSStephen Tozer  // Old: If `I` is a DbgVariableIntrinsic we record the variable, and apply
339a8e03aedSStephen Tozer  // extra logic if it is an `llvm.dbg.declare`.
340a8e03aedSStephen Tozer  if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I)) {
341a8e03aedSStephen Tozer    recordVariable(DVI->getVariable());
342a8e03aedSStephen Tozer    if (DbgDeclareInst *DDI = dyn_cast<DbgDeclareInst>(DVI))
343a8e03aedSStephen Tozer      recordDeclareAddress(DDI->getAddress());
344a8e03aedSStephen Tozer    continue;
345a8e03aedSStephen Tozer  }
346a8e03aedSStephen Tozer  // New: `filterDbgVars` is used to iterate over only DbgVariableRecords.
347a8e03aedSStephen Tozer  for (DbgVariableRecord &DVR = filterDbgVars(I.getDbgRecordRange())) {
348a8e03aedSStephen Tozer    recordVariable(DVR.getVariable());
349a8e03aedSStephen Tozer    // Debug variable records are not cast to subclasses; simply call the
350a8e03aedSStephen Tozer    // appropriate `isDbgX()` check, and use the methods as normal.
351a8e03aedSStephen Tozer    if (DVR.isDbgDeclare())
352a8e03aedSStephen Tozer      recordDeclareAddress(DVR.getAddress());
353a8e03aedSStephen Tozer  }
354a8e03aedSStephen Tozer  // ...
355a8e03aedSStephen Tozer}
356a8e03aedSStephen Tozer```
35722da8096SJeremy Morse
358a8e03aedSStephen Tozer## Processing individual debug records
359a8e03aedSStephen Tozer
360a8e03aedSStephen TozerIn most cases, any code that operates on debug intrinsics can be extracted to a template function or auto lambda (if it is not already in one) that can be applied to both debug intrinsics and debug records - though keep in mind the main exception that `isa`/`cast`/`dyn_cast` do not apply to `DbgVariableRecord` types.
361a8e03aedSStephen Tozer
362a8e03aedSStephen Tozer```
363a8e03aedSStephen Tozer// Old: Function that operates on debug variable intrinsics in a BasicBlock, and
364a8e03aedSStephen Tozer// collects llvm.dbg.declares.
365a8e03aedSStephen Tozervoid processDbgInfoInBlock(BasicBlock &BB,
366a8e03aedSStephen Tozer                           SmallVectorImpl<DbgDeclareInst*> &DeclareIntrinsics) {
367a8e03aedSStephen Tozer  for (Instruction &I : BB) {
368a8e03aedSStephen Tozer    if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I)) {
369a8e03aedSStephen Tozer      processVariableValue(DebugVariable(DVI), DVI->getValue());
370a8e03aedSStephen Tozer      if (DbgDeclareInst *DDI = dyn_cast<DbgDeclareInst>(DVI))
371a8e03aedSStephen Tozer        Declares.push_back(DDI);
372a8e03aedSStephen Tozer      else if (!isa<Constant>(DVI->getValue()))
373a8e03aedSStephen Tozer        DVI->setKillLocation();
374a8e03aedSStephen Tozer    }
375a8e03aedSStephen Tozer  }
376a8e03aedSStephen Tozer}
377a8e03aedSStephen Tozer
378a8e03aedSStephen Tozer// New: Template function is used to deduplicate handling of intrinsics and
379a8e03aedSStephen Tozer// records.
380a8e03aedSStephen Tozer// An overloaded function is also used to handle isa/cast/dyn_cast operations
381a8e03aedSStephen Tozer// for intrinsics and records, since those functions cannot be directly applied
382a8e03aedSStephen Tozer// to DbgRecords.
383a8e03aedSStephen TozerDbgDeclareInst *DynCastToDeclare(DbgVariableIntrinsic *DVI) {
384a8e03aedSStephen Tozer  return dyn_cast<DbgDeclareInst>(DVI);
385a8e03aedSStephen Tozer}
386a8e03aedSStephen TozerDbgVariableRecord *DynCastToDeclare(DbgVariableRecord *DVR) {
387a8e03aedSStephen Tozer  return DVR->isDbgDeclare() ? DVR : nullptr;
388a8e03aedSStephen Tozer}
389a8e03aedSStephen Tozer
390a8e03aedSStephen Tozertemplate<typename DbgVarTy, DbgDeclTy>
391a8e03aedSStephen Tozervoid processDbgVariable(DbgVarTy *DbgVar,
392a8e03aedSStephen Tozer                       SmallVectorImpl<DbgDeclTy*> &Declares) {
393a8e03aedSStephen Tozer    processVariableValue(DebugVariable(DbgVar), DbgVar->getValue());
394a8e03aedSStephen Tozer    if (DbgDeclTy *DbgDeclare = DynCastToDeclare(DbgVar))
395a8e03aedSStephen Tozer      Declares.push_back(DbgDeclare);
396a8e03aedSStephen Tozer    else if (!isa<Constant>(DbgVar->getValue()))
397a8e03aedSStephen Tozer      DbgVar->setKillLocation();
398a8e03aedSStephen Tozer};
399a8e03aedSStephen Tozer
400a8e03aedSStephen Tozervoid processDbgInfoInBlock(BasicBlock &BB,
401a8e03aedSStephen Tozer                           SmallVectorImpl<DbgDeclareInst*> &DeclareIntrinsics,
402a8e03aedSStephen Tozer                           SmallVectorImpl<DbgVariableRecord*> &DeclareRecords) {
403a8e03aedSStephen Tozer  for (Instruction &I : BB) {
404a8e03aedSStephen Tozer    if (DbgVariableIntrinsic *DVI = dyn_cast<DbgVariableIntrinsic>(&I))
405a8e03aedSStephen Tozer      processDbgVariable(DVI, DeclareIntrinsics);
406a8e03aedSStephen Tozer    for (DbgVariableRecord *DVR : filterDbgVars(I.getDbgRecordRange()))
407a8e03aedSStephen Tozer      processDbgVariable(DVR, DeclareRecords);
408a8e03aedSStephen Tozer  }
409a8e03aedSStephen Tozer}
410a8e03aedSStephen Tozer```
411a8e03aedSStephen Tozer
412a8e03aedSStephen Tozer## Moving and deleting debug records
413a8e03aedSStephen Tozer
414a8e03aedSStephen TozerYou can use `DbgRecord::removeFromParent` to unlink a `DbgRecord` from it's marker, and then `BasicBlock::insertDbgRecordBefore` or `BasicBlock::insertDbgRecordAfter` to re-insert the `DbgRecord` somewhere else. You cannot insert a `DbgRecord` at an arbitary point in a list of `DbgRecord`s (if you're doing this with `llvm.dbg.value`s then it's unlikely to be correct).
415a8e03aedSStephen Tozer
416a8e03aedSStephen TozerErase `DbgRecord`s by calling `eraseFromParent`.
417a8e03aedSStephen Tozer
418a8e03aedSStephen Tozer```
419a8e03aedSStephen Tozer// Old: Move a debug intrinsic to the start of the block, and delete all other intrinsics for the same variable in the block.
420a8e03aedSStephen Tozervoid moveDbgIntrinsicToStart(DbgVariableIntrinsic *DVI) {
421a8e03aedSStephen Tozer  BasicBlock *ParentBB = DVI->getParent();
422a8e03aedSStephen Tozer  DVI->removeFromParent();
423a8e03aedSStephen Tozer  for (Instruction &I : ParentBB) {
424a8e03aedSStephen Tozer    if (auto *BlockDVI = dyn_cast<DbgVariableIntrinsic>(&I))
425a8e03aedSStephen Tozer      if (BlockDVI->getVariable() == DVI->getVariable())
426a8e03aedSStephen Tozer        BlockDVI->eraseFromParent();
427a8e03aedSStephen Tozer  }
428a8e03aedSStephen Tozer  DVI->insertBefore(ParentBB->getFirstInsertionPt());
429a8e03aedSStephen Tozer}
430a8e03aedSStephen Tozer
431a8e03aedSStephen Tozer// New: Perform the same operation, but for a debug record.
432a8e03aedSStephen Tozervoid moveDbgRecordToStart(DbgVariableRecord *DVR) {
433a8e03aedSStephen Tozer  BasicBlock *ParentBB = DVR->getParent();
434a8e03aedSStephen Tozer  DVR->removeFromParent();
435a8e03aedSStephen Tozer  for (Instruction &I : ParentBB) {
436a8e03aedSStephen Tozer    for (auto &BlockDVR : filterDbgVars(I.getDbgRecordRange()))
437a8e03aedSStephen Tozer      if (BlockDVR->getVariable() == DVR->getVariable())
438a8e03aedSStephen Tozer        BlockDVR->eraseFromParent();
439a8e03aedSStephen Tozer  }
440a8e03aedSStephen Tozer  DVR->insertBefore(ParentBB->getFirstInsertionPt());
441a8e03aedSStephen Tozer}
442a8e03aedSStephen Tozer```
443a8e03aedSStephen Tozer
444a8e03aedSStephen Tozer## What about dangling debug records?
44522da8096SJeremy Morse
44622da8096SJeremy MorseIf you have a block like so:
44722da8096SJeremy Morse
44822da8096SJeremy Morse```text
44922da8096SJeremy Morse    foo:
45022da8096SJeremy Morse      %bar = add i32 %baz...
45122da8096SJeremy Morse      dbg.value(metadata i32 %bar,...
45222da8096SJeremy Morse      br label %xyzzy
45322da8096SJeremy Morse```
45422da8096SJeremy Morse
455763be018SStephen Tozeryour optimisation pass may wish to erase the terminator and then do something to the block. This is easy to do when debug info is kept in instructions, but with `DbgRecord`s there is no trailing instruction to attach the variable information to in the block above, once the terminator is erased. For such degenerate blocks, `DbgRecord`s are stored temporarily in a map in `LLVMContext`, and are re-inserted when a terminator is reinserted to the block or other instruction inserted at `end()`.
45622da8096SJeremy Morse
45722da8096SJeremy MorseThis can technically lead to trouble in the vanishingly rare scenario where an optimisation pass erases a terminator and then decides to erase the whole block. (We recommend not doing that).
458a8e03aedSStephen Tozer
459a8e03aedSStephen Tozer## Anything else?
460a8e03aedSStephen Tozer
461a8e03aedSStephen TozerThe above guide does not comprehensively cover every pattern that could apply to debug intrinsics; as mentioned at the [start of the guide](#how-to-update-existing-code), you can temporarily convert the target module from debug records to intrinsics as a stopgap measure. Most operations that can be performed on debug intrinsics have exact equivalents for debug records, but if you encounter any exceptions, reading the class docs (linked [here](#what-exactly-have-you-replaced-debug-intrinsics-with)) may give some insight, there may be examples in the existing codebase, and you can always ask for help on the [forums](https://discourse.llvm.org/tag/debuginfo).
462