1*0fca6ea1SDimitry Andric# `llvm-debuginfo-analyzer` 2*0fca6ea1SDimitry Andric 3*0fca6ea1SDimitry AndricThese are the notes collected during the development, review and test. 4*0fca6ea1SDimitry AndricThey describe limitations, known issues and future work. 5*0fca6ea1SDimitry Andric 6*0fca6ea1SDimitry Andric### Remove the use of macros in ``LVReader.h`` that describe the ``bumpallocators``. 7*0fca6ea1SDimitry Andric**[D137933](https://reviews.llvm.org/D137933#inline-1389904)** 8*0fca6ea1SDimitry Andric 9*0fca6ea1SDimitry AndricUse a standard (or LLVM) ``map`` with ``typeinfo`` (would need a specialization 10*0fca6ea1SDimitry Andricto expose equality and hasher) for the allocators and the creation 11*0fca6ea1SDimitry Andricfunctions could be a function template. 12*0fca6ea1SDimitry Andric 13*0fca6ea1SDimitry Andric### Use a **lit test** instead of a **unit test** for the **logical readers**. 14*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1324376)** 15*0fca6ea1SDimitry Andric 16*0fca6ea1SDimitry AndricAs the ``DebugInfoLogicalView`` library is sufficiently exposed via the 17*0fca6ea1SDimitry Andric``llvm-debuginfo-analyzer`` tool, follow the LLVM general approach and 18*0fca6ea1SDimitry Andricuse ``lit`` tests to validate the **logical readers**. 19*0fca6ea1SDimitry Andric 20*0fca6ea1SDimitry AndricConvert the ``unitests``: 21*0fca6ea1SDimitry Andric``` 22*0fca6ea1SDimitry Andricllvm-project/llvm/unittests/DebugInfo/LogicalView/CodeViewReaderTest.cpp 23*0fca6ea1SDimitry Andricllvm-project/llvm/unittests/DebugInfo/LogicalView/DWARFReaderTest.cpp 24*0fca6ea1SDimitry Andric``` 25*0fca6ea1SDimitry Andricinto ``lit`` tests: 26*0fca6ea1SDimitry Andric``` 27*0fca6ea1SDimitry Andricllvm-project/llvm/test/DebugInfo/LogicalView/CodeViewReader.test 28*0fca6ea1SDimitry Andricllvm-project/llvm/test/DebugInfo/LogicalView/DWARFReader.test 29*0fca6ea1SDimitry Andric``` 30*0fca6ea1SDimitry Andric 31*0fca6ea1SDimitry Andric### Eliminate calls to ``getInputFileDirectory()`` in the ``unittests``. 32*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1324359)** 33*0fca6ea1SDimitry Andric 34*0fca6ea1SDimitry AndricRewrite the unittests ``ReaderTest`` and ``CodeViewReaderTest`` to eliminate 35*0fca6ea1SDimitry Andricthe call: 36*0fca6ea1SDimitry Andric``` 37*0fca6ea1SDimitry Andric getInputFileDirectory() 38*0fca6ea1SDimitry Andric``` 39*0fca6ea1SDimitry Andricas use of that call is discouraged. 40*0fca6ea1SDimitry Andric 41*0fca6ea1SDimitry Andric### Fix mismatch between ``%d/%x`` format strings and ``uint64_t`` type. 42*0fca6ea1SDimitry Andric**[D137400](https://reviews.llvm.org/D137400) / [58758](https://github.com/llvm/llvm-project/issues/58758)** 43*0fca6ea1SDimitry Andric 44*0fca6ea1SDimitry AndricIncorrect printing of ``uint64_t`` on ``32-bit`` platforms. 45*0fca6ea1SDimitry AndricAdd the ``PRIx64`` specifier to the printing code (``format()``). 46*0fca6ea1SDimitry Andric 47*0fca6ea1SDimitry Andric### Remove ``LVScope::Children`` container. 48*0fca6ea1SDimitry Andric**[D137933](https://reviews.llvm.org/D137933#inline-1373902)** 49*0fca6ea1SDimitry Andric 50*0fca6ea1SDimitry AndricUse a **chaining iterator** over the other containers rather than keep a 51*0fca6ea1SDimitry Andricseparate container ``Children`` that mirrors their contents. 52*0fca6ea1SDimitry Andric 53*0fca6ea1SDimitry Andric### Use ``TableGen`` for command line options. 54*0fca6ea1SDimitry Andric**[D125777](https://reviews.llvm.org/D125777#inline-1291801)** 55*0fca6ea1SDimitry Andric 56*0fca6ea1SDimitry AndricThe current trend is to use ``TableGen`` for command-line options in tools. 57*0fca6ea1SDimitry AndricChange command line options to use ``tablegen`` as many other LLVM tools. 58*0fca6ea1SDimitry Andric 59*0fca6ea1SDimitry Andric### ``LVDoubleMap`` to return ``optional<ValueType>`` instead of ``null pointer``. 60*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1294164)** 61*0fca6ea1SDimitry Andric 62*0fca6ea1SDimitry AndricThe more idiomatic LLVM way to handle this would be to have ``find`` 63*0fca6ea1SDimitry Andricreturn ``Optional<ValueType>``. 64*0fca6ea1SDimitry Andric 65*0fca6ea1SDimitry Andric### Pass references instead of pointers (**Comparison functions**). 66*0fca6ea1SDimitry Andric**[D125782](https://reviews.llvm.org/D125782#inline-1293920)** 67*0fca6ea1SDimitry Andric 68*0fca6ea1SDimitry AndricIn the **comparison functions**, pass references instead of pointers (when 69*0fca6ea1SDimitry Andricpointers cannot be null). 70*0fca6ea1SDimitry Andric 71*0fca6ea1SDimitry Andric### Use ``StringMap`` where possible. 72*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1294211)** 73*0fca6ea1SDimitry Andric 74*0fca6ea1SDimitry AndricLLVM has a ``StringMap`` class that is advertised as more efficient than 75*0fca6ea1SDimitry Andric``std::map<std::string, ValueType>``. Mainly it does fewer allocations 76*0fca6ea1SDimitry Andricbecause the key is not a ``std::string``. 77*0fca6ea1SDimitry Andric 78*0fca6ea1SDimitry AndricReplace the use of ``std::map<std::string, ValueType>`` with ``StringMap``. 79*0fca6ea1SDimitry AndricOne specific case is the ``LVSymbolNames`` definitions. 80*0fca6ea1SDimitry Andric 81*0fca6ea1SDimitry Andric### Calculate unique offset for CodeView elements. 82*0fca6ea1SDimitry AndricIn order to have the same logical functionality as the DWARF reader, such 83*0fca6ea1SDimitry Andricas: 84*0fca6ea1SDimitry Andric 85*0fca6ea1SDimitry Andric* find scopes contribution to debug info 86*0fca6ea1SDimitry Andric* sort by its physical location 87*0fca6ea1SDimitry Andric 88*0fca6ea1SDimitry AndricThe logical elements must have an unique offset (similar like the DWARF 89*0fca6ea1SDimitry Andric``DIE`` offset). 90*0fca6ea1SDimitry Andric 91*0fca6ea1SDimitry Andric### Move ``initializeFileAndStringTables`` to the CodeView Library. 92*0fca6ea1SDimitry AndricThere is some code in the CodeView reader that was extracted/adapted 93*0fca6ea1SDimitry Andricfrom ``tools/llvm-readobj/COFFDumper.cpp`` that can be moved to the CodeView 94*0fca6ea1SDimitry Andriclibrary. 95*0fca6ea1SDimitry Andric 96*0fca6ea1SDimitry AndricWe had a similar case with code shared with ``llvm-pdbutil`` that was moved 97*0fca6ea1SDimitry Andricto the PDB library: **[D122226](https://reviews.llvm.org/D122226)** 98*0fca6ea1SDimitry Andric 99*0fca6ea1SDimitry Andric### Move ``getSymbolKindName`` and ``formatRegisterId`` to the CodeView Library. 100*0fca6ea1SDimitry AndricThere is some code in the CodeView reader that was extracted/adapted 101*0fca6ea1SDimitry Andricfrom ``lib/DebugInfo/CodeView/SymbolDumper.cpp`` that can be used. 102*0fca6ea1SDimitry Andric 103*0fca6ea1SDimitry Andric### Use of ``std::unordered_set`` instead of ``std::set``. 104*0fca6ea1SDimitry Andric**[D125784](https://reviews.llvm.org/D125784#inline-1221421)** 105*0fca6ea1SDimitry Andric 106*0fca6ea1SDimitry AndricReplace the ``std::set`` usage for ``DeducedScopes``, ``UnresolvedScopes`` and 107*0fca6ea1SDimitry Andric``IdentifiedNamespaces`` with ``std::unordered_set`` and get the benefit 108*0fca6ea1SDimitry Andricof the O(1) while inserting/searching, as the order is not important. 109*0fca6ea1SDimitry Andric 110*0fca6ea1SDimitry Andric### Optimize ``LVNamespaceDeduction::find`` funtion. 111*0fca6ea1SDimitry Andric**[D125784](https://reviews.llvm.org/D125784#inline-1296195)** 112*0fca6ea1SDimitry Andric 113*0fca6ea1SDimitry AndricOptimize the ``find`` method to use the proposed code: 114*0fca6ea1SDimitry Andric 115*0fca6ea1SDimitry Andric``` 116*0fca6ea1SDimitry Andric LVStringRefs::iterator Iter = std::find_if(Components.begin(), Components.end(), 117*0fca6ea1SDimitry Andric [](StringRef Name) { 118*0fca6ea1SDimitry Andric return IdentifiedNamespaces.find(Name) == IdentifiedNamespaces.end(); 119*0fca6ea1SDimitry Andric }); 120*0fca6ea1SDimitry Andric LVStringRefs::size_type FirstNonNamespace = std::distance(Components.begin(), Iter); 121*0fca6ea1SDimitry Andric``` 122*0fca6ea1SDimitry Andric 123*0fca6ea1SDimitry Andric### Move all the printing support to a common module. 124*0fca6ea1SDimitry AndricFactor out printing functionality from the logical elements into a 125*0fca6ea1SDimitry Andriccommon module. 126*0fca6ea1SDimitry Andric 127*0fca6ea1SDimitry Andric### Refactor ``LVBinaryReader::processLines``. 128*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1246155) / 129*0fca6ea1SDimitry Andric[D137156](https://reviews.llvm.org/D137156)** 130*0fca6ea1SDimitry Andric 131*0fca6ea1SDimitry AndricDuring the traversal of the debug information sections, we created the 132*0fca6ea1SDimitry Andriclogical lines representing the **disassembled instructions** from the **text 133*0fca6ea1SDimitry Andricsection** and the logical lines representing the **line records** from the 134*0fca6ea1SDimitry Andric**debug line** section. Using the ranges associated with the logical scopes, 135*0fca6ea1SDimitry Andricwe will allocate those logical lines to their logical scopes. 136*0fca6ea1SDimitry Andric 137*0fca6ea1SDimitry AndricConsider the case when any of those lines become orphans, causing 138*0fca6ea1SDimitry Andricincorrect scope parent for disassembly or line records. 139*0fca6ea1SDimitry Andric 140*0fca6ea1SDimitry Andric### Add support for ``-ffunction-sections``. 141*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1295012)** 142*0fca6ea1SDimitry Andric 143*0fca6ea1SDimitry AndricOnly linked executables are handled. It does not support relocatable 144*0fca6ea1SDimitry Andricfiles compiled with ``-ffunction-sections``. 145*0fca6ea1SDimitry Andric 146*0fca6ea1SDimitry Andric### Add support for DWARF v5 `.debug_names` section / CodeView public symbols stream. 147*0fca6ea1SDimitry Andric**[D125783](https://reviews.llvm.org/D125783#inline-1294142)** 148*0fca6ea1SDimitry Andric 149*0fca6ea1SDimitry AndricThe DWARF and CodeView readers use the public names information to create 150*0fca6ea1SDimitry Andricthe instructions (``LVLineAssembler``). Instead of relying on DWARF section 151*0fca6ea1SDimitry Andricnames (``.debug_pubnames``, ``.debug_names``) and CodeView public symbol stream 152*0fca6ea1SDimitry Andric(``S_PUB32``), the readers should collect the needed information while processing 153*0fca6ea1SDimitry Andricthe debug information. 154*0fca6ea1SDimitry Andric 155*0fca6ea1SDimitry AndricIf the object file supports the above section names and stream, use them 156*0fca6ea1SDimitry Andricto create the public names. 157*0fca6ea1SDimitry Andric 158*0fca6ea1SDimitry Andric### Add support for some extra DWARF locations. 159*0fca6ea1SDimitry AndricThe following DWARF debug location operands are not supported: 160*0fca6ea1SDimitry Andric 161*0fca6ea1SDimitry Andric* `DW_OP_const_type` 162*0fca6ea1SDimitry Andric* `DW_OP_entry_value` 163*0fca6ea1SDimitry Andric* `DW_OP_implicit_value` 164*0fca6ea1SDimitry Andric 165*0fca6ea1SDimitry Andric### Add support for additional binary formats. 166*0fca6ea1SDimitry Andric* Extended COFF (`XCOFF`) 167*0fca6ea1SDimitry Andric 168*0fca6ea1SDimitry Andric### Add support for ``JSON`` or ``YAML`` 169*0fca6ea1SDimitry AndricThe logical view uses its own and non-standard free form text when 170*0fca6ea1SDimitry Andricdisplaying information on logical elements. 171