SourceLevelDebugging.rst - OpenGrok cross reference for /llvm-project/llvm/docs/SourceLevelDebugging.rst

Lines Matching +full:- +full:- +full:require +full:- +full:hashes
14 front-ends or dealing directly with the information.  Further, this document
18 --------------------------------------------
21 pieces of the source-language's Abstract Syntax Tree map onto LLVM code.
29 * LLVM optimizations should interact in :ref:`well-defined and easily described
33   LLVM-to-LLVM tools should not need to know anything about the semantics of
34   the source-level-language.
36 * Source-level languages are often **widely** different from one another.
37   LLVM should not put any restrictions of the flavor of the source-language,
42   formats.  This allows compatibility with traditional machine-code level
47 between LLVM program objects and the source-level objects.  The description of
48 the source-level program is maintained in LLVM metadata in an
49 :ref:`implementation-defined format <ccxx_frontend>` (the C/C++ front-end
54 the stored debug information into source-language specific information.  As
55 such, a debugger must be aware of the source-language, and is thus tied to a
59 ---------------------------
68 other DWARF-based debuggers. :ref:`CodeViewDebug <codeview>` produces CodeView,
81 -----------------------------------
88   the source-level state of the program**, regardless of which LLVM
99   optimizers could optimize debug code just as well as non-debug code.
111 "``-O0 -g``" and get full debug information, allowing you to arbitrarily modify
113 "``-O3 -g``" gives you full debug information that is always available and
119 The :doc:`LLVM test-suite <TestSuiteMakefileGuide>` provides a framework to
123 .. code-block:: bash
125   % cd llvm/projects/test-suite/MultiSource/Benchmarks  # or some other level
146 variables, functions, source files, etc) is inserted by the language front-end
152 namespaces, etc: this allows for arbitrary source-language semantics and
153 type-systems to be used, as long as there is a module written for the target
157 assumptions about the source-level language being debugged, though it keeps
164 common to any source-language.  :ref:`ccxx_frontend` describes the data layout
165 conventions used by the C and C++ front-ends.
168 <LangRef.html#specialized-metadata>`_, first-class subclasses of ``Metadata``.
174 non-default but currently supported for backwards compatibility - though these
183 ----------------------------
193 .. code-block:: llvm
201 comma-separated arguments in parentheses, as with a `call`.
206 .. code-block:: llvm
219 .. code-block:: llvm
248 .. code-block:: llvm
271 .. code-block:: llvm
294 ----------------------------
298 In intrinsic-mode, LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to
310 .. code-block:: llvm
316 .. code-block:: llvm
325 .. code-block:: llvm
331 .. code-block:: llvm
340 .. code-block:: llvm
346 .. code-block:: llvm
363 it is non-trivial to model in LLVM, because it has no notion of scoping in this
370 .. code-block:: c
384 .. code-block:: text
405   attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "frame-pointer"="all" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
413   !1 = !DIFile(filename: "/dev/stdin", directory: "/Users/dexonsmith/data/llvm/debug-info")
444 .. code-block:: llvm
453 .. code-block:: text
468 .. code-block:: llvm
477 .. code-block:: text
508 passes alter or move instructions and blocks -- the developer could observe such
527 .. code-block:: llvm
554 Containing two source-level variables in ``!1`` and ``!3``. The function could,
557 .. code-block:: llvm
574 .. code-block:: llvm
589 the same time as ``!1`` has the constant value zero -- a pair of assignments
594 .. code-block:: llvm
629 observe re-ordering of assignments.
634 LLVM preserves debug information throughout mid-level and backend passes,
635 ultimately producing a mapping between source-level information and
640 represents a source-level assignment of a value to a source variable, the
659 frame setup and destruction may take several instructions, require a
664 ---------------------------------------------------
673 multiply-and-accumulate) then intermediate Values are lost. To track variable
681 otherwise transformed into a non-register, the variable location becomes
688 After MIR locations are assigned to each variable, machine pseudo-instructions
694 .. code-block:: text
712 .. code-block:: text
737 .. code-block:: llvm
763 If one compiles this IR with ``llc -o - -start-after=codegen-prepare -stop-after=expand-isel-pseudos -mtriple=x86_64--``, the following MIR is produced:
765 .. code-block:: text
772     %3:gr32 = MOV32r0 implicit-def dead $eflags
773     DBG_VALUE 0, $noreg, !3, !DIExpression(), debug-location !5
779     DBG_VALUE %0, $noreg, !3, !DIExpression(), debug-location !5
780     DBG_VALUE %2, $noreg, !3, !DIExpression(DW_OP_plus_uconst, 4, DW_OP_stack_value), debug-location !5
781     %4:gr32 = MOV32rm %2, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
782     %5:gr64_nosp = MOVSX64rr32 %0, debug-location !5
783     DBG_VALUE $noreg, $noreg, !3, !DIExpression(), debug-location !5
784     %1:gr32 = INC32r %0, implicit-def dead $eflags, debug-location !5
785     DBG_VALUE %1, $noreg, !3, !DIExpression(), debug-location !5
786     %6:gr32 = ADD32rm %4, %2, 4, killed %5, 0, $noreg, implicit-def dead $eflags :: (load 4 from %ir.addr2)
787     %7:gr32 = SUB32rr %6, %0, implicit-def $eflags, debug-location !5
788     JB_1 %bb.1, implicit $eflags, debug-location !5
789     JMP_1 %bb.2, debug-location !5
792     %8:gr32 = MOV32r0 implicit-def dead $eflags
793     $eax = COPY %8, debug-location !5
794     RET 0, $eax, debug-location !5
804   (as a 4-byte offset), but the variable location is salvaged by folding
812 ----------------------
815 and the pre-and-post RA machine schedulers. Instruction scheduling can
816 significantly change the nature of the program -- in the (very unlikely) worst
823 of the delay. To illustrate, consider this pseudo-MIR:
825 .. code-block:: text
827   %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
829   %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
831   %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
836 .. code-block:: text
838   %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
839   %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
841   %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
846 the DBG_VALUE of virtual register %7 upwards with the SUB32rr, we would re-order
854 .. code-block:: text
857   %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
859   %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
861   %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
874 ---------------------------------------------
880 VirtRegRewriter pass re-inserts DBG_VALUE instructions in their original
889 -----------------------------------------------
898 corresponding to a source-level assignment where the variable may change value,
905 .. code-block:: text
940 that it uses use-def chains to identify control flow merges and insert phi
972 C/C++ front-end specific debug information
975 The C and C++ front-ends represent information about the program in a
979 information, and contains enough information for non-dwarf targets to
987 source-language front-ends, the information used should be documented here.
996 -----------------------------
1002 .. code-block:: c++
1004   if (DILocation *Loc = I->getDebugLoc()) { // Here I is an LLVM instruction
1005     unsigned Line = Loc->getLine();
1006     StringRef File = Loc->getFilename();
1007     StringRef Dir = Loc->getDirectory();
1008     bool ImplicitCode = Loc->isImplicitCode();
1012 added by the front-end but doesn't correspond to source code written by the user. For example
1014 .. code-block:: c++
1026 ---------------------------------
1030 .. code-block:: c
1034 a C/C++ front-end would generate the following descriptors:
1036 .. code-block:: text
1065                directory: "/Users/dexonsmith/data/llvm/debug-info")
1095 --------------------------
1099 .. code-block:: c
1105 a C/C++ front-end would generate the following descriptors:
1107 .. code-block:: text
1128 ----------------------------------------
1134 .. code-block:: c
1143 .. code-block:: text
1149 .. code-block:: text
1161 ----------------------------
1163 There are a few DWARF attributes defined to support client debugging of Fortran programs.  LLVM can generate (or omit) the appropriate DWARF attributes for the prefix-specs of ELEMENTAL, PURE, IMPURE, RECURSIVE, and NON_RECURSIVE.  This is done by using the spFlags values: DISPFlagElemental, DISPFlagPure, and DISPFlagRecursive.
1165 .. code-block:: fortran
1169 a Fortran front-end would generate the following descriptors:
1171 .. code-block:: text
1180 .. code-block:: text
1190 .. code-block:: fortran
1194 a Fortran front-end would generate the following descriptors:
1196 .. code-block:: text
1201 A fortran deferred-length character can also contain the information of raw storage of the characters in addition to the length of the string. This information is encoded in the  stringLocationExpression field. Based on this information, DW_AT_data_location attribute is emitted in a DW_TAG_string_type debug info.
1207 .. code-block:: text
1219 A Fortran front-end may need to generate a *trampoline* function to call a
1220 function defined in a different compilation unit. In this case, the front-end
1223 .. code-block:: text
1230 .. code-block:: text
1242 ----------------------------------------------------------
1285 .. code-block:: objc
1302 .. code-block:: none
1332 auto-synthesized property is the name of the property from which it derives
1338 the @interface and @implementation - e.g. to provide a read-only property in
1339 the interface, and a read-write interface in the implementation.  In that case,
1346 .. code-block:: objc
1350 .. code-block:: none
1360 .. code-block:: objc
1364   -(void)myOwnP3Setter:(int)a;
1369   -(void)myOwnP3Setter:(int)a{ }
1374 .. code-block:: none
1396 +-----------------------+--------+
1400 +-----------------------+--------+
1405 +--------------------------------+--------+-----------+
1409 +--------------------------------+--------+-----------+
1411 +--------------------------------+--------+-----------+
1413 +--------------------------------+--------+-----------+
1415 +--------------------------------+--------+-----------+
1420 +--------------------------------------+-------+
1424 +--------------------------------------+-------+
1426 +--------------------------------------+-------+
1428 +--------------------------------------+-------+
1430 +--------------------------------------+-------+
1432 +--------------------------------------+-------+
1434 +--------------------------------------+-------+
1436 +--------------------------------------+-------+
1438 +--------------------------------------+-------+
1440 +--------------------------------------+-------+
1442 +--------------------------------------+-------+
1444 +--------------------------------------+-------+
1446 +--------------------------------------+-------+
1448 +--------------------------------------+-------+
1450 +--------------------------------------+-------+
1452 +--------------------------------------+-------+
1455 -----------------------
1483 its inconsistent and useless public-only name content making it a waste of
1506 from disk, and used as is, with little or no up-front parsing.  We would also
1519 duplicated.  We also want to make sure the table is ready to be used as-is by
1540 .. code-block:: none
1542   .------------.
1544   |------------|
1546   |------------|
1548   `------------'
1552 .. code-block:: none
1554   .------------.
1561   '------------'
1568 .. code-block:: none
1570               .------------.
1575               |------------|
1580               |------------|
1585               `------------'
1589 if we were to lookup "``printf``" in the table above, we would make a 32-bit
1605 .. code-block:: none
1607   .-------------.
1609   |-------------|
1611   |-------------|
1612   |  HASHES     |
1613   |-------------|
1615   |-------------|
1617   `-------------'
1619 The ``BUCKETS`` in the name tables are an index into the ``HASHES`` array.  By
1625 values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
1628 .. code-block:: none
1630   .-------------------------.
1638   |-------------------------|
1640   |-------------------------|
1641   |  HASHES                 | uint32_t[n_hashes] // 32 bit hash values
1642   |-------------------------|
1644   |-------------------------|
1646   `-------------------------'
1651 .. code-block:: none
1653               .------------.
1655               |------------|
1662               |------------|
1663               | 0x........ | HASHES[0]
1664               | 0x........ | HASHES[1]
1665               | 0x........ | HASHES[2]
1666               | 0x........ | HASHES[3]
1667               | 0x........ | HASHES[4]
1668               | 0x........ | HASHES[5]
1669               | 0x12345678 | HASHES[6]    hash for BUCKETS[3]
1670               | 0x29273623 | HASHES[7]    hash for BUCKETS[3]
1671               | 0x82638293 | HASHES[8]    hash for BUCKETS[3]
1672               | 0x........ | HASHES[9]
1673               | 0x........ | HASHES[10]
1674               | 0x........ | HASHES[11]
1675               | 0x........ | HASHES[12]
1676               | 0x........ | HASHES[13]
1677               | 0x........ | HASHES[n_hashes]
1678               |------------|
1694               |------------|
1700               |------------|
1702               | 0x00000004 | A 32 bit array count - number of HashData with name "erase"
1708               |------------|
1710               | 0x00000002 | A 32 bit array count - number of HashData with name "collision"
1714               | 0x00000003 | A 32 bit array count - number of HashData with name "dump"
1719               |------------|
1721               | 0x00000009 | A 32 bit array count - number of HashData with name "main"
1732               `------------'
1738 is the index into the ``HASHES`` table.  We would then compare any consecutive
1739 32 bit hashes values in the ``HASHES`` array as long as the hashes would be in
1742 memory for ``BUCKETS[3]``, and then compare a few consecutive 32 bit hashes
1768 .. code-block:: c
1778                                 // Specifically the length of the following HeaderData field - this does not
1792 .. code-block:: c
1801 hash values that are in the ``HASHES`` array, and is the same number of offsets
1809 The header is followed by the buckets, hashes, offsets, and hash value data.
1811 .. code-block:: c
1815     uint32_t buckets[Header.bucket_count];  // An array of hash indexes into the "hashes[]" array below
1816     uint32_t hashes [Header.hashes_count];  // Every unique 32 bit hash for the entire table is in this table
1817     uint32_t offsets[Header.hashes_count];  // An offset that corresponds to each item in the "hashes[]" array above
1820 ``buckets`` is an array of 32 bit indexes into the ``hashes`` array.  The
1821 ``hashes`` array contains all of the 32 bit hash values for all names in the
1822 hash table.  Each hash in the ``hashes`` table has an offset in the ``offsets``
1842 .. code-block:: c
1856 .. code-block:: none
1858   eAtomTypeNULL       - a termination atom that specifies the end of the atom list
1859   eAtomTypeDIEOffset  - an offset into the .debug_info section for the DWARF DIE for this name
1860   eAtomTypeCUOffset   - an offset into the .debug_info section for the CU that contains the DIE
1861   eAtomTypeDIETag     - The DW_TAG_XXX enumeration value so you don't have to parse the DWARF to see what it is
1862   eAtomTypeNameFlags  - Flags for functions and global variables (isFunction, isInlined, isExternal...)
1863   eAtomTypeTypeFlags  - Flags for types (isCXXClass, isObjCClass, ...)
1868 .. code-block:: c
1880 .. code-block:: c
1892 what is contained in each ``HashData`` object -- ``Atom.form`` tells us how large
1901 .. code-block:: c
1926 .. code-block:: c
1935 .. code-block:: none
1937   .------------.
1945   `------------'
1949 .. code-block:: none
1951   .------------.
1963   `------------'
1984 .. code-block:: c
2031 not be a forward declaration (``DW_AT_declaration`` attribute with a non-zero
2034 .. code-block:: c
2044 .. code-block:: none
2067 Objective-C Extensions
2071 Objective-C class.  The name used in the hash table is the name of the
2072 Objective-C class itself.  If the Objective-C class has a category, then an
2075 method "``-[NSString(my_additions) stringWithSpecialString:]``", we would add
2078 track down all Objective-C methods for an Objective-C class when doing
2079 expressions.  It is needed because of the dynamic nature of Objective-C where
2080 anyone can add methods to a class.  The DWARF for Objective-C methods is also
2085 given the Objective-C class name, or quickly find all methods and class
2087 selector names, it just maps Objective-C class names (or class names +
2091 In the "``.apple_names``" section for Objective-C functions, the full name is
2092 the entire function name with the brackets ("``-[NSString
2096 Mach-O Changes
2099 The sections names for the apple hash tables are for non-mach-o files.  For
2100 mach-o files, the sections should be contained in the ``__DWARF`` segment with
2103 * "``.apple_names``" -> "``__apple_names``"
2104 * "``.apple_types``" -> "``__apple_types``"
2105 * "``.apple_namespaces``" -> "``__apple_namespac``" (16 character limit)
2106 * "``.apple_objc``" -> "``__apple_objc``"
2117 -----------------
2125 16-bit record size and a 16-bit record kind.
2140 CodeView consumers and do not require type records.
2144 the source-level type graph may contain cycles through pointer types (consider a
2146 referring to the forward declaration record of user-defined record types. Only
2148 non-forward-declaration type records.
2151 ---------------------
2155 embedded in ``llvm-readobj``.
2159     $ cl -c -Z7 foo.cpp # Use /Z7 to keep types in the object file
2160     $ llvm-readobj --codeview foo.obj
2164     $ clang -g -gcodeview --target=x86_64-windows-msvc foo.cpp -S -emit-llvm
2170     $ llc foo.ll -filetype=obj -o foo.obj
2171     $ llvm-readobj --codeview foo.obj > foo.txt
2173   Use this pattern in lit test cases and FileCheck the output of llvm-readobj