1===================================== 2The PDB TPI and IPI Streams 3===================================== 4 5.. contents:: 6 :local: 7 8.. _tpi_intro: 9 10Introduction 11============ 12 13The PDB TPI Stream (Index 2) and IPI Stream (Index 4) contain information about 14all types used in the program. It is organized as a :ref:`header <tpi_header>` 15followed by a list of :doc:`CodeView Type Records <CodeViewTypes>`. Types are 16referenced from various streams and records throughout the PDB by their 17:ref:`type index <type_indices>`. In general, the sequence of type records 18following the :ref:`header <tpi_header>` forms a topologically sorted DAG 19(directed acyclic graph), which means that a type record B can only refer to 20the type A if ``A.TypeIndex < B.TypeIndex``. While there are rare cases where 21this property will not hold (particularly when dealing with object files 22compiled with MASM), an implementation should try very hard to make this 23property hold, as it means the entire type graph can be constructed in a single 24pass. 25 26.. important:: 27 Type records form a topologically sorted DAG (directed acyclic graph). 28 29.. _tpi_ipi: 30 31TPI vs IPI Stream 32================= 33 34Recent versions of the PDB format (aka all versions covered by this document) 35have 2 streams with identical layout, henceforth referred to as the TPI stream 36and IPI stream. Subsequent contents of this document describing the on-disk 37format apply equally whether it is for the TPI Stream or the IPI Stream. The 38only difference between the two is in *which* CodeView records are allowed to 39appear in each one, summarized by the following table: 40 41+----------------------+---------------------+ 42| TPI Stream | IPI Stream | 43+======================+=====================+ 44| LF_POINTER | LF_FUNC_ID | 45+----------------------+---------------------+ 46| LF_MODIFIER | LF_MFUNC_ID | 47+----------------------+---------------------+ 48| LF_PROCEDURE | LF_BUILDINFO | 49+----------------------+---------------------+ 50| LF_MFUNCTION | LF_SUBSTR_LIST | 51+----------------------+---------------------+ 52| LF_LABEL | LF_STRING_ID | 53+----------------------+---------------------+ 54| LF_ARGLIST | LF_UDT_SRC_LINE | 55+----------------------+---------------------+ 56| LF_FIELDLIST | LF_UDT_MOD_SRC_LINE | 57+----------------------+---------------------+ 58| LF_ARRAY | | 59+----------------------+---------------------+ 60| LF_CLASS | | 61+----------------------+---------------------+ 62| LF_STRUCTURE | | 63+----------------------+---------------------+ 64| LF_INTERFACE | | 65+----------------------+---------------------+ 66| LF_UNION | | 67+----------------------+---------------------+ 68| LF_ENUM | | 69+----------------------+---------------------+ 70| LF_TYPESERVER2 | | 71+----------------------+---------------------+ 72| LF_VFTABLE | | 73+----------------------+---------------------+ 74| LF_VTSHAPE | | 75+----------------------+---------------------+ 76| LF_BITFIELD | | 77+----------------------+---------------------+ 78| LF_METHODLIST | | 79+----------------------+---------------------+ 80| LF_PRECOMP | | 81+----------------------+---------------------+ 82| LF_ENDPRECOMP | | 83+----------------------+---------------------+ 84 85The usage of these records is described in more detail in 86:doc:`CodeView Type Records <CodeViewTypes>`. 87 88.. _type_indices: 89 90Type Indices 91============ 92 93A type index is a 32-bit integer that uniquely identifies a type inside of an 94object file's ``.debug$T`` section or a PDB file's TPI or IPI stream. The 95value of the type index for the first type record from the TPI stream is given 96by the ``TypeIndexBegin`` member of the :ref:`TPI Stream Header <tpi_header>` 97although in practice this value is always equal to 0x1000 (4096). 98 99Any type index with a high bit set is considered to come from the IPI stream, 100although this appears to be more of a hack, and LLVM does not generate type 101indices of this nature. They can, however, be observed in Microsoft PDBs 102occasionally, so one should be prepared to handle them. Note that having the 103high bit set is not a necessary condition to determine whether a type index 104comes from the IPI stream, it is only sufficient. 105 106Once the high bit is cleared, any type index >= ``TypeIndexBegin`` is presumed 107to come from the appropriate stream, and any type index less than this is a 108bitmask which can be decomposed as follows: 109 110.. code-block:: none 111 112 .---------------------------.------.----------. 113 | Unused | Mode | Kind | 114 '---------------------------'------'----------' 115 |+32 |+12 |+8 |+0 116 117 118- **Kind** - A value from the following enum: 119 120.. code-block:: c++ 121 122 enum class SimpleTypeKind : uint32_t { 123 None = 0x0000, // uncharacterized type (no type) 124 Void = 0x0003, // void 125 NotTranslated = 0x0007, // type not translated by cvpack 126 HResult = 0x0008, // OLE/COM HRESULT 127 128 SignedCharacter = 0x0010, // 8 bit signed 129 UnsignedCharacter = 0x0020, // 8 bit unsigned 130 NarrowCharacter = 0x0070, // really a char 131 WideCharacter = 0x0071, // wide char 132 Character16 = 0x007a, // char16_t 133 Character32 = 0x007b, // char32_t 134 Character8 = 0x007c, // char8_t 135 136 SByte = 0x0068, // 8 bit signed int 137 Byte = 0x0069, // 8 bit unsigned int 138 Int16Short = 0x0011, // 16 bit signed 139 UInt16Short = 0x0021, // 16 bit unsigned 140 Int16 = 0x0072, // 16 bit signed int 141 UInt16 = 0x0073, // 16 bit unsigned int 142 Int32Long = 0x0012, // 32 bit signed 143 UInt32Long = 0x0022, // 32 bit unsigned 144 Int32 = 0x0074, // 32 bit signed int 145 UInt32 = 0x0075, // 32 bit unsigned int 146 Int64Quad = 0x0013, // 64 bit signed 147 UInt64Quad = 0x0023, // 64 bit unsigned 148 Int64 = 0x0076, // 64 bit signed int 149 UInt64 = 0x0077, // 64 bit unsigned int 150 Int128Oct = 0x0014, // 128 bit signed int 151 UInt128Oct = 0x0024, // 128 bit unsigned int 152 Int128 = 0x0078, // 128 bit signed int 153 UInt128 = 0x0079, // 128 bit unsigned int 154 155 Float16 = 0x0046, // 16 bit real 156 Float32 = 0x0040, // 32 bit real 157 Float32PartialPrecision = 0x0045, // 32 bit PP real 158 Float48 = 0x0044, // 48 bit real 159 Float64 = 0x0041, // 64 bit real 160 Float80 = 0x0042, // 80 bit real 161 Float128 = 0x0043, // 128 bit real 162 163 Complex16 = 0x0056, // 16 bit complex 164 Complex32 = 0x0050, // 32 bit complex 165 Complex32PartialPrecision = 0x0055, // 32 bit PP complex 166 Complex48 = 0x0054, // 48 bit complex 167 Complex64 = 0x0051, // 64 bit complex 168 Complex80 = 0x0052, // 80 bit complex 169 Complex128 = 0x0053, // 128 bit complex 170 171 Boolean8 = 0x0030, // 8 bit boolean 172 Boolean16 = 0x0031, // 16 bit boolean 173 Boolean32 = 0x0032, // 32 bit boolean 174 Boolean64 = 0x0033, // 64 bit boolean 175 Boolean128 = 0x0034, // 128 bit boolean 176 }; 177 178- **Mode** - A value from the following enum: 179 180.. code-block:: c++ 181 182 enum class SimpleTypeMode : uint32_t { 183 Direct = 0, // Not a pointer 184 NearPointer = 1, // Near pointer 185 FarPointer = 2, // Far pointer 186 HugePointer = 3, // Huge pointer 187 NearPointer32 = 4, // 32 bit near pointer 188 FarPointer32 = 5, // 32 bit far pointer 189 NearPointer64 = 6, // 64 bit near pointer 190 NearPointer128 = 7 // 128 bit near pointer 191 }; 192 193Note that for pointers, the bitness is represented in the mode. So a ``void*`` 194would have a type index with ``Mode=NearPointer32, Kind=Void`` if built for 19532-bits but a type index with ``Mode=NearPointer64, Kind=Void`` if built for 19664-bits. 197 198By convention, the type index for ``std::nullptr_t`` is constructed the same 199way as the type index for ``void*``, but using the bitless enumeration value 200``NearPointer``. 201 202.. _tpi_header: 203 204Stream Header 205============= 206At offset 0 of the TPI Stream is a header with the following layout: 207 208.. code-block:: c++ 209 210 struct TpiStreamHeader { 211 uint32_t Version; 212 uint32_t HeaderSize; 213 uint32_t TypeIndexBegin; 214 uint32_t TypeIndexEnd; 215 uint32_t TypeRecordBytes; 216 217 uint16_t HashStreamIndex; 218 uint16_t HashAuxStreamIndex; 219 uint32_t HashKeySize; 220 uint32_t NumHashBuckets; 221 222 int32_t HashValueBufferOffset; 223 uint32_t HashValueBufferLength; 224 225 int32_t IndexOffsetBufferOffset; 226 uint32_t IndexOffsetBufferLength; 227 228 int32_t HashAdjBufferOffset; 229 uint32_t HashAdjBufferLength; 230 }; 231 232- **Version** - A value from the following enum. 233 234.. code-block:: c++ 235 236 enum class TpiStreamVersion : uint32_t { 237 V40 = 19950410, 238 V41 = 19951122, 239 V50 = 19961031, 240 V70 = 19990903, 241 V80 = 20040203, 242 }; 243 244Similar to the :doc:`PDB Stream <PdbStream>`, this value always appears to be 245``V80``, and no other values have been observed. It is assumed that should 246another value be observed, the layout described by this document may not be 247accurate. 248 249- **HeaderSize** - ``sizeof(TpiStreamHeader)`` 250 251- **TypeIndexBegin** - The numeric value of the type index representing the 252 first type record in the TPI stream. This is usually the value 0x1000 as 253 type indices lower than this are reserved (see :ref:`Type Indices 254 <type_indices>` for 255 a discussion of reserved type indices). 256 257- **TypeIndexEnd** - One greater than the numeric value of the type index 258 representing the last type record in the TPI stream. The total number of 259 type records in the TPI stream can be computed as ``TypeIndexEnd - 260 TypeIndexBegin``. 261 262- **TypeRecordBytes** - The number of bytes of type record data following the 263 header. 264 265- **HashStreamIndex** - The index of a stream which contains a list of hashes 266 for every type record. This value may be -1, indicating that hash 267 information is not present. In practice a valid stream index is always 268 observed, so any producer implementation should be prepared to emit this 269 stream to ensure compatibility with tools which may expect it to be present. 270 271- **HashAuxStreamIndex** - Presumably the index of a stream which contains a 272 separate hash table, although this has not been observed in practice and it's 273 unclear what it might be used for. 274 275- **HashKeySize** - The size of a hash value (usually 4 bytes). 276 277- **NumHashBuckets** - The number of buckets used to generate the hash values 278 in the aforementioned hash streams. 279 280- **HashValueBufferOffset / HashValueBufferLength** - The offset and size within 281 the TPI Hash Stream of the list of hash values. It should be assumed that 282 there are either 0 hash values, or a number equal to the number of type 283 records in the TPI stream (``TypeIndexEnd - TypeEndBegin``). Thus, if 284 ``HashBufferLength`` is not equal to ``(TypeIndexEnd - TypeEndBegin) * 285 HashKeySize`` we can consider the PDB malformed. 286 287- **IndexOffsetBufferOffset / IndexOffsetBufferLength** - The offset and size 288 within the TPI Hash Stream of the Type Index Offsets Buffer. This is a list 289 of pairs of uint32_t's where the first value is a :ref:`Type Index 290 <type_indices>` and the second value is the offset in the type record data of 291 the type with this index. This can be used to do a binary search followed by 292 a linear search to get O(log n) lookup by type index. 293 294- **HashAdjBufferOffset / HashAdjBufferLength** - The offset and size within 295 the TPI hash stream of a serialized hash table whose keys are the hash values 296 in the hash value buffer and whose values are type indices. This appears to 297 be useful in incremental linking scenarios, so that if a type is modified an 298 entry can be created mapping the old hash value to the new type index so that 299 a PDB file consumer can always have the most up to date version of the type 300 without forcing the incremental linker to garbage collect and update 301 references that point to the old version to now point to the new version. 302 The layout of this hash table is described in :doc:`HashTable`. 303 304.. _tpi_records: 305 306CodeView Type Record List 307========================= 308Following the header, there are ``TypeRecordBytes`` bytes of data that 309represent a variable length array of :doc:`CodeView type records 310<CodeViewTypes>`. The number of such records (e.g. the length of the array) 311can be determined by computing the value ``Header.TypeIndexEnd - 312Header.TypeIndexBegin``. 313 314O(log(n)) access is provided by way of the Type Index Offsets array (if 315present) described previously. 316