1# MLIR Bytecode Format 2 3This document describes the MLIR bytecode format and its encoding. 4This format is versioned and stable: we don't plan to ever break 5compatibility, that is a dialect should be able to deserialize any 6older bytecode. Similarly, we support back-deployment so that an 7older version of the format can be targetted. 8 9That said, it is important to realize that the promises of the 10bytecode format are made assuming immutable dialects: the format 11allows backward and forward compatibility, but only when nothing 12in a dialect changes (operations, types, attributes definitions). 13 14A dialect can opt-in to handle its own versioning through the 15`BytecodeDialectInterface`. Some hooks are exposed to the dialect 16to allow managing a version encoded into the bytecode file. The 17version is loaded lazily and allows to retrieve the version 18information while decoding the input IR, and gives an opportunity 19to each dialect for which a version is present to perform IR 20upgrades post-parsing through the `upgradeFromVersion` method. 21There is no restriction on what kind of information a dialect 22is allowed to encode to model its versioning. 23 24[TOC] 25 26## Magic Number 27 28MLIR uses the following four-byte magic number to 29indicate bytecode files: 30 31'\[‘M’<sub>8</sub>, ‘L’<sub>8</sub>, ‘ï’<sub>8</sub>, ‘R’<sub>8</sub>\]' 32 33In hex: 34 35'\[‘4D’<sub>8</sub>, ‘4C’<sub>8</sub>, ‘EF’<sub>8</sub>, ‘52’<sub>8</sub>\]' 36 37## Format Overview 38 39An MLIR Bytecode file is comprised of a byte stream, with a few simple 40structural concepts layered on top. 41 42### Primitives 43 44#### Fixed-Width Integers 45 46``` 47 byte ::= `0x00`...`0xFF` 48``` 49 50Fixed width integers are unsigned integers of a known byte size. The values are 51stored in little-endian byte order. 52 53TODO: Add larger fixed width integers as necessary. 54 55#### Variable-Width Integers 56 57Variable width integers, or `VarInt`s, provide a compact representation for 58integers. Each encoded VarInt consists of one to nine bytes, which together 59represent a single 64-bit value. The MLIR bytecode utilizes the "PrefixVarInt" 60encoding for VarInts. This encoding is a variant of the 61[LEB128 ("Little-Endian Base 128")](https://en.wikipedia.org/wiki/LEB128) 62encoding, where each byte of the encoding provides up to 7 bits for the value, 63with the remaining bit used to store a tag indicating the number of bytes used 64for the encoding. This means that small unsigned integers (less than 2^7) may be 65stored in one byte, unsigned integers up to 2^14 may be stored in two bytes, 66etc. 67 68The first byte of the encoding includes a length prefix in the low bits. This 69prefix is a bit sequence of '0's followed by a terminal '1', or the end of the 70byte. The number of '0' bits indicate the number of _additional_ bytes, not 71including the prefix byte, used to encode the value. All of the remaining bits 72in the first byte, along with all of the bits in the additional bytes, provide 73the value of the integer. Below are the various possible encodings of the prefix 74byte: 75 76``` 77xxxxxxx1: 7 value bits, the encoding uses 1 byte 78xxxxxx10: 14 value bits, the encoding uses 2 bytes 79xxxxx100: 21 value bits, the encoding uses 3 bytes 80xxxx1000: 28 value bits, the encoding uses 4 bytes 81xxx10000: 35 value bits, the encoding uses 5 bytes 82xx100000: 42 value bits, the encoding uses 6 bytes 83x1000000: 49 value bits, the encoding uses 7 bytes 8410000000: 56 value bits, the encoding uses 8 bytes 8500000000: 64 value bits, the encoding uses 9 bytes 86``` 87 88##### Signed Variable-Width Integers 89 90Signed variable width integer values are encoded in a similar fashion to 91[varints](#variable-width-integers), but employ 92[zigzag encoding](https://en.wikipedia.org/wiki/Variable-length_quantity#Zigzag_encoding). 93This encoding uses the low bit of the value to indicate the sign, which allows 94for more efficiently encoding negative numbers. If a negative value were encoded 95using a normal [varint](#variable-width-integers), it would be treated as an 96extremely large unsigned value. Using zigzag encoding allows for a smaller 97number of active bits in the value, leading to a smaller encoding. Below is the 98basic computation for generating a zigzag encoding: 99 100``` 101(value << 1) ^ (value >> 63) 102``` 103 104#### Strings 105 106Strings are blobs of characters with an associated length. 107 108### Sections 109 110``` 111section { 112 idAndIsAligned: byte // id | (hasAlign << 7) 113 length: varint, 114 115 alignment: varint?, 116 padding: byte[], // Padding bytes are always `0xCB`. 117 118 data: byte[] 119} 120``` 121 122Sections are a mechanism for grouping data within the bytecode. They enable 123delayed processing, which is useful for out-of-order processing of data, 124lazy-loading, and more. Each section contains a Section ID, whose high bit 125indicates if the section has alignment requirements, a length (which allows for 126skipping over the section), and an optional alignment. When an alignment is 127present, a variable number of padding bytes (0xCB) may appear before the section 128data. The alignment of a section must be a power of 2. 129 130## MLIR Encoding 131 132Given the generic structure of MLIR, the bytecode encoding is actually fairly 133simplistic. It effectively maps to the core components of MLIR. 134 135### Top Level Structure 136 137The top-level structure of the bytecode contains the 4-byte "magic number", a 138version number, a null-terminated producer string, and a list of sections. Each 139section is currently only expected to appear once within a bytecode file. 140 141``` 142bytecode { 143 magic: "MLïR", 144 version: varint, 145 producer: string, 146 sections: section[] 147} 148``` 149 150### String Section 151 152``` 153strings { 154 numStrings: varint, 155 reverseStringLengths: varint[], 156 stringData: byte[] 157} 158``` 159 160The string section contains a table of strings referenced within the bytecode, 161more easily enabling string sharing. This section is encoded first with the 162total number of strings, followed by the sizes of each of the individual strings 163in reverse order. The remaining encoding contains a single blob containing all 164of the strings concatenated together. 165 166### Dialect Section 167 168The dialect section of the bytecode contains all of the dialects referenced 169within the encoded IR, and some information about the components of those 170dialects that were also referenced. 171 172``` 173dialect_section { 174 numDialects: varint, 175 dialectNames: dialect_name_group[], 176 opNames: dialect_ops_group[] // ops grouped by dialect 177} 178 179dialect_name_group { 180 nameAndIsVersioned: varint // (dialectID << 1) | (hasVersion), 181 version: dialect_version_section // only if versioned 182} 183 184dialect_version_section { 185 size: varint, 186 version: byte[] 187} 188 189dialect_ops_group { 190 dialect: varint, 191 numOpNames: varint, 192 opNames: op_name_group[] 193} 194 195op_name_group { 196 nameAndIsRegistered: varint // (nameID << 1) | (isRegisteredOp) 197} 198``` 199 200Dialects are encoded as a `varint` containing the index to the name string 201within the string section, plus a flag indicating whether the dialect is 202versioned. Operation names are encoded in groups by dialect, with each group 203containing the dialect, the number of operation names, and the array of indexes 204to each name within the string section. The version is encoded as a nested 205section for each dialect. 206 207### Attribute/Type Sections 208 209Attributes and types are encoded using two [sections](#sections), one section 210(`attr_type_section`) containing the actual encoded representation, and another 211section (`attr_type_offset_section`) containing the offsets of each encoded 212attribute/type into the previous section. This structure allows for attributes 213and types to always be lazily loaded on demand. 214 215``` 216attr_type_section { 217 attrs: attribute[], 218 types: type[] 219} 220attr_type_offset_section { 221 numAttrs: varint, 222 numTypes: varint, 223 offsets: attr_type_offset_group[] 224} 225 226attr_type_offset_group { 227 dialect: varint, 228 numElements: varint, 229 offsets: varint[] // (offset << 1) | (hasCustomEncoding) 230} 231 232attribute { 233 encoding: ... 234} 235type { 236 encoding: ... 237} 238``` 239 240Each `offset` in the `attr_type_offset_section` above is the size of the 241encoding for the attribute or type and a flag indicating if the encoding uses 242the textual assembly format, or a custom bytecode encoding. We avoid using the 243direct offset into the `attr_type_section`, as a smaller relative offsets 244provides more effective compression. Attributes and types are grouped by 245dialect, with each `attr_type_offset_group` in the offset section containing the 246corresponding parent dialect, number of elements, and offsets for each element 247within the group. 248 249#### Attribute/Type Encodings 250 251In the abstract, an attribute/type is encoded in one of two possible ways: via 252its assembly format, or via a custom dialect defined encoding. 253 254##### Assembly Format Fallback 255 256In the case where a dialect does not define a method for encoding the attribute 257or type, the textual assembly format of that attribute or type is used as a 258fallback. For example, a type `!bytecode.type<42>` would be encoded as the null 259terminated string "!bytecode.type<42>". This ensures that every attribute and 260type can be encoded, even if the owning dialect has not yet opted in to a more 261efficient serialization. 262 263TODO: We shouldn't redundantly encode the dialect name here, we should use a 264reference to the parent dialect instead. 265 266##### Dialect Defined Encoding 267 268As an alternative to the assembly format fallback, dialects may also provide a 269custom encoding for their attributes and types. Custom encodings are very 270beneficial in that they are significantly smaller and faster to read and write. 271 272Dialects can opt-in to providing custom encodings by implementing the 273`BytecodeDialectInterface`. This interface provides hooks, namely 274`readAttribute`/`readType` and `writeAttribute`/`writeType`, that will be used 275by the bytecode reader and writer. These hooks are provided a reader and writer 276implementation that can be used to encode various constructs in the underlying 277bytecode format. A unique feature of this interface is that dialects may choose 278to only encode a subset of their attributes and types in a custom bytecode 279format, which can simplify adding new or experimental components that aren't 280fully baked. 281 282When implementing the bytecode interface, dialects are responsible for all 283aspects of the encoding. This includes the indicator for which kind of attribute 284or type is being encoded; the bytecode reader will only know that it has 285encountered an attribute or type of a given dialect, it doesn't encode any 286further information. As such, a common encoding idiom is to use a leading 287`varint` code to indicate how the attribute or type was encoded. 288 289### Resource Section 290 291Resources are encoded using two [sections](#sections), one section 292(`resource_section`) containing the actual encoded representation, and another 293section (`resource_offset_section`) containing the offsets of each encoded 294resource into the previous section. 295 296``` 297resource_section { 298 resources: resource[] 299} 300resource { 301 value: resource_bool | resource_string | resource_blob 302} 303resource_bool { 304 value: byte 305} 306resource_string { 307 value: varint 308} 309resource_blob { 310 alignment: varint, 311 size: varint, 312 padding: byte[], 313 blob: byte[] 314} 315 316resource_offset_section { 317 numExternalResourceGroups: varint, 318 resourceGroups: resource_group[] 319} 320resource_group { 321 key: varint, 322 numResources: varint, 323 resources: resource_info[] 324} 325resource_info { 326 key: varint, 327 size: varint 328 kind: byte, 329} 330``` 331 332Resources are grouped by the provider, either an external entity or a dialect, 333with each `resource_group` in the offset section containing the corresponding 334provider, number of elements, and info for each element within the group. For 335each element, we record the key, the value kind, and the encoded size. We avoid 336using the direct offset into the `resource_section`, as a smaller relative 337offsets provides more effective compression. 338 339### IR Section 340 341The IR section contains the encoded form of operations within the bytecode. 342 343``` 344ir_section { 345 block: block; // Single block without arguments. 346} 347``` 348 349#### Operation Encoding 350 351``` 352op { 353 name: varint, 354 encodingMask: byte, 355 location: varint, 356 357 attrDict: varint?, 358 359 numResults: varint?, 360 resultTypes: varint[], 361 362 numOperands: varint?, 363 operands: varint[], 364 365 numSuccessors: varint?, 366 successors: varint[], 367 368 numUseListOrders: varint?, 369 useListOrders: uselist[], 370 371 regionEncoding: varint?, // (numRegions << 1) | (isIsolatedFromAbove) 372 373 // regions are stored in a section if isIsolatedFromAbove 374 regions: (region | region_section)[] 375} 376 377uselist { 378 indexInRange: varint?, 379 useListEncoding: varint, // (numIndices << 1) | (isIndexPairEncoding) 380 indices: varint[] 381} 382``` 383 384The encoding of an operation is important because this is generally the most 385commonly appearing structure in the bytecode. A single encoding is used for 386every type of operation. Given this prevalence, many of the fields of an 387operation are optional. The `encodingMask` field is a bitmask which indicates 388which of the components of the operation are present. 389 390##### Location 391 392The location is encoded as the index of the location within the attribute table. 393 394##### Attributes 395 396If the operation has attribues, the index of the operation attribute dictionary 397within the attribute table is encoded. 398 399##### Results 400 401If the operation has results, the number of results and the indexes of the 402result types within the type table are encoded. 403 404##### Operands 405 406If the operation has operands, the number of operands and the value index of 407each operand is encoded. This value index is the relative ordering of the 408definition of that value from the start of the first ancestor isolated region. 409 410##### Successors 411 412If the operation has successors, the number of successors and the indexes of the 413successor blocks within the parent region are encoded. 414 415##### Use-list orders 416 417The reference use-list order is assumed to be the reverse of the global 418enumeration of all the op operands that one would obtain with a pre-order walk 419of the IR. This order is naturally obtained by building blocks of operations 420op-by-op. However, some transformations may shuffle the use-lists with respect 421to this reference ordering. If any of the results of the operation have a 422use-list order that is not sorted with respect to the reference use-list order, 423an encoding is emitted such that it is possible to reconstruct such order after 424parsing the bytecode. The encoding represents an index map from the reference 425operand order to the current use-list order. A bit flag is used to detect if 426this encoding is of type index-pair or not. When the bit flag is set to zero, 427the element at `i` represent the position of the use `i` of the reference list 428into the current use-list. When the bit flag is set to `1`, the encoding 429represent index pairs `(i, j)`, which indicate that the use at position `i` of 430the reference list is mapped to position `j` in the current use-list. When only 431less than half of the elements in the current use-list are shuffled with respect 432to the reference use-list, the index-pair encoding is used to reduce the 433bytecode memory requirements. 434 435##### Regions 436 437If the operation has regions, the number of regions and if the regions are 438isolated from above are encoded together in a single varint. Afterwards, each 439region is encoded inline. 440 441#### Region Encoding 442 443``` 444region { 445 numBlocks: varint, 446 447 numValues: varint?, 448 blocks: block[] 449} 450``` 451 452A region is encoded first with the number of blocks within. If the region is 453non-empty, the number of values defined directly within the region are encoded, 454followed by the blocks of the region. 455 456#### Block Encoding 457 458``` 459block { 460 encoding: varint, // (numOps << 1) | (hasBlockArgs) 461 arguments: block_arguments?, // Optional based on encoding 462 ops : op[] 463} 464 465block_arguments { 466 numArgs: varint?, 467 args: block_argument[] 468 numUseListOrders: varint?, 469 useListOrders: uselist[], 470} 471 472block_argument { 473 typeAndLocation: varint, // (type << 1) | (hasLocation) 474 location: varint? // Optional, else unknown location 475} 476``` 477 478A block is encoded with an array of operations and block arguments. The first 479field is an encoding that combines the number of operations in the block, with a 480flag indicating if the block has arguments. 481 482Use-list orders are attached to block arguments similarly to how they are 483attached to operation results. 484