xref: /llvm-project/mlir/docs/BytecodeFormat.md (revision 5795f9e27390fe5c322853476b3c9ba8376e8541)
1# MLIR Bytecode Format
2
3This document describes the MLIR bytecode format and its encoding.
4This format is versioned and stable: we don't plan to ever break
5compatibility, that is a dialect should be able to deserialize any
6older bytecode. Similarly, we support back-deployment so that an
7older version of the format can be targetted.
8
9That said, it is important to realize that the promises of the
10bytecode format are made assuming immutable dialects: the format
11allows backward and forward compatibility, but only when nothing
12in a dialect changes (operations, types, attributes definitions).
13
14A dialect can opt-in to handle its own versioning through the
15`BytecodeDialectInterface`. Some hooks are exposed to the dialect
16to allow managing a version encoded into the bytecode file. The
17version is loaded lazily and allows to retrieve the version
18information while decoding the input IR, and gives an opportunity
19to each dialect for which a version is present to perform IR
20upgrades post-parsing through the `upgradeFromVersion` method.
21There is no restriction on what kind of information a dialect
22is allowed to encode to model its versioning.
23
24[TOC]
25
26## Magic Number
27
28MLIR uses the following four-byte magic number to
29indicate bytecode files:
30
31'\[‘M’<sub>8</sub>, ‘L’<sub>8</sub>, ‘ï’<sub>8</sub>, ‘R’<sub>8</sub>\]'
32
33In hex:
34
35'\[‘4D’<sub>8</sub>, ‘4C’<sub>8</sub>, ‘EF’<sub>8</sub>, ‘52’<sub>8</sub>\]'
36
37## Format Overview
38
39An MLIR Bytecode file is comprised of a byte stream, with a few simple
40structural concepts layered on top.
41
42### Primitives
43
44#### Fixed-Width Integers
45
46```
47  byte ::= `0x00`...`0xFF`
48```
49
50Fixed width integers are unsigned integers of a known byte size. The values are
51stored in little-endian byte order.
52
53TODO: Add larger fixed width integers as necessary.
54
55#### Variable-Width Integers
56
57Variable width integers, or `VarInt`s, provide a compact representation for
58integers. Each encoded VarInt consists of one to nine bytes, which together
59represent a single 64-bit value. The MLIR bytecode utilizes the "PrefixVarInt"
60encoding for VarInts. This encoding is a variant of the
61[LEB128 ("Little-Endian Base 128")](https://en.wikipedia.org/wiki/LEB128)
62encoding, where each byte of the encoding provides up to 7 bits for the value,
63with the remaining bit used to store a tag indicating the number of bytes used
64for the encoding. This means that small unsigned integers (less than 2^7) may be
65stored in one byte, unsigned integers up to 2^14 may be stored in two bytes,
66etc.
67
68The first byte of the encoding includes a length prefix in the low bits. This
69prefix is a bit sequence of '0's followed by a terminal '1', or the end of the
70byte. The number of '0' bits indicate the number of _additional_ bytes, not
71including the prefix byte, used to encode the value. All of the remaining bits
72in the first byte, along with all of the bits in the additional bytes, provide
73the value of the integer. Below are the various possible encodings of the prefix
74byte:
75
76```
77xxxxxxx1:  7 value bits, the encoding uses 1 byte
78xxxxxx10: 14 value bits, the encoding uses 2 bytes
79xxxxx100: 21 value bits, the encoding uses 3 bytes
80xxxx1000: 28 value bits, the encoding uses 4 bytes
81xxx10000: 35 value bits, the encoding uses 5 bytes
82xx100000: 42 value bits, the encoding uses 6 bytes
83x1000000: 49 value bits, the encoding uses 7 bytes
8410000000: 56 value bits, the encoding uses 8 bytes
8500000000: 64 value bits, the encoding uses 9 bytes
86```
87
88##### Signed Variable-Width Integers
89
90Signed variable width integer values are encoded in a similar fashion to
91[varints](#variable-width-integers), but employ
92[zigzag encoding](https://en.wikipedia.org/wiki/Variable-length_quantity#Zigzag_encoding).
93This encoding uses the low bit of the value to indicate the sign, which allows
94for more efficiently encoding negative numbers. If a negative value were encoded
95using a normal [varint](#variable-width-integers), it would be treated as an
96extremely large unsigned value. Using zigzag encoding allows for a smaller
97number of active bits in the value, leading to a smaller encoding. Below is the
98basic computation for generating a zigzag encoding:
99
100```
101(value << 1) ^ (value >> 63)
102```
103
104#### Strings
105
106Strings are blobs of characters with an associated length.
107
108### Sections
109
110```
111section {
112  idAndIsAligned: byte // id | (hasAlign << 7)
113  length: varint,
114
115  alignment: varint?,
116  padding: byte[], // Padding bytes are always `0xCB`.
117
118  data: byte[]
119}
120```
121
122Sections are a mechanism for grouping data within the bytecode. They enable
123delayed processing, which is useful for out-of-order processing of data,
124lazy-loading, and more. Each section contains a Section ID, whose high bit
125indicates if the section has alignment requirements, a length (which allows for
126skipping over the section), and an optional alignment. When an alignment is
127present, a variable number of padding bytes (0xCB) may appear before the section
128data. The alignment of a section must be a power of 2.
129
130## MLIR Encoding
131
132Given the generic structure of MLIR, the bytecode encoding is actually fairly
133simplistic. It effectively maps to the core components of MLIR.
134
135### Top Level Structure
136
137The top-level structure of the bytecode contains the 4-byte "magic number", a
138version number, a null-terminated producer string, and a list of sections. Each
139section is currently only expected to appear once within a bytecode file.
140
141```
142bytecode {
143  magic: "MLïR",
144  version: varint,
145  producer: string,
146  sections: section[]
147}
148```
149
150### String Section
151
152```
153strings {
154  numStrings: varint,
155  reverseStringLengths: varint[],
156  stringData: byte[]
157}
158```
159
160The string section contains a table of strings referenced within the bytecode,
161more easily enabling string sharing. This section is encoded first with the
162total number of strings, followed by the sizes of each of the individual strings
163in reverse order. The remaining encoding contains a single blob containing all
164of the strings concatenated together.
165
166### Dialect Section
167
168The dialect section of the bytecode contains all of the dialects referenced
169within the encoded IR, and some information about the components of those
170dialects that were also referenced.
171
172```
173dialect_section {
174  numDialects: varint,
175  dialectNames: dialect_name_group[],
176  opNames: dialect_ops_group[]  // ops grouped by dialect
177}
178
179dialect_name_group {
180  nameAndIsVersioned: varint  // (dialectID << 1) | (hasVersion),
181  version: dialect_version_section  // only if versioned
182}
183
184dialect_version_section {
185  size: varint,
186  version: byte[]
187}
188
189dialect_ops_group {
190  dialect: varint,
191  numOpNames: varint,
192  opNames: op_name_group[]
193}
194
195op_name_group {
196  nameAndIsRegistered: varint  // (nameID << 1) | (isRegisteredOp)
197}
198```
199
200Dialects are encoded as a `varint` containing the index to the name string
201within the string section, plus a flag indicating whether the dialect is
202versioned. Operation names are encoded in groups by dialect, with each group
203containing the dialect, the number of operation names, and the array of indexes
204to each name within the string section. The version is encoded as a nested
205section for each dialect.
206
207### Attribute/Type Sections
208
209Attributes and types are encoded using two [sections](#sections), one section
210(`attr_type_section`) containing the actual encoded representation, and another
211section (`attr_type_offset_section`) containing the offsets of each encoded
212attribute/type into the previous section. This structure allows for attributes
213and types to always be lazily loaded on demand.
214
215```
216attr_type_section {
217  attrs: attribute[],
218  types: type[]
219}
220attr_type_offset_section {
221  numAttrs: varint,
222  numTypes: varint,
223  offsets: attr_type_offset_group[]
224}
225
226attr_type_offset_group {
227  dialect: varint,
228  numElements: varint,
229  offsets: varint[] // (offset << 1) | (hasCustomEncoding)
230}
231
232attribute {
233  encoding: ...
234}
235type {
236  encoding: ...
237}
238```
239
240Each `offset` in the `attr_type_offset_section` above is the size of the
241encoding for the attribute or type and a flag indicating if the encoding uses
242the textual assembly format, or a custom bytecode encoding. We avoid using the
243direct offset into the `attr_type_section`, as a smaller relative offsets
244provides more effective compression. Attributes and types are grouped by
245dialect, with each `attr_type_offset_group` in the offset section containing the
246corresponding parent dialect, number of elements, and offsets for each element
247within the group.
248
249#### Attribute/Type Encodings
250
251In the abstract, an attribute/type is encoded in one of two possible ways: via
252its assembly format, or via a custom dialect defined encoding.
253
254##### Assembly Format Fallback
255
256In the case where a dialect does not define a method for encoding the attribute
257or type, the textual assembly format of that attribute or type is used as a
258fallback. For example, a type `!bytecode.type<42>` would be encoded as the null
259terminated string "!bytecode.type<42>". This ensures that every attribute and
260type can be encoded, even if the owning dialect has not yet opted in to a more
261efficient serialization.
262
263TODO: We shouldn't redundantly encode the dialect name here, we should use a
264reference to the parent dialect instead.
265
266##### Dialect Defined Encoding
267
268As an alternative to the assembly format fallback, dialects may also provide a
269custom encoding for their attributes and types. Custom encodings are very
270beneficial in that they are significantly smaller and faster to read and write.
271
272Dialects can opt-in to providing custom encodings by implementing the
273`BytecodeDialectInterface`. This interface provides hooks, namely
274`readAttribute`/`readType` and `writeAttribute`/`writeType`, that will be used
275by the bytecode reader and writer. These hooks are provided a reader and writer
276implementation that can be used to encode various constructs in the underlying
277bytecode format. A unique feature of this interface is that dialects may choose
278to only encode a subset of their attributes and types in a custom bytecode
279format, which can simplify adding new or experimental components that aren't
280fully baked.
281
282When implementing the bytecode interface, dialects are responsible for all
283aspects of the encoding. This includes the indicator for which kind of attribute
284or type is being encoded; the bytecode reader will only know that it has
285encountered an attribute or type of a given dialect, it doesn't encode any
286further information. As such, a common encoding idiom is to use a leading
287`varint` code to indicate how the attribute or type was encoded.
288
289### Resource Section
290
291Resources are encoded using two [sections](#sections), one section
292(`resource_section`) containing the actual encoded representation, and another
293section (`resource_offset_section`) containing the offsets of each encoded
294resource into the previous section.
295
296```
297resource_section {
298  resources: resource[]
299}
300resource {
301  value: resource_bool | resource_string | resource_blob
302}
303resource_bool {
304  value: byte
305}
306resource_string {
307  value: varint
308}
309resource_blob {
310  alignment: varint,
311  size: varint,
312  padding: byte[],
313  blob: byte[]
314}
315
316resource_offset_section {
317  numExternalResourceGroups: varint,
318  resourceGroups: resource_group[]
319}
320resource_group {
321  key: varint,
322  numResources: varint,
323  resources: resource_info[]
324}
325resource_info {
326  key: varint,
327  size: varint
328  kind: byte,
329}
330```
331
332Resources are grouped by the provider, either an external entity or a dialect,
333with each `resource_group` in the offset section containing the corresponding
334provider, number of elements, and info for each element within the group. For
335each element, we record the key, the value kind, and the encoded size. We avoid
336using the direct offset into the `resource_section`, as a smaller relative
337offsets provides more effective compression.
338
339### IR Section
340
341The IR section contains the encoded form of operations within the bytecode.
342
343```
344ir_section {
345  block: block; // Single block without arguments.
346}
347```
348
349#### Operation Encoding
350
351```
352op {
353  name: varint,
354  encodingMask: byte,
355  location: varint,
356
357  attrDict: varint?,
358
359  numResults: varint?,
360  resultTypes: varint[],
361
362  numOperands: varint?,
363  operands: varint[],
364
365  numSuccessors: varint?,
366  successors: varint[],
367
368  numUseListOrders: varint?,
369  useListOrders: uselist[],
370
371  regionEncoding: varint?, // (numRegions << 1) | (isIsolatedFromAbove)
372
373  // regions are stored in a section if isIsolatedFromAbove
374  regions: (region | region_section)[]
375}
376
377uselist {
378  indexInRange: varint?,
379  useListEncoding: varint, // (numIndices << 1) | (isIndexPairEncoding)
380  indices: varint[]
381}
382```
383
384The encoding of an operation is important because this is generally the most
385commonly appearing structure in the bytecode. A single encoding is used for
386every type of operation. Given this prevalence, many of the fields of an
387operation are optional. The `encodingMask` field is a bitmask which indicates
388which of the components of the operation are present.
389
390##### Location
391
392The location is encoded as the index of the location within the attribute table.
393
394##### Attributes
395
396If the operation has attribues, the index of the operation attribute dictionary
397within the attribute table is encoded.
398
399##### Results
400
401If the operation has results, the number of results and the indexes of the
402result types within the type table are encoded.
403
404##### Operands
405
406If the operation has operands, the number of operands and the value index of
407each operand is encoded. This value index is the relative ordering of the
408definition of that value from the start of the first ancestor isolated region.
409
410##### Successors
411
412If the operation has successors, the number of successors and the indexes of the
413successor blocks within the parent region are encoded.
414
415##### Use-list orders
416
417The reference use-list order is assumed to be the reverse of the global
418enumeration of all the op operands that one would obtain with a pre-order walk
419of the IR. This order is naturally obtained by building blocks of operations
420op-by-op. However, some transformations may shuffle the use-lists with respect
421to this reference ordering. If any of the results of the operation have a
422use-list order that is not sorted with respect to the reference use-list order,
423an encoding is emitted such that it is possible to reconstruct such order after
424parsing the bytecode. The encoding represents an index map from the reference
425operand order to the current use-list order. A bit flag is used to detect if
426this encoding is of type index-pair or not. When the bit flag is set to zero,
427the element at `i` represent the position of the use `i` of the reference list
428into the current use-list. When the bit flag is set to `1`, the encoding
429represent index pairs `(i, j)`, which indicate that the use at position `i` of
430the reference list is mapped to position `j` in the current use-list. When only
431less than half of the elements in the current use-list are shuffled with respect
432to the reference use-list, the index-pair encoding is used to reduce the
433bytecode memory requirements.
434
435##### Regions
436
437If the operation has regions, the number of regions and if the regions are
438isolated from above are encoded together in a single varint. Afterwards, each
439region is encoded inline.
440
441#### Region Encoding
442
443```
444region {
445  numBlocks: varint,
446
447  numValues: varint?,
448  blocks: block[]
449}
450```
451
452A region is encoded first with the number of blocks within. If the region is
453non-empty, the number of values defined directly within the region are encoded,
454followed by the blocks of the region.
455
456#### Block Encoding
457
458```
459block {
460  encoding: varint, // (numOps << 1) | (hasBlockArgs)
461  arguments: block_arguments?, // Optional based on encoding
462  ops : op[]
463}
464
465block_arguments {
466  numArgs: varint?,
467  args: block_argument[]
468  numUseListOrders: varint?,
469  useListOrders: uselist[],
470}
471
472block_argument {
473  typeAndLocation: varint, // (type << 1) | (hasLocation)
474  location: varint? // Optional, else unknown location
475}
476```
477
478A block is encoded with an array of operations and block arguments. The first
479field is an encoding that combines the number of operations in the block, with a
480flag indicating if the block has arguments.
481
482Use-list orders are attached to block arguments similarly to how they are
483attached to operation results.
484