1# Data Layout Modeling 2 3Data layout information allows the compiler to answer questions related to how a 4value of a particular type is stored in memory. For example, the size of a value 5or its address alignment requirements. It enables, among others, the generation 6of various linear memory addressing schemes for containers of abstract types and 7deeper reasoning about vectors. 8 9The data layout subsystem is designed to scale to MLIR's open type and operation 10system. At the top level, it consists of: 11 12* attribute interfaces that can be implemented by concrete data layout 13 specifications; 14* type interfaces that should be implemented by types subject to data layout; 15* operation interfaces that must be implemented by operations that can serve 16 as data layout scopes (e.g., modules); 17* and dialect interfaces for data layout properties unrelated to specific 18 types. 19 20Built-in types are handled specially to decrease the overall query cost. 21Similarly, built-in `ModuleOp` supports data layouts without going through the 22interface. 23 24[TOC] 25 26## Usage 27 28### Scoping 29 30Following MLIR's nested structure, data layout properties are _scoped_ to 31regions belonging to either operations that implement the 32`DataLayoutOpInterface` or `ModuleOp` operations. Such scoping operations 33partially control the data layout properties and may have attributes that affect 34them, typically organized in a data layout specification. 35 36Types may have a different data layout in different scopes, including scopes 37that are nested in other scopes such as modules contained in other modules. At 38the same time, within the given scope excluding any nested scope, a given type 39has fixed data layout properties. Types are also expected to have a default, 40"natural" data layout in case they are used outside of any operation that 41provides data layout scope for them. This ensures that data layout queries 42always have a valid result. 43 44### Compatibility and Transformations 45 46The information necessary to compute layout properties can be combined from 47nested scopes. For example, an outer scope can define layout properties for a 48subset of types while inner scopes define them for a disjoint subset, or scopes 49can progressively relax alignment requirements on a type. This mechanism is 50supported by the notion of data layout _compatibility_: the layout defined in a 51nested scope is expected to be compatible with that of the outer scope. MLIR 52does not prescribe what compatibility means for particular ops and types but 53provides hooks for them to provide target- and type-specific checks. For 54example, one may want to only allow relaxation of alignment constraints (i.e., 55smaller alignment) in nested modules or, alternatively, one may require nested 56modules to fully redefine all constraints of the outer scope. 57 58Data layout compatibility is also relevant during IR transformation. Any 59transformation that affects the data layout scoping operation is expected to 60maintain data layout compatibility. It is under responsibility of the 61transformation to ensure it is indeed the case. 62 63### Queries 64 65Data layout property queries can be performed on the special object -- 66`DataLayout` -- which can be created for the given scoping operation. These 67objects allow one to interface with the data layout infrastructure and query 68properties of given types in the scope of the object. The signature of 69`DataLayout` class is as follows. 70 71```c++ 72class DataLayout { 73public: 74 explicit DataLayout(DataLayoutOpInterface scope); 75 76 llvm::TypeSize getTypeSize(Type type) const; 77 llvm::TypeSize getTypeSizeInBits(Type type) const; 78 uint64_t getTypeABIAlignment(Type type) const; 79 uint64_t getTypePreferredAlignment(Type type) const; 80 std::optional<uint64_t> getTypeIndexBitwidth(Type type) const; 81}; 82``` 83 84The user can construct the `DataLayout` object for the scope of interest. Since 85the data layout properties are fixed in the scope, they will be computed only 86once upon first request and cached for further use. Therefore, 87`DataLayout(op.getParentOfType<DataLayoutOpInterface>()).getTypeSize(type)` is 88considered an anti-pattern since it discards the cache after use. Because of 89caching, a `DataLayout` object returns valid results as long as the data layout 90properties of enclosing scopes remain the same, that is, as long as none of the 91ancestor operations are modified in a way that affects data layout. After such a 92modification, the user is expected to create a fresh `DataLayout` object. To aid 93with this, `DataLayout` asserts that the scope remains identical if MLIR is 94compiled with assertions enabled. 95 96## Custom Implementations 97 98Extensibility of the data layout modeling is provided through a set of MLIR 99[Interfaces](Interfaces.md). 100 101### Data Layout Specifications 102 103Data layout specification is an [attribute](LangRef.md/#attributes) that is 104conceptually a collection of key-value pairs called data layout specification 105_entries_. Data layout specification attributes implement the 106`DataLayoutSpecInterface`, described below. Each entry is itself an attribute 107that implements the `DataLayoutEntryInterface`. Entries have a key, either a 108`Type` or a `StringAttr`, and a value. Keys are used to associate entries with 109specific types or dialects: when handling a data layout properties request, a 110type or a dialect can only see the specification entries relevant to them and 111must go through the supplied `DataLayout` object for any recursive query. This 112supports and enforces better composability because types cannot (and should not) 113understand layout details of other types. Entry values are arbitrary attributes, 114specific to the type. 115 116For example, a data layout specification may be an actual list of pairs with 117simple custom syntax resembling the following: 118 119```mlir 120#my_dialect.layout_spec< 121 #my_dialect.layout_entry<!my_dialect.type, size=42>, 122 #my_dialect.layout_entry<"my_dialect.endianness", "little">, 123 #my_dialect.layout_entry<!my_dialect.vector, prefer_large_alignment>> 124``` 125 126The exact details of the specification and entry attributes, as well as their 127syntax, are up to implementations. 128 129We use the notion of _type class_ throughout the data layout subsystem. It 130corresponds to the C++ class of the given type, e.g., `IntegerType` for built-in 131integers. MLIR does not have a mechanism to represent type classes in the IR. 132Instead, data layout entries contain specific _instances_ of a type class, for 133example, `IntegerType{signedness=signless, bitwidth=8}` (or `i8` in the IR) or 134`IntegerType{signedness=unsigned, bitwidth=32}` (or `ui32` in the IR). When 135handling a data layout property query, a type class will be supplied with _all_ 136entries with keys belonging to this type class. For example, `IntegerType` will 137see the entries for `i8`, `si16` and `ui32`, but will _not_ see those for `f32` 138or `memref<?xi32>` (neither will `MemRefType` see the entry for `i32`). This 139allows for type-specific "interpolation" behavior where a type class can compute 140data layout properties of _any_ specific type instance given properties of other 141instances. Using integers as an example again, their alignment could be computed 142by taking that of the closest from above integer type with power-of-two 143bitwidth. 144 145[include "Interfaces/DataLayoutAttrInterface.md"] 146 147### Data Layout Scoping Operations 148 149Operations that define a scope for data layout queries, and that can be used to 150create a `DataLayout` object, are expected to implement the 151`DataLayoutOpInterface`. Such ops must provide at least a way of obtaining the 152data layout specification. The specification need not be necessarily attached to 153the operation as an attribute and may be constructed on-the-fly; it is only 154fetched once per `DataLayout` object and cached. Such ops may also provide 155custom handlers for data layout queries that provide results without forwarding 156the queries down to specific types or post-processing the results returned by 157types in target- or scope-specific ways. These custom handlers make it possible 158for scoping operations to (re)define data layout properties for types without 159having to modify the types themselves, e.g., when types are defined in another 160dialect. 161 162[include "Interfaces/DataLayoutOpInterface.md"] 163 164### Types with Data Layout 165 166Type classes that intend to handle data layout queries themselves are expected 167to implement the `DataLayoutTypeInterface`. This interface provides overridable 168hooks for each data layout query. Each of these hooks is supplied with the type 169instance, a `DataLayout` object suitable for recursive queries, and a list of 170data layout queries relevant for the type class. It is expected to provide a 171valid result even if the list of entries is empty. These hooks do not have 172access to the operation in the scope of which the query is handled and should 173use the supplied entries instead. 174 175[include "Interfaces/DataLayoutTypeInterface.md"] 176 177### Dialects with Data Layout Identifiers 178 179For data layout entries that are not related to a particular type class, the key 180of the entry is an Identifier that belongs to some dialect. In this case, the 181dialect is expected to implement the `DataLayoutDialectInterface`. This dialect 182provides hooks for verifying the validity of the entry value attributes and for 183and the compatibility of nested entries. 184 185### Bits and Bytes 186 187Two versions of hooks are provided for sizes: in bits and in bytes. The version 188in bytes has a default implementation that derives the size in bytes by rounding 189up the result of division of the size in bits by 8. Types exclusively targeting 190architectures with different assumptions can override this. Operations can 191redefine this for all types, providing scoped versions for cases of byte sizes 192other than eight without having to modify types, including built-in types. 193 194### Query Dispatch 195 196The overall flow of a data layout property query is as follows. 197 1981. The user constructs a `DataLayout` at the given scope. The constructor 199 fetches the data layout specification and combines it with those of 200 enclosing scopes (layouts are expected to be compatible). 2012. The user calls `DataLayout::query(Type ty)`. 2023. If `DataLayout` has a cached response, this response is returned 203 immediately. 2044. Otherwise, the query is handed down by `DataLayout` to the closest layout 205 scoping operation. If it implements `DataLayoutOpInterface`, then the query 206 is forwarded to`DataLayoutOpInterface::query(ty, *this, relevantEntries)` 207 where the relevant entries are computed as described above. If it does not 208 implement `DataLayoutOpInterface`, it must be a `ModuleOp`, and the query is 209 forwarded to `DataLayoutTypeInterface::query(dataLayout, relevantEntries)` 210 after casting `ty` to the type interface. 2115. Unless the `query` hook is reimplemented by the op interface, the query is 212 handled further down to `DataLayoutTypeInterface::query(dataLayout, 213 relevantEntries)` after casting `ty` to the type interface. If the type does 214 not implement the interface, an unrecoverable fatal error is produced. 2156. The type is expected to always provide the response, which is returned up 216 the call stack and cached by the `DataLayout.` 217 218## Default Implementation 219 220The default implementation of the data layout interfaces directly handles 221queries for a subset of built-in types. 222 223### Built-in Modules 224 225Built-in `ModuleOp` allows at most one attribute that implements 226`DataLayoutSpecInterface`. It does not implement the entire interface for 227efficiency and layering reasons. Instead, `DataLayout` can be constructed for 228`ModuleOp` and handles modules transparently alongside other operations that 229implement the interface. 230 231### Built-in Types 232 233The following describes the default properties of built-in types. 234 235The size of built-in integers and floats in bytes is computed as 236`ceildiv(bitwidth, 8)`. The ABI alignment of integer types with bitwidth below 23764 and of the float types is the closest from above power-of-two number of 238bytes. The ABI alignment of integer types with bitwidth 64 and above is 4 bytes 239(32 bits). 240 241The size of built-in vectors is computed by first rounding their number of 242elements in the _innermost_ dimension to the closest power-of-two from above, 243then getting the total number of elements, and finally multiplying it with the 244element size. For example, `vector<3xi32>` and `vector<4xi32>` have the same 245size. So do `vector<2x3xf32>` and `vector<2x4xf32>`, but `vector<3x4xf32>` and 246`vector<4x4xf32>` have different sizes. The ABI and preferred alignment of 247vector types is computed by taking the innermost dimension of the vector, 248rounding it up to the closest power-of-two, taking a product of that with 249element size in bytes, and rounding the result up again to the closest 250power-of-two. 251 252Note: these values are selected for consistency with the 253[default data layout in LLVM](https://llvm.org/docs/LangRef.html#data-layout), 254which MLIR assumed until the introduction of proper data layout modeling, and 255with the 256[modeling of n-D vectors](https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive). 257They **may change** in the future. 258 259#### `index` type 260 261Index type is an integer type used for target-specific size information in, 262e.g., `memref` operations. Its data layout is parameterized by a single integer 263data layout entry that specifies its bitwidth. For example, 264 265```mlir 266module attributes { dlti.dl_spec = #dlti.dl_spec< 267 #dlti.dl_entry<index, 32> 268>} {} 269``` 270 271specifies that `index` has 32 bits and index computations should be performed 272using 32-bit precision as well. All other layout properties of `index` match 273those of the integer type with the same bitwidth defined above. 274 275In absence of the corresponding entry, `index` is assumed to be a 64-bit 276integer. 277 278#### `complex` type 279 280By default complex type is treated like a 2 element structure of its given 281element type. This is to say that each of its elements are aligned to their 282preferred alignment, the entire complex type is also aligned to this preference, 283and the complex type size includes the possible padding between elements to enforce 284alignment. 285 286### Byte Size 287 288The default data layout assumes 8-bit bytes. 289 290### DLTI Dialect 291 292The [DLTI](../Dialects/DLTIDialect/) dialect provides the attributes implementing 293`DataLayoutSpecInterface` and `DataLayoutEntryInterface`, as well as a dialect 294attribute that can be used to attach the specification to a given operation. The 295verifier of this attribute triggers those of the specification and checks the 296compatibility of nested specifications. 297