xref: /llvm-project/mlir/docs/TargetLLVMIR.md (revision 698bb5f239f50e8217cbec1d19bf8e0bba8c5d11)
1# LLVM IR Target
2
3This document describes the mechanisms of producing LLVM IR from MLIR. The
4overall flow is two-stage:
5
61.  **conversion** of the IR to a set of dialects translatable to LLVM IR, for
7    example [LLVM Dialect](Dialects/LLVM.md) or one of the hardware-specific
8    dialects derived from LLVM IR intrinsics such as [AMX](Dialects/AMX.md),
9    [X86Vector](Dialects/X86Vector.md) or [ArmNeon](Dialects/ArmNeon.md);
102.  **translation** of MLIR dialects to LLVM IR.
11
12This flow allows the non-trivial transformation to be performed within MLIR
13using MLIR APIs and makes the translation between MLIR and LLVM IR *simple* and
14potentially bidirectional. As a corollary, dialect ops translatable to LLVM IR
15are expected to closely match the corresponding LLVM IR instructions and
16intrinsics. This minimizes the dependency on LLVM IR libraries in MLIR as well
17as reduces the churn in case of changes.
18
19Note that many different dialects can be lowered to LLVM but are provided as
20different sets of patterns and have different passes available to mlir-opt.
21However, this is primarily useful for testing and prototyping, and using the
22collection of patterns together is highly recommended. One place this is
23important and visible is the ControlFlow dialect's branching operations which
24will fail to apply if their types mismatch with the blocks they jump to in the
25parent op.
26
27SPIR-V to LLVM dialect conversion has a
28[dedicated document](SPIRVToLLVMDialectConversion.md).
29
30[TOC]
31
32## Conversion to the LLVM Dialect
33
34Conversion to the LLVM dialect from other dialects is the first step to produce
35LLVM IR. All non-trivial IR modifications are expected to happen at this stage
36or before. The conversion is *progressive*: most passes convert one dialect to
37the LLVM dialect and keep operations from other dialects intact. For example,
38the `-finalize-memref-to-llvm` pass will only convert operations from the
39`memref` dialect but will not convert operations from other dialects even if
40they use or produce `memref`-typed values.
41
42The process relies on the [Dialect Conversion](DialectConversion.md)
43infrastructure and, in particular, on the
44[materialization](DialectConversion.md/#type-conversion) hooks of `TypeConverter`
45to support progressive lowering by injecting `unrealized_conversion_cast`
46operations between converted and unconverted operations. After multiple partial
47conversions to the LLVM dialect are performed, the cast operations that became
48noop can be removed by the `-reconcile-unrealized-casts` pass. The latter pass
49is not specific to the LLVM dialect and can remove any noop casts.
50
51### Conversion of Built-in Types
52
53Built-in types have a default conversion to LLVM dialect types provided by the
54`LLVMTypeConverter` class. Users targeting the LLVM dialect can reuse and extend
55this type converter to support other types. Extra care must be taken if the
56conversion rules for built-in types are overridden: all conversion must use the
57same type converter.
58
59#### LLVM Dialect-compatible Types
60
61The types [compatible](Dialects/LLVM.md/#built-in-type-compatibility) with the
62LLVM dialect are kept as is.
63
64#### Complex Type
65
66Complex type is converted into an LLVM dialect literal structure type with two
67elements:
68
69-   real part;
70-   imaginary part.
71
72The elemental type is converted recursively using these rules.
73
74Example:
75
76```mlir
77  complex<f32>
78  // ->
79  !llvm.struct<(f32, f32)>
80```
81
82#### Index Type
83
84Index type is converted into an LLVM dialect integer type with the bitwidth
85specified by the [data layout](DataLayout.md) of the closest module. For
86example, on x86-64 CPUs it converts to i64. This behavior can be overridden by
87the type converter configuration, which is often exposed as a pass option by
88conversion passes.
89
90Example:
91
92```mlir
93  index
94  // -> on x86_64
95  i64
96```
97
98#### Ranked MemRef Types
99
100Ranked memref types are converted into an LLVM dialect literal structure type
101that contains the dynamic information associated with the memref object,
102referred to as *descriptor*. Only memrefs in the
103**[strided form](Dialects/Builtin.md/#strided-memref)** can be converted to the
104LLVM dialect with the default descriptor format. Memrefs with other, less
105trivial layouts should be converted into the strided form first, e.g., by
106materializing the non-trivial address remapping due to layout as `affine.apply`
107operations.
108
109The default memref descriptor is a struct with the following fields:
110
1111.  The pointer to the data buffer as allocated, referred to as "allocated
112    pointer". This is only useful for deallocating the memref.
1132.  The pointer to the properly aligned data pointer that the memref indexes,
114    referred to as "aligned pointer".
1153.  A lowered converted `index`-type integer containing the distance in number
116    of elements between the beginning of the (aligned) buffer and the first
117    element to be accessed through the memref, referred to as "offset".
1184.  An array containing as many converted `index`-type integers as the rank of
119    the memref: the array represents the size, in number of elements, of the
120    memref along the given dimension.
1215.  A second array containing as many converted `index`-type integers as the
122    rank of memref: the second array represents the "stride" (in tensor
123    abstraction sense), i.e. the number of consecutive elements of the
124    underlying buffer one needs to jump over to get to the next logically
125    indexed element.
126
127For constant memref dimensions, the corresponding size entry is a constant whose
128runtime value matches the static value. This normalization serves as an ABI for
129the memref type to interoperate with externally linked functions. In the
130particular case of rank `0` memrefs, the size and stride arrays are omitted,
131resulting in a struct containing two pointers + offset.
132
133Examples:
134
135```mlir
136// Assuming index is converted to i64.
137
138memref<f32> -> !llvm.struct<(ptr , ptr, i64)>
139memref<1 x f32> -> !llvm.struct<(ptr, ptr, i64,
140                                 array<1 x i64>, array<1 x i64>)>
141memref<? x f32> -> !llvm.struct<(ptr, ptr, i64
142                                 array<1 x i64>, array<1 x i64>)>
143memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr, ptr, i64
144                                               array<5 x i64>, array<5 x i64>)>
145memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr, ptr, i64
146                                             array<5 x i64>, array<5 x i64>)>
147
148// Memref types can have vectors as element types
149memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr, ptr, i64, array<2 x i64>,
150                                             array<2 x i64>)>
151```
152
153#### Unranked MemRef Types
154
155Unranked memref types are converted to LLVM dialect literal structure type that
156contains the dynamic information associated with the memref object, referred to
157as *unranked descriptor*. It contains:
158
1591.  a converted `index`-typed integer representing the dynamic rank of the
160    memref;
1612.  a type-erased pointer (`!llvm.ptr`) to a ranked memref descriptor with
162    the contents listed above.
163
164This descriptor is primarily intended for interfacing with rank-polymorphic
165library functions. The pointer to the ranked memref descriptor points to some
166*allocated* memory, which may reside on stack of the current function or in
167heap. Conversion patterns for operations producing unranked memrefs are expected
168to manage the allocation. Note that this may lead to stack allocations
169(`llvm.alloca`) being performed in a loop and not reclaimed until the end of the
170current function.
171
172#### Function Types
173
174Function types are converted to LLVM dialect function types as follows:
175
176-   function argument and result types are converted recursively using these
177    rules;
178-   if a function type has multiple results, they are wrapped into an LLVM
179    dialect literal structure type since LLVM function types must have exactly
180    one result;
181-   if a function type has no results, the corresponding LLVM dialect function
182    type will have one `!llvm.void` result since LLVM function types must have a
183    result;
184-   function types used in arguments of another function type are wrapped in an
185    LLVM dialect pointer type to comply with LLVM IR expectations;
186-   the structs corresponding to `memref` types, both ranked and unranked,
187    appearing as function arguments are unbundled into individual function
188    arguments to allow for specifying metadata such as aliasing information on
189    individual pointers;
190-   the conversion of `memref`-typed arguments is subject to
191    [calling conventions](#calling-conventions).
192-   if a function type has boolean attribute `func.varargs` being set, the
193    converted LLVM function will be variadic.
194
195Examples:
196
197```mlir
198// Zero-ary function type with no results:
199() -> ()
200// is converted to a zero-ary function with `void` result.
201!llvm.func<void ()>
202
203// Unary function with one result:
204(i32) -> (i64)
205// has its argument and result type converted, before creating the LLVM dialect
206// function type.
207!llvm.func<i64 (i32)>
208
209// Binary function with one result:
210(i32, f32) -> (i64)
211// has its arguments handled separately
212!llvm.func<i64 (i32, f32)>
213
214// Binary function with two results:
215(i32, f32) -> (i64, f64)
216// has its result aggregated into a structure type.
217!llvm.func<struct<(i64, f64)> (i32, f32)>
218
219// Function-typed arguments or results in higher-order functions:
220(() -> ()) -> (() -> ())
221// are converted into opaque pointers.
222!llvm.func<ptr (ptr)>
223
224// A memref descriptor appearing as function argument:
225(memref<f32>) -> ()
226// gets converted into a list of individual scalar components of a descriptor.
227!llvm.func<void (ptr, ptr, i64)>
228
229// The list of arguments is linearized and one can freely mix memref and other
230// types in this list:
231(memref<f32>, f32) -> ()
232// which gets converted into a flat list.
233!llvm.func<void (ptr, ptr, i64, f32)>
234
235// For nD ranked memref descriptors:
236(memref<?x?xf32>) -> ()
237// the converted signature will contain 2n+1 `index`-typed integer arguments,
238// offset, n sizes and n strides, per memref argument type.
239!llvm.func<void (ptr, ptr, i64, i64, i64, i64, i64)>
240
241// Same rules apply to unranked descriptors:
242(memref<*xf32>) -> ()
243// which get converted into their components.
244!llvm.func<void (i64, ptr)>
245
246// However, returning a memref from a function is not affected:
247() -> (memref<?xf32>)
248// gets converted to a function returning a descriptor structure.
249!llvm.func<struct<(ptr, ptr, i64, array<1xi64>, array<1xi64>)> ()>
250
251// If multiple memref-typed results are returned:
252() -> (memref<f32>, memref<f64>)
253// their descriptor structures are additionally packed into another structure,
254// potentially with other non-memref typed results.
255!llvm.func<struct<(struct<(ptr, ptr, i64)>,
256                   struct<(ptr, ptr, i64)>)> ()>
257
258// If "func.varargs" attribute is set:
259(i32) -> () attributes { "func.varargs" = true }
260// the corresponding LLVM function will be variadic:
261!llvm.func<void (i32, ...)>
262```
263
264Conversion patterns are available to convert built-in function operations and
265standard call operations targeting those functions using these conversion rules.
266
267#### Multi-dimensional Vector Types
268
269LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
270be multi-dimensional. Vector types cannot be nested in either IR. In the
271one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
272size with element type converted using these conversion rules. In the
273n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
274of one-dimensional vectors.
275
276Examples:
277
278```
279vector<4x8 x f32>
280// ->
281!llvm.array<4 x vector<8 x f32>>
282
283memref<2 x vector<4x8 x f32>
284// ->
285!llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)>
286```
287
288#### Tensor Types
289
290Tensor types cannot be converted to the LLVM dialect. Operations on tensors must
291be [bufferized](Bufferization.md) before being converted.
292
293### Conversion of LLVM Container Types with Non-Compatible Element Types
294
295Progressive lowering may result in there LLVM container types, such
296as LLVM dialect structures, containing non-compatible types:
297`!llvm.struct<(index)>`. Such types are converted recursively using the rules
298described above.
299
300Identified structures are converted to _new_ structures that have their
301identifiers prefixed with `_Converted.` since the bodies of identified types
302cannot be updated once initialized. Such names are considered _reserved_ and
303must not appear in the input code (in practice, C reserves names starting with
304`_` and a capital, and `.` cannot appear in valid C types anyway). If they do
305and have a different body than the result of the conversion, the type conversion
306will stop.
307
308### Calling Conventions
309
310Calling conventions provides a mechanism to customize the conversion of function
311and function call operations without changing how individual types are handled
312elsewhere. They are implemented simultaneously by the default type converter and
313by the conversion patterns for the relevant operations.
314
315#### Function Result Packing
316
317In case of multi-result functions, the returned values are inserted into a
318structure-typed value before being returned and extracted from it at the call
319site. This transformation is a part of the conversion and is transparent to the
320defines and uses of the values being returned.
321
322Example:
323
324```mlir
325func.func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
326  return %arg0, %arg1 : i32, i64
327}
328func.func @bar() {
329  %0 = arith.constant 42 : i32
330  %1 = arith.constant 17 : i64
331  %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
332  "use_i32"(%2#0) : (i32) -> ()
333  "use_i64"(%2#1) : (i64) -> ()
334}
335
336// is transformed into
337
338llvm.func @foo(%arg0: i32, %arg1: i64) -> !llvm.struct<(i32, i64)> {
339  // insert the values into a structure
340  %0 = llvm.mlir.undef : !llvm.struct<(i32, i64)>
341  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i32, i64)>
342  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i32, i64)>
343
344  // return the structure value
345  llvm.return %2 : !llvm.struct<(i32, i64)>
346}
347llvm.func @bar() {
348  %0 = llvm.mlir.constant(42 : i32) : i32
349  %1 = llvm.mlir.constant(17 : i64) : i64
350
351  // call and extract the values from the structure
352  %2 = llvm.call @foo(%0, %1)
353     : (i32, i64) -> !llvm.struct<(i32, i64)>
354  %3 = llvm.extractvalue %2[0] : !llvm.struct<(i32, i64)>
355  %4 = llvm.extractvalue %2[1] : !llvm.struct<(i32, i64)>
356
357  // use as before
358  "use_i32"(%3) : (i32) -> ()
359  "use_i64"(%4) : (i64) -> ()
360}
361```
362
363#### Default Calling Convention for Ranked MemRef
364
365The default calling convention converts `memref`-typed function arguments to
366LLVM dialect literal structs
367[defined above](#ranked-memref-types) before unbundling them into
368individual scalar arguments.
369
370Examples:
371
372This convention is implemented in the conversion of `func.func` and `func.call` to
373the LLVM dialect, with the former unpacking the descriptor into a set of
374individual values and the latter packing those values back into a descriptor so
375as to make it transparently usable by other operations. Conversions from other
376dialects should take this convention into account.
377
378This specific convention is motivated by the necessity to specify alignment and
379aliasing attributes on the raw pointers underpinning the memref.
380
381Examples:
382
383```mlir
384func.func @foo(%arg0: memref<?xf32>) -> () {
385  "use"(%arg0) : (memref<?xf32>) -> ()
386  return
387}
388
389// Gets converted to the following
390// (using type alias for brevity):
391!llvm.memref_1d = !llvm.struct<(ptr, ptr, i64, array<1xi64>, array<1xi64>)>
392
393llvm.func @foo(%arg0: !llvm.ptr,       // Allocated pointer.
394               %arg1: !llvm.ptr,       // Aligned pointer.
395               %arg2: i64,             // Offset.
396               %arg3: i64,             // Size in dim 0.
397               %arg4: i64) {           // Stride in dim 0.
398  // Populate memref descriptor structure.
399  %0 = llvm.mlir.undef : !llvm.memref_1d
400  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d
401  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d
402  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d
403  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d
404  %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d
405
406  // Descriptor is now usable as a single value.
407  "use"(%5) : (!llvm.memref_1d) -> ()
408  llvm.return
409}
410```
411
412```mlir
413func.func @bar() {
414  %0 = "get"() : () -> (memref<?xf32>)
415  call @foo(%0) : (memref<?xf32>) -> ()
416  return
417}
418
419// Gets converted to the following
420// (using type alias for brevity):
421!llvm.memref_1d = !llvm.struct<(ptr, ptr, i64, array<1xi64>, array<1xi64>)>
422
423llvm.func @bar() {
424  %0 = "get"() : () -> !llvm.memref_1d
425
426  // Unpack the memref descriptor.
427  %1 = llvm.extractvalue %0[0] : !llvm.memref_1d
428  %2 = llvm.extractvalue %0[1] : !llvm.memref_1d
429  %3 = llvm.extractvalue %0[2] : !llvm.memref_1d
430  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_1d
431  %5 = llvm.extractvalue %0[4, 0] : !llvm.memref_1d
432
433  // Pass individual values to the callee.
434  llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm.memref_1d) -> ()
435  llvm.return
436}
437```
438
439#### Default Calling Convention for Unranked MemRef
440
441For unranked memrefs, the list of function arguments always contains two
442elements, same as the unranked memref descriptor: an integer rank, and a
443type-erased (`!llvm.ptr`) pointer to the ranked memref descriptor. Note that
444while the *calling convention* does not require allocation, *casting* to
445unranked memref does since one cannot take an address of an SSA value containing
446the ranked memref, which must be stored in some memory instead. The caller is in
447charge of ensuring the thread safety and management of the allocated memory, in
448particular the deallocation.
449
450Example
451
452```mlir
453llvm.func @foo(%arg0: memref<*xf32>) -> () {
454  "use"(%arg0) : (memref<*xf32>) -> ()
455  return
456}
457
458// Gets converted to the following.
459
460llvm.func @foo(%arg0: i64              // Rank.
461               %arg1: !llvm.ptr) { // Type-erased pointer to descriptor.
462  // Pack the unranked memref descriptor.
463  %0 = llvm.mlir.undef : !llvm.struct<(i64, ptr)>
464  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i64, ptr)>
465  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i64, ptr)>
466
467  "use"(%2) : (!llvm.struct<(i64, ptr)>) -> ()
468  llvm.return
469}
470```
471
472```mlir
473llvm.func @bar() {
474  %0 = "get"() : () -> (memref<*xf32>)
475  call @foo(%0): (memref<*xf32>) -> ()
476  return
477}
478
479// Gets converted to the following.
480
481llvm.func @bar() {
482  %0 = "get"() : () -> (!llvm.struct<(i64, ptr)>)
483
484  // Unpack the memref descriptor.
485  %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr)>
486  %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr)>
487
488  // Pass individual values to the callee.
489  llvm.call @foo(%1, %2) : (i64, !llvm.ptr)
490  llvm.return
491}
492```
493
494**Lifetime.** The second element of the unranked memref descriptor points to
495some memory in which the ranked memref descriptor is stored. By convention, this
496memory is allocated on stack and has the lifetime of the function. (*Note:* due
497to function-length lifetime, creation of multiple unranked memref descriptors,
498e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to
499be returned from a function, the ranked descriptor it points to is copied into
500dynamically allocated memory, and the pointer in the unranked descriptor is
501updated accordingly. The allocation happens immediately before returning. It is
502the responsibility of the caller to free the dynamically allocated memory. The
503default conversion of `func.call` and `func.call_indirect` copies the ranked
504descriptor to newly allocated memory on the caller's stack. Thus, the convention
505of the ranked memref descriptor pointed to by an unranked memref descriptor
506being stored on stack is respected.
507
508#### Bare Pointer Calling Convention for Ranked MemRef
509
510The "bare pointer" calling convention converts `memref`-typed function arguments
511to a *single* pointer to the aligned data. Note that this does *not* apply to
512uses of `memref` outside of function signatures, the default descriptor
513structures are still used. This convention further restricts the supported cases
514to the following.
515
516-   `memref` types with default layout.
517-   `memref` types with all dimensions statically known.
518-   `memref` values allocated in such a way that the allocated and aligned
519    pointer match. Alternatively, the same function must handle allocation and
520    deallocation since only one pointer is passed to any callee.
521
522Examples:
523
524```
525func.func @callee(memref<2x4xf32>)
526
527func.func @caller(%0 : memref<2x4xf32>) {
528  call @callee(%0) : (memref<2x4xf32>) -> ()
529}
530
531// ->
532
533!descriptor = !llvm.struct<(ptr, ptr, i64,
534                            array<2xi64>, array<2xi64>)>
535
536llvm.func @callee(!llvm.ptr)
537
538llvm.func @caller(%arg0: !llvm.ptr) {
539  // A descriptor value is defined at the function entry point.
540  %0 = llvm.mlir.undef : !descriptor
541
542  // Both the allocated and aligned pointer are set up to the same value.
543  %1 = llvm.insertelement %arg0, %0[0] : !descriptor
544  %2 = llvm.insertelement %arg0, %1[1] : !descriptor
545
546  // The offset is set up to zero.
547  %3 = llvm.mlir.constant(0 : index) : i64
548  %4 = llvm.insertelement %3, %2[2] : !descriptor
549
550  // The sizes and strides are derived from the statically known values.
551  %5 = llvm.mlir.constant(2 : index) : i64
552  %6 = llvm.mlir.constant(4 : index) : i64
553  %7 = llvm.insertelement %5, %4[3, 0] : !descriptor
554  %8 = llvm.insertelement %6, %7[3, 1] : !descriptor
555  %9 = llvm.mlir.constant(1 : index) : i64
556  %10 = llvm.insertelement %9, %8[4, 0] : !descriptor
557  %11 = llvm.insertelement %10, %9[4, 1] : !descriptor
558
559  // The function call corresponds to extracting the aligned data pointer.
560  %12 = llvm.extractelement %11[1] : !descriptor
561  llvm.call @callee(%12) : (!llvm.ptr) -> ()
562}
563```
564
565#### Bare Pointer Calling Convention For Unranked MemRef
566
567The "bare pointer" calling convention does not support unranked memrefs as their
568shape cannot be known at compile time.
569
570### Generic alloction and deallocation functions
571
572When converting the Memref dialect, allocations and deallocations are converted
573into calls to `malloc` (`aligned_alloc` if aligned allocations are requested)
574and `free`. However, it is possible to convert them to more generic functions
575which can be implemented by a runtime library, thus allowing custom allocation
576strategies or runtime profiling. When the conversion pass is  instructed to
577perform such operation, the names of the calles are
578`_mlir_memref_to_llvm_alloc`, `_mlir_memref_to_llvm_aligned_alloc` and
579`_mlir_memref_to_llvm_free`. Their signatures are the same of `malloc`,
580`aligned_alloc` and `free`.
581
582### C-compatible wrapper emission
583
584In practical cases, it may be desirable to have externally-facing functions with
585a single attribute corresponding to a MemRef argument. When interfacing with
586LLVM IR produced from C, the code needs to respect the corresponding calling
587convention. The conversion to the LLVM dialect provides an option to generate
588wrapper functions that take memref descriptors as pointers-to-struct compatible
589with data types produced by Clang when compiling C sources. The generation of
590such wrapper functions can additionally be controlled at a function granularity
591by setting the `llvm.emit_c_interface` unit attribute.
592
593More specifically, a memref argument is converted into a pointer-to-struct
594argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where
595`T` is the converted element type and `N` is the memref rank. This type is
596compatible with that produced by Clang for the following C++ structure template
597instantiations or their equivalents in C.
598
599```cpp
600template<typename T, size_t N>
601struct MemRefDescriptor {
602  T *allocated;
603  T *aligned;
604  intptr_t offset;
605  intptr_t sizes[N];
606  intptr_t strides[N];
607};
608```
609
610Furthermore, we also rewrite function results to pointer parameters if the
611rewritten function result has a struct type. The special result parameter is
612added as the first parameter and is of pointer-to-struct type.
613
614If enabled, the option will do the following. For *external* functions declared
615in the MLIR module.
616
6171.  Declare a new function `_mlir_ciface_<original name>` where memref arguments
618    are converted to pointer-to-struct and the remaining arguments are converted
619    as usual. Results are converted to a special argument if they are of struct
620    type.
6212.  Add a body to the original function (making it non-external) that
622    1.  allocates memref descriptors,
623    2.  populates them,
624    3.  potentially allocates space for the result struct, and
625    4.  passes the pointers to these into the newly declared interface function,
626        then
627    5.  collects the result of the call (potentially from the result struct),
628        and
629    6.  returns it to the caller.
630
631For (non-external) functions defined in the MLIR module.
632
6331.  Define a new function `_mlir_ciface_<original name>` where memref arguments
634    are converted to pointer-to-struct and the remaining arguments are converted
635    as usual. Results are converted to a special argument if they are of struct
636    type.
6372.  Populate the body of the newly defined function with IR that
638    1.  loads descriptors from pointers;
639    2.  unpacks descriptor into individual non-aggregate values;
640    3.  passes these values into the original function;
641    4.  collects the results of the call and
642    5.  either copies the results into the result struct or returns them to the
643        caller.
644
645Examples:
646
647```mlir
648
649func.func @qux(%arg0: memref<?x?xf32>) attributes {llvm.emit_c_interface}
650
651// Gets converted into the following
652// (using type alias for brevity):
653!llvm.memref_2d = !llvm.struct<(ptr, ptr, i64, array<2xi64>, array<2xi64>)>
654
655// Function with unpacked arguments.
656llvm.func @qux(%arg0: !llvm.ptr, %arg1: !llvm.ptr,
657               %arg2: i64, %arg3: i64, %arg4: i64,
658               %arg5: i64, %arg6: i64) {
659  // Populate memref descriptor (as per calling convention).
660  %0 = llvm.mlir.undef : !llvm.memref_2d
661  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
662  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
663  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
664  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
665  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
666  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
667  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
668
669  // Store the descriptor in a stack-allocated space.
670  %8 = llvm.mlir.constant(1 : index) : i64
671  %9 = llvm.alloca %8 x !llvm.memref_2d
672     : (i64) -> !llvm.ptr
673  llvm.store %7, %9 : !llvm.memref_2d, !llvm.ptr
674
675  // Call the interface function.
676  llvm.call @_mlir_ciface_qux(%9) : (!llvm.ptr) -> ()
677
678  // The stored descriptor will be freed on return.
679  llvm.return
680}
681
682// Interface function.
683llvm.func @_mlir_ciface_qux(!llvm.ptr)
684```
685
686
687```cpp
688// The C function implementation for the interface function.
689extern "C" {
690void _mlir_ciface_qux(MemRefDescriptor<float, 2> *input) {
691  // detailed impl
692}
693}
694```
695
696```mlir
697func.func @foo(%arg0: memref<?x?xf32>) attributes {llvm.emit_c_interface} {
698  return
699}
700
701// Gets converted into the following
702// (using type alias for brevity):
703!llvm.memref_2d = !llvm.struct<(ptr, ptr, i64, array<2xi64>, array<2xi64>)>
704
705// Function with unpacked arguments.
706llvm.func @foo(%arg0: !llvm.ptr, %arg1: !llvm.ptr,
707               %arg2: i64, %arg3: i64, %arg4: i64,
708               %arg5: i64, %arg6: i64) {
709  llvm.return
710}
711
712// Interface function callable from C.
713llvm.func @_mlir_ciface_foo(%arg0: !llvm.ptr) {
714  // Load the descriptor.
715  %0 = llvm.load %arg0 : !llvm.ptr -> !llvm.memref_2d
716
717  // Unpack the descriptor as per calling convention.
718  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
719  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
720  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
721  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
722  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
723  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
724  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
725  llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
726    : (!llvm.ptr, !llvm.ptr, i64, i64, i64,
727       i64, i64) -> ()
728  llvm.return
729}
730```
731
732```cpp
733// The C function signature for the interface function.
734extern "C" {
735void _mlir_ciface_foo(MemRefDescriptor<float, 2> *input);
736}
737```
738
739```mlir
740func.func @foo(%arg0: memref<?x?xf32>) -> memref<?x?xf32> attributes {llvm.emit_c_interface} {
741  return %arg0 : memref<?x?xf32>
742}
743
744// Gets converted into the following
745// (using type alias for brevity):
746!llvm.memref_2d = !llvm.struct<(ptr, ptr, i64, array<2xi64>, array<2xi64>)>
747
748// Function with unpacked arguments.
749llvm.func @foo(%arg0: !llvm.ptr, %arg1: !llvm.ptr, %arg2: i64,
750               %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64)
751    -> !llvm.memref_2d {
752  %0 = llvm.mlir.undef : !llvm.memref_2d
753  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
754  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
755  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
756  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
757  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
758  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
759  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
760  llvm.return %7 : !llvm.memref_2d
761}
762
763// Interface function callable from C.
764// NOTE: the returned memref becomes the first argument
765llvm.func @_mlir_ciface_foo(%arg0: !llvm.ptr, %arg1: !llvm.ptr) {
766  %0 = llvm.load %arg1 : !llvm.ptr
767  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
768  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
769  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
770  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
771  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
772  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
773  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
774  %8 = llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
775    : (!llvm.ptr, !llvm.ptr, i64, i64, i64, i64, i64) -> !llvm.memref_2d
776  llvm.store %8, %arg0 : !llvm.memref_2d, !llvm.ptr
777  llvm.return
778}
779```
780
781```cpp
782// The C function signature for the interface function.
783extern "C" {
784void _mlir_ciface_foo(MemRefDescriptor<float, 2> *output,
785                      MemRefDescriptor<float, 2> *input);
786}
787```
788
789Rationale: Introducing auxiliary functions for C-compatible interfaces is
790preferred to modifying the calling convention since it will minimize the effect
791of C compatibility on intra-module calls or calls between MLIR-generated
792functions. In particular, when calling external functions from an MLIR module in
793a (parallel) loop, the fact of storing a memref descriptor on stack can lead to
794stack exhaustion and/or concurrent access to the same address. Auxiliary
795interface function serves as an allocation scope in this case. Furthermore, when
796targeting accelerators with separate memory spaces such as GPUs, stack-allocated
797descriptors passed by pointer would have to be transferred to the device memory,
798which introduces significant overhead. In such situations, auxiliary interface
799functions are executed on host and only pass the values through device function
800invocation mechanism.
801
802Limitation: Right now we cannot generate C interface for variadic functions,
803regardless of being non-external or external. Because C functions are unable to
804"forward" variadic arguments like this:
805```c
806void bar(int, ...);
807
808void foo(int x, ...) {
809  // ERROR: no way to forward variadic arguments.
810  void bar(x, ...);
811}
812```
813
814### Address Computation
815
816Accesses to a memref element are transformed into an access to an element of the
817buffer pointed to by the descriptor. The position of the element in the buffer
818is calculated by linearizing memref indices in row-major order (lexically first
819index is the slowest varying, similar to C, but accounting for strides). The
820computation of the linear address is emitted as arithmetic operation in the LLVM
821IR dialect. Strides are extracted from the memref descriptor.
822
823Examples:
824
825An access to a memref with indices:
826
827```mlir
828%0 = memref.load %m[%1,%2,%3,%4] : memref<?x?x4x8xf32, offset: ?>
829```
830
831is transformed into the equivalent of the following code:
832
833```mlir
834// Compute the linearized index from strides.
835// When strides or, in absence of explicit strides, the corresponding sizes are
836// dynamic, extract the stride value from the descriptor.
837%stride1 = llvm.extractvalue[4, 0] : !llvm.struct<(ptr, ptr, i64,
838                                                   array<4xi64>, array<4xi64>)>
839%addr1 = arith.muli %stride1, %1 : i64
840
841// When the stride or, in absence of explicit strides, the trailing sizes are
842// known statically, this value is used as a constant. The natural value of
843// strides is the product of all sizes following the current dimension.
844%stride2 = llvm.mlir.constant(32 : index) : i64
845%addr2 = arith.muli %stride2, %2 : i64
846%addr3 = arith.addi %addr1, %addr2 : i64
847
848%stride3 = llvm.mlir.constant(8 : index) : i64
849%addr4 = arith.muli %stride3, %3 : i64
850%addr5 = arith.addi %addr3, %addr4 : i64
851
852// Multiplication with the known unit stride can be omitted.
853%addr6 = arith.addi %addr5, %4 : i64
854
855// If the linear offset is known to be zero, it can also be omitted. If it is
856// dynamic, it is extracted from the descriptor.
857%offset = llvm.extractvalue[2] : !llvm.struct<(ptr, ptr, i64,
858                                               array<4xi64>, array<4xi64>)>
859%addr7 = arith.addi %addr6, %offset : i64
860
861// All accesses are based on the aligned pointer.
862%aligned = llvm.extractvalue[1] : !llvm.struct<(ptr, ptr, i64,
863                                                array<4xi64>, array<4xi64>)>
864
865// Get the address of the data pointer.
866%ptr = llvm.getelementptr %aligned[%addr7]
867     : !llvm.struct<(ptr, ptr, i64, array<4xi64>, array<4xi64>)> -> !llvm.ptr
868
869// Perform the actual load.
870%0 = llvm.load %ptr : !llvm.ptr -> f32
871```
872
873For stores, the address computation code is identical and only the actual store
874operation is different.
875
876Note: the conversion does not perform any sort of common subexpression
877elimination when emitting memref accesses.
878
879### Utility Classes
880
881Utility classes common to many conversions to the LLVM dialect can be found
882under `lib/Conversion/LLVMCommon`. They include the following.
883
884-   `LLVMConversionTarget` specifies all LLVM dialect operations as legal.
885-   `LLVMTypeConverter` implements the default type conversion as described
886    above.
887-   `ConvertOpToLLVMPattern` extends the conversion pattern class with LLVM
888    dialect-specific functionality.
889-   `VectorConvertOpToLLVMPattern` extends the previous class to automatically
890    unroll operations on higher-dimensional vectors into lists of operations on
891    one-dimensional vectors before.
892-   `StructBuilder` provides a convenient API for building IR that creates or
893    accesses values of LLVM dialect structure types; it is derived by
894    `MemRefDescriptor`, `UrankedMemrefDescriptor` and `ComplexBuilder` for the
895    built-in types convertible to LLVM dialect structure types.
896
897## Translation to LLVM IR
898
899MLIR modules containing `llvm.func`, `llvm.mlir.global` and `llvm.metadata`
900operations can be translated to LLVM IR modules using the following scheme.
901
902-   Module-level globals are translated to LLVM IR global values.
903-   Module-level metadata are translated to LLVM IR metadata, which can be later
904    augmented with additional metadata defined on specific ops.
905-   All functions are declared in the module so that they can be referenced.
906-   Each function is then translated separately and has access to the complete
907    mappings between MLIR and LLVM IR globals, metadata, and functions.
908-   Within a function, blocks are traversed in topological order and translated
909    to LLVM IR basic blocks. In each basic block, PHI nodes are created for each
910    of the block arguments, but not connected to their source blocks.
911-   Within each block, operations are translated in their order. Each operation
912    has access to the same mappings as the function and additionally to the
913    mapping of values between MLIR and LLVM IR, including PHI nodes. Operations
914    with regions are responsible for translated the regions they contain.
915-   After operations in a function are translated, the PHI nodes of blocks in
916    this function are connected to their source values, which are now available.
917
918The translation mechanism provides extension hooks for translating custom
919operations to LLVM IR via a dialect interface `LLVMTranslationDialectInterface`:
920
921-   `convertOperation` translates an operation that belongs to the current
922    dialect to LLVM IR given an `IRBuilderBase` and various mappings;
923-   `amendOperation` performs additional actions on an operation if it contains
924    a dialect attribute that belongs to the current dialect, for example sets up
925    instruction-level metadata.
926
927Dialects containing operations or attributes that want to be translated to LLVM
928IR must provide an implementation of this interface and register it with the
929system. Note that registration may happen without creating the dialect, for
930example, in a separate library to avoid the need for the "main" dialect library
931to depend on LLVM IR libraries. The implementations of these methods may used
932the
933[`ModuleTranslation`](https://mlir.llvm.org/doxygen/classmlir_1_1LLVM_1_1ModuleTranslation.html)
934object provided to them which holds the state of the translation and contains
935numerous utilities.
936
937Note that this extension mechanism is *intentionally restrictive*. LLVM IR has a
938small, relatively stable set of instructions and types that MLIR intends to
939model fully. Therefore, the extension mechanism is provided only for LLVM IR
940constructs that are more often extended -- intrinsics and metadata. The primary
941goal of the extension mechanism is to support sets of intrinsics, for example
942those representing a particular instruction set. The extension mechanism does
943not allow for customizing type or block translation, nor does it support custom
944module-level operations. Such transformations should be performed within MLIR
945and target the corresponding MLIR constructs.
946
947## Translation from LLVM IR
948
949An experimental flow allows one to import a substantially limited subset of LLVM
950IR into MLIR, producing LLVM dialect operations.
951
952```
953  mlir-translate -import-llvm filename.ll
954```
955