Lines Matching +full:non +full:- +full:batch
7 <img width="90" align="left" alt="MLIR Codegen Flow" src="https://user-images.githubusercontent.com/10148468/73613629-c5586580-45c5-11ea-94b7-074aeea94c7b.png">
9 Linalg is designed to solve the High-level Hierarchical Optimization (HHO box)
21 one-off op knowledge.
30 1. Tiled Producer-Consumer Fusion with Parametric Tile-And-Fuse.
35 1. Partially Lower to Iterations Over a Finer-Grained Linalg Op.
37 ## High-Level Description of Linalg Ops<a name="linalg_ops"></a>
40 [listed prior art](../../Rationale/RationaleLinalgDialect.md/#prior-art). The
53 occurs it may end up being materialized only as a register-level SSA value.
57 2. a "shape-only" tensor output value whose underlying elements are not used in
62 [Linalg and Shapes](https://llvm.discourse.group/t/linalg-and-shapes/2421)
65 ### Payload-Carrying Ops<a name="payload_ops"></a>
68 [structured op](https://docs.google.com/presentation/d/1P-j1GrH6Q5gLBjao0afQ-GfvcAeF-QU4GXXeSy0eJ9I/edit#slide=id.p)
80 *has* all the information needed to synthesize the control-flow required to
83 [URUK](http://icps.u-strasbg.fr/~bastoul/research/papers/GVBCPST06-IJPP.pdf).
87 layout, and the second one is a `memref` of 4-element vectors with a 2-strided,
88 1-offset layout.
93 affine_map<(m) -> (m)>,
94 affine_map<(m) -> (m)>
108 %c = "some_compute"(%a, %b): (f32, vector<4xf32>) -> (vector<4xf32>)
119 // Run: mlir-opt example1.mlir -allow-unregistered-dialect -convert-linalg-to-loops
132 %3 = "some_compute"(%1, %2) : (f32, vector<4xf32>) -> vector<4xf32>
141 instance, it guarantees no out-of bounds access can occur by construction
153 2. This does not model arbitrary code with side-effects.
161 promising path towards extensibility to non-dense tensors as experience with
163 [sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf) and
164 [position-dependent arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf),
165 as well as [TACO](http://tensor-compiler.org/), has shown.
173 `memref` is a 2-strided one on both of its dimensions, and the second `memref`
179 affine_map<(i, j) -> (j, i)>,
180 affine_map<(i, j) -> (j)>
194 %c = "some_compute"(%a, %b): (f32, vector<4xf32>) -> (vector<4xf32>)
205 // Run: mlir-opt example2.mlir -allow-unregistered-dialect -convert-linalg-to-loops
216 %3 = "some_compute"(%1, %2) : (f32, vector<4xf32>) -> vector<4xf32>
227 - Given a subset of the iteration space, what subset of data does it read and
229 - Given a subset of data read or written, what subset of the iteration space
233 implement transformations such as tiling, tiled producer-consumer fusion, and
237 [AffineMaps](https://mlir.llvm.org/docs/LangRef/#affinemap-attribute) (see the
239 short-term solution, but in the longer term note that this property could be
240 even evaluated dynamically, similarly to inspector-executor algorithms.
249 [Optimizing Compilers for Modern Architectures](https://www.elsevier.com/books/optimizing-compilers-for-modern-architectures/allen/978-0-08-051324-9).
264 properties that may be hard (or even impossible) to derive from lower-level
270 example is the use of data-dependent reduction semantics for specifying
282 [Regions](https://github.com/llvm/llvm-project/blob/58265ad42a90ae8905be6a447cb42e53529a54a0/mlir/docs/LangRef.md/#regions).
303 #map = affine_map<(i, j) -> (i, j)>
323 This function basically element-wise adds up two matrices (`%A` and `%B`) and
347 In the process of lowering to loops and lower-level constructs, similar
349 [inlined call op proposal](https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282/2).
350 We expect to be able to reuse the common lower-level infrastructure provided it
372 affine_map<(i, j) -> (i, j)>,
373 affine_map<(i, j) -> (i, j)>,
374 affine_map<(i, j) -> (i, j)>
399 // Run: mlir-opt example4.mlir -convert-linalg-to-std
406 memref<?x?xf32, strided<[?, ?], offset: ?>>, memref<?x?xf32, strided<[?, ?], offset: ?>>) -> ()
417 // Run: mlir-opt example4.mlir -convert-linalg-to-std | mlir-opt -convert-func-to-llvm
421 llvm.call @pointwise_add(...) : (!llvm<"float*">, ...) -> ()
428 *">) -> ()
437 offloading operations to fast library implementations: pass a non-owning pointer
443 Generally, `linalg` passes non-owning pointers to View data structures to
444 pre-compiled library calls linked externally.
447 [ongoing discussion](https://llvm.discourse.group/t/lowering-optional-attributes-in-linalg-structuredops-to-standard-dialect/333/3)
456 challenging, or even infeasible. Linalg ops adopt perfect-nestedness as a
457 first-class property: the structure cannot be broken and is transported in the
469 forcing innermost control-flow nesting is a lot like writing data-parallel code
471 used before in polyhedral compilers to convert non-affine control into affine
492 [core guiding principles](../../Rationale/RationaleLinalgDialect.md/#core-guiding-principlesa-nameguiding_principlesa).
495 because of empirical evidence building and working on multiple high-level
505 support ragged, mixed-sparse and other types. We expect to draw on the
507 [sparse](https://www.lift-project.org/publications/2016/harries16sparse.pdf) and
508 [position-dependent arrays](https://www.lift-project.org/publications/2019/pizzuti19positiondependentarrays.pdf).
524 Future ops are added on a per-need basis but should include:
534 work in the field of large-scale distributed stencil computations.
536 In a longer-term future, the abstractions from
537 [Legion data-centric programming model](https://legion.stanford.edu/overview/)
540 ### Named Payload-Carrying Ops<a name="named_ops"></a>
556 them to be auto-generated from Tablegen soon.
561 (`mlir-linalg-ods-gen`) to automatically produce named ops from a notation that
564 The syntax and semantics used in `mlir-linalg-ods-gen` are very much in flight
569 type(symbolic-affine-expression-list)` (e.g. `A : f32(M, N + M)`) and each
580 operand is not used in any expressions, it will be considered a shape-only
595 op-specific attributes, dynamic ranks, some form of templating, shape
605 A `"""`-wrapped doc string can be attached to the named op. It should contain a
611 def batchmatmul(A: f32(Batch, M, K), B: f32(K, N)) -> (C: f32(Batch, M, N))
612 """Batch matrix-multiply operation.
614 This operation performs batch matrix-multiply over ...
621 When `mlir-linalg-ods-gen -gen-ods-decl=1` is called, the following ODS is
631 When `mlir-linalg-ods-gen -gen-impl=1` is called, the following C++ is produced:
660 ### YAML Based Named Structured Ops<a name="yaml-gen"></a>
662 Linalg provides a declarative generation tool (`mlir-linalg-ods-yaml-gen`) to
663 automatically produce named ops from a YAML-based op description format intended
664 to capture the structure of the named ops. The YAML-based op descriptions are
670 `mlir-linalg-ods-yaml-gen.cpp` as the source of truth for the schema.