ChH.md - OpenGrok cross reference for /llvm-project/mlir/docs/Tutorials/transform/ChH.md

Lines Matching refs:loops
131 variables in Halide DSL correspond to implicit loops iterating over the
133 their definitions also share the corresponding loops. In other words, the loop
157 --one-shot-bufferize --convert-linalg-to-loops`)
217 *   `split` decomposes a loop dimension into two immediately nested loops with
226 *   `reorder` rearranges the loops arbitrarily. In Linalg representation, loops
228     microkernels. The order of implicit loops in a `linalg.generic` operation
232     dimensions and not the explicit loops materialized by tiling operations that
234     leverage this behavior by materializing loops directly in the desired order
249     on the relation between loops surrounding functions, this corresponds to
253     that operates on `forall` loops materialized by tiling beforehand.
272 loops. The implicit loop order for the operation is `n, y, x, c`, so the `co`
274 The remaining dimensions can be materialized as loops in one transformation.
284 This will result in the following loops being created in the IR with the nested
286 implicit loops.
302 on the `conv` function that need to happen before loops are destroyed by
326 update into the `co, n, y, xo` loops materialized by tiling earlier. Structured
331     loops materialized;
347 To complete the structure, we need to put the `rz, ry, rx` loops outside the
348 “tile” loops `xi, ci`. This can be achieved materializing the corresponding
349 loops from the convolution operation. However, these are reduction loops and it
350 wouldn’t be valid to materialize them as intrinsically parallel “forall” loops.
352 sequential `scf.for` loops. (`scf.forall` loops can also express parallel
363 This transformation materializes the desired loops around the convolution
364 operation. It is also more capable than merely producing (reduction) loops: the
369 the loops. In our case, `tile_size = 1` along all dimensions, so the reduction
370 is entirely performed by the generated loops. The combiner structured operation
377 Finally, we need to produce innermost loops `xi` and `ci` that are still not
401 unrolling is requested for the innermost loops that form the 4x5 tile of
404 additional loops,, `unroll(y)` and `unroll(r.x, 2)`, is requested in the
408 the corresponding loops corresponding to `xi` and `ci` dimensions that actually
411 As tiling in the Transform dialect produces handles to the loops materialized by
412 tiling, unrolling those loops is just a matter of chaining the corresponding
454 Tiling-by-one as a way of materializing loops produced structured (`linalg`)
552 // as temporary caches deep inside loops, which could lead to catastrophic
554 // all the surrounding loops.
645 simplify them. In fact, unrolling loops early in the transformation sequence can
652 bufferization invalidates all loop handles including to loops that we are
655 kind of loops produced in the schedule from `scf.for` to `scf.forall` to have
659 them into single-iterator `scf.for` loops _after bufferization_.
679 did so by rematching generated loops after bufferization, which partially defies
692 these loops. Therefore, the last stage of tiling, re-matching and unrolling can