Lines Matching full:operations
16 scheduling part, can target the operations in the Transform dialect that are
17 later applied by the compiler. Sets of transform operations, or even new
27 operations, that correspond to the scheduling part of the DSL.
34 operations are similar to or different from the existing transformations.
85 // Convolution proper. While Linalg has named operations for 2D convolutions,
100 // Note the fastmath attributes that allow operations to be recombined into
129 initialization and the update are represented as two distinct Linalg operations
230 not apply to named operations that need to be “generalized” first by calling
232 dimensions and not the explicit loops materialized by tiling operations that
240 transformation to all suitable operations at, e.g., a function scope via
243 tensors, which are later decomposed into operations on smaller
372 changes the order of floating point operations (so would reduction tiling with
374 operations, but is explicitly allowed by `fastmath` flags. Halide also emits
379 take into account the way it operates on MLIR structured operations: rather than
382 operations accordingly. Therefore, our tensor type should not become trivial,
402 16-element vector operations to ensure a contiguous sequence of `vfma`
431 and fusion may have produced a lot of operations computing tensor subsets and
455 operations processing 4D types where only one dimension isn’t unit-sized, e.g.,
472 This produces the desired code performing arithmetic operations on
476 operations. Another round of simplification can be applied post vectorization.
504 with operations. Note that we apply the pipeline to functions rather than entire
515 is not always the case in general as some operations, in particular
516 `tensor.empty` may not be bufferizable. Such operations need to be removed
520 `tensor.empty` operations only serving for defining the size of a computation or
533 // Lower complex, multidimensional vector operations into simpler
535 // to vector dialect operations present in the payload IR at this stage.
551 // kind of operations. These patterns may produce local allocations to act
562 // buffer aliasing operations that may have been introduced during
587 floating point operations, it reaches ~14 GFlops. With 1 FMA unit available,
639 schedule application. The repeated tensor subsetting operations, that are later
640 transformed into vector transfer operations, and vector memory loads, are
643 long and complex chains of access and update operations that become so long that
654 operations after bufferization to produce new handles. We will first change the
656 less operations to match by using `transform.structured.tile_using_forall`
658 Then we can match all `scf.forall` operations in the payload IR and transform
684 Multidimensional structured operations on vectors are lowered to target-specific
686 operation on `vector<5x64xf32>` is replaced with 5 operations on
688 required type at the MLIR level. Each of these operations is then split into 4
689 operations on `vector<16xf32>` at the LLVM level where the information about
701 simpler in presence of large-vector operations, which allowed for more