Revision tags: llvmorg-21-init |
|
#
6aaa8f25 |
| 21-Jan-2025 |
Matthias Springer <me@m-sp.org> |
[mlir][IR][NFC] Move free-standing functions to `MemRefType` (#123465)
Turn free-standing `MemRefType`-related helper functions in
`BuiltinTypes.h` into member functions.
|
Revision tags: llvmorg-19.1.7 |
|
#
09dfc571 |
| 20-Dec-2024 |
Jacques Pienaar <jpienaar@google.com> |
[mlir] Enable decoupling two kinds of greedy behavior. (#104649)
The greedy rewriter is used in many different flows and it has a lot of
convenience (work list management, debugging actions, tracin
[mlir] Enable decoupling two kinds of greedy behavior. (#104649)
The greedy rewriter is used in many different flows and it has a lot of
convenience (work list management, debugging actions, tracing, etc). But
it combines two kinds of greedy behavior 1) how ops are matched, 2)
folding wherever it can.
These are independent forms of greedy and leads to inefficiency. E.g.,
cases where one need to create different phases in lowering and is
required to applying patterns in specific order split across different
passes. Using the driver one ends up needlessly retrying folding/having
multiple rounds of folding attempts, where one final run would have
sufficed.
Of course folks can locally avoid this behavior by just building their
own, but this is also a common requested feature that folks keep on
working around locally in suboptimal ways.
For downstream users, there should be no behavioral change. Updating
from the deprecated should just be a find and replace (e.g., `find ./
-type f -exec sed -i
's|applyPatternsAndFoldGreedily|applyPatternsGreedily|g' {} \;` variety)
as the API arguments hasn't changed between the two.
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
27a713f5 |
| 12-Aug-2024 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][vector] Add scalable lowering for `transfer_write(transpose)` (#101353)
This specifically handles the case of a transpose from a vector type
like `vector<8x[4]xf32>` to `vector<[4]x8xf32>`.
[mlir][vector] Add scalable lowering for `transfer_write(transpose)` (#101353)
This specifically handles the case of a transpose from a vector type
like `vector<8x[4]xf32>` to `vector<[4]x8xf32>`. Such transposes occur
fairly frequently when scalably vectorizing `linalg.generic`s. There is
no direct lowering for these (as types like `vector<[4]x8xf32>` cannot
be represented in LLVM-IR). However, if the only use of the transpose is
a write, then it is possible to lower the `transfer_write(transpose)` as
a VLA loop.
Example:
```mlir
%transpose = vector.transpose %vec, [1, 0]
: vector<4x[4]xf32> to vector<[4]x4xf32>
vector.transfer_write %transpose, %dest[%i, %j] {in_bounds = [true, true]}
: vector<[4]x4xf32>, memref<?x?xf32>
```
Becomes:
```mlir
%c1 = arith.constant 1 : index
%c4 = arith.constant 4 : index
%c0 = arith.constant 0 : index
%0 = vector.extract %arg0[0] : vector<[4]xf32> from vector<4x[4]xf32>
%1 = vector.extract %arg0[1] : vector<[4]xf32> from vector<4x[4]xf32>
%2 = vector.extract %arg0[2] : vector<[4]xf32> from vector<4x[4]xf32>
%3 = vector.extract %arg0[3] : vector<[4]xf32> from vector<4x[4]xf32>
%vscale = vector.vscale
%c4_vscale = arith.muli %vscale, %c4 : index
scf.for %idx = %c0 to %c4_vscale step %c1 {
%4 = vector.extract %0[%idx] : f32 from vector<[4]xf32>
%5 = vector.extract %1[%idx] : f32 from vector<[4]xf32>
%6 = vector.extract %2[%idx] : f32 from vector<[4]xf32>
%7 = vector.extract %3[%idx] : f32 from vector<[4]xf32>
%slice_i = affine.apply #map(%idx)[%i]
%slice = vector.from_elements %4, %5, %6, %7 : vector<4xf32>
vector.transfer_write %slice, %arg1[%slice_i, %j] {in_bounds = [true]}
: vector<4xf32>, memref<?x?xf32>
}
```
show more ...
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6 |
|
#
dec8055a |
| 09-May-2024 |
Kazu Hirata <kazu@google.com> |
[mlir] Use StringRef::operator== instead of StringRef::equals (NFC) (#91560)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber Stri
[mlir] Use StringRef::operator== instead of StringRef::equals (NFC) (#91560)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
10 under mlir/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
show more ...
|
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
5fcf907b |
| 17-Jan-2024 |
Matthias Springer <me@m-sp.org> |
[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `sta
[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
show more ...
|
#
aa2dc792 |
| 12-Jan-2024 |
Matthias Springer <me@m-sp.org> |
[mlir][vector] Fix rewrite pattern API violation in `VectorToSCF` (#77909)
A rewrite pattern is not allowed to change the IR if it returns
"failure". This commit fixes
`test/Conversion/VectorToSCF
[mlir][vector] Fix rewrite pattern API violation in `VectorToSCF` (#77909)
A rewrite pattern is not allowed to change the IR if it returns
"failure". This commit fixes
`test/Conversion/VectorToSCF/vector-to-scf.mlir` when running with
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`.
```
Processing operation : 'vector.transfer_read'(0x55823a409a60) {
%5 = "vector.transfer_read"(%arg0, %0, %0, %2, %4) <{in_bounds = [true, true], operandSegmentSizes = array<i32: 1, 2, 1, 1>, permutation_map = affine_map<(d0, d1) -> (d0, d1)>}> : (memref<?x4xf32>, index, index, f32, vector<[4]x4xi1>) -> vector<[4]x4xf32>
* Pattern (anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion : 'vector.transfer_read -> ()' {
Trying to match "(anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion"
** Insert : 'vector.splat'(0x55823a445640)
"(anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion" result 0
} -> failure : pattern failed to match
LLVM ERROR: pattern returned failure but IR did change
```
show more ...
|
#
6b21948f |
| 03-Jan-2024 |
Rik Huijzer <github@huijzer.xyz> |
[mlir][vector] Fix invalid `LoadOp` indices being created (#76292)
Fixes https://github.com/llvm/llvm-project/issues/71326.
This is the second PR. The first PR at
https://github.com/llvm/llvm-pr
[mlir][vector] Fix invalid `LoadOp` indices being created (#76292)
Fixes https://github.com/llvm/llvm-project/issues/71326.
This is the second PR. The first PR at
https://github.com/llvm/llvm-project/pull/75519 was reverted because an
integration test failed. The failed integration test was simplified and
added to the core MLIR tests. Compared to the first PR, the current PR
uses a more reliable approach. In summary, the current PR determines the
mask indices by looking up the _mask_ buffer load indices from the
previous iteration, whereas `main` looks up the indices for the _data_
buffer. The mask and data indices can differ when using a
`permutation_map`.
The cause of the issue was that a new `LoadOp` was created which looked
something like:
```mlir
func.func main(%arg1 : index, %arg2 : index) {
%alloca_0 = memref.alloca() : memref<vector<1x32xi1>>
%1 = vector.type_cast %alloca_0 : memref<vector<1x32xi1>> to memref<1xvector<32xi1>>
%2 = memref.load %1[%arg1, %arg2] : memref<1xvector<32xi1>>
return
}
```
which crashed inside the `LoadOp::verify`. Note here that `%alloca_0` is
the mask as can be seen from the `i1` element type and note it is 0
dimensional. Next, `%1` has one dimension, but `memref.load` tries to
index it with two indices.
This issue occured in the following code (a simplified version of the
bug report):
```mlir
#map1 = affine_map<(d0, d1, d2, d3) -> (d0, 0, 0, d3)>
func.func @main(%subview: memref<1x1x1x1xi32>, %mask: vector<1x1xi1>) -> vector<1x1x1x1xi32> {
%c0 = arith.constant 0 : index
%c0_i32 = arith.constant 0 : i32
%3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1}
: memref<1x1x1x1xi32>, vector<1x1x1x1xi32>
return %3 : vector<1x1x1x1xi32>
}
```
After this patch, it is lowered to the following by
`-convert-vector-to-scf`:
```mlir
func.func @main(%arg0: memref<1x1x1x1xi32>, %arg1: vector<1x1xi1>) -> vector<1x1x1x1xi32> {
%c0_i32 = arith.constant 0 : i32
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%alloca = memref.alloca() : memref<vector<1x1x1x1xi32>>
%alloca_0 = memref.alloca() : memref<vector<1x1xi1>>
memref.store %arg1, %alloca_0[] : memref<vector<1x1xi1>>
%0 = vector.type_cast %alloca : memref<vector<1x1x1x1xi32>> to memref<1xvector<1x1x1xi32>>
%1 = vector.type_cast %alloca_0 : memref<vector<1x1xi1>> to memref<1xvector<1xi1>>
scf.for %arg2 = %c0 to %c1 step %c1 {
%3 = vector.type_cast %0 : memref<1xvector<1x1x1xi32>> to memref<1x1xvector<1x1xi32>>
scf.for %arg3 = %c0 to %c1 step %c1 {
%4 = vector.type_cast %3 : memref<1x1xvector<1x1xi32>> to memref<1x1x1xvector<1xi32>>
scf.for %arg4 = %c0 to %c1 step %c1 {
%5 = memref.load %1[%arg2] : memref<1xvector<1xi1>>
%6 = vector.transfer_read %arg0[%arg2, %c0, %c0, %c0], %c0_i32, %5 {in_bounds = [true]} : memref<1x1x1x1xi32>, vector<1xi32>
memref.store %6, %4[%arg2, %arg3, %arg4] : memref<1x1x1xvector<1xi32>>
}
}
}
%2 = memref.load %alloca[] : memref<vector<1x1x1x1xi32>>
return %2 : vector<1x1x1x1xi32>
}
```
What was causing the problems is that one dimension of the data buffer
`%alloca` (eltype `i32`) is unpacked (`vector.type_cast`) inside the
outmost loop (loop with index variable `%arg2`) and the nested loop
(loop with index variable `%arg3`), whereas the mask buffer `%alloca_0`
(eltype `i1`) is not unpacked in these loops.
Before this patch, the load indices would be determined by looking up
the load indices for the *data* buffer load op. However, as shown in the
specific example, when a permutation map is specified then the load
indices from the data buffer load op start to differ from the indices
for the mask op. To fix this, this patch ensures that the load indices
for the *mask* buffer are used instead.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
show more ...
|
#
9f5afc3d |
| 17-Dec-2023 |
Rik Huijzer <github@huijzer.xyz> |
Revert "[mlir][vector] Fix invalid `LoadOp` indices being created (#75519)"
This reverts commit 3a1ae2f46db473cfde4baa6e1b090f5dae67e8db.
|
#
3a1ae2f4 |
| 17-Dec-2023 |
Rik Huijzer <github@huijzer.xyz> |
[mlir][vector] Fix invalid `LoadOp` indices being created (#75519)
Fixes https://github.com/llvm/llvm-project/issues/71326.
The cause of the issue was that a new `LoadOp` was created which looked
[mlir][vector] Fix invalid `LoadOp` indices being created (#75519)
Fixes https://github.com/llvm/llvm-project/issues/71326.
The cause of the issue was that a new `LoadOp` was created which looked
something like:
```mlir
%arg4 =
func.func main(%arg1 : index, %arg2 : index) {
%alloca_0 = memref.alloca() : memref<vector<1x32xi1>>
%1 = vector.type_cast %alloca_0 : memref<vector<1x32xi1>> to memref<1xvector<32xi1>>
%2 = memref.load %1[%arg1, %arg2] : memref<1xvector<32xi1>>
return
}
```
which crashed inside the `LoadOp::verify`. Note here that `%alloca_0` is
0 dimensional, `%1` has one dimension, but `memref.load` tries to index
`%1` with two indices.
This is now fixed by using the fact that `unpackOneDim` always unpacks
one dim
https://github.com/llvm/llvm-project/blob/1bce61e6b01b38e04260be4f422bbae59c34c766/mlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp#L897-L903
and so the `loadOp` should just index only one dimension.
---------
Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
show more ...
|
#
8f9aac44 |
| 06-Dec-2023 |
Matthias Springer <me@m-sp.org> |
[mlir][vector] Fix invalid IR in `vector.print` lowering (#74410)
`DecomposePrintOpConversion` used to generate invalid op such as:
```
error: 'arith.extsi' op operand type 'vector<10xi32>' and re
[mlir][vector] Fix invalid IR in `vector.print` lowering (#74410)
`DecomposePrintOpConversion` used to generate invalid op such as:
```
error: 'arith.extsi' op operand type 'vector<10xi32>' and result type 'vector<10xi32>' are cast incompatible
vector.print %v9 : vector<10xi32>
```
This commit fixes tests such as
`mlir/test/Integration/Dialect/Vector/CPU/test-reductions-i32.mlir` when
verifying the IR after each pattern application (#74270).
show more ...
|
#
c84061fd |
| 30-Nov-2023 |
Rik Huijzer <github@huijzer.xyz> |
[mlir][vector] Fix a `target-rank=0` unrolling (#73365)
Fixes https://github.com/llvm/llvm-project/issues/64269.
With this patch, calling `mlir-opt "-convert-vector-to-scf=full-unroll
target-ran
[mlir][vector] Fix a `target-rank=0` unrolling (#73365)
Fixes https://github.com/llvm/llvm-project/issues/64269.
With this patch, calling `mlir-opt "-convert-vector-to-scf=full-unroll
target-rank=0"` on
```mlir
func.func @main(%vec : vector<2xi32>) {
%alloc = memref.alloc() : memref<4xi32>
%c0 = arith.constant 0 : index
vector.transfer_write %vec, %alloc[%c0] : vector<2xi32>, memref<4xi32>
return
}
```
will result in
```mlir
module {
func.func @main(%arg0: vector<2xi32>) {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%alloc = memref.alloc() : memref<4xi32>
%0 = vector.extract %arg0[0] : i32 from vector<2xi32>
%1 = vector.broadcast %0 : i32 to vector<i32>
vector.transfer_write %1, %alloc[%c0] : vector<i32>, memref<4xi32>
%2 = vector.extract %arg0[1] : i32 from vector<2xi32>
%3 = vector.broadcast %2 : i32 to vector<i32>
vector.transfer_write %3, %alloc[%c1] : vector<i32>, memref<4xi32>
return
}
}
```
I've also tried to proactively find other `target-rank=0` bugs, but
couldn't find any. `options.targetRank` is only used 8 times throughout
the `mlir` folder, all inside `VectorToSCF.cpp`. None of the other uses
look like they could cause a crash. I've also tried
```mlir
func.func @main(%vec : vector<2xi32>) -> vector<2xi32> {
%alloc = memref.alloc() : memref<4xindex>
%c0 = arith.constant 0 : index
%out = vector.transfer_read %alloc[%c0], %c0 : memref<4xindex>, vector<2xi32>
return %out : vector<2xi32>
}
```
with `"--convert-vector-to-scf=full-unroll target-rank=0"` and that also
didn't crash. (Maybe obvious. I have to admit that I'm not very familiar
with these ops.)
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5 |
|
#
1609f1c2 |
| 14-Nov-2023 |
long.chen <lipracer@gmail.com> |
[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/
Not all changes are made manually, most of them are made through
[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269)
detail see the docment: https://mlir.llvm.org/deprecation/
Not all changes are made manually, most of them are made through a clang
tool I wrote https://github.com/lipracer/cpp-refactor.
show more ...
|
#
ced97ffd |
| 13-Nov-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][Vector] Don't fully unroll transfer_writes of n-D scalable vectors (#71924)
It is not possible to unroll a scalable vector at compile time. This
currently prevents transfer_writes from being
[mlir][Vector] Don't fully unroll transfer_writes of n-D scalable vectors (#71924)
It is not possible to unroll a scalable vector at compile time. This
currently prevents transfer_writes from being lowered to
arm_sme.tile_writes (downstream).
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2 |
|
#
9816edc9 |
| 28-Sep-2023 |
Cullen Rhodes <cullen.rhodes@arm.com> |
[mlir][vector] add result type to vector.extract assembly format (#66499)
The vector.extract assembly format currently only contains the source
type, for example:
%1 = vector.extract %0[1] : v
[mlir][vector] add result type to vector.extract assembly format (#66499)
The vector.extract assembly format currently only contains the source
type, for example:
%1 = vector.extract %0[1] : vector<3x7x8xf32>
it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:
%1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
show more ...
|
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
98f6289a |
| 11-Jul-2023 |
Diego Caballero <diegocaballero@google.com> |
[mlir][Vector] Add support for Value indices to vector.extract/insert
`vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instea
[mlir][Vector] Add support for Value indices to vector.extract/insert
`vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instead.
This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops
Differential Revision: https://reviews.llvm.org/D155034
show more ...
|
#
2a82dfd7 |
| 08-Sep-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF)
This allows the lowering of > rank 1 transfer_reads/writes to equivalent lower-rank ones when the trai
[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToSCF)
This allows the lowering of > rank 1 transfer_reads/writes to equivalent lower-rank ones when the trailing dimension is scalable. The resulting ops still cannot be completely lowered as they depend on arrays of scalable vectors being enabled, and a few related fixes (see D158517).
This patch also explicitly disables lowering transfer_reads/writes with a leading scalable dimension, as more changes would be needed to handle that correctly and it is unclear if it is required.
Examples of ops that can now be further lowered:
%vec = vector.transfer_read %arg0[%c0, %c0], %cst, %mask {in_bounds = [true, true]} : memref<3x?xf32>, vector<3x[4]xf32>
vector.transfer_write %vec, %arg0[%c0, %c0], %mask {in_bounds = [true, true]} : vector<3x[4]xf32>, memref<3x?xf32>
Reviewed By: c-rhodes, awarzynski, dcaballe
Differential Revision: https://reviews.llvm.org/D158753
show more ...
|
#
f36e909d |
| 11-Aug-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests, a few CUDA/GPU MLIR tests, and ensuring the assembly forma
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests, a few CUDA/GPU MLIR tests, and ensuring the assembly format is round-trippable.
This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM.
The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors.
To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example:
vector.print punctuation <comma>
lowers to
llvm.call @printComma() : () -> ()
The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines).
Reviewed By: awarzynski, c-rhodes, aartbik
Differential Revision: https://reviews.llvm.org/D156519
show more ...
|
#
1b272d21 |
| 10-Aug-2023 |
Mehdi Amini <joker.eph@gmail.com> |
Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 490dae26cb3bee2e8401e4c2a7ad3e0996be67d0.
Bot is broken, seems like there is a problem of ambiguit
Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 490dae26cb3bee2e8401e4c2a7ad3e0996be67d0.
Bot is broken, seems like there is a problem of ambiguity in the parser.
show more ...
|
#
490dae26 |
| 09-Aug-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests and a few CUDA/GPU MLIR tests.
This patch splits the lower
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
Reland of the original patch after updating the Python binding tests and a few CUDA/GPU MLIR tests.
This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM.
The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors.
To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example:
vector.print <comma>
lowers to
llvm.call @printComma() : () -> ()
The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines).
Reviewed By: awarzynski, c-rhodes, aartbik
Differential Revision: https://reviews.llvm.org/D156519
show more ...
|
#
b160442d |
| 09-Aug-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 3875804a0725c6490b4c0e76e1c0e1e0dbccedf4.
This caused some test failures for the MLIR python bindi
Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors"
This reverts commit 3875804a0725c6490b4c0e76e1c0e1e0dbccedf4.
This caused some test failures for the MLIR python bindings. Reverting until those are addressed.
show more ...
|
#
3875804a |
| 09-Aug-2023 |
Benjamin Maxwell <benjamin.maxwell@arm.com> |
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements,
[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors
This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM.
The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors.
To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example:
vector.print <comma>
lowers to
llvm.call @printComma() : () -> ()
The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines).
Reviewed By: awarzynski, c-rhodes, aartbik
Differential Revision: https://reviews.llvm.org/D156519
show more ...
|
#
16b75cd2 |
| 31-Jul-2023 |
Matthias Springer <me@m-sp.org> |
[mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions
`DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayA
[mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions
`DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayAttr`) are generated.
Differential Revision: https://reviews.llvm.org/D156684
show more ...
|
#
b1d26875 |
| 17-Jul-2023 |
Matthias Springer <me@m-sp.org> |
[mlir][IR] Remove duplicate `isLastMemrefDimUnitStride` functions
This function is duplicated in various dialects.
Differential Revision: https://reviews.llvm.org/D155462
|
#
5cebffc2 |
| 30-Jun-2023 |
Andrzej Warzynski <andrzej.warzynski@arm.com> |
[mlir][Vector] Update the lowering of `vector.transfer_write` to SCF
This change updates the lowering of `vector.transfer_write` to SCF when scalable vectors are used. Specifically, when lowering `v
[mlir][Vector] Update the lowering of `vector.transfer_write` to SCF
This change updates the lowering of `vector.transfer_write` to SCF when scalable vectors are used. Specifically, when lowering `vector.transfer_write` to a loop of `vector.extractelement` ops, make sure that the upper bound of the generated loop is scaled by `vector.vscale`: ``` %10 = vector.vscale %11 = arith.muli %10, %c16 : index scf.for %arg2 = %c0 to %11 step %c1 ```
For reference, this is the current version (i.e. before this change): ``` scf.for %arg2 = %c0 to %c16 step %c1 ``` Note that this only valid for fixed-width vectors.
Differential Revision: https://reviews.llvm.org/D154226
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
5550c821 |
| 08-May-2023 |
Tres Popp <tpopp@google.com> |
[mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionali
[mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call.
Caveats include: - This clang-tidy script probably has more problems. - This only touches C++ code, so nothing that is being generated.
Context: - https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" - Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation: This first patch was created with the following steps. The intention is to only do automated changes at first, so I waste less time if it's reverted, and so the first mass change is more clear as an example to other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git: 0. Retrieve the change from the following to build clang-tidy with an additional check: https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check 1. Build clang-tidy 2. Run clang-tidy over your entire codebase while disabling all checks and enabling the one relevant one. Run on all header files also. 3. Delete .inc files that were also modified, so the next build rebuilds them to a pure state. 4. Some changes have been deleted for the following reasons: - Some files had a variable also named cast - Some files had not included a header file that defines the cast functions - Some files are definitions of the classes that have the casting methods, so the code still refers to the method instead of the function without adding a prefix or removing the method declaration at the same time.
``` ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\ -header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\ mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\ mlir/lib/**/IR/\ mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\ mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\ mlir/test/lib/Dialect/Test/TestTypes.cpp\ mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\ mlir/test/lib/Dialect/Test/TestAttributes.cpp\ mlir/unittests/TableGen/EnumsGenTest.cpp\ mlir/test/python/lib/PythonTestCAPI.cpp\ mlir/include/mlir/IR/ ```
Differential Revision: https://reviews.llvm.org/D150123
show more ...
|