VectorDistribute.cpp - OpenGrok history log for /llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# bc29fc93	13-Dec-2024	Petr Kurapov <petr.a.kurapov@intel.com>	[MLIR] Create GPU utils library & move distribution utils (#119264) Continue the move of `warp_execute_on_lane_0` op to the gpu dialect (#116994). This patch creates a utils library in GPU and move [MLIR] Create GPU utils library & move distribution utils (#119264) Continue the move of `warp_execute_on_lane_0` op to the gpu dialect (#116994). This patch creates a utils library in GPU and moves generic helper functions there. show more ...
Revision tags: llvmorg-19.1.5
# ecaf2c33	22-Nov-2024	Petr Kurapov <petr.a.kurapov@intel.com>	[MLIR] Move warp_execute_on_lane_0 from vector to gpu (#116994) Please see the related RFC here: https://discourse.llvm.org/t/rfc-move-execute-on-lane-0-from-vector-to-gpu-dialect/82989. This pa [MLIR] Move warp_execute_on_lane_0 from vector to gpu (#116994) Please see the related RFC here: https://discourse.llvm.org/t/rfc-move-execute-on-lane-0-from-vector-to-gpu-dialect/82989. This patch does exactly one thing - moves the op to gpu. show more ...
Revision tags: llvmorg-19.1.4
# 2f925d75	18-Nov-2024	Kunwar Grover <groverkss@gmail.com>	[mlir][Vector] Move insert/extractelement distribution patterns to insert/extract (#116425) This is a NFC-ish change that moves vector.extractelement/vector.insertelement vector distribution patter [mlir][Vector] Move insert/extractelement distribution patterns to insert/extract (#116425) This is a NFC-ish change that moves vector.extractelement/vector.insertelement vector distribution patterns to vector.insert/vector.extract. Before: 0-d/1-d vector.extract -> vector.extractelement -> distributed vector.extractelement 2-d+ vector.extract -> distributed vector.extract After: scalar input vector.extract -> distributed vector.extract vector.extractelement -> distributed vector.extract 2d+ vector.extract -> distributed vector.extract The same changes are done for insertelement/insert. The change allows us to remove reliance on vector.extractelement/vector.insertelement, which are soon to be depreciated: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops/71116/8 No extra tests are included because this patch doesn't introduce / remove any functionality. It only changes the chain of lowerings. This change can be completly NFC if we make the distributed operation vector.extractelement/vector.insertelement, but that is slightly weird, because you are going from extractelement -> extract -> extractelement. show more ...
# b613a540	08-Nov-2024	Matthias Springer <me@m-sp.org>	[mlir][IR][NFC] Cleanup insertion point API usage (#115415) Use `setInsertionPointToStart` / `setInsertionPointToEnd` when possible.
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3
# b5e47d2e	14-Aug-2024	Bangtian Liu <liubangtian@gmail.com>	[mlir][vector] Add extra check on distribute types to avoid crashes (#102952) This PR addresses the issue detailed in https://github.com/iree-org/iree/issues/17948. The problem occurs when distr [mlir][vector] Add extra check on distribute types to avoid crashes (#102952) This PR addresses the issue detailed in https://github.com/iree-org/iree/issues/17948. The problem occurs when distributed types are set to NULL, leading to compilation crashes. --------- Signed-off-by: Bangtian Liu <liubangtian@gmail.com> show more ...
Revision tags: llvmorg-19.1.0-rc2
# 5262865a	04-Aug-2024	Kazu Hirata <kazu@google.com>	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896)
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# 971b8525	01-Apr-2024	Jakub Kuderski <jakub@nod-labs.com>	[mlir][NFC] Simplify type checks with isa predicates (#87183) For more context on isa predicates, see: https://github.com/llvm/llvm-project/pull/83753.
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1
# c2b95292	28-Feb-2024	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Fix n-d transfer write distribution (#83215) Currently n-d transfer write distribution can be inconsistent with distribution of reductions if a value has multiple users, one of which [mlir][vector] Fix n-d transfer write distribution (#83215) Currently n-d transfer write distribution can be inconsistent with distribution of reductions if a value has multiple users, one of which is a transfer_write with a non-standard distribution map, and the other of which is a vector.reduction. We may want to consider removing the distribution map functionality in the future for this reason. show more ...
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 5fcf907b	17-Jan-2024	Matthias Springer <me@m-sp.org>	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `sta [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`). show more ...
# ad100b36	12-Jan-2024	Matthias Springer <me@m-sp.org>	[mlir][vector] Fix dominance error in warp vector distribution (#77771) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` [mlir][vector] Fix dominance error in warp vector distribution (#77771) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` The position at which `vector.extractelement` extracts must also be distributed. The fix in `WarpOpExtractElement` is similar to `WarpOpInsertElement`. show more ...
# 35c19fdd	12-Jan-2024	Matthias Springer <me@m-sp.org>	[mlir][vector] Support warp distribution of `transfer_read` with dependencies (#77779) Support distribution of `vector.transfer_read` ops when operands are defined inside of the region of `warp_exe [mlir][vector] Support warp distribution of `transfer_read` with dependencies (#77779) Support distribution of `vector.transfer_read` ops when operands are defined inside of the region of `warp_execute_on_lane_0` (except for the buffer from which the op is reading). Such IR was previously not supported. This commit changes the implementation such that indices and the padding value are also distributed. This commit simplifies the implementation considerably: the original implementation created a new `transfer_read` op and then checked if this new op is valid. If not, the rewrite pattern failed. This was a bit hacky. It was also a violation of the rewrite pattern API (detected by `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) because the IR was modified, but the pattern returned "failure". show more ...
# 80636227	12-Dec-2023	Jakub Kuderski <jakub@nod-labs.com>	[mlir][vector] Allow vector distribution with multiple written elements (#75122) Add a configuration option to allow vector distribution with multiple elements written by a single lane. This is [mlir][vector] Allow vector distribution with multiple written elements (#75122) Add a configuration option to allow vector distribution with multiple elements written by a single lane. This is so that we can perform vector multi-reduction with multiple results per workgroup. show more ...
Revision tags: llvmorg-17.0.6
# f385f6c9	27-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Distribute all non-permutation or broadcasted masked transfer reads (#73539) The primary difficulty with distribution of masked transfers is when the permutation map permutes the vec [mlir][vector] Distribute all non-permutation or broadcasted masked transfer reads (#73539) The primary difficulty with distribution of masked transfers is when the permutation map permutes the vector, in which case the distribution logic needs to make sure the correct mask elements end up with the distributed transfer. This is only tricky when the permutation map has a permutation in it, so we can relax the condition for distribution. show more ...
Revision tags: llvmorg-17.0.5
# 1609f1c2	14-Nov-2023	long.chen <lipracer@gmail.com>	[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through [mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through a clang tool I wrote https://github.com/lipracer/cpp-refactor. show more ...
# bc81f8c8	10-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Drop incorrect startRootUpdate calls in vector distribution (#71988) Fixes asan failures in https://lab.llvm.org/buildbot/#/builders/5/builds/38191 introduced by #71964.
# aa2376a0	10-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Notify the rewriter when sinking out of warp ops (#71964) A number of the warp distribution patterns work by rewriting a warp op in place by moving a contained op outside. This notif [mlir][vector] Notify the rewriter when sinking out of warp ops (#71964) A number of the warp distribution patterns work by rewriting a warp op in place by moving a contained op outside. This notifies the rewriter that the warp op is changing in this case. show more ...
# d4d28914	10-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Add distribution pattern for vector.create_mask (#71619) This is the last step needed for basic support for distributing masked vector code. The lane id gets delinearized based on th [mlir][vector] Add distribution pattern for vector.create_mask (#71619) This is the last step needed for basic support for distributing masked vector code. The lane id gets delinearized based on the distributed mask shape and then compared against the original mask sizes to compute the bounds for the distributed mask. Note that the distribution of masks is implicit on the shape specified by the warp op. As a result, it is the responsibility of the consumer of the mask to ensure the distributed mask will match its own distribution semantics. show more ...
# df49a97a	10-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Root the transfer write distribution pattern on the warp op (#71868) Currently when there is a mix of transfer read ops and transfer write ops that need to be distributed, because th [mlir][vector] Root the transfer write distribution pattern on the warp op (#71868) Currently when there is a mix of transfer read ops and transfer write ops that need to be distributed, because the pattern for write distribution is rooted on the transfer write, it is hard to guarantee that the write gets distributed after the read when the two aren't directly connected by SSA. This is likely still relatively unsafe when there are undistributable ops, but structurally these patterns are a bit difficult to work with. For now pattern benefits give fairly good guarantees for happy paths. show more ...
# 7360d5d3	09-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Fix cases with multiple yielded transfer_read ops (#71625) This fixes two bugs: 1) When deciding whether a transfer read could be propagated out of a warp op, it looked for the f [mlir][vector] Fix cases with multiple yielded transfer_read ops (#71625) This fixes two bugs: 1) When deciding whether a transfer read could be propagated out of a warp op, it looked for the first yield operand that was produced by a transfer read. If this transfer read wasn't ready to be distributed, the pattern would not re-check for any other transfer reads that could have been propagated. 2) When dropping dead warp results, we do so by updating the warp op signature and splicing in the old region. This does not add the ops in the body of the warp op back to the pattern applicator's worklist, and thus those operations won't be DCE'd. This is a problem for patterns like the one for transfer reads that will still see the dead operation as a user. show more ...
# 771f5759	09-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Add pattern to distribute masked reads (#71610) Because the distribution is based on types, supporting general masked reads requires first materializing the permutation map in IR to [mlir][vector] Add pattern to distribute masked reads (#71610) Because the distribution is based on types, supporting general masked reads requires first materializing the permutation map in IR to align the elements of the mask with the elements read by the transfer op. For now just support cases with the trivial permutation map. show more ...
# 25ec1fa9	07-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Add support for distributing masked writes (#71482) General distribution of masked writes requires materializing the permutation on the vector of the write in IR to ensure the vector [mlir][vector] Add support for distributing masked writes (#71482) General distribution of masked writes requires materializing the permutation on the vector of the write in IR to ensure the vector lines up with the mask. For now just support cases with trivial permutation maps. show more ...
# 98dcd98a	06-Nov-2023	Quinn Dawkins <quinn.dawkins@gmail.com>	[mlir][vector] Hoist uniform scalar loop code after scf.for distribution (#71422) After propagation of `vector.warp_execute_on_lane_0` through `scf.for`, uniform operations like those on the loop i [mlir][vector] Hoist uniform scalar loop code after scf.for distribution (#71422) After propagation of `vector.warp_execute_on_lane_0` through `scf.for`, uniform operations like those on the loop iterators can now be hoisted out of the inner warp op. show more ...
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2
# 9816edc9	28-Sep-2023	Cullen Rhodes <cullen.rhodes@arm.com>	[mlir][vector] add result type to vector.extract assembly format (#66499) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : v [mlir][vector] add result type to vector.extract assembly format (#66499) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32> show more ...
Revision tags: llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 98f6289a	11-Jul-2023	Diego Caballero <diegocaballero@google.com>	[mlir][Vector] Add support for Value indices to vector.extract/insert `vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instea [mlir][Vector] Add support for Value indices to vector.extract/insert `vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instead. This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops Differential Revision: https://reviews.llvm.org/D155034 show more ...
# 5cf714bb	18-Sep-2023	Matthias Springer <me@m-sp.org>	[mlir][SCF] scf.for: Consistent API around `initArgs` (#66512) * Always use the auto-generated `getInitArgs` function. Remove the hand-written `getInitOperands` duplicate. * Remove `hasIterOperand [mlir][SCF] scf.for: Consistent API around `initArgs` (#66512) * Always use the auto-generated `getInitArgs` function. Remove the hand-written `getInitOperands` duplicate. * Remove `hasIterOperands` and `getNumIterOperands`. The names were inconsistent because the "arg" is called `initArgs` in TableGen. Use `getInitArgs().size()` instead. * Fix verification around ops with no results. show more ...
12 3 4