#
5921295d |
| 29-Jan-2025 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Revert "[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis." (#124962)
Reverts llvm/llvm-project#124129 as its currently causing a regression at #124499 - avoids th
Revert "[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis." (#124962)
Reverts llvm/llvm-project#124129 as its currently causing a regression at #124499 - avoids the regression until a proper fix can be added to getSpillCost
show more ...
|
#
4a1a6974 |
| 29-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP][NFC]Unify ScalarToTreeEntries and MultiNodeScalars, NFC
Currently, SLP has 2 distinct storages to manage mapping between vectorized instructions and their corresponding vectorized TreeEntry no
[SLP][NFC]Unify ScalarToTreeEntries and MultiNodeScalars, NFC
Currently, SLP has 2 distinct storages to manage mapping between vectorized instructions and their corresponding vectorized TreeEntry nodes. It leads to inefficient lookup for the matching TreeEntries and makes it harder to correctly track instructions, associated with multiple nodes. There is a plan to extend this support for instructions, that require scheduling, to allow support for copyable elements. Merging ScalarToTreeEntry and MultiNodeScalars will allow reduce maintenance of the feature
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/124914
show more ...
|
Revision tags: llvmorg-21-init |
|
#
947d8ebb |
| 28-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Unify getNumberOfParts use
Adds getNumberOfParts and uses it instead of similar code across code base, fixes analysis of non-vectorizable types in computeMinimumValueSizes.
Reviewers: RKSimon
[SLP]Unify getNumberOfParts use
Adds getNumberOfParts and uses it instead of similar code across code base, fixes analysis of non-vectorizable types in computeMinimumValueSizes.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/124774
show more ...
|
#
a1ab5b4c |
| 28-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Check the MainOp matches the requirements for the instructions
Need to include MainOp into the analysis of the instructions in getSameOpcode to be sure that it is checked for the requirements t
[SLP]Check the MainOp matches the requirements for the instructions
Need to include MainOp into the analysis of the instructions in getSameOpcode to be sure that it is checked for the requirements to prevent crashes during further analysis.
show more ...
|
#
1d5fbe83 |
| 28-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Adjust NumberOfParts value for adjusted number of buildvector scalars
Need to adjust NumParts value, when GatheredScalars scalars are adjusted after extractelements analysis, to fix compiler cr
[SLP]Adjust NumberOfParts value for adjusted number of buildvector scalars
Need to adjust NumParts value, when GatheredScalars scalars are adjusted after extractelements analysis, to fix compiler crash
show more ...
|
#
08d14e10 |
| 28-Jan-2025 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
[SLP] Fix CommonMask will be transformed into an incorrect mask if createShuffle is called multiple times. (#124244)
We have two types of mask in SLP: a scalar mask and a vector mask.
When vectoriz
[SLP] Fix CommonMask will be transformed into an incorrect mask if createShuffle is called multiple times. (#124244)
We have two types of mask in SLP: a scalar mask and a vector mask.
When vectorizing four i32 additions into <4 x i32>, SLP creates a mask
of length 4.
When vectorizing four <2 x i32> additions into <8 x i32>, SLP also
creates a mask of length 4.
We refer to the first case as a scalar mask (because the mask element
represents a scalar, i32), and the second case as a vector mask (because
the mask element represents a vector, <4 x i32>).
At some point, we must convert the scalar mask into a vector mask
(otherwise, calling TTI cost functions or IRBuilderBase functions may
yield incorrect results).
Since both ShuffleCostEstimator and ShuffleInstructionBuilder can modify
the CommonMask, we have decided to perform the mask transformation only
within createShuffle. However, we do not store the transformed result,
as createShuffle may be called multiple times.
show more ...
|
#
f1d5e70a |
| 27-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP][NFC]Do not check poison values for corresponding vectorized entries
No need to check poison values if they have been vectorized and/or mark them as vectorized, it should work only for instruct
[SLP][NFC]Do not check poison values for corresponding vectorized entries
No need to check poison values if they have been vectorized and/or mark them as vectorized, it should work only for instructions.
show more ...
|
#
a12d7e4b |
| 24-Jan-2025 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[SLP] getVectorCallCosts - don't provide scalar argument data for vector IntrinsicCostAttributes (#124254)
getVectorCallCosts determines the cost of a vector intrinsic, based off
an existing scalar
[SLP] getVectorCallCosts - don't provide scalar argument data for vector IntrinsicCostAttributes (#124254)
getVectorCallCosts determines the cost of a vector intrinsic, based off
an existing scalar intrinsic call - but we were including the scalar
argument data to the IntrinsicCostAttributes, which meant that not only
was the cost calculation not type-only based, it was making incorrect
assumptions about constant values etc.
This also exposed an issue that x86 relied on fallback calculations for
funnel shift costs - this is great when we have the argument data as
that improves the accuracy of uniform shift amounts etc., but meant that
type-only costs would default to Cost=2 for all custom lowered funnel
shifts, which was far too cheap.
This is the reverse of #124129 where we weren't including argument data
when we could.
Fixes #63980
show more ...
|
#
8e702735 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and sim
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
show more ...
|
#
c7e6ca76 |
| 23-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP][NFC]Add dump() method for ScheduleData struct type for better debugging
|
#
d8cd8d56 |
| 23-Jan-2025 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis. (#124129)
We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themsel
[SLP] getSpillCost - fully populate IntrinsicCostAttributes to improve cost analysis. (#124129)
We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.)
Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can.
Noticed while having yet another attempt at #63980
show more ...
|
#
fa299294 |
| 23-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP][NFC]Modernize code base in several places
|
#
d3aea77f |
| 23-Jan-2025 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
[SLP] Move transformMaskAfterShuffle into BaseShuffleAnalysis and use it as much as possible. (#123896)
|
#
5deb4ef9 |
| 21-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Initial non-power-of-2 (but still whole register) for remaining nodes
Added non-power-of-2 (but still whole registers) vectorization support for nodes other than stores and reductions.
Reviewe
[SLP]Initial non-power-of-2 (but still whole register) for remaining nodes
Added non-power-of-2 (but still whole registers) vectorization support for nodes other than stores and reductions.
Reviewers: preames, RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/113356
show more ...
|
#
7d01a8f2 |
| 20-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Fix vector factor for repeated node for bv
When adding a node vector, when it is used already in the shuffle for buildvector, need to calculate vector factor from all vector, not only this sing
[SLP]Fix vector factor for repeated node for bv
When adding a node vector, when it is used already in the shuffle for buildvector, need to calculate vector factor from all vector, not only this single vector, to avoid incorrect result. Also, need to increase stability of the reused entries detection to avoid mismatch in cost estimation/codegen.
Fixes #123639
show more ...
|
#
2b1e037a |
| 18-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Fix createInsertVector mask emission
|
#
07d49653 |
| 18-Jan-2025 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
[SLP] Replace MainOp and AltOp in TreeEntry with InstructionsState. (#122443)
Add TreeEntry::hasState.
Add assert for getTreeEntry.
Remove the OpValue parameter from the canReuseExtract function.
[SLP] Replace MainOp and AltOp in TreeEntry with InstructionsState. (#122443)
Add TreeEntry::hasState.
Add assert for getTreeEntry.
Remove the OpValue parameter from the canReuseExtract function.
Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.
show more ...
|
#
b1bf95c0 |
| 17-Jan-2025 |
George Chaltas <george.chaltas@intel.com> |
ReduxWidth check for 0 (#123257)
Added assert to check for underflow of ReduxWidth
modified: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Source code analysis flagged the operation (Re
ReduxWidth check for 0 (#123257)
Added assert to check for underflow of ReduxWidth
modified: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Source code analysis flagged the operation (ReduxWwidth - 1) as
potential underflow, since ReduxWidth is unsigned.
Realize that this should never happen if everything is working right,
but added an assert to check for it just in case.
show more ...
|
#
fec503d1 |
| 17-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP][NFC]Add safe createExtractVector and use instead Builder.CreateExtractVector
|
#
0fe8469e |
| 14-Jan-2025 |
Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com> |
SLPVectorizer: strip bad FIXME (NFC) (#122888)
Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of the FIXME it introduced in SLPVectorizer: the FIXME is bad, and we'd
get n
SLPVectorizer: strip bad FIXME (NFC) (#122888)
Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid
of the FIXME it introduced in SLPVectorizer: the FIXME is bad, and we'd
get no testable impact by using CmpPredicate::getMatching here.
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
066b8887 |
| 13-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Correctly set vector operand for extracts with poisons
When extracts are vectorized and it has some poison values instead of instructions, need to correctly set the vectorized operand not as po
[SLP]Correctly set vector operand for extracts with poisons
When extracts are vectorized and it has some poison values instead of instructions, need to correctly set the vectorized operand not as poison, but as a main vector operand of the main extract instruction.
Fixes #122583
show more ...
|
#
092d6283 |
| 13-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Check for div/rem instructions before extending with poisons
Need to check if the instructions can be safely extended with poison before actually doing this to avoid incorrect transformations.
[SLP]Check for div/rem instructions before extending with poisons
Need to check if the instructions can be safely extended with poison before actually doing this to avoid incorrect transformations.
Fixes #122691
show more ...
|
#
af524de1 |
| 13-Jan-2025 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Do not include subvectors for fully matched buildvectors
If the buildvector node fully matched another node, need to exclude subvectors, when building final shuffle, just a shuffle of the origi
[SLP]Do not include subvectors for fully matched buildvectors
If the buildvector node fully matched another node, need to exclude subvectors, when building final shuffle, just a shuffle of the original node must be emitted.
Fixes #122584
show more ...
|
#
56a37a3c |
| 13-Jan-2025 |
Mel Chen <mel.chen@sifive.com> |
[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549)
This patch simplifies select-based integer min/max reductions by
utilizing `llvm::getMinMaxReductionPredicate`, and generates
[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549)
This patch simplifies select-based integer min/max reductions by
utilizing `llvm::getMinMaxReductionPredicate`, and generates
intrinsic-based min/max reductions by utilizing
`llvm::getMinMaxReductionIntrinsicOp`.
show more ...
|
#
35e76b6a |
| 10-Jan-2025 |
Han-Kuan Chen <hankuan.chen@sifive.com> |
Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198)"
This reverts commit f3d6cdc5aebafac3961d4fccbd2ca0e302c6082c.
|