Revision tags: llvmorg-21-init |
|
#
33f9d839 |
| 18-Jan-2025 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - split ConvertToBroadcastAVX512 helper to handle single bitwidth at a time.
Attempt 32-bit broadcasts first, and then fallback to 64-bit broadcasts on failure.
We los
[X86] X86FixupVectorConstants - split ConvertToBroadcastAVX512 helper to handle single bitwidth at a time.
Attempt 32-bit broadcasts first, and then fallback to 64-bit broadcasts on failure.
We lose an explicit assertion for matching operand numbers but X86InstrFoldTables already does something similar.
Pulled out of WIP patch #73509
show more ...
|
Revision tags: llvmorg-19.1.7 |
|
#
be6c752e |
| 12-Jan-2025 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - use VPMOVSX/ZX extensions for PS/PD domain moves (#122601)
For targets with free domain moves, or AVX512 support, allow the use of VPMOVSX/ZX extension loads to r
[X86] X86FixupVectorConstantsPass - use VPMOVSX/ZX extensions for PS/PD domain moves (#122601)
For targets with free domain moves, or AVX512 support, allow the use of VPMOVSX/ZX extension loads to reduce the load sizes.
I've limited this to extension to i32/i64 types as we're mostly interested in shuffle mask loading here, but we could include i16 types as well just as easily.
Inspired by a regression on #122485
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
c59ac1a2 |
| 18-Sep-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] Cleanup AVX512 VBROADCAST subvector instruction names. (#108888)
This patch makes the `VBROADCAST***X**` subvector broadcast instructions consistent - the `***X**` section represents the origi
[X86] Cleanup AVX512 VBROADCAST subvector instruction names. (#108888)
This patch makes the `VBROADCAST***X**` subvector broadcast instructions consistent - the `***X**` section represents the original subvector type/size, but we were not correctly using the AVX512 Z/Z256/Z128 suffix to consistently represent the destination width (or we missed it entirely).
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
1baa3850 |
| 18-Apr-2024 |
Nikita Popov <npopov@redhat.com> |
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.
This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.
As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.
There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
bef25ae2 |
| 08-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - use explicit register bitwidth for the loaded vector instead of using constant pool bitwidth
Fixes #81136 - we might be loading from a constant pool entry wider than
[X86] X86FixupVectorConstants - use explicit register bitwidth for the loaded vector instead of using constant pool bitwidth
Fixes #81136 - we might be loading from a constant pool entry wider than the destination register bitwidth, affecting the vextload scale calculation.
ConvertToBroadcastAVX512 doesn't yet set an explicit bitwidth (it will default to the constant pool bitwidth) due to difficulties in looking up the original register width through the fold tables, but as we only use rebuildSplatCst this shouldn't cause any miscompilations, although it might prevent folding to broadcast if only the lower bits match a splatable pattern.
show more ...
|
#
f407be32 |
| 08-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - rename FixupEntry::BitWidth to FixupEntry::MemBitWidth NFC.
Make it clearer that this refers to the width of the constant element stored in memory - which won't match
[X86] X86FixupVectorConstants - rename FixupEntry::BitWidth to FixupEntry::MemBitWidth NFC.
Make it clearer that this refers to the width of the constant element stored in memory - which won't match the register element width after a sext/zextload
show more ...
|
#
b8466138 |
| 08-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - add destination register width to rebuildSplatCst/rebuildZeroUpperCst/rebuildExtCst callbacks
As found on #81136 - we aren't correctly handling for cases where the co
[X86] X86FixupVectorConstants - add destination register width to rebuildSplatCst/rebuildZeroUpperCst/rebuildExtCst callbacks
As found on #81136 - we aren't correctly handling for cases where the constant pool entry is wider than the destination register width, causing incorrect scaling of the truncated constant for load-extension cases.
This first patch just pulls out the destination register width argument, its still currently driven by the constant pool entry but that will be addressed in a followup.
show more ...
|
#
50d38cf9 |
| 07-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants.cpp - update comment to describe all the constant load ops performed by the pass
|
Revision tags: llvmorg-18.1.0-rc2 |
|
#
69ffa7be |
| 05-Feb-2024 |
Simon Pilgrim <RKSimon@users.noreply.github.com> |
[X86] X86FixupVectorConstants - load+zero vector constants that can be stored in a truncated form (#80428)
Further develops the vsextload support added in #79815 / b5d35feacb7246573c6a4ab2bddc4919a4
[X86] X86FixupVectorConstants - load+zero vector constants that can be stored in a truncated form (#80428)
Further develops the vsextload support added in #79815 / b5d35feacb7246573c6a4ab2bddc4919a4228ed5 - reduces the size of the vector constant by storing it in the constant pool in a truncated form, and zero-extend it as part of the load.
show more ...
|
#
b5d35fea |
| 02-Feb-2024 |
Simon Pilgrim <RKSimon@users.noreply.github.com> |
[X86] X86FixupVectorConstants - load+sign-extend vector constants that can be stored in a truncated form (#79815)
Reduce the size of the vector constant by storing it in the constant pool in a trunc
[X86] X86FixupVectorConstants - load+sign-extend vector constants that can be stored in a truncated form (#79815)
Reduce the size of the vector constant by storing it in the constant pool in a truncated form, and sign-extend it as part of the load.
I've extended the existing FixupConstant functionality to support these sext constant rebuilds - we still select the smallest stored constant entry and prefer vzload/broadcast/vextload for same bitwidth to avoid domain flips.
I intend to add the matching load+zero-extend handling in a future PR, but that requires some alterations to the existing MC shuffle comments handling first.
show more ...
|
#
6ac4fe8d |
| 01-Feb-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants.cpp - refactor constant search loop to take array of sorted candidates
Pulled out of #79815 - refactors the internal FixupConstant logic to just accept an array of vzlo
[X86] X86FixupVectorConstants.cpp - refactor constant search loop to take array of sorted candidates
Pulled out of #79815 - refactors the internal FixupConstant logic to just accept an array of vzload/broadcast candidates that are pre-sorted in ascending constant pool size
show more ...
|
Revision tags: llvmorg-18.1.0-rc1 |
|
#
cfb70267 |
| 29-Jan-2024 |
Shengchen Kan <shengchen.kan@intel.com> |
[X86][NFC] Rename lookupBroadcastFoldTable to lookupBroadcastFoldTableBySize
Address RKSimon's comments in #79761
|
#
e4375bf4 |
| 25-Jan-2024 |
Mikael Holmen <mikael.holmen@ericsson.com> |
[X86] Fix warning about unused variable [NFC]
Without this gcc complains like ../lib/Target/X86/X86FixupVectorConstants.cpp:70:13: warning: unused variable 'CUndef' [-Wunused-variable] 70 | i
[X86] Fix warning about unused variable [NFC]
Without this gcc complains like ../lib/Target/X86/X86FixupVectorConstants.cpp:70:13: warning: unused variable 'CUndef' [-Wunused-variable] 70 | if (auto *CUndef = dyn_cast<UndefValue>(C)) | ^~~~~~
Remove the unused variable and change dyn_cast to isa.
show more ...
|
#
8b43c1be |
| 24-Jan-2024 |
Simon Pilgrim <RKSimon@users.noreply.github.com> |
[X86] X86FixupVectorConstants - shrink vector load to movsd/movsd/movd/movq 'zero upper' instructions (#79000)
If we're loading a vector constant that is known to be zero in the upper elements, then
[X86] X86FixupVectorConstants - shrink vector load to movsd/movsd/movd/movq 'zero upper' instructions (#79000)
If we're loading a vector constant that is known to be zero in the upper elements, then attempt to shrink the constant and just scalar load the lower 32/64 bits.
Always chose the vzload/broadcast with the smallest constant load, and prefer vzload over broadcasts for same bitwidth to avoid domain flips (mainly a AVX1 issue).
Fixes #73783
show more ...
|
Revision tags: llvmorg-19-init |
|
#
4e64ed97 |
| 22-Jan-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] Update X86::getConstantFromPool to take base OperandNo instead of Displacement MachineOperand
This allows us to check the entire constant address calculation, and ensure we're not performing a
[X86] Update X86::getConstantFromPool to take base OperandNo instead of Displacement MachineOperand
This allows us to check the entire constant address calculation, and ensure we're not performing any runtime address math into the constant pool (noticed in an upcoming patch).
show more ...
|
#
c1729c8d |
| 19-Jan-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants.cpp - pull out rebuildConstant helper for future patches. NFC.
Add helper to convert raw APInt bit stream into ConstantDataVector elements.
This was used internally by
[X86] X86FixupVectorConstants.cpp - pull out rebuildConstant helper for future patches. NFC.
Add helper to convert raw APInt bit stream into ConstantDataVector elements.
This was used internally by rebuildSplatableConstant but will be reused in future patches for #73783 and #71078
show more ...
|
#
d12dffac |
| 18-Jan-2024 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] Add X86::getConstantFromPool helper function to replace duplicate implementations.
We had the same helper function in shuffle decode / vector constant code - move this to X86InstrInfo to avoid
[X86] Add X86::getConstantFromPool helper function to replace duplicate implementations.
We had the same helper function in shuffle decode / vector constant code - move this to X86InstrInfo to avoid duplication.
show more ...
|
#
1d56138d |
| 12-Dec-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was f32/f64
This partially reverts 33819f3bfb9c - the asm comments become a lot messier in #73509 - we'
[X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was f32/f64
This partially reverts 33819f3bfb9c - the asm comments become a lot messier in #73509 - we're better off ensuring the constant data is the correct type in DAG
show more ...
|
#
33819f3b |
| 11-Dec-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was ANY floating point type
We don't need an exact match, this is mainly cleanup for cases where v2f32
[X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was ANY floating point type
We don't need an exact match, this is mainly cleanup for cases where v2f32 style types have been cast to f64 etc.
show more ...
|
#
d1deeae0 |
| 11-Dec-2023 |
Simon Pilgrim <RKSimon@users.noreply.github.com> |
[X86] Rename VBROADCASTF128/VBROADCASTI128 to VBROADCASTF128rm/VBROADCASTI128rm (#75040)
Add missing rm postfix to show these are load instructions
|
#
539e60c3 |
| 30-Nov-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - consistently use non-DQI 128/256-bit subvector broadcasts
Without the predicate there's no benefit to using the DQI variants instead of the default AVX512F instru
[X86] X86FixupVectorConstantsPass - consistently use non-DQI 128/256-bit subvector broadcasts
Without the predicate there's no benefit to using the DQI variants instead of the default AVX512F instructions
show more ...
|
#
bafa51c8 |
| 28-Nov-2023 |
Shengchen Kan <shengchen.kan@intel.com> |
[X86] Rename X86MemoryFoldTableEntry to X86FoldTableEntry, NFCI
b/c it's used for element that folds a load, store or broadcast.
|
Revision tags: llvmorg-17.0.6 |
|
#
1552b911 |
| 21-Nov-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - attempt to match VEX logic ops back to EVEX if we can create a broadcast fold
On non-DQI AVX512 targets, X86InstrInfo::setExecutionDomainCustom will convert EVEX
[X86] X86FixupVectorConstantsPass - attempt to match VEX logic ops back to EVEX if we can create a broadcast fold
On non-DQI AVX512 targets, X86InstrInfo::setExecutionDomainCustom will convert EVEX int-domain instructions to VEX fp-domain instructions. But, if we have the chance to use a broadcast fold we're better off using a EVEX instruction, so handle a reverse fold.
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
6155fa69 |
| 02-Nov-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - pull out the hasAVX2() test and use single ConvertToBroadcast call. NFC.
Matches AVX512 ConvertToBroadcast calls and makes it easier to add extension support in t
[X86] X86FixupVectorConstantsPass - pull out the hasAVX2() test and use single ConvertToBroadcast call. NFC.
Matches AVX512 ConvertToBroadcast calls and makes it easier to add extension support in the future.
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
f6ff2cc7 |
| 14-Jun-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets (REAPPLIED)
lowerBuildVectorAsBroadcast will not broadcast splat cons
[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets (REAPPLIED)
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.
This is an updated commit of ab4b924832ce26c21b88d7f82fcf4992ea8906bb after being reverted at 78de45fd4a902066617fcc9bb88efee11f743bc6
show more ...
|