#
834cc88c |
| 13-Jun-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets (REAPPLIED)
lowerBuildVectorAsBroadcast will not broadcast splat constants
[X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets (REAPPLIED)
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.
NOTE: SSE3 targets can use MOVDDUP but not all SSE era CPUs can perform this as cheaply as a vector load, we will need to add scheduler model checks if we want to pursue this.
This is an updated commit of 98061013e01207444cfd3980cde17b5e75764fbe after being reverted at a279a09ab9524d1d74ef29b34618102d4b202e2f
show more ...
|
Revision tags: llvmorg-16.0.6 |
|
#
a279a09a |
| 06-Jun-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Revert rG98061013e01207444cfd3980 - [X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets
Reverting while we address an existing
Revert rG98061013e01207444cfd3980 - [X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets
Reverting while we address an existing issue exposed by this (Issue #63108)
show more ...
|
#
78de45fd |
| 06-Jun-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Revert rGab4b924832ce26c21b88d7f82fcf4992ea8906bb - [X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets
Reverting while w
Revert rGab4b924832ce26c21b88d7f82fcf4992ea8906bb - [X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets
Reverting while we address an existing issue exposed by this (Issue #63108)
show more ...
|
Revision tags: llvmorg-16.0.5 |
|
#
d6a36619 |
| 31-May-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - use VBROADCASTSS/VBROADCASTSD for integer vector loads on AVX1-only targets
Matches behaviour in lowerBuildVectorAsBroadcast
|
#
ab4b9248 |
| 29-May-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets
lowerBuildVectorAsBroadcast will not broadcast splat constants in all
[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.
show more ...
|
#
98061013 |
| 27-May-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases
[X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.
NOTE: SSE3 targets can use MOVDDUP but not all SSE era CPUs can perform this as cheaply as a vector load, we will need to add scheduler model checks if we want to pursue this.
show more ...
|
#
0b91de5e |
| 19-May-2023 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[X86] Add X86FixupVectorConstantsPass to re-fold AVX512 vector load folds as broadcast folds
This patch analyzes AVX512 instructions for full vector width folded loads from the constant pool and att
[X86] Add X86FixupVectorConstantsPass to re-fold AVX512 vector load folds as broadcast folds
This patch analyzes AVX512 instructions for full vector width folded loads from the constant pool and attempts to determine if it can be replaced with a smaller broadcast folded variant. Typically the broadcast opportunities were missed by type-width mismatches or mulituse limitations which have been removed in later passes.
As well as introducing broadcast fold tables (which can hopefully be extended/automated in the future), this also handles mismatches in the AND/ANDN/OR/XOR/TERNLOG type-widths, catching additional missed opportunities.
This is patch is pulled from the ongoing work based on D150143, but without removing the existing DAG constant broadcast lowering code - this patch is currently a late stage cleanup only.
The intention is to add additional broadcast/extension handling of constants in future patches, but it turned out that AVX512 broadcast handling was the easiest to start with.
Differential Revision: https://reviews.llvm.org/D150526
show more ...
|