Revision tags: llvmorg-21-init |
|
#
6b1db798 |
| 22-Jan-2025 |
Sander de Smalen <sander.desmalen@arm.com> |
Revert "Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632)"
There's a regression with one of the bootstrap builds for x86. I'll revert this while
Revert "Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632)"
There's a regression with one of the bootstrap builds for x86. I'll revert this while I investigate.
This reverts commit 4df6d3df24ae9cff07c70c96a1663cbba6e1dca5.
show more ...
|
#
4df6d3df |
| 22-Jan-2025 |
Sander de Smalen <sander.desmalen@arm.com> |
Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632)
This PR aims to reland work done by @arsenm which was previously
reverted due to some tangenti
Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (#123632)
This PR aims to reland work done by @arsenm which was previously
reverted due to some tangentially related scheduler issues as discussed
on #76416.
This PR cherry-picks the original commit (0e46b49de433), and adds
another patch on top with the following changes:
* The code in `updateRegDefsUses` now updates subranges when
subreg-liveness-tracking is enabled.
* When adding an implicit-def operand for the super-register,
the code in `reMaterializeTrivialDef` which tries to remove
undefined subranges should now take into account that the lanes
from the super-reg are no longer undefined.
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
53c37f30 |
| 29-Jul-2024 |
Stefan Pintilie <stefanp@ca.ibm.com> |
[PowerPC] Add phony subregisters to cover the high half of the VSX registers. (#94628)
On PowerPC there are 128 bit VSX registers. These registers are half
overlapped with 64 bit floating point reg
[PowerPC] Add phony subregisters to cover the high half of the VSX registers. (#94628)
On PowerPC there are 128 bit VSX registers. These registers are half
overlapped with 64 bit floating point registers (FPR). The 64 bit half
of the VXS register that does not overlap with the FPR does not overlap
with any other register class. The FPR are the only subregisters of the
VSX registers but they do not fully cover the 128 bit super register.
This leads to incorrect lane masks being created.
This patch adds phony registers for the other half of the VSX registers
in order to fully cover them and to make sure that the lane masks are
not the same for the VSX and the floating point register.
show more ...
|
Revision tags: llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
7b021f2e |
| 13-Sep-2023 |
Maryam Moghadas <maryammo@ca.ibm.com> |
[PowerPC] Optimize VPERM and fix code order for swapping vector operands on LE
This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its vector operands is XXSWAPD, similar to XXPERM.
[PowerPC] Optimize VPERM and fix code order for swapping vector operands on LE
This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its vector operands is XXSWAPD, similar to XXPERM. It also reorganizes the little-endian swap code on LE, swapping the vector operand after adjusting the mask operand. This ensures that the vector operand is swapped at the correct point in the code, resulting in a valid constant pool for the mask operand.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D149083
show more ...
|
#
b922a362 |
| 08-Sep-2023 |
Qiu Chaofan <qiucofan@cn.ibm.com> |
[PowerPC] Define SchedModel for Power8
PowerPC subtargets prior to Power9 use the 'legacy' itinerary way to provide scheduling information. This patch re-writes the tablegen file to define the sched
[PowerPC] Define SchedModel for Power8
PowerPC subtargets prior to Power9 use the 'legacy' itinerary way to provide scheduling information. This patch re-writes the tablegen file to define the scheduling information in the new SchedModel way, which can bring improvements to some benchmarks.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D154488
show more ...
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3 |
|
#
52a774fd |
| 17-Feb-2023 |
Ting Wang <Ting.Wang.SH@ibm.com> |
[PowerPC] remove XXSWAPD after load from CP which is a splat value
If the value from constant-pool is a splat value of vector type, do not need swap after load from constant-pool.
Reviewed By: shch
[PowerPC] remove XXSWAPD after load from CP which is a splat value
If the value from constant-pool is a splat value of vector type, do not need swap after load from constant-pool.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D139491
show more ...
|
Revision tags: llvmorg-16.0.0-rc2 |
|
#
f68fc8d9 |
| 30-Jan-2023 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Fix incorrect shift amount for build_vector
The pattern for a build_vector node was incorrect for big endian subtargets.
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6 |
|
#
7614ba0a |
| 25-Nov-2022 |
Maryam Moghadas <maryammo@ca.ibm.com> |
[PowerPC] Fix vperm codegen
Commit rG934d5fa2b8672695c335deed0e19d0e777c98403 changed the vperm codegen for cases that vperm is not replaced by xxperm, this patch is to revert that.
Reviewed By: st
[PowerPC] Fix vperm codegen
Commit rG934d5fa2b8672695c335deed0e19d0e777c98403 changed the vperm codegen for cases that vperm is not replaced by xxperm, this patch is to revert that.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D138736
show more ...
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
#
934d5fa2 |
| 12-Sep-2022 |
Maryam Moghadas <maryammo@ca.ibm.com> |
[PowerPC] Exploit xxperm, check for dead vectors and substitute vperm with xxperm
vperm instruction requires the data to be in the Altivec registers, if one of the vector operands is not used after
[PowerPC] Exploit xxperm, check for dead vectors and substitute vperm with xxperm
vperm instruction requires the data to be in the Altivec registers, if one of the vector operands is not used after this vperm instruction then it can be substituted by xxperm which doubles the number of available registers.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D133700
show more ...
|
#
427fb351 |
| 07-Oct-2022 |
Kai Nacke <kai.peter.nacke@ibm.com> |
[PPC] Opaque pointer migration, part 1.
The LIT test cases were migrated with the script provided by Nikita Popov. Due to the size of the change it is split into several parts.
Reviewed By: nemanja
[PPC] Opaque pointer migration, part 1.
The LIT test cases were migrated with the script provided by Nikita Popov. Due to the size of the change it is split into several parts.
Reviewed By: nemanja, amyk, nikic, PowerPC
Differential Revision: https://reviews.llvm.org/D135470
show more ...
|
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
#
335e8bf1 |
| 08-Jun-2022 |
Quinn Pham <Quinn.Pham@ibm.com> |
[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores
This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on
[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores
This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on Power8 and earlier (LE) instead of VMX load/store instructions. The reason for this change is because VMX instructions require the vector to be 16-byte aligned. So, a vector load/store will fail with VMX instructions if the vector is misaligned. Also, `gcc` generates VSX instructions in this situation which allow for unaligned access but require a swap instruction after loading/before storing. This is not an issue for BE because we already emit VSX instructions since no swap is required. And this is not an issue on Power9 and up since we have access to `lxv[x]`/`stxv[x]` which allow for unaligned access and do not require swaps.
This patch also delays the VSX load/store for LE combines until after LegalizeOps to prioritize other load/store combines.
Reviewed By: #powerpc, stefanp
Differential Revision: https://reviews.llvm.org/D127309
show more ...
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
#
666ee849 |
| 06-Aug-2021 |
Kai Luo <lkail@cn.ibm.com> |
[PowerPC] Fix shift amount of xxsldwi when performing vector int_to_double
POC ``` // main.c #include <stdio.h> #include <altivec.h> extern vector double foo(vector int s); int main() { vector int
[PowerPC] Fix shift amount of xxsldwi when performing vector int_to_double
POC ``` // main.c #include <stdio.h> #include <altivec.h> extern vector double foo(vector int s); int main() { vector int s = {0, 1, 0, 4}; vector double vd; vd = foo(s); printf("%lf %lf\n", vd[0], vd[1]); return 0; } // poc.c vector double foo(vector int s) { int x1 = s[1]; int x3 = s[3]; double d1 = x1; double d3 = x3; vector double x = { d1, d3 }; return x; } ``` Compiled with `poc.c main.c -mcpu=pwr8 -O3` on BE machine. Current clang gives ``` 4.000000 1.000000 ``` while xlc gives ``` 1.000000 4.000000 ``` Xlc's output should be correct.
Reviewed By: shchenz, #powerpc
Differential Revision: https://reviews.llvm.org/D107428
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
ba627a32 |
| 14-Jul-2021 |
Amy Kwan <amy.kwan1@ibm.com> |
[PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests
This patch includes the following updates to the load/store refactoring effort introduced in D93370: - Update va
[PowerPC] Update Refactored Load/Store Implementation, XForm VSX Patterns, and Tests
This patch includes the following updates to the load/store refactoring effort introduced in D93370: - Update various VSX patterns that use to "force" an XForm, to instead just XForm. This allows the ability for the patterns to compute the most optimal addressing mode (and to produce a DForm instruction when possible) - Update pattern and test case for the LXVD2X/STXVD2X intrinsics - Update LIT test cases that use to use the XForm instruction to use the DForm instruction
Differential Revision: https://reviews.llvm.org/D95115
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
092619cf |
| 22-Apr-2021 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Improve codegen for vector fp to int widening conversions
We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions co
[PowerPC] Improve codegen for vector fp to int widening conversions
We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions come up in code occasionally and this improvement allows us to open code some functions that need to be added to altivec.h.
show more ...
|
#
03e7feff |
| 20-Apr-2021 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets
[PowerPC] Canonicalize shuffles on big endian targets as well
Extend shuffle canonicalization and conversion of shuffles fed by vectorized scalars to big endian subtargets. For big endian subtargets, loads and direct moves of scalars into vector registers put the data in the correct element for SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is narrower, the value still ends up in the wrong place - althouth a different wrong place than on little endian targets.
This patch extends the combine that keeps values where they are if they feed a shuffle to big endian targets.
Differential revision: https://reviews.llvm.org/D100478
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
1fed1316 |
| 19-Jun-2020 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little
[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE
We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations.
This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code
Differential revision: https://reviews.llvm.org/D77448
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
513976df |
| 16-Apr-2020 |
Kang Zhang <shkzhang@cn.ibm.com> |
[PowerPC] Ignore implicit register operands for MCInst
Summary: When doing the conversion: MachineInst -> MCInst, we should ignore the implicit operands, it will expose more opportunity for InstiAli
[PowerPC] Ignore implicit register operands for MCInst
Summary: When doing the conversion: MachineInst -> MCInst, we should ignore the implicit operands, it will expose more opportunity for InstiAlias.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D77118
show more ...
|
#
ecd84354 |
| 07-Apr-2020 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[NFC][PowerPC] Fix register class for patterns using XXPERMDIs
There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lea
[NFC][PowerPC] Fix register class for patterns using XXPERMDIs
There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lead to unncessary copies.
show more ...
|
#
bfa9ce1c |
| 23-Mar-2020 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Improve handling of some BUILD_VECTOR nodes
An analysis of real world code turned up a number of patterns with BUILD_VECTOR of nodes resulting from operations on extracted vector elements
[PowerPC] Improve handling of some BUILD_VECTOR nodes
An analysis of real world code turned up a number of patterns with BUILD_VECTOR of nodes resulting from operations on extracted vector elements for which we produce poor code. This addresses those cases. No attempt is made for completeness as that would entail a large amount of work for something that there is no evidence of in real code.
Differential revision: https://reviews.llvm.org/D72660
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
228dd96c |
| 12-Nov-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests
Summary: This is found during review of https://reviews.llvm.org/D67088.
CHECK-DAG is non-overlapping after https://reviews.llvm.o
[PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests
Summary: This is found during review of https://reviews.llvm.org/D67088.
CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106. -allow-deprecated-dag-overlap was introduced to temporary accept old behavior.
But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll` The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`.
This patch remove the deprecated options, and fix the broken tests.
Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz
Reviewed By: shchenz
Subscribers: shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69733
show more ...
|
#
cf57be9d |
| 22-Oct-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC][NFC] Remove deprecated Function Attrs comments
|
Revision tags: llvmorg-9.0.0 |
|
#
1461fb6e |
| 17-Sep-2019 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Exploit single instruction load-and-splat for word and doubleword
We currently produce a load, followed by (possibly a move for integers and) a splat as separate instructions. VSX has alwa
[PowerPC] Exploit single instruction load-and-splat for word and doubleword
We currently produce a load, followed by (possibly a move for integers and) a splat as separate instructions. VSX has always had a splatting load for doublewords, but as of Power9, we have it for words as well. This patch just exploits these instructions.
Differential revision: https://reviews.llvm.org/D63624
llvm-svn: 372139
show more ...
|
Revision tags: llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
#
37cd0dd2 |
| 14-Aug-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC][NFC] Remove duplicate tests in build-vector-test.ll
AllOnes has been split into build-vector-allones.ll.
llvm-svn: 368900
|
Revision tags: llvmorg-9.0.0-rc2 |
|
#
66c32090 |
| 01-Aug-2019 |
Zi Xuan Wu <wuzish@cn.ibm.com> |
recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store
In PowerPC, there is instruction to load vector in big endian element order when it's in lit
recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store
In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store.
Differential Revision: https://reviews.llvm.org/D65063
llvm-svn: 367516
show more ...
|
#
54d446f7 |
| 31-Jul-2019 |
Zi Xuan Wu <wuzish@cn.ibm.com> |
revert r367382 because buildbot failure
llvm-svn: 367388
|