Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
909d6c86 |
| 19-Jan-2021 |
Victor Huang <wei.huang@ibm.com> |
[PowerPC] Fix the check for the instruction using FRSP/XSRSP output register
When performing peephole optimization to simplify the code, after removing passed FPSP/XSRSP instruction we will set any
[PowerPC] Fix the check for the instruction using FRSP/XSRSP output register
When performing peephole optimization to simplify the code, after removing passed FPSP/XSRSP instruction we will set any uses of that FRSP/XSRSP to the source of the FRSP/XSRSP.
We are finding the machine instruction using virtual register holding FRSP/XSRSP results by searching all following instructions and encountering an issue that the first use of the virtual register is a debug MI causing: 1. virtual register in the debug MI removed unexpectedly. 2. virtual register used in non-debug MI not replaced with the source of FRSP/XSRSP. which stays in a undef status.
This patch fix the issue by only searching non-debug machine instruction using virtual register holding FRSP/XSRSP results when the vr only has one non debug usage.
Differential Revisien: https://reviews.llvm.org/D94711 Reviewed by: nemanjai
show more ...
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
2ea7210e |
| 16-Dec-2020 |
Esme-Yi <esme.yi@ibm.com> |
Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit 1c0941e1524f499e3fbde48fc3bdd0e70fc8f2e4.
|
#
913515e4 |
| 14-Dec-2020 |
Kazu Hirata <kazu@google.com> |
[Target] Use llvm::is_contained (NFC)
|
#
45ec3a37 |
| 03-Dec-2020 |
Baptiste Saleil <baptiste.saleil@ibm.com> |
[PowerPC] Fix for excessive ACC copies due to PHI nodes
When using accumulators in loops, they are passed around in PHI nodes of unprimed accumulators, causing the generation of additional prime/unp
[PowerPC] Fix for excessive ACC copies due to PHI nodes
When using accumulators in loops, they are passed around in PHI nodes of unprimed accumulators, causing the generation of additional prime/unprime instructions. This patch detects these cases and changes these PHI nodes to primed accumulator PHI nodes. We also add IR and MIR test cases for several PHI node cases.
Differential Revision: https://reviews.llvm.org/D91391
show more ...
|
Revision tags: llvmorg-11.0.1-rc1 |
|
#
1c0941e1 |
| 22-Nov-2020 |
Esme-Yi <esme.yi@ibm.com> |
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: We have the patterns to fold 2 RLWINMs before RA, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWIN
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: We have the patterns to fold 2 RLWINMs before RA, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization too.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D89855
show more ...
|
#
5053eab8 |
| 03-Nov-2020 |
Esme-Yi <esme.yi@ibm.com> |
Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit 119ab2181e6ed823849c93d55af8e989c28c9f3c.
|
#
119ab218 |
| 03-Nov-2020 |
Esme-Yi <esme.yi@ibm.com> |
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, f
[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.
Reviewed By: shchenz, steven.zhang
Differential Revision: https://reviews.llvm.org/D89855
show more ...
|
#
b969dfe2 |
| 03-Nov-2020 |
Esme-Yi <esme.yi@ibm.com> |
[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.
Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for ex
[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo.
Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D89846
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
69289cc1 |
| 02-Sep-2020 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Fix broken kill flag after MI peephole
The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed two bugs in the PPC back end. The first one was fixed in commit 27714075848e7f05
[PowerPC] Fix broken kill flag after MI peephole
The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed two bugs in the PPC back end. The first one was fixed in commit 27714075848e7f05a297317ad28ad2570d8e5a43 but the test case had to be added without -verify-machineinstrs due to the second bug. This commit fixes the use-after-kill that is left behind by the PPC MI peephole optimization.
show more ...
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
8aa52b19 |
| 09-Jun-2020 |
Chen Zheng <czhengsz@cn.ibm.com> |
[APInt] set all bits for getBitsSetWithWrap if loBit == hiBit
differentiate getBitsSetWithWrap & getBitsSet when loBit == hiBit getBitsSetWithWrap sets all bits; getBitsSet does nothing.
Reviewed B
[APInt] set all bits for getBitsSetWithWrap if loBit == hiBit
differentiate getBitsSetWithWrap & getBitsSet when loBit == hiBit getBitsSetWithWrap sets all bits; getBitsSet does nothing.
Reviewed By: lkail, RKSimon, lebedev.ri
Differential Revision: https://reviews.llvm.org/D81325
show more ...
|
#
c9790d54 |
| 09-Jun-2020 |
Anil Mahmud <Anil.Mahmud@ibm.com> |
[PowerPC] Remove extra instruction left by emitRLDICWhenLoweringJumpTables
The function emitRLDICWhenLoweringJumpTables in PPCMIPeephole.cpp was supposed to convert a pair of RLDICL and RLDICR to a
[PowerPC] Remove extra instruction left by emitRLDICWhenLoweringJumpTables
The function emitRLDICWhenLoweringJumpTables in PPCMIPeephole.cpp was supposed to convert a pair of RLDICL and RLDICR to a single RLDIC, but it was leaving out the RLDICL instruction. This PR fixes the bug.
Differential Revision: https://reviews.llvm.org/D78063
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
cd83333f |
| 12-May-2020 |
Kamau Bridgeman <kamau.bridgeman@ibm.com> |
[PowerPC] Fold redundant load immediates of zero and delete if possible
This patch folds redundant load immediates into a zero for instructions which recognise this as the value zero and not the reg
[PowerPC] Fold redundant load immediates of zero and delete if possible
This patch folds redundant load immediates into a zero for instructions which recognise this as the value zero and not the register. If the load immediate is no longer in use it is then deleted.
This is already done in earlier passes but the ppc-mi-peephole allows for a more general implementation.
Differential Revision: https://reviews.llvm.org/D69168
show more ...
|
#
64d44ae7 |
| 27-Apr-2020 |
Victor Huang <wei.huang@ibm.com> |
[PowerPC][Future] Remove "unskipableSimplifyCode()" in PPCMIPeephole.cpp
"unskipableSimplifyCode()" was added to handle unsafe BL8_NOTOC instruction when TOC was not completely removed. The function
[PowerPC][Future] Remove "unskipableSimplifyCode()" in PPCMIPeephole.cpp
"unskipableSimplifyCode()" was added to handle unsafe BL8_NOTOC instruction when TOC was not completely removed. The function is not needed after confirming TOC pointer is not used in a function that uses PC-Relative addressing.
Differential Revision: https://reviews.llvm.org/D78517
show more ...
|
#
6c4b40de |
| 08-Apr-2020 |
Stefan Pintilie <stefanp@ca.ibm.com> |
[PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this point
[PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this pointer to access the TOC or because other functions that are called from that function expect a valid TOC pointer in the register R2. The main exception to this is leaf functions that do not access the TOC since they are guaranteed not to need a valid TOC pointer.
This patch introduces a feature that will allow more functions to not require a valid TOC pointer in R2.
Differential Revision: https://reviews.llvm.org/D73664
show more ...
|
#
71f1ab53 |
| 03-Apr-2020 |
Qiu Chaofan <qiucofan@cn.ibm.com> |
[PowerPC] Remove unnecessary XSRSP instruction
MI peephole will remove unnecessary FRSP instructions. This patch removes such unnecessary XSRSP.
Reviewed By: steven.zhang
Differential Revision: ht
[PowerPC] Remove unnecessary XSRSP instruction
MI peephole will remove unnecessary FRSP instructions. This patch removes such unnecessary XSRSP.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D77208
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init |
|
#
26ba160d |
| 09-Jan-2020 |
Zheng Chen <czhengsz@cn.ibm.com> |
[PowerPC] when folding rlwinm+rlwinm. to andi., we should use first rlwinm input reg.
%2:gprc = RLWINM %1:gprc, 27, 5, 10 %3:gprc = RLWINM_rec %2:gprc, 8, 5, 10, implicit-def $cr0
==>
%3:gprc = AN
[PowerPC] when folding rlwinm+rlwinm. to andi., we should use first rlwinm input reg.
%2:gprc = RLWINM %1:gprc, 27, 5, 10 %3:gprc = RLWINM_rec %2:gprc, 8, 5, 10, implicit-def $cr0
==>
%3:gprc = ANDI_rec %1, 0, implicit-def $cr0
we should use %1 instead of %2 as ANDI_rec input.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D71885
show more ...
|
#
24ee4ede |
| 06-Jan-2020 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC][NFC] Rename record instructions to use _rec suffix instead of o
We use o suffix to indicate record form instuctions, (as it is similar to dot '.' in mne?)
This was fine before, as we did
[PowerPC][NFC] Rename record instructions to use _rec suffix instead of o
We use o suffix to indicate record form instuctions, (as it is similar to dot '.' in mne?)
This was fine before, as we did not support XO-form. However, with https://reviews.llvm.org/D66902, we now have XO-form support.
It becomes confusing now to still use 'o' for record form, and it is weird to have something like 'Oo' .
This patch rename all 'o' instructions to use '_rec' instead. Also rename `isDot` to `isRecordForm`.
Reviewed By: #powerpc, hfinkel, nemanjai, steven.zhang, lkail
Differential Revision: https://reviews.llvm.org/D70758
show more ...
|
#
1b57749a |
| 26-Dec-2019 |
czhengsz <czhengsz@cn.ibm.com> |
[PowerPC] stop folding if result rlwinm mask is wrap while original rlwinm is not.
%1:g8rc = RLWINM8 %0:g8rc, 0, 16, 9 %2:g8rc = RLWINM8 killed %1:g8rc, 0, 0, 31 -> %2:g8rc = RLWINM8 %0:g8rc, 0, 16,
[PowerPC] stop folding if result rlwinm mask is wrap while original rlwinm is not.
%1:g8rc = RLWINM8 %0:g8rc, 0, 16, 9 %2:g8rc = RLWINM8 killed %1:g8rc, 0, 0, 31 -> %2:g8rc = RLWINM8 %0:g8rc, 0, 16, 9
The above folding is wrong. Before transformation, %2:g8rc is 32 bit value. After transformation, %2:g8rc becomes a 64 bit value. This patch fixes above issue.
Reviewed by: steven.zhang
Differential Revision: https://reviews.llvm.org/D71833
show more ...
|
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
#
a0b025b8 |
| 09-Dec-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC] [NFC] Cleanup xxpermdi peephole optimization
Summary: Following on from rG884351547da2, this patch cleans up the logic for `xxpermdi` peephole optimizations by converting two layers of nes
[PowerPC] [NFC] Cleanup xxpermdi peephole optimization
Summary: Following on from rG884351547da2, this patch cleans up the logic for `xxpermdi` peephole optimizations by converting two layers of nested `if`s to early breaks and simplifying the logic.
Reviewers: hfinkel, nemanjai, jsji, lkail, #powerpc, steven.zhang
Reviewed By: #powerpc, steven.zhang
Subscribers: wuzish, steven.zhang, hiraditya, kbarton, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71170
Patch by vddvss (Colin Samples).
show more ...
|
Revision tags: llvmorg-9.0.1-rc2 |
|
#
3d41a58e |
| 05-Dec-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8o
Summary: This is found during https://reviews.llvm.org/D70758 All the other record forms are having suffix o at the end. ANDIo8 and ANDISo8 are the only
[PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8o
Summary: This is found during https://reviews.llvm.org/D70758 All the other record forms are having suffix o at the end. ANDIo8 and ANDISo8 are the only two that put o before 8.
This patch rename them to be consistent with others.
Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg
Reviewed By: jhibbits
Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70928
show more ...
|
#
88435154 |
| 07-Dec-2019 |
Kai Luo <lkail@cn.ibm.com> |
[PowerPC] Fix MI peephole optimization for splats
Summary: This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.
Specifically, the pass can combine
[PowerPC] Fix MI peephole optimization for splats
Summary: This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.
Specifically, the pass can combine a splat/swap to a splat/copy. It uses `TargetRegisterInfo::lookThruCopyLike` to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.
This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register.
Here is an example in pseudo-MIR of what happens in the test cased added in this patch:
Before PPC MI peephole optimization: ``` %arg = XVADDDP %0, %1
$f1 = COPY %arg.sub_64 call double rint(double) %res.first = COPY $f1 %vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64
%arg.swapped = XXPERMDI %arg, %arg, 2 $f1 = COPY %arg.swapped.sub_64 call double rint(double) %res.second = COPY $f1
%vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64 %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0 %vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2 ; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ] ```
After optimization: ``` ; ... %vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0 ; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1 ; so the pass replaces the swap with a copy: %vec.res = COPY %vec.res.splat ; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ] ```
As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.
Committed for vddvss (Colin Samples). Thanks Colin for the patch! Differential Revision: https://reviews.llvm.org/D69497
show more ...
|
#
f0ba1aec |
| 04-Dec-2019 |
czhengsz <czhengsz@cn.ibm.com> |
[PowerPC] folding rlwinm + rlwinm to rlwinm
For example: x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12 can be combined to x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang,
[PowerPC] folding rlwinm + rlwinm to rlwinm
For example: x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12 can be combined to x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang, lkail
Differential Revision: https://reviews.llvm.org/D70374
show more ...
|
#
d1c16598 |
| 25-Nov-2019 |
czhengsz <czhengsz@cn.ibm.com> |
Revert "[PowerPC] combine rlwinm+rlwinm to rlwinm"
This reverts commit 29f6f9b2b2bfecccf903738e2f5a0cd0a70fce31.
|
Revision tags: llvmorg-9.0.1-rc1 |
|
#
29f6f9b2 |
| 22-Nov-2019 |
czhengsz <czhengsz@cn.ibm.com> |
[PowerPC] combine rlwinm+rlwinm to rlwinm combine x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12
to x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang
Differential Revision: https://reviews.
[PowerPC] combine rlwinm+rlwinm to rlwinm combine x3 = rlwinm x3, 27, 5, 31 x3 = rlwinm x3, 19, 0, 12
to x3 = rlwinm x3, 14, 0, 12
Reviewed by: steven.zhang
Differential Revision: https://reviews.llvm.org/D70374
show more ...
|
#
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <rnk@google.com> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|