Revision tags: llvmorg-21-init |
|
#
0d71b3e4 |
| 14-Jan-2025 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Remove unused argument from getCoveringSubRegIndexes. NFC. (#122884)
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
735ab61a |
| 13-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
e03f4271 |
| 19-Sep-2024 |
Jay Foad <jay.foad@amd.com> |
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all oc
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8 |
|
#
7c6d0d26 |
| 15-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Use llvm::unique (NFC) (#95628)
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
303a7835 |
| 18-Nov-2023 |
David Green <david.green@arm.com> |
[GreedyRA] Improve RA for nested loop induction variables (#72093)
Imagine a loop of the form:
```
preheader:
%r = def
header:
bcc latch, inner
inner1:
..
inner2:
b
[GreedyRA] Improve RA for nested loop induction variables (#72093)
Imagine a loop of the form:
```
preheader:
%r = def
header:
bcc latch, inner
inner1:
..
inner2:
b latch
latch:
%r = subs %r
bcc header
```
It can be possible for code to spend a decent amount of time in the
header<->latch loop, not going into the inner part of the loop as much.
The greedy register allocator can prefer to spill _around_ %r though,
adding spills around the subs in the loop, which can be very detrimental
for performance. (The case I am looking at is actually a very deeply
nested set of loops that repeat the header<->latch pattern at multiple
different levels).
The greedy RA will apply a preference to spill to the IV, as it is live
through the header block. This patch attempts to add a heuristic to
prevent that in this case for variables that look like IVs, in a similar
regard to the extra spill weight that gets added to variables that look
like IVs, that are expensive to spill. That will mean spills are more
likely to be pushed into the inner blocks, where they are less likely to
be executed and not as expensive as spills around the IV.
This gives a 8% speedup in the exchange benchmark from spec2017 when
compiled with flang-new, whilst importantly stabilising the scores to be
less chaotic to other changes. Running ctmark showed no difference in
the compile time. I've tried to run a range of benchmarking for
performance, most of which were relatively flat not showing many large
differences. One matrix multiply case improved 21.3% due to removing a
cascading chains of spills, and some other knock-on effects happen which
usually cause small differences in the scores.
show more ...
|
#
ce7fd498 |
| 16-Nov-2023 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[AMDGPU] RA inserted scalar instructions can be at the BB top (#72140)
We adjust the insertion point at the BB top for spills/copies during RA
to ensure they are placed after the exec restore instr
[AMDGPU] RA inserted scalar instructions can be at the BB top (#72140)
We adjust the insertion point at the BB top for spills/copies during RA
to ensure they are placed after the exec restore instructions required
for the divergent control flow execution. This is, however, required
only for the vector operations. The insertions for scalar registers can
still go to the BB top.
show more ...
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
bdd17b85 |
| 09-Aug-2023 |
Jon Roelofs <jonathan_roelofs@apple.com> |
Remove a reference to rdar://problem/10664933
The original commit, and the comments in the code already provide sufficient context. But for posterity, there's a tiny bit more that might be useful if
Remove a reference to rdar://problem/10664933
The original commit, and the comments in the code already provide sufficient context. But for posterity, there's a tiny bit more that might be useful if someone is digging here in the future:
> This is related to <rdar://problem/10318439> Lower invokes into terminating > machine instructions. > > The return value from a function call is live in to that function call's > landing pad. The landing pad is shared with a later call, and the variable is > undef on the first exceptional edge. > > Our computation of the last legal split point gets confused because the > return value is live-out from the calling block, and live-in to the landing > pad, but it is not live on the edge itself. > > Fixed in r147911 and r147912.
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
4d42e8b5 |
| 28-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.
The workaround in c26dfc81e254c78dc2
Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.
The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG.
show more ...
|
#
a496c8be |
| 26-Jul-2023 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048
Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b.
No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll
Reviewed By: alexfh
Differential Revision: https://reviews.llvm.org/D156381
show more ...
|
Revision tags: llvmorg-18-init |
|
#
b7836d85 |
| 07-Jul-2023 |
Yashwant Singh <Yashwant.Singh@amd.com> |
[CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instru
[CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting.
Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR.
- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.
Reviewed By: arsenm, cdevadas, kparzysz
Differential Revision: https://reviews.llvm.org/D150388
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
d170a254 |
| 03-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Define and use MachineOperand::getOperandNo
This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo.
Differential Revision: https://reviews.llvm
[CodeGen] Define and use MachineOperand::getOperandNo
This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo.
Differential Revision: https://reviews.llvm.org/D143250
show more ...
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
f7dffc28 |
| 10-Dec-2022 |
Kazu Hirata <kazu@google.com> |
Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer need to include None.h.
This is part of an effort to migrate from llvm::Optional to std::optional:
Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer need to include None.h.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
cb38be9e |
| 07-Dec-2022 |
Gregory Alfonso <gfunni234@gmail.com> |
[NFC] Use Register instead of unsigned for variables that receive a Register object
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D139451
|
#
998960ee |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of
[CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
8d0383eb |
| 24-Jun-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
show more ...
|
Revision tags: llvmorg-14.0.6 |
|
#
e9f7263b |
| 16-Jun-2022 |
Kito Cheng <kito.cheng@sifive.com> |
Reland "[SplitKit] Handle early clobber + tied to def correctly"
This reverts commit 7207373e1eb0dd419b4e13a5e2d0ca146ef9544e.
We found another RISC-V bug when landing D126048, and it has been fixe
Reland "[SplitKit] Handle early clobber + tied to def correctly"
This reverts commit 7207373e1eb0dd419b4e13a5e2d0ca146ef9544e.
We found another RISC-V bug when landing D126048, and it has been fixed by D127642 now.
Differential Revision: https://reviews.llvm.org/D126048
show more ...
|
Revision tags: llvmorg-14.0.5 |
|
#
7207373e |
| 08-Jun-2022 |
Kito Cheng <kito.cheng@sifive.com> |
Revert "[SplitKit] Handle early clobber + tied to def correctly"
Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.
This reverts commit e14d04909df4e52e531f6c2e045c3cf9638dd817.
|
#
e14d0490 |
| 26-May-2022 |
Kito Cheng <kito.cheng@sifive.com> |
[SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when t
[SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when the operand has tied to def, and the def operand is early clobber.
Give an example to demo what's wrong: 0 %0 = ... 16 early-clobber %0 = Op %0 (tied-def 0), ... 32 ... = Op %0
Before extend: %0 = [0r, 0d) [16e, 32d)
The point we want to extend is 0d to 16e not 16r in this case, but if we use 16r here we will extend nothing because that already contained in [16e, 32d).
This patch add check for detect such case and adjust the extend point.
Detailed explanation for testcase: https://reviews.llvm.org/D126047
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D126048
show more ...
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
#
a40dc4ea |
| 05-Feb-2022 |
Benjamin Kramer <benny.kra@googlemail.com> |
Simplify mask creation with llvm::seq. NFCI.
|
#
592f52de |
| 03-Feb-2022 |
Mircea Trofin <mtrofin@google.com> |
[nfc][regalloc] const LiveIntervals within the allocator
Once built, LiveIntervals are immutable. This patch captures that.
Differential Revision: https://reviews.llvm.org/D118918
|
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
87c00878 |
| 28-Aug-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
SplitKit: Remove decade old live interval hack
This was trying to fixup broken live intervals coming out of the coalescer. The verifier is more complete now and no tests seem to fail without this.
|
Revision tags: llvmorg-13.0.0-rc2 |
|
#
e1beebba |
| 10-Aug-2021 |
Ruiling Song <ruiling.song@amd.com> |
SplitKit: Don't further split subrange mask in buildCopy
We may use several COPY instructions to copy the needed sub-registers during split. But the way we split the lanes during the COPYs may be di
SplitKit: Don't further split subrange mask in buildCopy
We may use several COPY instructions to copy the needed sub-registers during split. But the way we split the lanes during the COPYs may be different from the subranges of the old register. This would fail when we extend the subranges of the new register because the LaneMasks do not match exactly between subranges of new register and old register. Since we are bundling the COPYs, I think there is no need to further refine the subranges of the new register based on the set of LaneMasks of the inserted COPYs.
I am not sure if there will be further breaking cases. But as the subranges of new register are created based on the LaneMasks of the subranges of old register, it will be highly possible we will always find an exact LaneMask match. We can think about how to make the extendPHIKillRanges() work for subrange mask mismatch case if we meet more such cases in the future.
The test case was from D105065 by @arsenm.
Differential Revision: https://reviews.llvm.org/D107829
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
#
9f631d14 |
| 04-May-2021 |
Serguei Katkov <serguei.katkov@azul.com> |
[GreedyRA] Add support for invoke statepoint with tied-defs.
statepoint instruction uses tied-def registers to represent live gc value which is use and def at the same time on a call. At the same ti
[GreedyRA] Add support for invoke statepoint with tied-defs.
statepoint instruction uses tied-def registers to represent live gc value which is use and def at the same time on a call. At the same time invoke statepoint instruction is a last split point which can throw and jump to landing pad. As a result we have instructon which is last split point with tied-defs registers and we need to teach Greedy RA to work with it.
The option -use-registers-for-gc-values-in-landing-pad controls whether statepoint lowering will generate tied-defs for invoke statepoint and is off by default now.
To resolve all issues the following changes has been done. 1) Last Split point for invoke statepoint should be statepoint itself
If statepoint has a def it is a relocated gc pointer and it should be available in landing pad. So we cannot split interval after statepoint at end of basic block.
2) Do not split interval on tied-def
If end of interval for overlap utility is a use which has tied-def we should not split interval on this instruction due to in this case use and def may have different registers and it breaks tied-def property.
3) Take into account Last Split Point for enterIntvAtEnd
If the use after Last Split Point is a def so it should be tied-def and we can take the def of the tied-use as ParentVNI and thus tied-use and tied-def will be live in resulting interval.
4) Handle the case when def is after LIP in InlineSpiller
If def of LI is after last insertion point of basic block we cannot hoist in this BB.
The example of such instruction is invoke statepoint where def represents the relocated live gc pointer. Invoke is a last insertion point and its def is located after it. In this case there is no place to insert spill and we bail out.
5) Fix removeBackCopies to account empty copies
RegAssignMap cannot hold empty interval, so do not set stop to kill value if it produces empty interval.
This can happen if we remove back-copy and right before that we have another back-copy.
For example, for parent %0 we can get %1 = COPY %0 %2 = COPY %0 while we removing %2 we cannot set kill for %1 due to its empty.
6) Do not hoist copy to BB if its def is after LSP
If the parent def is a LastSplitPoint or later we cannot hoist copy to this basic block because inserted copy (or re-materialization) will be located before the def.
All parts have been reviewed separately as follows: https://reviews.llvm.org/D100747 https://reviews.llvm.org/D100748 https://reviews.llvm.org/D100750 https://reviews.llvm.org/D100927 https://reviews.llvm.org/D100945 https://reviews.llvm.org/D101028
Reviewers: reames, rnk, void, MatzeB, wmi, qcolombet Reviewed By: reames, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D101150
show more ...
|
#
b98807df |
| 13-Apr-2021 |
Hongtao Yu <hoy@fb.com> |
[CSSPGO] Exclude pseudo probes from slot index
Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because o
[CSSPGO] Exclude pseudo probes from slot index
Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because of enlarged lifetime length with pseudo probe instructions. As a consequence, program could get different code generated w/ and w/o pseudo probes. I'm closing the gap by excluding pseudo probes from stack index and downstream register allocation related passes.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D100334
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
ffba9e59 |
| 22-Feb-2021 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Use range-based for loops (NFC)
|