Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
318c69de |
| 27-Nov-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)"
The issue with slow compile-time was caused by an assert in AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuper
Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)"
The issue with slow compile-time was caused by an assert in AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked' after adding all the reserved registers. This call gets very expensive after adding the _HI registers due to the way the function searches in the 'Exception' list, which is expected to be a small list but isn't (the patch added 190 _HI regs).
It was possible to rewrite the code in such a way that the _HI registers are marked as reserved after the check. This makes the problem go away entirely and restores compile-time to what it was before (tested for `check-runtimes`, which previously showed a ~5x slowdown).
This reverts commits: 1434d2ab215e3ea9c5f34689d056edd3d4423a78 2704647fb7986673b89cef1def729e3b022e2607
show more ...
|
#
1434d2ab |
| 22-Nov-2024 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)" (#117307)
Details in #114827
This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.
|
Revision tags: llvmorg-19.1.4 |
|
#
c1c68baf |
| 14-Nov-2024 |
Sander de Smalen <sander.desmalen@arm.com> |
[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)
This is a step towards enabling subreg liveness tracking for AArch64,
which requires that registers are fully covered by their
[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)
This is a step towards enabling subreg liveness tracking for AArch64,
which requires that registers are fully covered by their subregisters,
as covered here #109797.
There are several changes in this patch:
* AArch64RegisterInfo.td and tests: Define the high bits like B0_HI,
H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some
register class, this added a register class which meant that we had to
update 'magic numbers' in several tests.
The use of ComposedSubRegIndex helped 'compress' the number of bits
required for the lanemask. The correctness of the masks is tested by an
explicit unit tests.
* LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for
register tuples, but with this change to describe the high bits, a
register like 'D0' will also have 'HasDisjunctSubRegs' set to true
(because it's fullly covered by S0 and S0_HI). The fix here is to
explicitly test if the register class is one of the known D/Q/Z tuples.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8 |
|
#
7c6d0d26 |
| 15-Jun-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Use llvm::unique (NFC) (#95628)
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
14bc3748 |
| 17-Apr-2023 |
Jay Foad <jay.foad@amd.com> |
[MC] Use subregs/superregs instead of MCSubRegIterator/MCSuperRegIterator. NFC.
Differential Revision: https://reviews.llvm.org/D148613
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
#
1eb75362 |
| 26-Aug-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA][RegisterFile] Consistently update the PRF in the presence of multiple writes to the same register.
My last change to the RegisterFile (PR51495) has introduced a bug in the logic that allocates
[MCA][RegisterFile] Consistently update the PRF in the presence of multiple writes to the same register.
My last change to the RegisterFile (PR51495) has introduced a bug in the logic that allocates physical registers in the PRF.
In some cases, this bug could have triggered a nasty unsigned wrap in the number of allocated registers, thus resulting in mca being stuck forever in a loop of PRF availability checks.
show more ...
|
#
4a5b1917 |
| 25-Aug-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[X86][MCA] Address the latest issues with MULX reported in PR51495.
It turns out that SchedWrite WriteIMulH was always assigned to the low half of the result of a MULX (rather than to the high half)
[X86][MCA] Address the latest issues with MULX reported in PR51495.
It turns out that SchedWrite WriteIMulH was always assigned to the low half of the result of a MULX (rather than to the high half).
To avoid confusion, this patch swaps the two MULX writes in the tablegen definition of MULX32/64. That way, write names better describe what they actually refer to; this also avoids further complications if in future we decide to reuse the same MulH writes to also model other scalar integer multiply instructions. I also had to swap the latency values for the two MULX writes to make sure that the change is effectively an NFC. In fact, none of the existing x86 tests were affected by this small refactoring.
This patch also fixes a bug in MCA: a wrong latency value was propagated for instructions that perform multiple writes to a same register. This last issue was found by Roman while testing MULX on targets that define a different latency for the Low/High part of the result.
Differential Revision: https://reviews.llvm.org/D108727
show more ...
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
#
dc11d4e6 |
| 16-Jun-2021 |
Patrick Holland <pholland2@apple.com> |
[MCA] [RegisterFile] Allow for skipping Defs with RegID of 0 (rather than assert(RegID) like we do before this patch).
This patch will allow developers to remove unwanted instruction Defs (most like
[MCA] [RegisterFile] Allow for skipping Defs with RegID of 0 (rather than assert(RegID) like we do before this patch).
This patch will allow developers to remove unwanted instruction Defs (most likely from within a target specific InstrPostProcess) by setting that Def's RegisterID to 0. Differential Revision: https://reviews.llvm.org/D104433
show more ...
|
Revision tags: llvmorg-12.0.1-rc2 |
|
#
50770d8d |
| 27-May-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA] Refactor the InOrderIssueStage stage. NFCI
Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile.
Changed how the InOrderIssueStage keeps track of backend
[MCA] Refactor the InOrderIssueStage stage. NFCI
Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile.
Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent().
No functional change intended.
show more ...
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
9ceea666 |
| 08-May-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA][RegisterFile] Refactor the move elimination logic to address PR50258.
This patch lifts the restriction on the number of read/write registers for a move elimination candidate. With this patch,
[MCA][RegisterFile] Refactor the move elimination logic to address PR50258.
This patch lifts the restriction on the number of read/write registers for a move elimination candidate. With this patch, move elimination candidates with exactly two reads and two writes are treated like register swap operations for the purpose of move elimination.
This patch currently doesn't affect any upstream model. However, it should help unblock the progress on PR50258.
show more ...
|
#
3822ac90 |
| 07-May-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA][RegisterFile] Fix register class check for move elimination (PR50265)
The register file should always check if the destination register is from a register class that allows move elimination.
[MCA][RegisterFile] Fix register class check for move elimination (PR50265)
The register file should always check if the destination register is from a register class that allows move elimination.
Before this change, the check on the register class was only performed in a few very specific cases. However, it should have always been performed. This patch fixes the issue.
Note that none of the upstream scheduling models is currently affected by this bug, so there is no test for it. The issue was found by Roman while working on the znver3 model. I was able to reproduce the issue locally by tweaking the btver2 model. I then verified that this patch fixes the issue.
show more ...
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
97a00b7b |
| 24-Mar-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA] Fix for uninitialised member in constructor. NFC
|
#
f5bdc88e |
| 23-Mar-2021 |
Andrea Di Biagio <andrea.dibiagio@sony.com> |
[MCA] Improved handling of negative read-advance cycles.
Before this patch, register writes were always invalidated by the RegisterFile at instruction commit stage. So, the RegisterFile was often lo
[MCA] Improved handling of negative read-advance cycles.
Before this patch, register writes were always invalidated by the RegisterFile at instruction commit stage. So, the RegisterFile was often losing the knowledge about the `execute cycle` of writes already committed. While this was not problematic for non-delayed reads, this was sometimes leading to inaccurate read latency computations in the presence of negative read-advance cycles.
This patch fixes the issue by changing how the RegisterFile component internally keeps track of the `execute cycle` information of each write. On every instruction executed, the RegisterFile gets notified by the RetireStage, so that it can internally record the execute cycle of each executed write. The `execute cycle` information is stored within WriteRef itself, and it is not invalidated when the write is committed.
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
099c089d |
| 02-Sep-2020 |
Jay Foad <jay.foad@amd.com> |
[APInt] New member function setBitVal
Differential Revision: https://reviews.llvm.org/D87033
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
#
589cb004 |
| 22-Aug-2019 |
Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> |
[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI
llvm-svn: 369648
|
Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3 |
|
#
c102e2a2 |
| 18-Feb-2019 |
Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> |
[MCA] Correctly update register definitions in the PRF after move elimination.
This patch fixes a bug where register writes performed by optimizable register moves were sometimes wrongly treated lik
[MCA] Correctly update register definitions in the PRF after move elimination.
This patch fixes a bug where register writes performed by optimizable register moves were sometimes wrongly treated like partial register updates. Before this patch, llvm-mca wrongly predicted a 1.50 IPC for test reg-move-elimination-6.s (added by this patch). With this patch, llvm-mca correctly updates the register defintions in the PRF, and the IPC for that test is now correctly reported as 2.
llvm-svn: 354271
show more ...
|
#
7a950ed5 |
| 18-Feb-2019 |
Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> |
[MCA] Slightly refactor method writeStartEvent in WriteState and ReadState. NFCI
This is another change in preparation for PR37494. No functional change intended.
llvm-svn: 354261
|
Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2 |
|
#
4bce783e |
| 05-Feb-2019 |
Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> |
[MCA] Moved the logic that updates register dependencies from DispatchStage to RegisterFile. NFC
DispatchStage should always delegate to an object of class RegisterFile the task of updating data dep
[MCA] Moved the logic that updates register dependencies from DispatchStage to RegisterFile. NFC
DispatchStage should always delegate to an object of class RegisterFile the task of updating data dependencies. ReadState and WriteState objects should not be modified directly by DispatchStage. This patch also renames stage IS_AVAILABLE to IS_DISPATCHED.
llvm-svn: 353170
show more ...
|
Revision tags: llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
#
cc5e6a72 |
| 17-Dec-2018 |
Clement Courbet <courbet@google.com> |
[llvm-mca] Move llvm-mca library to llvm/lib/MCA.
Summary: See PR38731.
Reviewers: andreadb
Subscribers: mgorny, javed.absar, tschuett, gbedwell, andreadb, RKSimon, llvm-commits
Differential Revi
[llvm-mca] Move llvm-mca library to llvm/lib/MCA.
Summary: See PR38731.
Reviewers: andreadb
Subscribers: mgorny, javed.absar, tschuett, gbedwell, andreadb, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D55557
llvm-svn: 349332
show more ...
|