Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
735ab61a |
| 13-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
|
#
9470945b |
| 07-Nov-2024 |
Valery Pykhtin <valery.pykhtin@gmail.com> |
[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)
CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set
[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)
CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set. Let's stop
collecting duplicates at the first place.
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
e6ada716 |
| 21-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
[regalloc][basic] Change spill weight for optsize funcs (#112960)
Change the spill weight calculations for `optsize` functions to remove
the block frequency multiplier. For those functions, we do n
[regalloc][basic] Change spill weight for optsize funcs (#112960)
Change the spill weight calculations for `optsize` functions to remove
the block frequency multiplier. For those functions, we do not want to
consider the runtime cost of spilling, only the codesize cost.
I built a large app with the basic and greedy (default) register
allocator enabled.
| Regalloc Type | Uncompressed Size Delta | Compressed Size Delta |
| - | - | - |
| Basic | -303.8 KiB (-0.23%) | -232.0 KiB (-0.39%) |
| Greedy | 159.1 KiB (0.12%) | 130.1 KiB (0.22%) |
Since I only saw a size win with the basic register allocator, I decided
to only change the behavior for that type.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1 |
|
#
056a3f46 |
| 26-Sep-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC] Reapply 3f37c517f, SmallDenseMap speedups
This time with 100% more building unit tests. Original commit message follows.
[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109
[NFC] Reapply 3f37c517f, SmallDenseMap speedups
This time with 100% more building unit tests. Original commit message follows.
[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)
If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck.
show more ...
|
#
817e742b |
| 25-Sep-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
Revert "[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)"
This reverts commit 3f37c517fbc40531571f8b9f951a8610b4789cd6.
Lo and behold, I missed a unit test
|
#
3f37c517 |
| 25-Sep-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)
If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurio
[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)
If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1 |
|
#
c80c09f3 |
| 23-Jul-2024 |
Dimitry Andric <dimitry@andric.com> |
[CalcSpillWeights] Avoid x87 excess precision influencing weight result
Fixes #99396
The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by x87 excess precision, which can result in
[CalcSpillWeights] Avoid x87 excess precision influencing weight result
Fixes #99396
The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by x87 excess precision, which can result in slightly different register choices when the compiler is hosted on x86_64 or i386. This leads to different object file output when cross-compiling to i386, or native.
Similar to 7af3432e22b0, we need to add a `volatile` qualifier to the local `Weight` variable to force it onto the stack, and avoid the excess precision. Define `stack_float_t` in `MathExtras.h` for this purpose, and use it.
show more ...
|
Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
f6d431f2 |
| 24-Apr-2024 |
Xu Zhang <simonzgx@gmail.com> |
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
5353d3f5 |
| 17-Nov-2023 |
Matthias Braun <matze@braunis.de> |
Remove unused LoopInfo from InlineSpiller and SpillPlacement (NFC) (#71874)
|
Revision tags: llvmorg-17.0.5 |
|
#
284c6990 |
| 03-Nov-2023 |
Nick Desaulniers <nickdesaulniers@users.noreply.github.com> |
[CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747)
This is necessary for RegAllocGreedy support for memory folding inline
asm that us
[CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747)
This is necessary for RegAllocGreedy support for memory folding inline
asm that uses "rm" constraints.
Thanks to @qcolombet for the suggestion.
Link: https://github.com/llvm/llvm-project/issues/20571
show more ...
|
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
4d0f1e32 |
| 11-Aug-2023 |
Elliot Goodrich <elliotgoodrich@gmail.com> |
[llvm] Remove SmallSet from MachineInstr.h
`MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place
[llvm] Remove SmallSet from MachineInstr.h
`MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place.
According to `ClangBuildAnalyzer` (run solely on building LLVM, no other projects) the second most expensive template to instantiate is the `SmallSet::insert` method used in the `inline` implementation in `getUsedDebugRegs()`:
``` **** Templates that took longest to instantiate: 554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms) 521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560 ms) ... ```
By removing this method and putting its implementation in the one call site we greatly reduce the template instantiation time and reduce the number of includes.
When copying the implementation, I removed a check on `MO.getReg()` as this is checked within `MO.isVirtual()`.
Differential Revision: https://reviews.llvm.org/D157720
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1 |
|
#
4d42e8b5 |
| 28-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.
The workaround in c26dfc81e254c78dc2
Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.
The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG.
show more ...
|
#
a496c8be |
| 26-Jul-2023 |
Vitaly Buka <vitalybuka@google.com> |
Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048
Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b.
No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll
Reviewed By: alexfh
Differential Revision: https://reviews.llvm.org/D156381
show more ...
|
Revision tags: llvmorg-18-init |
|
#
b7836d85 |
| 07-Jul-2023 |
Yashwant Singh <Yashwant.Singh@amd.com> |
[CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instru
[CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting.
Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR.
- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.
Reviewed By: arsenm, cdevadas, kparzysz
Differential Revision: https://reviews.llvm.org/D150388
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
#
eac8e25e |
| 22-Mar-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC.
The first member of the pair should be unsigned instead of Register because it is the hint type, 0 for simple (target independent) hint
[CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC.
The first member of the pair should be unsigned instead of Register because it is the hint type, 0 for simple (target independent) hints and other values for target dependent hints.
Differential Revision: https://reviews.llvm.org/D146646
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
d170a254 |
| 03-Feb-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Define and use MachineOperand::getOperandNo
This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo.
Differential Revision: https://reviews.llvm
[CodeGen] Define and use MachineOperand::getOperandNo
This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo.
Differential Revision: https://reviews.llvm.org/D143250
show more ...
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
e72ca520 |
| 13-Jan-2023 |
Craig Topper <craig.topper@sifive.com> |
[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
Revision tags: llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
6c44a717 |
| 23-Jul-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
RegAlloc: Use SmallSet instead of std::set
There shouldn't be more than a small handful of hints at most.
|
#
8d0383eb |
| 24-Jun-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
show more ...
|
#
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
Revision tags: llvmorg-14.0.6 |
|
#
145cc9db |
| 14-Jun-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Remove futureWeight (NFC)
The last use was removed on Jun 5, 2022 in commit 5c06f7168fd1bd589b831cacd5f1cb8a928446fb, which itself was a patch to remove unused code.
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
64e56f83 |
| 21-Dec-2021 |
Mircea Trofin <mtrofin@google.com> |
[NFC] Expose isRematerializable and copyHint from CalcSpillWeights
We need to reuse them for the ML regalloc eviction advisor, as we 'explode' the weight calculation into sub-features.
Differential
[NFC] Expose isRematerializable and copyHint from CalcSpillWeights
We need to reuse them for the ML regalloc eviction advisor, as we 'explode' the weight calculation into sub-features.
Differential Revision: https://reviews.llvm.org/D116074
show more ...
|
#
07622368 |
| 21-Dec-2021 |
Mircea Trofin <mtrofin@google.com> |
[NFC] Fix clang-tidy issues in CalcSpillWeights.cpp
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
#
311d81ce |
| 24-Mar-2021 |
Serguei Katkov <serguei.katkov@azul.com> |
[RegAlloc] Fix "ran out of regs" with uses in statepoint
Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live interva
[RegAlloc] Fix "ran out of regs" with uses in statepoint
Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live intervals in the way that all physical registers are occupied by "zero-length" live intervals which are marked as not-spillable. While intervals are marked as not-spillable in the moment of creation when they are really zero-length it is possible that in future as part of re-materialization there will need for physical register between def and use of such tiny interval (the use is not related to this interval at all). As all physical registers are assigned to not-spillable intervals there is not avaialbe registers and RA reports an error.
The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint instruction in var args section. Such interval may be perfectly spilled and folded to operand of statepoint.
Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98766
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
0447f350 |
| 10-Dec-2020 |
David Green <david.green@arm.com> |
[ARM][RegAlloc] Add t2LoopEndDec
We currently have problems with the way that low overhead loops are specified, with LR being spilled between the t2LoopDec and the t2LoopEnd forcing the entire loop
[ARM][RegAlloc] Add t2LoopEndDec
We currently have problems with the way that low overhead loops are specified, with LR being spilled between the t2LoopDec and the t2LoopEnd forcing the entire loop to be reverted late in the backend. As they will eventually become a single instruction, this patch introduces a t2LoopEndDec which is the combination of the two, combined before registry allocation to make sure this does not fail.
Unfortunately this instruction is a terminator that produces a value (and also branches - it only produces the value around the branching edge). So this needs some adjustment to phi elimination and the register allocator to make sure that we do not spill this LR def around the loop (needing to put a spill after the terminator). We treat the loop very carefully, making sure that there is nothing else like calls that would break it's ability to use LR. For that, this adds a isUnspillableTerminator to opt in the new behaviour.
There is a chance that this could cause problems, and so I have added an escape option incase. But I have not seen any problems in the testing that I've tried, and not reverting Low overhead loops is important for our performance. If this does work then we can hopefully do the same for t2WhileLoopStart and t2DoLoopStart instructions.
This patch also contains the code needed to convert or revert the t2LoopEndDec in the backend (which just needs a subs; bne) and the code pre-ra to create them.
Differential Revision: https://reviews.llvm.org/D91358
show more ...
|