CalcSpillWeights.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/CalcSpillWeights.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 735ab61a	13-Nov-2024	Kazu Hirata <kazu@google.com>	[CodeGen] Remove unused includes (NFC) (#115996) Identified with misc-include-cleaner.
# 9470945b	07-Nov-2024	Valery Pykhtin <valery.pykhtin@gmail.com>	[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236) CopyHints set has been collecting duplicates of a register with increasing weight and then deduplicated with HintedRegs set [CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236) CopyHints set has been collecting duplicates of a register with increasing weight and then deduplicated with HintedRegs set. Let's stop collecting duplicates at the first place. show more ...
Revision tags: llvmorg-19.1.3
# e6ada716	21-Oct-2024	Ellis Hoag <ellis.sparky.hoag@gmail.com>	[regalloc][basic] Change spill weight for optsize funcs (#112960) Change the spill weight calculations for `optsize` functions to remove the block frequency multiplier. For those functions, we do n [regalloc][basic] Change spill weight for optsize funcs (#112960) Change the spill weight calculations for `optsize` functions to remove the block frequency multiplier. For those functions, we do not want to consider the runtime cost of spilling, only the codesize cost. I built a large app with the basic and greedy (default) register allocator enabled. \| Regalloc Type \| Uncompressed Size Delta \| Compressed Size Delta \| \| - \| - \| - \| \| Basic \| -303.8 KiB (-0.23%) \| -232.0 KiB (-0.39%) \| \| Greedy \| 159.1 KiB (0.12%) \| 130.1 KiB (0.22%) \| Since I only saw a size win with the basic register allocator, I decided to only change the behavior for that type. show more ...
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# 056a3f46	26-Sep-2024	Jeremy Morse <jeremy.morse@sony.com>	[NFC] Reapply 3f37c517f, SmallDenseMap speedups This time with 100% more building unit tests. Original commit message follows. [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109 [NFC] Reapply 3f37c517f, SmallDenseMap speedups This time with 100% more building unit tests. Original commit message follows. [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck. show more ...
# 817e742b	25-Sep-2024	Jeremy Morse <jeremy.morse@sony.com>	Revert "[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)" This reverts commit 3f37c517fbc40531571f8b9f951a8610b4789cd6. Lo and behold, I missed a unit test
# 3f37c517	25-Sep-2024	Jeremy Morse <jeremy.morse@sony.com>	[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurio [NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417) If we use SmallDenseMaps instead of DenseMaps at these locations, we get a substantial speedup because there's less spurious malloc traffic. Discovered by instrumenting DenseMap with some accounting code, then selecting sites where we'll get the most bang for our buck. show more ...
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# c80c09f3	23-Jul-2024	Dimitry Andric <dimitry@andric.com>	[CalcSpillWeights] Avoid x87 excess precision influencing weight result Fixes #99396 The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by x87 excess precision, which can result in [CalcSpillWeights] Avoid x87 excess precision influencing weight result Fixes #99396 The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by x87 excess precision, which can result in slightly different register choices when the compiler is hosted on x86_64 or i386. This leads to different object file output when cross-compiling to i386, or native. Similar to 7af3432e22b0, we need to add a `volatile` qualifier to the local `Weight` variable to force it onto the stack, and avoid the excess precision. Define `stack_float_t` in `MathExtras.h` for this purpose, and use it. show more ...
Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# f6d431f2	24-Apr-2024	Xu Zhang <simonzgx@gmail.com>	[CodeGen] Make the parameter TRI required in some functions. (#85968) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many [CodeGen] Make the parameter TRI required in some functions. (#85968) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`. show more ...
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6
# 5353d3f5	17-Nov-2023	Matthias Braun <matze@braunis.de>	Remove unused LoopInfo from InlineSpiller and SpillPlacement (NFC) (#71874)
Revision tags: llvmorg-17.0.5
# 284c6990	03-Nov-2023	Nick Desaulniers <nickdesaulniers@users.noreply.github.com>	[CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747) This is necessary for RegAllocGreedy support for memory folding inline asm that us [CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747) This is necessary for RegAllocGreedy support for memory folding inline asm that uses "rm" constraints. Thanks to @qcolombet for the suggestion. Link: https://github.com/llvm/llvm-project/issues/20571 show more ...
Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 4d0f1e32	11-Aug-2023	Elliot Goodrich <elliotgoodrich@gmail.com>	[llvm] Remove SmallSet from MachineInstr.h `MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place [llvm] Remove SmallSet from MachineInstr.h `MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place. According to `ClangBuildAnalyzer` (run solely on building LLVM, no other projects) the second most expensive template to instantiate is the `SmallSet::insert` method used in the `inline` implementation in `getUsedDebugRegs()`: ``` **** Templates that took longest to instantiate: 554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms) 521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560 ms) ... ``` By removing this method and putting its implementation in the one call site we greatly reduce the template instantiation time and reduce the number of includes. When copying the implementation, I removed a check on `MO.getReg()` as this is checked within `MO.isVirtual()`. Differential Revision: https://reviews.llvm.org/D157720 show more ...
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 4d42e8b5	28-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc2 Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG. show more ...
# a496c8be	26-Jul-2023	Vitaly Buka <vitalybuka@google.com>	Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048 Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" And dependent commits. Details in D150388. This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c. This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e. This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826. This reverts commit b7836d856206ec39509d42529f958c920368166b. No conflicts in the code, few tests had conflicts in autogenerated CHECKs: llvm/test/CodeGen/Thumb2/mve-float32regloops.ll llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D156381 show more ...
Revision tags: llvmorg-18-init
# b7836d85	07-Jul-2023	Yashwant Singh <Yashwant.Singh@amd.com>	[CodeGen]Allow targets to use target specific COPY instructions for live range splitting Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instru [CodeGen]Allow targets to use target specific COPY instructions for live range splitting Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting. Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR. - Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline. Reviewed By: arsenm, cdevadas, kparzysz Differential Revision: https://reviews.llvm.org/D150388 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# eac8e25e	22-Mar-2023	Jay Foad <jay.foad@amd.com>	[CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC. The first member of the pair should be unsigned instead of Register because it is the hint type, 0 for simple (target independent) hint [CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC. The first member of the pair should be unsigned instead of Register because it is the hint type, 0 for simple (target independent) hints and other values for target dependent hints. Differential Revision: https://reviews.llvm.org/D146646 show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# d170a254	03-Feb-2023	Jay Foad <jay.foad@amd.com>	[CodeGen] Define and use MachineOperand::getOperandNo This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo. Differential Revision: https://reviews.llvm [CodeGen] Define and use MachineOperand::getOperandNo This is a helper function to very slightly simplify many calls to MachineInstruction::getOperandNo. Differential Revision: https://reviews.llvm.org/D143250 show more ...
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# e72ca520	13-Jan-2023	Craig Topper <craig.topper@sifive.com>	[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
Revision tags: llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 6c44a717	23-Jul-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	RegAlloc: Use SmallSet instead of std::set There shouldn't be more than a small handful of hints at most.
# 8d0383eb	24-Jun-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed. show more ...
# 9e6d1f4b	17-Jul-2022	Kazu Hirata <kazu@google.com>	[CodeGen] Qualify auto variables in for loops (NFC)
Revision tags: llvmorg-14.0.6
# 145cc9db	14-Jun-2022	Kazu Hirata <kazu@google.com>	[CodeGen] Remove futureWeight (NFC) The last use was removed on Jun 5, 2022 in commit 5c06f7168fd1bd589b831cacd5f1cb8a928446fb, which itself was a patch to remove unused code.
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 64e56f83	21-Dec-2021	Mircea Trofin <mtrofin@google.com>	[NFC] Expose isRematerializable and copyHint from CalcSpillWeights We need to reuse them for the ML regalloc eviction advisor, as we 'explode' the weight calculation into sub-features. Differential [NFC] Expose isRematerializable and copyHint from CalcSpillWeights We need to reuse them for the ML regalloc eviction advisor, as we 'explode' the weight calculation into sub-features. Differential Revision: https://reviews.llvm.org/D116074 show more ...
# 07622368	21-Dec-2021	Mircea Trofin <mtrofin@google.com>	[NFC] Fix clang-tidy issues in CalcSpillWeights.cpp
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 311d81ce	24-Mar-2021	Serguei Katkov <serguei.katkov@azul.com>	[RegAlloc] Fix "ran out of regs" with uses in statepoint Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live interva [RegAlloc] Fix "ran out of regs" with uses in statepoint Statepoint instruction is known to have a variable and big number of operands. It is possible that Register Allocator will split live intervals in the way that all physical registers are occupied by "zero-length" live intervals which are marked as not-spillable. While intervals are marked as not-spillable in the moment of creation when they are really zero-length it is possible that in future as part of re-materialization there will need for physical register between def and use of such tiny interval (the use is not related to this interval at all). As all physical registers are assigned to not-spillable intervals there is not avaialbe registers and RA reports an error. The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint instruction in var args section. Such interval may be perfectly spilled and folded to operand of statepoint. Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98766 show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 0447f350	10-Dec-2020	David Green <david.green@arm.com>	[ARM][RegAlloc] Add t2LoopEndDec We currently have problems with the way that low overhead loops are specified, with LR being spilled between the t2LoopDec and the t2LoopEnd forcing the entire loop [ARM][RegAlloc] Add t2LoopEndDec We currently have problems with the way that low overhead loops are specified, with LR being spilled between the t2LoopDec and the t2LoopEnd forcing the entire loop to be reverted late in the backend. As they will eventually become a single instruction, this patch introduces a t2LoopEndDec which is the combination of the two, combined before registry allocation to make sure this does not fail. Unfortunately this instruction is a terminator that produces a value (and also branches - it only produces the value around the branching edge). So this needs some adjustment to phi elimination and the register allocator to make sure that we do not spill this LR def around the loop (needing to put a spill after the terminator). We treat the loop very carefully, making sure that there is nothing else like calls that would break it's ability to use LR. For that, this adds a isUnspillableTerminator to opt in the new behaviour. There is a chance that this could cause problems, and so I have added an escape option incase. But I have not seen any problems in the testing that I've tried, and not reverting Low overhead loops is important for our performance. If this does work then we can hopefully do the same for t2WhileLoopStart and t2DoLoopStart instructions. This patch also contains the code needed to convert or revert the t2LoopEndDec in the backend (which just needs a subs; bne) and the code pre-ra to create them. Differential Revision: https://reviews.llvm.org/D91358 show more ...
12 3 4 5