History log of /llvm-project/llvm/lib/CodeGen/CalcSpillWeights.cpp (Results 1 – 25 of 104)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 735ab61a 13-Nov-2024 Kazu Hirata <kazu@google.com>

[CodeGen] Remove unused includes (NFC) (#115996)

Identified with misc-include-cleaner.


# 9470945b 07-Nov-2024 Valery Pykhtin <valery.pykhtin@gmail.com>

[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)

CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set

[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)

CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set. Let's stop
collecting duplicates at the first place.

show more ...


Revision tags: llvmorg-19.1.3
# e6ada716 21-Oct-2024 Ellis Hoag <ellis.sparky.hoag@gmail.com>

[regalloc][basic] Change spill weight for optsize funcs (#112960)

Change the spill weight calculations for `optsize` functions to remove
the block frequency multiplier. For those functions, we do n

[regalloc][basic] Change spill weight for optsize funcs (#112960)

Change the spill weight calculations for `optsize` functions to remove
the block frequency multiplier. For those functions, we do not want to
consider the runtime cost of spilling, only the codesize cost.

I built a large app with the basic and greedy (default) register
allocator enabled.

| Regalloc Type | Uncompressed Size Delta | Compressed Size Delta |
| - | - | - |
| Basic | -303.8 KiB (-0.23%) | -232.0 KiB (-0.39%) |
| Greedy | 159.1 KiB (0.12%) | 130.1 KiB (0.22%) |

Since I only saw a size win with the basic register allocator, I decided
to only change the behavior for that type.

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# 056a3f46 26-Sep-2024 Jeremy Morse <jeremy.morse@sony.com>

[NFC] Reapply 3f37c517f, SmallDenseMap speedups

This time with 100% more building unit tests. Original commit message follows.

[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109

[NFC] Reapply 3f37c517f, SmallDenseMap speedups

This time with 100% more building unit tests. Original commit message follows.

[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)

If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.

show more ...


# 817e742b 25-Sep-2024 Jeremy Morse <jeremy.morse@sony.com>

Revert "[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)"

This reverts commit 3f37c517fbc40531571f8b9f951a8610b4789cd6.

Lo and behold, I missed a unit test


# 3f37c517 25-Sep-2024 Jeremy Morse <jeremy.morse@sony.com>

[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)

If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurio

[NFC] Switch a number of DenseMaps to SmallDenseMaps for speedup (#109417)

If we use SmallDenseMaps instead of DenseMaps at these locations,
we get a substantial speedup because there's less spurious malloc
traffic. Discovered by instrumenting DenseMap with some accounting
code, then selecting sites where we'll get the most bang for our buck.

show more ...


Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1
# c80c09f3 23-Jul-2024 Dimitry Andric <dimitry@andric.com>

[CalcSpillWeights] Avoid x87 excess precision influencing weight result

Fixes #99396

The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by
x87 excess precision, which can result in

[CalcSpillWeights] Avoid x87 excess precision influencing weight result

Fixes #99396

The result of `VirtRegAuxInfo::weightCalcHelper` can be influenced by
x87 excess precision, which can result in slightly different register
choices when the compiler is hosted on x86_64 or i386. This leads to
different object file output when cross-compiling to i386, or native.

Similar to 7af3432e22b0, we need to add a `volatile` qualifier to the
local `Weight` variable to force it onto the stack, and avoid the excess
precision. Define `stack_float_t` in `MathExtras.h` for this purpose,
and use it.

show more ...


Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# f6d431f2 24-Apr-2024 Xu Zhang <simonzgx@gmail.com>

[CodeGen] Make the parameter TRI required in some functions. (#85968)

Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many

[CodeGen] Make the parameter TRI required in some functions. (#85968)

Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.

show more ...


Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6
# 5353d3f5 17-Nov-2023 Matthias Braun <matze@braunis.de>

Remove unused LoopInfo from InlineSpiller and SpillPlacement (NFC) (#71874)


Revision tags: llvmorg-17.0.5
# 284c6990 03-Nov-2023 Nick Desaulniers <nickdesaulniers@users.noreply.github.com>

[CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747)

This is necessary for RegAllocGreedy support for memory folding inline
asm that us

[CalcSpillWeights] don't mark live intervals with spillable inlineasm ops as having infinite spill weight (#70747)

This is necessary for RegAllocGreedy support for memory folding inline
asm that uses "rm" constraints.

Thanks to @qcolombet for the suggestion.

Link: https://github.com/llvm/llvm-project/issues/20571

show more ...


Revision tags: llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 4d0f1e32 11-Aug-2023 Elliot Goodrich <elliotgoodrich@gmail.com>

[llvm] Remove SmallSet from MachineInstr.h

`MachineInstr.h` is a commonly included file and this includes
`llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is
used only in one place

[llvm] Remove SmallSet from MachineInstr.h

`MachineInstr.h` is a commonly included file and this includes
`llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is
used only in one place.

According to `ClangBuildAnalyzer` (run solely on building LLVM, no other
projects) the second most expensive template to instantiate is the
`SmallSet::insert` method used in the `inline` implementation in
`getUsedDebugRegs()`:

```
**** Templates that took longest to instantiate:
554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms)
521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560
ms)
...
```

By removing this method and putting its implementation in the one call
site we greatly reduce the template instantiation time and reduce the
number of includes.

When copying the implementation, I removed a check on `MO.getReg()` as
this is checked within `MO.isVirtual()`.

Differential Revision: https://reviews.llvm.org/D157720

show more ...


Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 4d42e8b5 28-Jul-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.

The workaround in c26dfc81e254c78dc2

Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.

The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work
around the underlying problem with SUBREG_TO_REG.

show more ...


# a496c8be 26-Jul-2023 Vitaly Buka <vitalybuka@google.com>

Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048

Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"

And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381

show more ...


Revision tags: llvmorg-18-init
# b7836d85 07-Jul-2023 Yashwant Singh <Yashwant.Singh@amd.com>

[CodeGen]Allow targets to use target specific COPY instructions for live range splitting

Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instru

[CodeGen]Allow targets to use target specific COPY instructions for live range splitting

Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# eac8e25e 22-Mar-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC.

The first member of the pair should be unsigned instead of Register
because it is the hint type, 0 for simple (target independent) hint

[CodeGen] Fix type of MachineRegisterInfo::RegAllocHints. NFC.

The first member of the pair should be unsigned instead of Register
because it is the hint type, 0 for simple (target independent) hints and
other values for target dependent hints.

Differential Revision: https://reviews.llvm.org/D146646

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# d170a254 03-Feb-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Define and use MachineOperand::getOperandNo

This is a helper function to very slightly simplify many calls to
MachineInstruction::getOperandNo.

Differential Revision: https://reviews.llvm

[CodeGen] Define and use MachineOperand::getOperandNo

This is a helper function to very slightly simplify many calls to
MachineInstruction::getOperandNo.

Differential Revision: https://reviews.llvm.org/D143250

show more ...


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init
# e72ca520 13-Jan-2023 Craig Topper <craig.topper@sifive.com>

[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC

Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715


Revision tags: llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 6c44a717 23-Jul-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

RegAlloc: Use SmallSet instead of std::set

There shouldn't be more than a small handful of hints at most.


# 8d0383eb 24-Jun-2022 Matt Arsenault <Matthew.Arsenault@amd.com>

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.

show more ...


# 9e6d1f4b 17-Jul-2022 Kazu Hirata <kazu@google.com>

[CodeGen] Qualify auto variables in for loops (NFC)


Revision tags: llvmorg-14.0.6
# 145cc9db 14-Jun-2022 Kazu Hirata <kazu@google.com>

[CodeGen] Remove futureWeight (NFC)

The last use was removed on Jun 5, 2022 in
commit 5c06f7168fd1bd589b831cacd5f1cb8a928446fb, which itself was
a patch to remove unused code.


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 64e56f83 21-Dec-2021 Mircea Trofin <mtrofin@google.com>

[NFC] Expose isRematerializable and copyHint from CalcSpillWeights

We need to reuse them for the ML regalloc eviction advisor, as we
'explode' the weight calculation into sub-features.

Differential

[NFC] Expose isRematerializable and copyHint from CalcSpillWeights

We need to reuse them for the ML regalloc eviction advisor, as we
'explode' the weight calculation into sub-features.

Differential Revision: https://reviews.llvm.org/D116074

show more ...


# 07622368 21-Dec-2021 Mircea Trofin <mtrofin@google.com>

[NFC] Fix clang-tidy issues in CalcSpillWeights.cpp


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 311d81ce 24-Mar-2021 Serguei Katkov <serguei.katkov@azul.com>

[RegAlloc] Fix "ran out of regs" with uses in statepoint

Statepoint instruction is known to have a variable and big number of operands.
It is possible that Register Allocator will split live interva

[RegAlloc] Fix "ran out of regs" with uses in statepoint

Statepoint instruction is known to have a variable and big number of operands.
It is possible that Register Allocator will split live intervals in the way that all
physical registers are occupied by "zero-length" live intervals which are marked
as not-spillable.
While intervals are marked as not-spillable in the moment of creation when they are
really zero-length it is possible that in future as part of re-materialization there will
need for physical register between def and use of such tiny interval (the use is not
related to this interval at all).
As all physical registers are assigned to not-spillable intervals there is not avaialbe
registers and RA reports an error.

The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint
instruction in var args section. Such interval may be perfectly spilled and folded to
operand of statepoint.

Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98766

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 0447f350 10-Dec-2020 David Green <david.green@arm.com>

[ARM][RegAlloc] Add t2LoopEndDec

We currently have problems with the way that low overhead loops are
specified, with LR being spilled between the t2LoopDec and the t2LoopEnd
forcing the entire loop

[ARM][RegAlloc] Add t2LoopEndDec

We currently have problems with the way that low overhead loops are
specified, with LR being spilled between the t2LoopDec and the t2LoopEnd
forcing the entire loop to be reverted late in the backend. As they will
eventually become a single instruction, this patch introduces a
t2LoopEndDec which is the combination of the two, combined before
registry allocation to make sure this does not fail.

Unfortunately this instruction is a terminator that produces a value
(and also branches - it only produces the value around the branching
edge). So this needs some adjustment to phi elimination and the register
allocator to make sure that we do not spill this LR def around the loop
(needing to put a spill after the terminator). We treat the loop very
carefully, making sure that there is nothing else like calls that would
break it's ability to use LR. For that, this adds a
isUnspillableTerminator to opt in the new behaviour.

There is a chance that this could cause problems, and so I have added an
escape option incase. But I have not seen any problems in the testing
that I've tried, and not reverting Low overhead loops is important for
our performance. If this does work then we can hopefully do the same for
t2WhileLoopStart and t2DoLoopStart instructions.

This patch also contains the code needed to convert or revert the
t2LoopEndDec in the backend (which just needs a subs; bne) and the code
pre-ra to create them.

Differential Revision: https://reviews.llvm.org/D91358

show more ...


12345