Revision tags: llvmorg-21-init |
|
#
0ef39a88 |
| 24-Jan-2025 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
MachineCSE: Remove check for subreg on a def operand (#124095)
There are no subregister defs in SSA.
|
#
09bf5b0d |
| 16-Jan-2025 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Avoid repeated hash lookups (NFC) (#123160)
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
fe636692 |
| 09-Nov-2024 |
paperchalice <liujunchang97@outlook.com> |
[Instrumentation] Support `MachineFunction` in `OptNoneInstrumentation` (#115471)
Support `MachineFunction` in `OptNoneInstrumentation`, also add
`isRequired` to all necessary passes.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
6c143a86 |
| 04-Sep-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605)
|
#
4552153c |
| 04-Sep-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[CodeGen][MachineCSE] Remove unused AA results(NFC) (#106604)
Alias Analysis result is never used in this pass and hence removing it.
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
3d08ade7 |
| 29-Aug-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:
https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).
Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
show more ...
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
09989996 |
| 12-Jul-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)
- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass
[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)
- Add `MachineBlockFrequencyAnalysis`.
- Add `MachineBlockFrequencyPrinterPass`.
- Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager.
- `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new
pass manager migration.
show more ...
|
#
0d9d5f7e |
| 12-Jul-2024 |
Vikram Hegde <115221833+vikramRH@users.noreply.github.com> |
[CodeGen] Guard copy propagation in machine CSE against undefs (#97413)
|
Revision tags: llvmorg-18.1.8 |
|
#
837dc542 |
| 11-Jun-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree v
[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
show more ...
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
f6d431f2 |
| 24-Apr-2024 |
Xu Zhang <simonzgx@gmail.com> |
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
92c2529c |
| 04-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[llvm] Stop including vector (NFC)
Identified with clangd.
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4 |
|
#
0c5c7b52 |
| 31-Aug-2023 |
Daniel Paoliello <danpao@microsoft.com> |
Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:
Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:
* The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).
Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.
Documentation for the symbol can be found in the Microsoft PDB library dumper: https://github.com/microsoft/microsoft-pdb/blob/0fe89a942f9a0f8e061213313e438884f4c9b876/cvdump/dumpsym7.cpp#L5518
This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D149367
show more ...
|
#
0a4fc4ac |
| 26-Aug-2023 |
Arthur Eubanks <aeubanks@google.com> |
Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables"
This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3.
Causes crashes, see comments in https://reviews.llvm.org/D14
Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables"
This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3.
Causes crashes, see comments in https://reviews.llvm.org/D149367.
Some follow-up fixes are also reverted:
This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085. This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6. This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.
show more ...
|
#
8d0c3db3 |
| 25-Aug-2023 |
Daniel Paoliello <danpao@microsoft.com> |
Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:
Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables
The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information:
* The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted).
Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to.
Documentation for the symbol can be found in the Microsoft PDB library dumper: https://github.com/microsoft/microsoft-pdb/blob/0fe89a942f9a0f8e061213313e438884f4c9b876/cvdump/dumpsym7.cpp#L5518
This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes).
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D149367
show more ...
|
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2 |
|
#
f580901d |
| 03-Aug-2023 |
Jingu Kang <jingu.kang@arm.com> |
[MachineCSE] Add an option to override the profitability heuristics
Differential Revision: https://reviews.llvm.org/D157002
|
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
5022fc2a |
| 24-May-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.
Differential Revision: https://reviews.llvm.org/D151424
|
Revision tags: llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
8bf7f86d |
| 17-Apr-2023 |
Akshay Khadse <akshayskhadse@gmail.com> |
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differentia
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148303
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
e72ca520 |
| 13-Jan-2023 |
Craig Topper <craig.topper@sifive.com> |
[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
Revision tags: llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5 |
|
#
11e86868 |
| 14-Nov-2022 |
Guozhi Wei <carrot@google.com> |
[MachineCSE] Allow CSE for instructions with ignorable operands
Ignorable operands don't impact instruction's behavior, we can safely do CSE on the instruction.
It is split from D130919. It has big
[MachineCSE] Allow CSE for instructions with ignorable operands
Ignorable operands don't impact instruction's behavior, we can safely do CSE on the instruction.
It is split from D130919. It has big impact to some AMDGPU test cases. For example in atomic_optimizations_raw_buffer.ll, when trying to check if the following instruction can be CSEed
%37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec", this function is implemented as
- return TRI.isCallerPreservedPhysReg(Reg, MF) || + return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) || (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg));
Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed to CSE this instruction.
With this patch TII.isIgnorableUse returns true for the operand $exec, so isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to be CSEed with previous instruction
%14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
So I got different result from here. AMDGPU's implementation of isIgnorableUse is
bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const { // Any implicit use of exec by VALU is not a real register read. return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() && isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent()); }
Since the operand $exec is not a real register read, my understanding is it's reasonable to do CSE on such instructions.
Because more instructions are CSEed, so I get less instructions generated for these tests.
Differential Revision: https://reviews.llvm.org/D137222
show more ...
|
Revision tags: llvmorg-15.0.4 |
|
#
88ac25b3 |
| 27-Oct-2022 |
John Brawn <john.brawn@arm.com> |
[MachineCSE] Allow PRE of instructions that read physical registers
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value be
[MachineCSE] Allow PRE of instructions that read physical registers
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to.
This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things.
This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd).
Differential Revision: https://reviews.llvm.org/D136675
show more ...
|
#
7a7b36e9 |
| 28-Oct-2022 |
John Brawn <john.brawn@arm.com> |
Revert "[MachineCSE] Allow PRE of instructions that read physical registers"
This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a.
This is causing a miscompile in ffmpeg when compiled for a
Revert "[MachineCSE] Allow PRE of instructions that read physical registers"
This reverts commit 628467e53f4ceecd2b5f0797f07591c66d9d9d2a.
This is causing a miscompile in ffmpeg when compiled for armv7.
show more ...
|
#
628467e5 |
| 27-Oct-2022 |
John Brawn <john.brawn@arm.com> |
[MachineCSE] Allow PRE of instructions that read physical registers
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value be
[MachineCSE] Allow PRE of instructions that read physical registers
Currently MachineCSE forbids PRE when the instruction reads a physical register. Relax this so that it's allowed when the value being read is the same as what would be read in the place the instruction would be hoisted to.
This is being done in preparation for adding FPCR handling to the AArch64 backend, in order to prevent it to from worsening the generated code, but for targets that already have a similar register it should improve things.
This patch affects code generation in several tests. The new code looks better except for in Thumb2/LowOverheadLoops/memcall.ll where we perform PRE but the LowOverheadLoops transformation then undoes it. Also in AMDGPU/selectcc-opt.ll the CHECK makes things look worse, but actually the function as a whole is better (as a MOV is PRE'd).
Differential Revision: https://reviews.llvm.org/D136675
show more ...
|
Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1 |
|
#
59365f33 |
| 16-Sep-2022 |
Pengxuan Zheng <pzheng@quicinc.com> |
[MachineCSE] Add a threshold to avoid spending too much time in isProfitableToCSE
Currently, it can become extremely costly to compute MayIncreasePressure if the size of CSUses turns out to be very
[MachineCSE] Add a threshold to avoid spending too much time in isProfitableToCSE
Currently, it can become extremely costly to compute MayIncreasePressure if the size of CSUses turns out to be very large. In that case, it's no longer cost effective to keep computing MayIncreasePressure. Therefore, to limit the amount of time spent in isProfitableToCSE, we simply conservatively assume MayIncreasePressure if the size of CSUses is too large. This can reduce overall compile time by 30% for some benchmarks.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D134003
show more ...
|
Revision tags: llvmorg-15.0.0 |
|
#
77dbc520 |
| 31-Aug-2022 |
Craig Topper <craig.topper@sifive.com> |
[MachineCSE] Use TargetInstrInfo::isAsCheapAsAMove in isPRECandidate.
Some targets like RISC-V require operands to be inspected to determine if an instruction is similar to a move.
Spotted while in
[MachineCSE] Use TargetInstrInfo::isAsCheapAsAMove in isPRECandidate.
Some targets like RISC-V require operands to be inspected to determine if an instruction is similar to a move.
Spotted while investigating code differences between using an ADDI vs an ADDIW. RISC-V has the isAsCheapAsAMove flag for ADDI, but the TII hook checks the immediate is 0 or the register is X0. ADDIW is never generated with X0 or with an immediate of 0 so it doesn't have the isAsCheapAsAMove flag.
I don't know enough about the PRE code to write a test for this yet.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D132981
show more ...
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
8d0383eb |
| 24-Jun-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
show more ...
|