History log of /llvm-project/llvm/lib/CodeGen/MachineCombiner.cpp (Results 1 – 25 of 110)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6
# a35db288 16-Dec-2024 David Green <david.green@arm.com>

[NFC] Remove some unnecessary semicolons

All inside LLVM_DEBUG, some of which have been cleaned up by adding block
scopes to allow them to format more nicely.


Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3
# 6ab26eab 28-Oct-2024 Ellis Hoag <ellis.sparky.hoag@gmail.com>

Check hasOptSize() in shouldOptimizeForSize() (#112626)


# 732b804e 16-Oct-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (#108507)


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 79d0de2a 09-Jul-2024 paperchalice <liujunchang97@outlook.com>

[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)

- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.


Revision tags: llvmorg-18.1.8
# 837dc542 11-Jun-2024 paperchalice <liujunchang97@outlook.com>

[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)

Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree v

[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)

Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.

show more ...


Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5
# f6d431f2 24-Apr-2024 Xu Zhang <simonzgx@gmail.com>

[CodeGen] Make the parameter TRI required in some functions. (#85968)

Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many

[CodeGen] Make the parameter TRI required in some functions. (#85968)

Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.

show more ...


Revision tags: llvmorg-18.1.4
# b5640369 11-Apr-2024 Pengcheng Wang <wangpengcheng.pp@bytedance.com>

[MachineCombiner][NFC] Split target-dependent patterns

We split target-dependent MachineCombiner patterns into their target
folder.

This makes MachineCombiner much more target-independent.

Reviewe

[MachineCombiner][NFC] Split target-dependent patterns

We split target-dependent MachineCombiner patterns into their target
folder.

This makes MachineCombiner much more target-independent.

Reviewers:
davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc

Reviewed By: topperc, mshockwave

Pull Request: https://github.com/llvm/llvm-project/pull/87991

show more ...


Revision tags: llvmorg-18.1.3, llvmorg-18.1.2
# 6588ac30 14-Mar-2024 Jonas Paulsson <paulson1@linux.ibm.com>

[MachineCombiner] Don't ignore PHI depths (#82025)

The depths of the Root and the NewRoot are to be compared in
MachineCombiner::improvesCriticalPathLen(), and while the call to
BlockTrace.getInst

[MachineCombiner] Don't ignore PHI depths (#82025)

The depths of the Root and the NewRoot are to be compared in
MachineCombiner::improvesCriticalPathLen(), and while the call to
BlockTrace.getInstrCycles(*Root) includes the Depth of a PHI, for some
reason PHI nodes have been ignored in getOperandDef().

This patch removes the special handling of PHIs in getOperandDef() so that
Root and NewRoot get a fair comparison. This does not affect loop headers
as MachineTraceMetrics handles that case by ignoring incoming PHI edges.

show more ...


Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5
# 5022fc2a 24-May-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.

Differential Revision: https://reviews.llvm.org/D151424


Revision tags: llvmorg-16.0.4
# b0fb9822 05-May-2023 Luo, Yuanke <yuanke.luo@intel.com>

[Coverity] Big parameter passed by value.


Revision tags: llvmorg-16.0.3
# 40222ddc 29-Apr-2023 Luo, Yuanke <yuanke.luo@intel.com>

[X86] Fix the vnni machine combine issue.

The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly
in genAlternativeDpCodeSequence(). It causes vnni lit test failure when
LLVM_ENABLE

[X86] Fix the vnni machine combine issue.

The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly
in genAlternativeDpCodeSequence(). It causes vnni lit test failure when
LLVM_ENABLE_EXPENSIVE_CHECKS is on.

show more ...


# 8f7f9d86 21-Apr-2023 Luo, Yuanke <yuanke.luo@intel.com>

[X86] Machine combine vnni instruction.

"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is
reduced after combination. However when vpdpwssd is in a critical path
the combination get

[X86] Machine combine vnni instruction.

"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is
reduced after combination. However when vpdpwssd is in a critical path
the combination get less ILP. It happens when vpdpwssd is in a loop, the
vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd
has data dependency for each iterations. If vpaddd is in a critical path
while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd
+ vpaddd".
This patch is based on the machine combiner framework to acheive decision
on "vpmaddwd + vpaddd" combination. The typical example code is as
below.
```
__m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) {

for (int i = 0; i < cnt; ++i) {
__m256i a = p[i];
__m256i m = _mm256_madd_epi16 (b, a);
c = _mm256_add_epi32(m, c);
}

return c;
}
```

Differential Revision: https://reviews.llvm.org/D148980

show more ...


# 43b38696 20-Apr-2023 Akshay Khadse <akshayskhadse@gmail.com>

Fix uninitialized class members

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148692


Revision tags: llvmorg-16.0.2
# 8bf7f86d 17-Apr-2023 Akshay Khadse <akshayskhadse@gmail.com>

Fix uninitialized pointer members in CodeGen

This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differentia

Fix uninitialized pointer members in CodeGen

This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303

show more ...


Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3
# 2693efa8 14-Feb-2023 Anton Sidorenko <anton.sidorenko@syntacore.com>

[MachineCombiner] Support local strategy for traces

For in-order cores MachineCombiner makes better decisions when the critical path
is calculated only for the current basic block and does not take

[MachineCombiner] Support local strategy for traces

For in-order cores MachineCombiner makes better decisions when the critical path
is calculated only for the current basic block and does not take into account
other blocks from the trace.

This patch adds a virtual method to TargetInstrInfo to allow each target decide
which strategy to use.

Depends on D140541

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140542

show more ...


# 5bdd0bee 14-Feb-2023 Anton Sidorenko <anton.sidorenko@syntacore.com>

[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble`

We are about to allow different trace strategies for MachineCombiner. Make
the name of the ensemble strategy-neutral.

Depends on D140540

[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble`

We are about to allow different trace strategies for MachineCombiner. Make
the name of the ensemble strategy-neutral.

Depends on D140540

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140541

show more ...


# 77bd15ae 14-Feb-2023 Anton Sidorenko <anton.sidorenko@syntacore.com>

[MachineTraceMetrics][NFC] Move Strategy enum out of the class

Make forward declaration possible to reduce amount of dependencies and reduce
re-compilation burden caused by further patches.

Reviewe

[MachineTraceMetrics][NFC] Move Strategy enum out of the class

Make forward declaration possible to reduce amount of dependencies and reduce
re-compilation burden caused by further patches.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140539

show more ...


Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


# e72ca520 13-Jan-2023 Craig Topper <craig.topper@sifive.com>

[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC

Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715


Revision tags: llvmorg-15.0.7
# 38818b60 04-Jan-2023 serge-sans-paille <sguelton@mozilla.com>

Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part

Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0

Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part

Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955

show more ...


# 9560ac3a 04-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombine] Reorganize code for readability and tracing [nfc]


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3
# b6c79073 11-Oct-2022 Anton Sidorenko <anton.sidorenko@syntacore.com>

[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns

This patch adds tranformation of fmul+fadd/fsub chains to fused multiply
instructions:
* fmul+fadd->fmadd
* fmul+fsub->fmsub

[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns

This patch adds tranformation of fmul+fadd/fsub chains to fused multiply
instructions:
* fmul+fadd->fmadd
* fmul+fsub->fmsub/fnmsub

We also will try to combine these instructions if the fmul has more than one use
and cannot be deleted. However, removing the dependence between fmul and fadd can
still be profitable, and we rely on machine combiner approximations of scheduling.

Differential Revision: https://reviews.llvm.org/D136764

show more ...


# 4431e705 13-Oct-2022 Anton Sidorenko <anton.sidorenko@syntacore.com>

[NFC] Use forward decl of MachineCombinerPattern enum to reduce dependencies

Differential Revision: https://reviews.llvm.org/D135776


Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 9e6d1f4b 17-Jul-2022 Kazu Hirata <kazu@google.com>

[CodeGen] Qualify auto variables in for loops (NFC)


# 2f11b3a6 14-Jul-2022 Guozhi Wei <carrot@google.com>

[MachineCombiner] Don't compute the latency of transient instructions

If an MI will not generate a target instruction, we should not compute its
latency. Then we can compute more precise instruction

[MachineCombiner] Don't compute the latency of transient instructions

If an MI will not generate a target instruction, we should not compute its
latency. Then we can compute more precise instruction sequence cost, and get
better result.

Differential Revision: https://reviews.llvm.org/D129615

show more ...


12345