Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6 |
|
#
a35db288 |
| 16-Dec-2024 |
David Green <david.green@arm.com> |
[NFC] Remove some unnecessary semicolons
All inside LLVM_DEBUG, some of which have been cleaned up by adding block scopes to allow them to format more nicely.
|
Revision tags: llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
6ab26eab |
| 28-Oct-2024 |
Ellis Hoag <ellis.sparky.hoag@gmail.com> |
Check hasOptSize() in shouldOptimizeForSize() (#112626)
|
#
732b804e |
| 16-Oct-2024 |
Christudasan Devadasan <christudasan.devadasan@amd.com> |
[CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (#108507)
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
79d0de2a |
| 09-Jul-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
|
Revision tags: llvmorg-18.1.8 |
|
#
837dc542 |
| 11-Jun-2024 |
paperchalice <liujunchang97@outlook.com> |
[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree v
[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
show more ...
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5 |
|
#
f6d431f2 |
| 24-Apr-2024 |
Xu Zhang <simonzgx@gmail.com> |
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659
There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411.
Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
show more ...
|
Revision tags: llvmorg-18.1.4 |
|
#
b5640369 |
| 11-Apr-2024 |
Pengcheng Wang <wangpengcheng.pp@bytedance.com> |
[MachineCombiner][NFC] Split target-dependent patterns
We split target-dependent MachineCombiner patterns into their target folder.
This makes MachineCombiner much more target-independent.
Reviewe
[MachineCombiner][NFC] Split target-dependent patterns
We split target-dependent MachineCombiner patterns into their target folder.
This makes MachineCombiner much more target-independent.
Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc
Reviewed By: topperc, mshockwave
Pull Request: https://github.com/llvm/llvm-project/pull/87991
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2 |
|
#
6588ac30 |
| 14-Mar-2024 |
Jonas Paulsson <paulson1@linux.ibm.com> |
[MachineCombiner] Don't ignore PHI depths (#82025)
The depths of the Root and the NewRoot are to be compared in
MachineCombiner::improvesCriticalPathLen(), and while the call to
BlockTrace.getInst
[MachineCombiner] Don't ignore PHI depths (#82025)
The depths of the Root and the NewRoot are to be compared in
MachineCombiner::improvesCriticalPathLen(), and while the call to
BlockTrace.getInstrCycles(*Root) includes the Depth of a PHI, for some
reason PHI nodes have been ignored in getOperandDef().
This patch removes the special handling of PHIs in getOperandDef() so that
Root and NewRoot get a fair comparison. This does not affect loop headers
as MachineTraceMetrics handles that case by ignoring incoming PHI edges.
show more ...
|
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5 |
|
#
5022fc2a |
| 24-May-2023 |
Jay Foad <jay.foad@amd.com> |
[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.
Differential Revision: https://reviews.llvm.org/D151424
|
Revision tags: llvmorg-16.0.4 |
|
#
b0fb9822 |
| 05-May-2023 |
Luo, Yuanke <yuanke.luo@intel.com> |
[Coverity] Big parameter passed by value.
|
Revision tags: llvmorg-16.0.3 |
|
#
40222ddc |
| 29-Apr-2023 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86] Fix the vnni machine combine issue.
The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly in genAlternativeDpCodeSequence(). It causes vnni lit test failure when LLVM_ENABLE
[X86] Fix the vnni machine combine issue.
The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly in genAlternativeDpCodeSequence(). It causes vnni lit test failure when LLVM_ENABLE_EXPENSIVE_CHECKS is on.
show more ...
|
#
8f7f9d86 |
| 21-Apr-2023 |
Luo, Yuanke <yuanke.luo@intel.com> |
[X86] Machine combine vnni instruction.
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is reduced after combination. However when vpdpwssd is in a critical path the combination get
[X86] Machine combine vnni instruction.
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is reduced after combination. However when vpdpwssd is in a critical path the combination get less ILP. It happens when vpdpwssd is in a loop, the vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd has data dependency for each iterations. If vpaddd is in a critical path while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd + vpaddd". This patch is based on the machine combiner framework to acheive decision on "vpmaddwd + vpaddd" combination. The typical example code is as below. ``` __m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) {
for (int i = 0; i < cnt; ++i) { __m256i a = p[i]; __m256i m = _mm256_madd_epi16 (b, a); c = _mm256_add_epi32(m, c); }
return c; } ```
Differential Revision: https://reviews.llvm.org/D148980
show more ...
|
#
43b38696 |
| 20-Apr-2023 |
Akshay Khadse <akshayskhadse@gmail.com> |
Fix uninitialized class members
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148692
|
Revision tags: llvmorg-16.0.2 |
|
#
8bf7f86d |
| 17-Apr-2023 |
Akshay Khadse <akshayskhadse@gmail.com> |
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differentia
Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D148303
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3 |
|
#
2693efa8 |
| 14-Feb-2023 |
Anton Sidorenko <anton.sidorenko@syntacore.com> |
[MachineCombiner] Support local strategy for traces
For in-order cores MachineCombiner makes better decisions when the critical path is calculated only for the current basic block and does not take
[MachineCombiner] Support local strategy for traces
For in-order cores MachineCombiner makes better decisions when the critical path is calculated only for the current basic block and does not take into account other blocks from the trace.
This patch adds a virtual method to TargetInstrInfo to allow each target decide which strategy to use.
Depends on D140541
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D140542
show more ...
|
#
5bdd0bee |
| 14-Feb-2023 |
Anton Sidorenko <anton.sidorenko@syntacore.com> |
[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble`
We are about to allow different trace strategies for MachineCombiner. Make the name of the ensemble strategy-neutral.
Depends on D140540
[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble`
We are about to allow different trace strategies for MachineCombiner. Make the name of the ensemble strategy-neutral.
Depends on D140540
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D140541
show more ...
|
#
77bd15ae |
| 14-Feb-2023 |
Anton Sidorenko <anton.sidorenko@syntacore.com> |
[MachineTraceMetrics][NFC] Move Strategy enum out of the class
Make forward declaration possible to reduce amount of dependencies and reduce re-compilation burden caused by further patches.
Reviewe
[MachineTraceMetrics][NFC] Move Strategy enum out of the class
Make forward declaration possible to reduce amount of dependencies and reduce re-compilation burden caused by further patches.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D140539
show more ...
|
Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
86eff6be |
| 20-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
[MachineCombiner] Use default latency model when no detailed model available
This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat
[MachineCombiner] Use default latency model when no detailed model available
This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.
Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)
The motivation here is two fold.
First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.
Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.
Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.
For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.
Differential Revision: https://reviews.llvm.org/D141017
show more ...
|
#
e72ca520 |
| 13-Jan-2023 |
Craig Topper <craig.topper@sifive.com> |
[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D141715
|
Revision tags: llvmorg-15.0.7 |
|
#
38818b60 |
| 04-Jan-2023 |
serge-sans-paille <sguelton@mozilla.com> |
Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0
Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.
The only non-automatic changes have been:
1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).
Per reviewers' comment, some useless makeArrayRef have been removed in the process.
This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides.
Differential Revision: https://reviews.llvm.org/D140955
show more ...
|
#
9560ac3a |
| 04-Jan-2023 |
Philip Reames <preames@rivosinc.com> |
[MachineCombine] Reorganize code for readability and tracing [nfc]
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
b6c79073 |
| 11-Oct-2022 |
Anton Sidorenko <anton.sidorenko@syntacore.com> |
[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns
This patch adds tranformation of fmul+fadd/fsub chains to fused multiply instructions: * fmul+fadd->fmadd * fmul+fsub->fmsub
[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns
This patch adds tranformation of fmul+fadd/fsub chains to fused multiply instructions: * fmul+fadd->fmadd * fmul+fsub->fmsub/fnmsub
We also will try to combine these instructions if the fmul has more than one use and cannot be deleted. However, removing the dependence between fmul and fadd can still be profitable, and we rely on machine combiner approximations of scheduling.
Differential Revision: https://reviews.llvm.org/D136764
show more ...
|
#
4431e705 |
| 13-Oct-2022 |
Anton Sidorenko <anton.sidorenko@syntacore.com> |
[NFC] Use forward decl of MachineCombinerPattern enum to reduce dependencies
Differential Revision: https://reviews.llvm.org/D135776
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <kazu@google.com> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
#
2f11b3a6 |
| 14-Jul-2022 |
Guozhi Wei <carrot@google.com> |
[MachineCombiner] Don't compute the latency of transient instructions
If an MI will not generate a target instruction, we should not compute its latency. Then we can compute more precise instruction
[MachineCombiner] Don't compute the latency of transient instructions
If an MI will not generate a target instruction, we should not compute its latency. Then we can compute more precise instruction sequence cost, and get better result.
Differential Revision: https://reviews.llvm.org/D129615
show more ...
|