History log of /llvm-project/llvm/lib/CodeGen/MachineCombiner.cpp (Results 51 – 75 of 110)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a0cd09d4 16-Mar-2018 Andrew V. Tischenko <andrew.v.tischenko@gmail.com>

This patch fixes the invalid usage of OptSize in Machine Combiner.
Differential Revision: https://reviews.llvm.org/D43813

llvm-svn: 327721


Revision tags: llvmorg-6.0.0
# 08389192 26-Feb-2018 Andrew V. Tischenko <andrew.v.tischenko@gmail.com>

The final step to close D41278 [MachineCombiner] Improve debug output (NFC).
Differential Revision: https://reviews.llvm.org/D41278

llvm-svn: 326074


Revision tags: llvmorg-6.0.0-rc3
# b65b078d 15-Feb-2018 Andrew V. Tischenko <andrew.v.tischenko@gmail.com>

(NFC)[MachineCombiner] Improve debug output.

llvm-svn: 325217


Revision tags: llvmorg-6.0.0-rc2
# 6805004c 06-Feb-2018 Alexander Ivchenko <alexander.ivchenko@intel.com>

Fix unused variable warning in release mode. NFC.

llvm-svn: 324330


# c68428b5 31-Jan-2018 Florian Hahn <florian.hahn@arm.com>

[MachineCombiner] Add check for optimal pattern order.

In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case

[MachineCombiner] Add check for optimal pattern order.

In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case in the future. I do not think there is a need for evaluating all
patterns for now. But this patch adds an extra (expensive) check, that
evaluates the latencies of all patterns, and ensures that the latency
saved decreases for subsequent patterns.

This catches the sub-optimal order fixed in D41587, but I am not
entirely happy with the check, as it only applies to sub-optimal
patterns seen while building with EXPENSIVE_CHECKS on. It did not
discover any other sub-optimal pattern ordering.

Reviewers: Gerolf, spatel, mssimpso

Reviewed By: Gerolf, mssimpso

Differential Revision: https://reviews.llvm.org/D41766

llvm-svn: 323873

show more ...


Revision tags: llvmorg-6.0.0-rc1
# f1caa283 15-Dec-2017 Matthias Braun <matze@braunis.de>

MachineFunction: Return reference from getFunction(); NFC

The Function can never be nullptr so we can return a reference.

llvm-svn: 320884


# c468b648 13-Dec-2017 Michael Zolotukhin <mzolotukhin@apple.com>

Remove redundant includes from lib/CodeGen.

llvm-svn: 320619


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3
# 001c3dd2 06-Dec-2017 Florian Hahn <florian.hahn@arm.com>

[MachineCombiner] Add up latencies of all instructions in new pattern.

Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we o

[MachineCombiner] Add up latencies of all instructions in new pattern.

Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.

Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like

r1 = INS1 ..
r2 = INS2 ..
r3 = INS3 r1, r2

I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.

Reviewers: Gerolf, sebpop, spop, fhahn

Reviewed By: fhahn

Subscribers: evandro, llvm-commits

Differential Revision: https://reviews.llvm.org/D40307

llvm-svn: 319951

show more ...


Revision tags: llvmorg-5.0.1-rc2
# b3bde2ea 17-Nov-2017 David Blaikie <dblaikie@gmail.com>

Fix a bunch more layering of CodeGen headers that are in Target

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, n

Fix a bunch more layering of CodeGen headers that are in Target

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490

show more ...


# 3f833edc 08-Nov-2017 David Blaikie <dblaikie@gmail.com>

Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering

This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the

Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering

This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647

show more ...


Revision tags: llvmorg-5.0.1-rc1
# 194693e9 30-Oct-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[MC] Split out register def/use idx calls to make debugging simpler. NFCI.

llvm-svn: 316927


# e52abba2 11-Oct-2017 Florian Hahn <florian.hahn@arm.com>

[MachineCombiner] Fix initialisation of LastUpdate for incremental update.

Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental upd

[MachineCombiner] Fix initialisation of LastUpdate for incremental update.

Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled.

Patch by Paul Walker.

Reviewers: fhahn, Gerolf, efriedma, MatzeB

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38734

llvm-svn: 315502

show more ...


# ceb44947 20-Sep-2017 Florian Hahn <florian.hahn@arm.com>

Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.

This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when

Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.

This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751

show more ...


# 06e2a384 13-Sep-2017 Hans Wennborg <hans@hanshq.net>

Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."

This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary

Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."

This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213

show more ...


# d39b8a35 07-Sep-2017 Florian Hahn <florian.hahn@arm.com>

[MachineCombiner] Update instruction depths incrementally for large BBs.

Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner

[MachineCombiner] Update instruction depths incrementally for large BBs.

Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719

show more ...


Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1
# 1d2dc681 13-Jul-2017 Jakub Kuderski <kubakuderski@gmail.com>

[NFC] Move DEBUG_TYPE macro below includes...

in MachineCombiner.cpp.

llvm-svn: 307940


Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2
# 1527baab 25-May-2017 Matthias Braun <matze@braunis.de>

CodeGen: Rename DEBUG_TYPE to match passnames

Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE inste

CodeGen: Rename DEBUG_TYPE to match passnames

Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.

llvm-svn: 303921

show more ...


Revision tags: llvmorg-4.0.1-rc1
# 17ce8a2f 15-Mar-2017 Eric Christopher <echristo@gmail.com>

Fix up grammar in a comment.

llvm-svn: 297898


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3
# 8da96914 13-Feb-2017 Andrew V. Tischenko <andrew.v.tischenko@gmail.com>

Compile time decreasing in the case we're dealing with Machine Combiner.
Before this patch compile time was about 21s (see below). After this patch
we have less than 2s (see bellow).

Intel(R) Xeo

Compile time decreasing in the case we're dealing with Machine Combiner.
Before this patch compile time was about 21s (see below). After this patch
we have less than 2s (see bellow).

Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz

DAGCombiner - trunk
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.685s

DAGCombiner + Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.655s

MachineCombiner w/o Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m21.614s

MachineCombiner + Speed patch
time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math
real 0m1.593s

The test spill_fdiv.ll is attached to D29627
D29627 should be closed.

llvm-svn: 294936

show more ...


Revision tags: llvmorg-4.0.0-rc2
# a4976c61 29-Jan-2017 Matthias Braun <matze@braunis.de>

MachineInstr: Remove parameter from dump()

The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.d

MachineInstr: Remove parameter from dump()

The primary use of the dump() functions in LLVM is for use in a
debugger. Unfortunately lldb does not seem to handle default arguments
so using `p SomeMI.dump()` fails and you have to type the longer `p
SomeMI.dump(nullptr)`. Remove the paramter to make the most common use
easy. (You can always construct something like `p
SomeMI.print(dbgs(),MyTII)` if you need more features).

Differential Revision: https://reviews.llvm.org/D29241

llvm-svn: 293440

show more ...


Revision tags: llvmorg-4.0.0-rc1
# 77794843 21-Dec-2016 Sebastian Pop <sebpop@gmail.com>

machine combiner: fix pretty printer

we used to print UNKNOWN instructions when the instruction to be printer was not
yet inserted in any BB: in that case the pretty printer would not be able to
com

machine combiner: fix pretty printer

we used to print UNKNOWN instructions when the instruction to be printer was not
yet inserted in any BB: in that case the pretty printer would not be able to
compute a TII as the instruction does not belong to any BB or function yet.
This patch explicitly passes the TII to the pretty-printer.

Differential Revision: https://reviews.llvm.org/D27645

llvm-svn: 290228

show more ...


# e08d9c7c 11-Dec-2016 Sebastian Pop <sebpop@gmail.com>

instr-combiner: sum up all latencies of the transformed instructions

We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-

instr-combiner: sum up all latencies of the transformed instructions

We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-instruction combiner uses a
too-simple algorithm to compute the cost of one of the two alternatives [before
and after running a combining pass on a section of code], and therefor it throws
away the combination results too often.

This fix has the potential to help any ISA with the potential to combine
instructions and for which at least one subarchitecture has a scheduling model.
As of now, this is only known to definitely affect AArch64 subarchitectures with
a scheduling model.

Regression tested on AMD64/GNU-Linux, new test case tested to fail on an
unpatched compiler and pass on a patched compiler.

Patch by Abe Skolnik and Sebastian Pop.

llvm-svn: 289399

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# 117296c0 01-Oct-2016 Mehdi Amini <mehdi.amini@apple.com>

Use StringRef in Pass/PassManager APIs (NFC)

llvm-svn: 283004


Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 01b3a618 24-Apr-2016 Gerolf Hoflehner <ghoflehner@apple.com>

[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)

The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that

[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)

The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328

show more ...


# 591c3795 22-Apr-2016 Daniel Sanders <daniel.sanders@imgtec.com>

Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64

It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127


12345