History log of /llvm-project/llvm/test/CodeGen/X86/umul-with-overflow.ll (Results 1 – 25 of 36)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# 7b3bbd83 09-Oct-2023 Jay Foad <jay.foad@amd.com>

Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"

This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.


# 2501ae58 09-Oct-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Really renumber slot indexes before register allocation (#67038)

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries

[CodeGen] Really renumber slot indexes before register allocation (#67038)

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.

show more ...


Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0
# e0919b18 13-Sep-2023 Jay Foad <jay.foad@amd.com>

[CodeGen] Renumber slot indexes before register allocation (#66334)

RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering

[CodeGen] Renumber slot indexes before register allocation (#66334)

RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.

show more ...


Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 005173cb 21-Jan-2023 Roman Lebedev <lebedev.ri@gmail.com>

[X86] `X86TargetLowering`: override `allowsMemoryAccess()`

The baseline `allowsMemoryAccess()` is wrong for X86.
It assumes that aligned memory operations are always allowed,
but that is not true.

[X86] `X86TargetLowering`: override `allowsMemoryAccess()`

The baseline `allowsMemoryAccess()` is wrong for X86.
It assumes that aligned memory operations are always allowed,
but that is not true.

For example, We can not perform a 32-byte aligned non-temporal load
of a 32-byte vector, without AVX2 that is, yet `allowsMemoryAccess()`
will say it is allowed, so we may end up merging non-temporal loads,
only to split them up to legalize them, and here we go again.

NOTE: the test changes here are superfluous. The main effect is that without this change,
in D141777, we'd get stuck endlessly merging and splitting non-temporal stores.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D141776

show more ...


# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


Revision tags: llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4
# d24c93cc 26-Oct-2022 Guozhi Wei <carrot@google.com>

[X86] Enable reassociation for ADD instructions

ADD is an associative and commutative operation, so we can do reassociation for it.

Differential Revision: https://reviews.llvm.org/D136396


Revision tags: llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3
# 189900eb 16-Aug-2022 Matthias Braun <matze@braunis.de>

X86: Stop assigning register costs for longer encodings.

This stops reporting CostPerUse 1 for `R8`-`R15` and `XMM8`-`XMM31`.
This was previously done because instruction encoding require a REX
pref

X86: Stop assigning register costs for longer encodings.

This stops reporting CostPerUse 1 for `R8`-`R15` and `XMM8`-`XMM31`.
This was previously done because instruction encoding require a REX
prefix when using them resulting in longer instruction encodings. I
found that this regresses the quality of the register allocation as the
costs impose an ordering on eviction candidates. I also feel that there
is a bit of an impedance mismatch as the actual costs occure when
encoding instructions using those registers, but the order of VReg
assignments is not primarily ordered by number of Defs+Uses.

I did extensive measurements with the llvm-test-suite wiht SPEC2006 +
SPEC2017 included, internal services showed similar patterns. Generally
there are a log of improvements but also a lot of regression. But on
average the allocation quality seems to improve at a small code size
regression.

Results for measuring static and dynamic instruction counts:

Dynamic Counts (scaled by execution frequency) / Optimization Remarks:
Spills+FoldedSpills -5.6%
Reloads+FoldedReloads -4.2%
Copies -0.1%

Static / LLVM Statistics:
regalloc.NumSpills mean -1.6%, geomean -2.8%
regalloc.NumReloads mean -1.7%, geomean -3.1%
size..text mean +0.4%, geomean +0.4%

Static / LLVM Statistics:
mean -2.2%, geomean -3.1%) regalloc.NumSpills
mean -2.6%, geomean -3.9%) regalloc.NumReloads
mean +0.6%, geomean +0.6%) size..text

Static / LLVM Statistics:
regalloc.NumSpills mean -3.0%
regalloc.NumReloads mean -3.3%
size..text mean +0.3%, geomean +0.3%

Differential Revision: https://reviews.llvm.org/D133902

show more ...


Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# b0319ab7 30-Nov-2021 Fraser Cormack <fraser@codeplay.com>

[PR52475] Ensure a correct chain in copies to/from hidden sret parameter

This patch fixes an issue during SelectionDAG construction. When the
target is unable to lower the function's return value, a

[PR52475] Ensure a correct chain in copies to/from hidden sret parameter

This patch fixes an issue during SelectionDAG construction. When the
target is unable to lower the function's return value, a hidden sret
parameter is created. It is initialized and copied to a stored variable
(DemoteRegister) with CopyToReg and is later fetched with
CopyFromReg. The bug is that the chains used for each copy are
inconsistent, and thus in rare cases the scheduler may issue them out of
order.

The fix is to ensure that the CopyFromReg uses the DAG root which is set
as the chain corresponding to the initial CopyToReg.

Fixes https://llvm.org/PR52475

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D114795

show more ...


# f1d8345a 30-Nov-2021 Guozhi Wei <carrot@google.com>

[TwoAddressInstructionPass] Create register mapping for registers with multiple uses in the current MBB

Currently we create register mappings for registers used only once in current
MBB. For registe

[TwoAddressInstructionPass] Create register mapping for registers with multiple uses in the current MBB

Currently we create register mappings for registers used only once in current
MBB. For registers with multiple uses, when all the uses are in the current MBB,
we can also create mappings for them similarly according to the last use.
For example

%reg101 = ...
= ... reg101
%reg103 = ADD %reg101, %reg102

We can create mapping between %reg101 and %reg103.

Differential Revision: https://reviews.llvm.org/D113193

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 4a36e96c 21-Aug-2021 Matt Arsenault <Matthew.Arsenault@amd.com>

RegAllocGreedy: Account for reserved registers in num regs heuristic

This simple heuristic uses the estimated live range length combined
with the number of registers in the class to switch which heu

RegAllocGreedy: Account for reserved registers in num regs heuristic

This simple heuristic uses the estimated live range length combined
with the number of registers in the class to switch which heuristic to
use. This was taking the raw number of registers in the class, even
though not all of them may be available. AMDGPU heavily relies on
dynamically reserved numbers of registers based on user attributes to
satisfy occupancy constraints, so the raw number is highly misleading.

There are still a few problems here. In the original testcase that
made me notice this, the live range size is incorrect after the
scheduler rearranges instructions, since the instructions don't have
the original InstrDist offsets. Additionally, I think it would be more
appropriate to use the number of disjointly allocatable registers in
the class. For the AMDGPU register tuples, there are a large number of
registers in each tuple class, but only a small fraction can actually
be allocated at the same time since they all overlap with each
other. It seems we do not have a query that corresponds to the number
of independently allocatable registers. Relatedly, I'm still debugging
some allocation failures where overlapping tuples seem to not be
handled correctly.

The test changes are mostly noise. There are a handful of x86 tests
that look like regressions with an additional spill, and a handful
that now avoid a spill. The worst looking regression is likely
test/Thumb2/mve-vld4.ll which introduces a few additional
spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll
shows a massive improvement by completely eliminating a large number
of spills inside a loop.

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 0aef747b 11-Jun-2021 Roman Lebedev <lebedev.ri@gmail.com>

[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated

The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not

[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated

The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from
what pretty much all the checklines were generated with,
and most of the tests are still not updated, so each time one of the
non-up-to-date tests is updated to see the effect of the code change,
there is a lot of noise. Instead of having to deal with that each
time, let's just deal with everything at once.

This has been done via:
```
cd llvm-project/llvm/test/CodeGen/X86
grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" | xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc
```

Not all tests were regenerated, however.

show more ...


Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3
# 04e45e93 19-Feb-2019 Nikita Popov <nikita.ppv@gmail.com>

[SDAG] Use shift amount type in MULO promotion; NFC

Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths

[SDAG] Use shift amount type in MULO promotion; NFC

Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.

Split out from D57997.

llvm-svn: 354359

show more ...


Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# df5304d8 02-Jan-2019 Craig Topper <craig.topper@intel.com>

[X86] Add load folding support to the custom isel we do for X86ISD::UMUL/SMUL.

The peephole pass isn't always able to fold the load because it can't commute the implicit usage of AL/AX/EAX/RAX.

llv

[X86] Add load folding support to the custom isel we do for X86ISD::UMUL/SMUL.

The peephole pass isn't always able to fold the load because it can't commute the implicit usage of AL/AX/EAX/RAX.

llvm-svn: 350272

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 2d0f20cc 19-Sep-2018 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86] Handle COPYs of physregs better (regalloc hints)

Enable enableMultipleCopyHints() on X86.

Original Patch by @jonpa:

While enabling the mischeduler for SystemZ, it was discovered that for som

[X86] Handle COPYs of physregs better (regalloc hints)

Enable enableMultipleCopyHints() on X86.

Original Patch by @jonpa:

While enabling the mischeduler for SystemZ, it was discovered that for some reason a test needed one extra seemingly needless COPY (test/CodeGen/SystemZ/call-03.ll). The handling for that is resulted in this patch, which improves the register coalescing by providing not just one copy hint, but a sorted list of copy hints. On SystemZ, this gives ~12500 less register moves on SPEC, as well as marginally less spilling.

Instead of improving just the SystemZ backend, the improvement has been implemented in common-code (calculateSpillWeightAndHint(). This gives a lot of test failures, but since this should be a general improvement I hope that the involved targets will help and review the test updates.

Differential Revision: https://reviews.llvm.org/D38128

llvm-svn: 342578

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2
# 43e94b15 31-Jan-2018 Puyan Lotfi <puyan@puyan.org>

Followup on Proposal to move MIR physical register namespace to '$' sigil.

Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for n

Followup on Proposal to move MIR physical register namespace to '$' sigil.

Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.

llvm-svn: 323922

show more ...


Revision tags: llvmorg-6.0.0-rc1
# a8a83d15 07-Dec-2017 Francis Visoiu Mistrih <francisvm@yahoo.com>

[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.

Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple v

[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.

Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022

show more ...


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3
# 25528d6d 04-Dec-2017 Francis Visoiu Mistrih <francisvm@yahoo.com>

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR n

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665

show more ...


# 9e658c97 01-Dec-2017 Matt Morehouse <mascasa@google.com>

Revert "[X86] Improvement in CodeGen instruction selection for LEAs."

This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591


# 328199ec 01-Dec-2017 Jatin Bhateja <jatin.bhateja@gmail.com>

[X86] Improvement in CodeGen instruction selection for LEAs.

Summary:
1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
accommodate

[X86] Improvement in CodeGen instruction selection for LEAs.

Summary:
1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
accommodate similar operand appearing in the DAG e.g.
T1 = A + B
T2 = T1 + 10
T3 = T2 + A
For above DAG rooted at T3, X86AddressMode will now look like
Base = B , Index = A , Scale = 2 , Disp = 10

2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
then complex LEAs (having 3 operands) could be factored out e.g.
leal 1(%rax,%rcx,1), %rdx
leal 1(%rax,%rcx,2), %rcx
will be factored as following
leal 1(%rax,%rcx,1), %rdx
leal (%rdx,%rcx) , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 319543

show more ...


Revision tags: llvmorg-5.0.1-rc2
# 9d7bb0cb 28-Nov-2017 Francis Visoiu Mistrih <francisvm@yahoo.com>

[CodeGen] Print register names in lowercase in both MIR and debug output

As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printin

[CodeGen] Print register names in lowercase in both MIR and debug output

As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187

show more ...


Revision tags: llvmorg-5.0.1-rc1
# 2a6c9adb 04-Oct-2017 Hans Wennborg <hans@hanshq.net>

Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"

It broke the Chromium / SQLite build; see PR34830.

> Summary:
> 1/ Ope

Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"

It broke the Chromium / SQLite build; see PR34830.

> Summary:
> 1/ Operand folding during complex pattern matching for LEAs has been
> extended, such that it promotes Scale to accommodate similar operand
> appearing in the DAG.
> e.g.
> T1 = A + B
> T2 = T1 + 10
> T3 = T2 + A
> For above DAG rooted at T3, X86AddressMode will no look like
> Base = B , Index = A , Scale = 2 , Disp = 10
>
> 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
> so that if there is an opportunity then complex LEAs (having 3 operands)
> could be factored out.
> e.g.
> leal 1(%rax,%rcx,1), %rdx
> leal 1(%rax,%rcx,2), %rcx
> will be factored as following
> leal 1(%rax,%rcx,1), %rdx
> leal (%rdx,%rcx) , %edx
>
> 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
> thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919

show more ...


# 3c29bacd 04-Oct-2017 Jatin Bhateja <jatin.bhateja@gmail.com>

[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)

Summary:
1/ Operand folding during complex pattern matching for LEAs has been
ext

[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)

Summary:
1/ Operand folding during complex pattern matching for LEAs has been
extended, such that it promotes Scale to accommodate similar operand
appearing in the DAG.
e.g.
T1 = A + B
T2 = T1 + 10
T3 = T2 + A
For above DAG rooted at T3, X86AddressMode will no look like
Base = B , Index = A , Scale = 2 , Disp = 10

2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
so that if there is an opportunity then complex LEAs (having 3 operands)
could be factored out.
e.g.
leal 1(%rax,%rcx,1), %rdx
leal 1(%rax,%rcx,2), %rcx
will be factored as following
leal 1(%rax,%rcx,1), %rdx
leal (%rdx,%rcx) , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886

show more ...


# 534bfbd3 15-Sep-2017 Hans Wennborg <hans@hanshq.net>

Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."

This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite a

Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."

This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
> 1/ Operand folding during complex pattern matching for LEAs has been
> extended, such that it promotes Scale to accommodate similar operand
> appearing in the DAG.
> e.g.
> T1 = A + B
> T2 = T1 + 10
> T3 = T2 + A
> For above DAG rooted at T3, X86AddressMode will no look like
> Base = B , Index = A , Scale = 2 , Disp = 10
>
> 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
> so that if there is an opportunity then complex LEAs (having 3 operands)
> could be factored out.
> e.g.
> leal 1(%rax,%rcx,1), %rdx
> leal 1(%rax,%rcx,2), %rcx
> will be factored as following
> leal 1(%rax,%rcx,1), %rdx
> leal (%rdx,%rcx) , %edx
>
> 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
> thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376

show more ...


# 908c8b37 15-Sep-2017 Jatin Bhateja <jatin.bhateja@gmail.com>

[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.

Summary:
1/ Operand folding during complex pattern matching for LEAs has been
extended, such that it promotes Scale

[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.

Summary:
1/ Operand folding during complex pattern matching for LEAs has been
extended, such that it promotes Scale to accommodate similar operand
appearing in the DAG.
e.g.
T1 = A + B
T2 = T1 + 10
T3 = T2 + A
For above DAG rooted at T3, X86AddressMode will no look like
Base = B , Index = A , Scale = 2 , Disp = 10

2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
so that if there is an opportunity then complex LEAs (having 3 operands)
could be factored out.
e.g.
leal 1(%rax,%rcx,1), %rdx
leal 1(%rax,%rcx,2), %rcx
will be factored as following
leal 1(%rax,%rcx,1), %rdx
leal (%rdx,%rcx) , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343

show more ...


Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1
# 976a5d26 26-Jul-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[X86] Regenerated umul overflow tests on 32/64 bit targets

llvm-svn: 309102


12