History log of /llvm-project/llvm/test/CodeGen/ARM/load-combine.ll (Results 1 – 22 of 22)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# bed1c7f0 19-Dec-2022 Nikita Popov <npopov@redhat.com>

[ARM] Convert some tests to opaque pointers (NFC)


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5
# f387918d 15-Nov-2022 Craig Topper <craig.topper@sifive.com>

[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.

We can reuse constants if we use SRL followed by AND and AND followed by SHL.
Similar was

[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.

We can reuse constants if we use SRL followed by AND and AND followed by SHL.
Similar was done to bitreverse previously.

Differential Revision: https://reviews.llvm.org/D138045

show more ...


Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# e18cc527 30-Mar-2022 Sanjay Patel <spatel@rotateright.com>

[SDAG] try to canonicalize logical shift after bswap

When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C

This is the backend version of D

[SDAG] try to canonicalize logical shift after bswap

When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C

This is the backend version of D122010 and an alternative
suggested in D120648.
There's an extra check to make sure the shift amount is
valid that was not in the rough draft.

I'm not sure if there is a larger motivating case for RISCV (bug report?),
but the ARM diffs show a benefit from having a late version of the
transform (because we do not combine the loads in IR).

Differential Revision: https://reviews.llvm.org/D122655

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# cb15ba84 22-Nov-2019 Clement Courbet <courbet@google.com>

Reland "[DAGCombiner] Allow zextended load combines."

Check that the generated type is simple.


# 88e20552 22-Nov-2019 Clement Courbet <courbet@google.com>

Revert "[DAGCombiner] Allow zextended load combines."

Breaks some bots.


# 036790f9 19-Nov-2019 Clement Courbet <courbet@google.com>

[DAGCombiner] Allow zextended load combines.

Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base)

Reviewers: apilipenko, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llv

[DAGCombiner] Allow zextended load combines.

Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base)

Reviewers: apilipenko, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70487

show more ...


# 12a88f01 21-Nov-2019 Clement Courbet <courbet@google.com>

[DAGCombiner] Add tests for thumb load-combine.


# 23c76792 20-Nov-2019 Clement Courbet <courbet@google.com>

[CodeGen][NFC] Regenerate load-combine test with update_llc_test.

To prepare for D27861.


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 82099457 29-Apr-2019 Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>

[DAG] Refactor DAGCombiner::ReassociateOps

Summary:
Extract the logic for doing reassociations
from DAGCombiner::reassociateOps into a helper
function DAGCombiner::reassociateOpsCommutative,
and use

[DAG] Refactor DAGCombiner::ReassociateOps

Summary:
Extract the logic for doing reassociations
from DAGCombiner::reassociateOps into a helper
function DAGCombiner::reassociateOpsCommutative,
and use that helper to trigger reassociation
on the original operand order, or the commuted
operand order.

Codegen is not identical since the operand order will
be different when doing the reassociations for the
commuted case. That causes some unfortunate churn in
some test cases. Apart from that this should be NFC.

Reviewers: spatel, craig.topper, tstellar

Reviewed By: spatel

Subscribers: dmgreen, dschuff, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61199

llvm-svn: 359476

show more ...


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2
# ca1d849c 27-Mar-2018 Artur Pilipenko <apilipenko@azulsystems.com>

Fix a reoccuring typo in load-combine tests

%tmp = bitcast i32* %arg to i8*
%tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
- %tmp2 = load i8, i8* %tmp, align 1
+ %tmp2 = load i8, i8* %tm

Fix a reoccuring typo in load-combine tests

%tmp = bitcast i32* %arg to i8*
%tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
- %tmp2 = load i8, i8* %tmp, align 1
+ %tmp2 = load i8, i8* %tmp1, align 1

This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended.

llvm-svn: 328642

show more ...


Revision tags: llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3
# e1b2d314 01-Mar-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336).

Support {a|s}ext, {a|z|s}ext loa

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336).

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 296651

show more ...


# 8e48f416 22-Feb-2017 Bill Seurer <seurer@linux.vnet.ibm.com>

[DAGCombiner] revert r295336

r295336 causes a bootstrapped clang to fail for many compilations on
powerpc BE. See
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315
for e

[DAGCombiner] revert r295336

r295336 causes a bootstrapped clang to fail for many compilations on
powerpc BE. See
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315
for example.

Reverting as per the developer's request.

llvm-svn: 295849

show more ...


# 85d75829 16-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Resubmit -r295314 with PowerPC and AMDGPU tests updated.

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patt

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Resubmit -r295314 with PowerPC and AMDGPU tests updated.

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 295336

show more ...


# a1b384c4 16-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

Rever -r295314 "[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine"

This change causes some of AMDGPU and PowerPC tests to fail.

llvm-svn: 295316


# daaa0c0f 16-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://

[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 295314

show more ...


# 0e4583b5 09-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

Add DAGCombiner load combine tests for partially available values

If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift.

Add DAGCombiner load combine tests for partially available values

If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear.

llvm-svn: 294591

show more ...


# 4a640319 09-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Support non-zero offset in load combine

Enable folding patterns which load the value from non-zero offset:

i8 *a = ...
i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24)
=

[DAGCombiner] Support non-zero offset in load combine

Enable folding patterns which load the value from non-zero offset:

i8 *a = ...
i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24)
=>
i32 val = *((i32*)(a+4))

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D29394

llvm-svn: 294582

show more ...


Revision tags: llvmorg-4.0.0-rc2
# 469596ef 07-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

Add DAGCombiner load combine tests for {a|s}ext, {a|z|s}ext load nodes

Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the revi

Add DAGCombiner load combine tests for {a|s}ext, {a|z|s}ext load nodes

Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the review of the change to support these nodes more clear.

Separated from https://reviews.llvm.org/D29591 review.

llvm-svn: 294305

show more ...


# d3464bf9 06-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Support bswap as a part of load combine patterns

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D29397

llvm-svn: 294201


# bdf3c5af 06-Feb-2017 Artur Pilipenko <apilipenko@azulsystems.com>

Add DAGCombiner load combine tests with non-zero offset

This is separated from https://reviews.llvm.org/D29394 review.

llvm-svn: 294185


# 41c0005a 25-Jan-2017 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2.

The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some

[DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2.

The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html

This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used.

From the original commit:

Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.

Assuming little endian target:
i8 *a = ...
i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
i32 val = *((i32)a)

i8 *a = ...
i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
i32 val = BSWAP(*((i32)a))

This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.

Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)

Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.

The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.

Reviewed By: RKSimon, filcab, chandlerc

Differential Revision: https://reviews.llvm.org/D27861

llvm-svn: 293036

show more ...


Revision tags: llvmorg-4.0.0-rc1
# c93cc595 13-Dec-2016 Artur Pilipenko <apilipenko@azulsystems.com>

[DAGCombiner] Match load by bytes idiom and fold it into a single load

Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a

[DAGCombiner] Match load by bytes idiom and fold it into a single load

Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.

Assuming little endian target:
i8 *a = ...
i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
i32 val = *((i32)a)

i8 *a = ...
i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
i32 val = BSWAP(*((i32)a))

This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.

Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)

Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.

The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.

Reviewed By: hfinkel, RKSimon, filcab

Differential Revision: https://reviews.llvm.org/D26149

llvm-svn: 289538

show more ...