History log of /llvm-project/llvm/lib/CodeGen/TargetLoweringBase.cpp (Results 301 – 325 of 500)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# bd79f73f 03-Jun-2017 Galina Kistanova <gkistanova@gmail.com>

Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.

llvm-svn: 304635


# 251ea8a4 01-Jun-2017 Amaury Sechet <deadalnix@gmail.com>

Do not legalize large setcc with setcce, introduce setcccarry and do it with usubo/setcccarry.

Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rathe

Do not legalize large setcc with setcce, introduce setcccarry and do it with usubo/setcccarry.

Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.

This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33374

llvm-svn: 304404

show more ...


# 3a7578c6 31-May-2017 Zaara Syeda <syzaara@ca.ibm.com>

[PPC] Inline expansion of memcmp

This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a targ

[PPC] Inline expansion of memcmp

This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313

show more ...


# f6d4dc5b 30-May-2017 Craig Topper <craig.topper@gmail.com>

[SelectionDAG] Set ISD::FPOWI to Expand by default

Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch change

[SelectionDAG] Set ISD::FPOWI to Expand by default

Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215

show more ...


# b52e0366 17-May-2017 Francis Visoiu Mistrih <fvisoiumistrih@apple.com>

BitVector: add iterators for set bits

Differential revision: https://reviews.llvm.org/D32060

llvm-svn: 303227


# 8ac81f39 30-Apr-2017 Amaury Sechet <deadalnix@gmail.com>

Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry

Summary: As per discution on how to get better codegen an large int legalization, it became clear that usin

Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry

Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29872

llvm-svn: 301775

show more ...


# 744c215e 28-Apr-2017 Matthias Braun <matze@braunis.de>

TargetLowering: Add finalizeLowering() function; NFC

Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.

This function is meant for target speci

TargetLowering: Add finalizeLowering() function; NFC

Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.

This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.

Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.

Differential Revision: https://reviews.llvm.org/D32621

llvm-svn: 301679

show more ...


# 919f9e8d 28-Apr-2017 Jun Bum Lim <junbuml@codeaurora.org>

[InlineCost] Improve the cost heuristic for Switch

Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:

[InlineCost] Improve the cost heuristic for Switch

Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649

show more ...


# c8e8e2a0 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Move value type list from TargetRegisterClass to TargetRegisterInfo

Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301234


# 98ab4c64 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Revert r301231: Accidentally committed stale files

I forgot to commit local changes before commit.

llvm-svn: 301232


# c0197066 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Move value type list from TargetRegisterClass to TargetRegisterInfo

Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301231


# 44e25f37 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Move size and alignment information of regclass to TargetRegisterInfo

1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC

Move size and alignment information of regclass to TargetRegisterInfo

1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221

show more ...


# 59a2d7b9 11-Apr-2017 Serge Guelton <sguelton@quarkslab.com>

Module::getOrInsertFunction is using C-style vararg instead of variadic templates.

From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not ty

Module::getOrInsertFunction is using C-style vararg instead of variadic templates.

From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments.
The variadic template is an obvious solution to both issues.

Differential Revision: https://reviews.llvm.org/D31070

llvm-svn: 299949

show more ...


# b050c7fb 11-Apr-2017 Diana Picus <diana.picus@linaro.org>

Revert "Turn some C-style vararg into variadic templates"

This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008

ll

Revert "Turn some C-style vararg into variadic templates"

This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008

llvm-svn: 299928

show more ...


# 5fd75fb7 11-Apr-2017 Serge Guelton <sguelton@quarkslab.com>

Turn some C-style vararg into variadic templates

Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr

Turn some C-style vararg into variadic templates

Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

llvm-svn: 299925

show more ...


# f7e4388e 07-Apr-2017 Simon Dardis <simon.dardis@imgtec.com>

Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"

This reverts commit r299766. This change appears to have broken the MIPS
buildbots. Reverting while I investi

Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"

This reverts commit r299766. This change appears to have broken the MIPS
buildbots. Reverting while I investigate.

Revert "[mips] Remove usage of debug only variable (NFC)"

This reverts commit r299769. Follow up commit.

llvm-svn: 299788

show more ...


# 6470ff0b 07-Apr-2017 Simon Dardis <simon.dardis@imgtec.com>

[SelectionDAG] Enable target specific vector scalarization of calls and returns

By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize

[SelectionDAG] Enable target specific vector scalarization of calls and returns

By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 299766

show more ...


# db11fdfd 06-Apr-2017 Mehdi Amini <mehdi.amini@apple.com>

Revert "Turn some C-style vararg into variadic templates"

This reverts commit r299699, the examples needs to be updated.

llvm-svn: 299702


# 579540a8 06-Apr-2017 Mehdi Amini <mehdi.amini@apple.com>

Turn some C-style vararg into variadic templates

Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr

Turn some C-style vararg into variadic templates

Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D31070

llvm-svn: 299699

show more ...


# b518054b 21-Mar-2017 Reid Kleckner <rnk@google.com>

Rename AttributeSet to AttributeList

Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttr

Rename AttributeSet to AttributeList

Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393

show more ...


# cf2da96c 14-Mar-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[SelectionDAG] Add a signed integer absolute ISD node

Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that p

[SelectionDAG] Add a signed integer absolute ISD node

Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.

ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.

At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom.

Differential Revision: https://reviews.llvm.org/D29639

llvm-svn: 297780

show more ...


# 54e22f33 14-Mar-2017 Nirav Dave <niravd@google.com>

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

Recommiting with compiler time improvements

Recommitting after fixup of 32-bit aliasing sign offset bug in

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

Recommiting with compiler time improvements

Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

* Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.

When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.

This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).

Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

Additional Minor Changes:

1. Finishes removing unused AliasLoad code

2. Unifies the chain aggregation in the merged stores across code
paths

3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.

4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.

5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence

6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.

7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)

8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:

CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 297695

show more ...


# ce52b807 03-Mar-2017 Chandler Carruth <chandlerc@gmail.com>

[SDAG] Revert r296476 (and r296486, r296668, r296690).

This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
severa

[SDAG] Revert r296476 (and r296486, r296668, r296690).

This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.

llvm-svn: 296862

show more ...


# f830dec3 28-Feb-2017 Nirav Dave <niravd@google.com>

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

* Simplify Consecutive Merge St

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

* Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search and chain alias analysis which only
checks for parallel stores through the chain subgraph. This is cleaner
as the separation of non-interfering loads/stores from the
store-merging logic.

When merging stores search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited.

This improves the quality of the output SelectionDAG and the output
Codegen (save perhaps for some ARM cases where we correctly constructs
wider loads, but then promotes them to float operations which appear
but requires more expensive constant generation).

Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

Additional Minor Changes:

1. Finishes removing unused AliasLoad code

2. Unifies the chain aggregation in the merged stores across code
paths

3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.

4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seems sufficient to not cause regressions in
tests.

5. Remove Chain dependencies of Memory operations on CopyfromReg
nodes as these are captured by data dependence

6. Forward loads-store values through tokenfactors containing
{CopyToReg,CopyFromReg} Values.

7. Peephole to convert buildvector of extract_vector_elt to
extract_subvector if possible (see
CodeGen/AArch64/store-merge.ll)

8. Store merging for the ARM target is restricted to 32-bit as
some in some contexts invalid 64-bit operations are being
generated. This can be removed once appropriate checks are
added.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable, improving load-store forwarding. One test in
particular is worth noting:

CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
forwarding converts a load-store pair into a parallel store and
a memory-realized bitcast of the same value. However, because we
lose the sharing of the explicit and implicit store values we
must create another local store. A similar transformation
happens before SelectionDAG as well.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 296476

show more ...


# 73cd0194 26-Feb-2017 Nirav Dave <niravd@google.com>

Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."

This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

llvm-svn: 296

Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."

This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

llvm-svn: 296279

show more ...


1...<<11121314151617181920