History log of /llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (Results 1001 – 1025 of 2094)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 604526fe 10-May-2017 Ahmed Bougacha <ahmed.bougacha@gmail.com>

[CodeGen] Don't require AA in SDAGISel at -O0.

Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, A

[CodeGen] Don't require AA in SDAGISel at -O0.

Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load). But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG. That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302611

show more ...


# 3a363fff 09-May-2017 Reid Kleckner <rnk@google.com>

Re-land "Use the frame index side table for byval and inalloca arguments"

This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544


# 84075fdd 09-May-2017 Reid Kleckner <rnk@google.com>

Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"

This re-lands commit r302461. It was not the cause of PR32977.

llvm-svn: 302543


# d526b13e 09-May-2017 Serge Pavlov <sepavloff@gmail.com>

Add extra operand to CALLSEQ_START to keep frame part set up previously

Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs

Add extra operand to CALLSEQ_START to keep frame part set up previously

Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
affects all targets that use frame pseudo instructions and touched many
files although the changes are uniform.
- Access to frame properties are implemented using special instructions
rather than calls getOperand(N).getImm(). For X86 and ARM such
replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
instruction. These involve proper instruction initialization and
methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527

show more ...


# cf9daa33 09-May-2017 Amara Emerson <amara.emerson@arm.com>

Introduce experimental generic intrinsics for horizontal vector reductions.

- This change allows targets to opt-in to using them instead of the log2
shufflevector algorithm.
- The SLP and Loop vec

Introduce experimental generic intrinsics for horizontal vector reductions.

- This change allows targets to opt-in to using them instead of the log2
shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
factored out into LoopUtils, and now have a unified interface for generating
reductions regardless of the preference of the target. LoopUtils now uses TTI
to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514

show more ...


# 41bb9423 09-May-2017 Reid Kleckner <rnk@google.com>

Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"

This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot

Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"

This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot. I was unable to reproduce the failure locally,
however.

llvm-svn: 302504

show more ...


# 9f29914d 09-May-2017 Reid Kleckner <rnk@google.com>

Revert "Use the frame index side table for byval and inalloca arguments"

This reverts r302483 and it's follow up fix.

llvm-svn: 302493


# 45efcf0c 08-May-2017 Reid Kleckner <rnk@google.com>

Use the frame index side table for byval and inalloca arguments

Summary:
For inalloca functions, this is a very common code pattern:

%argpack = type <{ i32, i32, i32 }>
define void @f(%argpack*

Use the frame index side table for byval and inalloca arguments

Summary:
For inalloca functions, this is a very common code pattern:

%argpack = type <{ i32, i32, i32 }>
define void @f(%argpack* inalloca %args) {
entry:
%a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
%b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
%c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483

show more ...


# bf828eed 08-May-2017 Reid Kleckner <rnk@google.com>

Don't add DBG_VALUE instructions for static allocas in dbg.declare

Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entire

Don't add DBG_VALUE instructions for static allocas in dbg.declare

Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.

Effectively revert r113967

Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32920

llvm-svn: 302461

show more ...


# 9bcaed86 08-May-2017 Dean Michael Berris <dberris@google.com>

[XRay] Custom event logging intrinsic

This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some

[XRay] Custom event logging intrinsic

This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405

show more ...


# ac1a97b3 05-May-2017 Reid Kleckner <rnk@google.com>

Simplify dbg.value handling in SDISel with early returns

No functional change other than improving dbgs logging accuracy on
constant dbg values. Previously we would add things like "i32 42" as
debug

Simplify dbg.value handling in SDISel with early returns

No functional change other than improving dbgs logging accuracy on
constant dbg values. Previously we would add things like "i32 42" as
debug values, and then log that we were dropping the debug info, which
is silly.

Delete some dead code that was checking for static allocas. This
remained after r207165, but served no purpose. Currently, static alloca
dbg.values are always sent through the DanglingDebugInfoMap, and are
usually made valid the first time the alloca is used.

llvm-svn: 302267

show more ...


# 89ad89cc 02-May-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)

PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types,

[SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)

PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.

This patch adds support for extension/truncation for both integer and float types.

Differential Revision: https://reviews.llvm.org/D32391

llvm-svn: 301910

show more ...


# d28f0cd4 01-May-2017 Amara Emerson <amara.emerson@arm.com>

Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.

This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.

Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.

This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.org/D32527

llvm-svn: 301803

show more ...


# 6652a52e 28-Apr-2017 Reid Kleckner <rnk@google.com>

Use Argument::hasAttribute and AttributeList::ReturnIndex more

This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of pla

Use Argument::hasAttribute and AttributeList::ReturnIndex more

This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666

show more ...


# 919f9e8d 28-Apr-2017 Jun Bum Lim <junbuml@codeaurora.org>

[InlineCost] Improve the cost heuristic for Switch

Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:

[InlineCost] Improve the cost heuristic for Switch

Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649

show more ...


# d0af7e8a 28-Apr-2017 Craig Topper <craig.topper@gmail.com>

[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits

This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar

[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits

This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620

show more ...


# 37ef04ad 25-Apr-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[SelectionDAG] Use getBuildVector helper where possible. NFCI

llvm-svn: 301314


# 986d73cc 25-Apr-2017 Simon Pilgrim <llvm-dev@redking.me.uk>

[SelectionDAG] Pull out repeated getValueType calls. NFCI.

Noticed in D32391.

llvm-svn: 301308


# c8e8e2a0 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Move value type list from TargetRegisterClass to TargetRegisterInfo

Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301234


# 98ab4c64 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Revert r301231: Accidentally committed stale files

I forgot to commit local changes before commit.

llvm-svn: 301232


# c0197066 24-Apr-2017 Krzysztof Parzyszek <kparzysz@codeaurora.org>

Move value type list from TargetRegisterClass to TargetRegisterInfo

Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301231


# fd23a0c0 24-Apr-2017 Yaxun Liu <Yaxun.Liu@amd.com>

CodeGen: Add a hook for getFenceOperandTy

Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, th

CodeGen: Add a hook for getFenceOperandTy

Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186

llvm-svn: 301215

show more ...


# 5d977f8e 20-Apr-2017 Yaxun Liu <Yaxun.Liu@amd.com>

CodeGen: Let frame index value type match alloca addr space

Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as poi

CodeGen: Let frame index value type match alloca addr space

Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021

llvm-svn: 300864

show more ...


# 6825fb64 18-Apr-2017 Adrian Prantl <aprantl@apple.com>

PR32382: Fix emitting complex DWARF expressions.

The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
- describe a variable in a regis

PR32382: Fix emitting complex DWARF expressions.

The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
- describe a variable in a register
- consist of only a DW_OP_reg
2. Memory location descriptions
- describe the address of a variable
3. Implicit location descriptions
- describe the value of a variable
- end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident. This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8
DBG_VALUE, RAX --> DW_OP_reg0 +0
DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

llvm-svn: 300522

show more ...


# fb502d2f 14-Apr-2017 Reid Kleckner <rnk@google.com>

[IR] Make paramHasAttr to use arg indices instead of attr indices

This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index

[IR] Make paramHasAttr to use arg indices instead of attr indices

This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367

show more ...


1...<<41424344454647484950>>...84