#
f7e4388e |
| 07-Apr-2017 |
Simon Dardis <simon.dardis@imgtec.com> |
Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investi
Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investigate.
Revert "[mips] Remove usage of debug only variable (NFC)"
This reverts commit r299769. Follow up commit.
llvm-svn: 299788
show more ...
|
#
6470ff0b |
| 07-Apr-2017 |
Simon Dardis <simon.dardis@imgtec.com> |
[SelectionDAG] Enable target specific vector scalarization of calls and returns
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize
[SelectionDAG] Enable target specific vector scalarization of calls and returns
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register.
By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
llvm-svn: 299766
show more ...
|
#
92a5cf43 |
| 28-Mar-2017 |
Adam Nemet <anemet@apple.com> |
[SDAG] Remove -enable-fmf-dag
This is no longer needed as spotted by Sanjay in https://reviews.llvm.org/D31165.
llvm-svn: 298963
|
#
6820f391 |
| 28-Mar-2017 |
Adam Nemet <anemet@apple.com> |
[SDAG] Add AllowContract to SNodeFlags
Properly propagate the FMF from the LLVM IR to this flag.
This is toward moving fp-contraction=fast from an LLVM TargetOption to a FastMathFlag in order to fi
[SDAG] Add AllowContract to SNodeFlags
Properly propagate the FMF from the LLVM IR to this flag.
This is toward moving fp-contraction=fast from an LLVM TargetOption to a FastMathFlag in order to fix PR25721.
Differential Revision: https://reviews.llvm.org/D31165
llvm-svn: 298961
show more ...
|
#
f01a1dad |
| 28-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
[x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality
Follow-up to: https://reviews.llvm.org/rL298775
llvm-svn: 298933
|
#
9ebb6884 |
| 25-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
[x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality
This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp
[x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality
This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster.
Differential Revision: https://reviews.llvm.org/D31290
llvm-svn: 298775
show more ...
|
#
b518054b |
| 21-Mar-2017 |
Reid Kleckner <rnk@google.com> |
Rename AttributeSet to AttributeList
Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttr
Rename AttributeSet to AttributeList
Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
show more ...
|
#
ac6081cb |
| 18-Mar-2017 |
Nirav Dave <niravd@google.com> |
Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk
Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin
Differential Revision: https://revie
Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk
Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D27050
llvm-svn: 298179
show more ...
|
#
6de2c779 |
| 18-Mar-2017 |
Nirav Dave <niravd@google.com> |
Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
|
#
45707d4d |
| 16-Mar-2017 |
Reid Kleckner <rnk@google.com> |
Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand,
Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time.
In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()).
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D31052
llvm-svn: 298010
show more ...
|
#
5273afd4 |
| 06-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
[DAG] fix typo in comment; NFC
llvm-svn: 297011
|
#
7884dcb7 |
| 02-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
[DAG] early exit to improve readability and formatting of visitMemCmpCall(); NFCI
llvm-svn: 296824
|
#
209b0f9a |
| 02-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
[DAG] improve documentation comments; NFC
llvm-svn: 296808
|
#
f7aba7ba |
| 02-Mar-2017 |
Sanjay Patel <spatel@rotateright.com> |
fix typo in comment; NFC
llvm-svn: 296760
|
#
f7c0980c |
| 01-Mar-2017 |
Reid Kleckner <rnk@google.com> |
Elide argument copies during instruction selection
Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a relea
Elide argument copies during instruction selection
Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters.
This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it.
Supersedes D28388
Fixes PR26328
Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D29668
llvm-svn: 296683
show more ...
|
#
96ec7a23 |
| 15-Feb-2017 |
Craig Topper <craig.topper@gmail.com> |
[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask
Summary: The current code loops over all elements to calculate a used range. Then a second sh
[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask
Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.
This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.
Reviewers: zvi, RKSimon
Reviewed By: zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29926
llvm-svn: 295152
show more ...
|
#
8f3df731 |
| 13-Feb-2017 |
Arnold Schwaighofer <aschwaighofer@apple.com> |
swiftcc: Don't emit tail calls from callers with swifterror parameters
Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is liv
swiftcc: Don't emit tail calls from callers with swifterror parameters
Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call.
rdar://30495920
llvm-svn: 294982
show more ...
|
#
7e320c24 |
| 09-Feb-2017 |
Geoff Berry <gberry@codeaurora.org> |
[SelectionDAG] Fix bugs in inverted condition splitting code.
Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operatio
[SelectionDAG] Fix bugs in inverted condition splitting code.
Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization.
Reviewers: bogner, MatzeB, qcolombet
Subscribers: mcrosier, uabelho, llvm-commits
Differential Revision: https://reviews.llvm.org/D29680
llvm-svn: 294605
show more ...
|
#
0887d44a |
| 07-Feb-2017 |
Reid Kleckner <rnk@google.com> |
[SDAGISel] Simplify some SDAGISel code, NFC
Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much
[SDAGISel] Simplify some SDAGISel code, NFC
Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much more readable than doing it conditionally inside the loop. It also makes it clear that argument lowering can update StaticAllocaMap because no instructions have been selected yet.
Also use range-based for loops where possible.
llvm-svn: 294329
show more ...
|
#
9677cc6f |
| 03-Feb-2017 |
Ahmed Bougacha <ahmed.bougacha@gmail.com> |
[TLI] Robustize SDAG LibFunc proto checking by merging it into TLI.
This re-applies commit r292189, reverted in r292191.
SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-
[TLI] Robustize SDAG LibFunc proto checking by merging it into TLI.
This re-applies commit r292189, reverted in r292191.
SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking.
Use TLI instead, removing another heap of redundant code.
This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to a few tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. - a handful of SystemZ tests check the SDAG support for lax prototype checking: Ulrich agrees on removing them.
I don't think it's worth supporting any of these (IMO) invalid testcases. Instead, fix them to be more meaningful.
llvm-svn: 294028
show more ...
|
#
a0a1164c |
| 26-Jan-2017 |
Andrew Kaylor <andrew.kaylor@intel.com> |
Add intrinsics for constrained floating point operations
This commit introduces a set of experimental intrinsics intended to prevent optimizations that make assumptions about the rounding mode and f
Add intrinsics for constrained floating point operations
This commit introduces a set of experimental intrinsics intended to prevent optimizations that make assumptions about the rounding mode and floating point exception behavior. These intrinsics will later be extended to specify flush-to-zero behavior. More work is also required to model instruction dependencies in machine code and to generate these instructions from clang (when required by pragmas and/or command line options that are not currently supported).
Differential Revision: https://reviews.llvm.org/D27028
llvm-svn: 293226
show more ...
|
#
92a286ae |
| 24-Jan-2017 |
Geoff Berry <gberry@codeaurora.org> |
[SelectionDAG] Handle inverted conditions when splitting into multiple branches.
Summary: When conditional branches with complex conditions are split into multiple branches in SelectionDAGBuilder::F
[SelectionDAG] Handle inverted conditions when splitting into multiple branches.
Summary: When conditional branches with complex conditions are split into multiple branches in SelectionDAGBuilder::FindMergedConditions, also handle inverted conditions. These may sometimes appear without having been optimized by InstCombine when CodeGenPrepare decides to sink and duplicate cmp instructions, causing them to have only one use. This problem can be increased by e.g. GVNHoist hiding more cmps from InstCombine by combining equivalent cmps from different blocks.
For example codegen X & !(Y | Z) as: jmp_if_X TmpBB jmp FBB TmpBB: jmp_if_notY Tmp2BB jmp FBB Tmp2BB: jmp_if_notZ TBB jmp FBB
Reviewers: bogner, MatzeB, qcolombet
Subscribers: llvm-commits, hiraditya, mcrosier, sebpop
Differential Revision: https://reviews.llvm.org/D28380
llvm-svn: 292944
show more ...
|
#
d21529fa |
| 23-Jan-2017 |
David L. Jones <dlj@google.com> |
[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including li
[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including libc implementations, where function names are actually macros (musl uses "#define fopen64 fopen", for example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such macros as functions instead (via <cstdio>). However, there are some "library" functions which are not part of the standard, and thus not subject to this rule (fopen64, for example). So, in order to be both portable and consistent, the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare enumerators. This patch changes LibFunc to be an enum with enumerators prefixed with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
show more ...
|
#
2074e749 |
| 19-Jan-2017 |
Mikael Holmen <mikael.holmen@ericsson.com> |
[DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary: The SDNodeOrder is saved in the IROrder field in the SDNode, and this field may affects scheduling. Thus, letting dbg.value/declare i
[DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary: The SDNodeOrder is saved in the IROrder field in the SDNode, and this field may affects scheduling. Thus, letting dbg.value/declare increase the order numbers may in turn affect scheduling.
Because of this change we also need to update the code deciding when dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.
Dbg values now have the same order as the SDNode they are connected to, not the following orders.
Test cases provided by Florian Hahn.
Reviewers: bogner, aprantl, sunfish, atrick
Reviewed By: atrick
Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D25318
llvm-svn: 292485
show more ...
|
#
9e5a085c |
| 17-Jan-2017 |
Ahmed Bougacha <ahmed.bougacha@gmail.com> |
Revert "[TLI] Robustize SDAG proto checking by merging it into TLI."
This reverts commit r292189, as it causes issues on SystemZ bots.
llvm-svn: 292191
|