Revision tags: llvmorg-3.5.0-rc3 |
|
#
b9fd9ed3 |
| 07-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Ha
Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
llvm-svn: 215154
show more ...
|
#
f8b27c41 |
| 07-Aug-2014 |
Rafael Espindola <rafael.espindola@gmail.com> |
Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start.
Thanks to Lang Hames for making MCJIT a good replacement!
llvm-svn: 215111
|
Revision tags: llvmorg-3.5.0-rc2 |
|
#
d913448b |
| 04-Aug-2014 |
Eric Christopher <echristo@gmail.com> |
Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change.
llvm-svn: 214781
|
#
93046910 |
| 25-Jul-2014 |
Hal Finkel <hfinkel@anl.gov> |
Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assum
Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here:
- llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).
The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion.
llvm-svn: 213973
show more ...
|
Revision tags: llvmorg-3.5.0-rc1 |
|
#
e03a25da |
| 20-Jun-2014 |
Karthik Bhat <kv.bhat@samsung.com> |
Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as
Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015
llvm-svn: 211339
show more ...
|
Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1 |
|
#
e8172d85 |
| 08-May-2014 |
Hal Finkel <hfinkel@anl.gov> |
Fix a spelling error
llvm-svn: 208314
|
#
6532c20f |
| 08-May-2014 |
Hal Finkel <hfinkel@anl.gov> |
Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features)
Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions.
llvm-svn: 208289
show more ...
|
#
1625bfcc |
| 06-May-2014 |
Benjamin Kramer <benny.kra@googlemail.com> |
TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target.
llvm-svn: 208115
|
Revision tags: llvmorg-3.4.1, llvmorg-3.4.1-rc2 |
|
#
1b9dde08 |
| 22-Apr-2014 |
Chandler Carruth <chandlerc@gmail.com> |
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind
[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers.
Other sub-trees will follow.
llvm-svn: 206837
show more ...
|
#
56bf297e |
| 14-Apr-2014 |
Hal Finkel <hfinkel@anl.gov> |
Don't assert in BasicTTI::getMemoryOpCost for non-simple types
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting AllowUnknown=true with TLI->getSimpleValueType is not suf
Don't assert in BasicTTI::getMemoryOpCost for non-simple types
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).
llvm-svn: 206150
show more ...
|
#
c0196b1b |
| 14-Apr-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206142
|
Revision tags: llvmorg-3.4.1-rc1 |
|
#
6fd19ab3 |
| 03-Apr-2014 |
Hal Finkel <hfinkel@anl.gov> |
Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads
When a vector type legalizes to a larger vector type, and the target does not support the associated extending
Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads
When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this.
Between this, and r205487, PowerPC on the P7 with VSX enabled shows:
MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup
(some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger)
llvm-svn: 205495
show more ...
|
#
55312deb |
| 02-Apr-2014 |
Hal Finkel <hfinkel@anl.gov> |
Fix multi-register costs in BasicTTI::getCastInstrCost
For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or
Fix multi-register costs in BasicTTI::getCastInstrCost
For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize.
llvm-svn: 205487
show more ...
|
#
ce376c0f |
| 10-Mar-2014 |
Raul E. Silvera <rsilvera@google.com> |
When analyzing vectors of element type that require legalization, the legalization cost must be included to get an accurate estimation of the total cost of the scalarized vector. The inaccurate cost
When analyzing vectors of element type that require legalization, the legalization cost must be included to get an accurate estimation of the total cost of the scalarized vector. The inaccurate cost triggered unprofitable SLP vectorization on 32-bit X86.
Summary: Include legalization overhead when computing scalarization cost
Reviewers: hfinkel, nadav
CC: chandlerc, rnk, llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2992
llvm-svn: 203509
show more ...
|
#
24e685fd |
| 10-Mar-2014 |
Craig Topper <craig.topper@gmail.com> |
[C++11] Remove 'virtual' keyword from methods marked with 'override' keyword.
llvm-svn: 203444
|
#
aee3ca6c |
| 10-Mar-2014 |
Chandler Carruth <chandlerc@gmail.com> |
[TTI] There is actually no realistic way to pop TTI implementations off the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use overrid
[TTI] There is actually no realistic way to pop TTI implementations off the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing.
This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager.
llvm-svn: 203437
show more ...
|
#
73156025 |
| 02-Mar-2014 |
Craig Topper <craig.topper@gmail.com> |
Switch all uses of LLVM_OVERRIDE to just use 'override' directly.
llvm-svn: 202621
|
#
77dfe45f |
| 02-Mar-2014 |
Craig Topper <craig.topper@gmail.com> |
Switch all uses of LLVM_FINAL to just use 'final', and remove the macro.
llvm-svn: 202618
|
#
3e752e7a |
| 24-Jan-2014 |
Juergen Ributzka <juergen@apple.com> |
Add final and owerride keywords to TargetTransformInfo's subclasses.
llvm-svn: 200021
|
Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1 |
|
#
cae8735a |
| 17-Sep-2013 |
Arnold Schwaighofer <aschwaighofer@apple.com> |
Costmodel: Add support for horizontal vector reductions
Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this.
Costmodel: Add support for horizontal vector reductions
Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this.
We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence:
(v0, v1, v2, v3) \ \ / / \ \ / + +
(v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef)
%rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0
This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface.
We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp).
(v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef)
%rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0
llvm-svn: 190876
show more ...
|
#
8f2e7005 |
| 11-Sep-2013 |
Hal Finkel <hfinkel@anl.gov> |
Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (
Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important.
llvm-svn: 190542
show more ...
|
#
8e83820a |
| 29-Aug-2013 |
Hal Finkel <hfinkel@anl.gov> |
Revert: r189565 - Add getUnrollingPreferences to TTI
Revert unintentional commit (of an unreviewed change).
Original commit message:
Add getUnrollingPreferences to TTI
Allow targets to customize
Revert: r189565 - Add getUnrollingPreferences to TTI
Revert unintentional commit (of an unreviewed change).
Original commit message:
Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important.
llvm-svn: 189566
show more ...
|
#
63e6c0e9 |
| 29-Aug-2013 |
Hal Finkel <hfinkel@anl.gov> |
Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (
Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important.
llvm-svn: 189565
show more ...
|
#
37cd6cfb |
| 23-Aug-2013 |
Richard Sandiford <rsandifo@linux.vnet.ibm.com> |
Turn MipsOptimizeMathLibCalls into a target-independent scalar transform
...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to tes
Turn MipsOptimizeMathLibCalls into a target-independent scalar transform
...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available.
The pass is opt-in because at the moment it only handles sqrt.
llvm-svn: 189097
show more ...
|
#
0c5c01aa |
| 19-Aug-2013 |
Hal Finkel <hfinkel@anl.gov> |
Add a llvm.copysign intrinsic
This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to
Add a llvm.copysign intrinsic
This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well.
In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded.
In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN).
llvm-svn: 188728
show more ...
|