BasicTargetTransformInfo.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/BasicTargetTransformInfo.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-3.5.0-rc3
# b9fd9ed3	07-Aug-2014	Eric Christopher <echristo@gmail.com>	Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Ha Temporarily Revert "Nuke the old JIT." as it's not quite ready to be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154 show more ...
# f8b27c41	07-Aug-2014	Rafael Espindola <rafael.espindola@gmail.com>	Nuke the old JIT. I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
Revision tags: llvmorg-3.5.0-rc2
# d913448b	04-Aug-2014	Eric Christopher <echristo@gmail.com>	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781
# 93046910	25-Jul-2014	Hal Finkel <hfinkel@anl.gov>	Add @llvm.assume, lowering, and some basic properties This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assum Add @llvm.assume, lowering, and some basic properties This is the first commit in a series that add an @llvm.assume intrinsic which can be used to provide the optimizer with a condition it may assume to be true (when the control flow would hit the intrinsic call). Some basic properties are added here: - llvm.invariant(true) is dead. - llvm.invariant(false) is unreachable (this directly corresponds to the documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef). The intrinsic is tagged as writing arbitrarily, in order to maintain control dependencies. BasicAA has been updated, however, to return NoModRef for any particular location-based query so that we don't unnecessarily block code motion. llvm-svn: 213973 show more ...
Revision tags: llvmorg-3.5.0-rc1
# e03a25da	20-Jun-2014	Karthik Bhat <kv.bhat@samsung.com>	Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer. This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer. This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 llvm-svn: 211339 show more ...
Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1
# e8172d85	08-May-2014	Hal Finkel <hfinkel@anl.gov>	Fix a spelling error llvm-svn: 208314
# 6532c20f	08-May-2014	Hal Finkel <hfinkel@anl.gov>	Move late partial-unrolling thresholds into the processor definitions The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features) Move late partial-unrolling thresholds into the processor definitions The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289 show more ...
# 1625bfcc	06-May-2014	Benjamin Kramer <benny.kra@googlemail.com>	TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target. llvm-svn: 208115
Revision tags: llvmorg-3.4.1, llvmorg-3.4.1-rc2
# 1b9dde08	22-Apr-2014	Chandler Carruth <chandlerc@gmail.com>	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind [Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837 show more ...
# 56bf297e	14-Apr-2014	Hal Finkel <hfinkel@anl.gov>	Don't assert in BasicTTI::getMemoryOpCost for non-simple types BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting AllowUnknown=true with TLI->getSimpleValueType is not suf Don't assert in BasicTTI::getMemoryOpCost for non-simple types BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for example, non-power-of-two vector types return non-simple EVTs (not MVT::Other). llvm-svn: 206150 show more ...
# c0196b1b	14-Apr-2014	Craig Topper <craig.topper@gmail.com>	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206142
Revision tags: llvmorg-3.4.1-rc1
# 6fd19ab3	03-Apr-2014	Hal Finkel <hfinkel@anl.gov>	Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this. Between this, and r205487, PowerPC on the P7 with VSX enabled shows: MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup (some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger) llvm-svn: 205495 show more ...
# 55312deb	02-Apr-2014	Hal Finkel <hfinkel@anl.gov>	Fix multi-register costs in BasicTTI::getCastInstrCost For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or Fix multi-register costs in BasicTTI::getCastInstrCost For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize. llvm-svn: 205487 show more ...
# ce376c0f	10-Mar-2014	Raul E. Silvera <rsilvera@google.com>	When analyzing vectors of element type that require legalization, the legalization cost must be included to get an accurate estimation of the total cost of the scalarized vector. The inaccurate cost When analyzing vectors of element type that require legalization, the legalization cost must be included to get an accurate estimation of the total cost of the scalarized vector. The inaccurate cost triggered unprofitable SLP vectorization on 32-bit X86. Summary: Include legalization overhead when computing scalarization cost Reviewers: hfinkel, nadav CC: chandlerc, rnk, llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2992 llvm-svn: 203509 show more ...
# 24e685fd	10-Mar-2014	Craig Topper <craig.topper@gmail.com>	[C++11] Remove 'virtual' keyword from methods marked with 'override' keyword. llvm-svn: 203444
# aee3ca6c	10-Mar-2014	Chandler Carruth <chandlerc@gmail.com>	[TTI] There is actually no realistic way to pop TTI implementations off the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use overrid [TTI] There is actually no realistic way to pop TTI implementations off the stack of the analysis group because they are all immutable passes. This is made clear by Craig's recent work to use override systematically -- we weren't overriding anything for 'finalizePass' because there is no such thing. This is kind of a lame restriction on the API -- we can no longer push and pop things, we just set up the stack and run. However, I'm not invested in building some better solution on top of the existing (terrifying) immutable pass and legacy pass manager. llvm-svn: 203437 show more ...
# 73156025	02-Mar-2014	Craig Topper <craig.topper@gmail.com>	Switch all uses of LLVM_OVERRIDE to just use 'override' directly. llvm-svn: 202621
# 77dfe45f	02-Mar-2014	Craig Topper <craig.topper@gmail.com>	Switch all uses of LLVM_FINAL to just use 'final', and remove the macro. llvm-svn: 202618
# 3e752e7a	24-Jan-2014	Juergen Ributzka <juergen@apple.com>	Add final and owerride keywords to TargetTransformInfo's subclasses. llvm-svn: 200021
Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1
# cae8735a	17-Sep-2013	Arnold Schwaighofer <aschwaighofer@apple.com>	Costmodel: Add support for horizontal vector reductions Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. Costmodel: Add support for horizontal vector reductions Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876 show more ...
# 8f2e7005	11-Sep-2013	Hal Finkel <hfinkel@anl.gov>	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core ( Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542 show more ...
# 8e83820a	29-Aug-2013	Hal Finkel <hfinkel@anl.gov>	Revert: r189565 - Add getUnrollingPreferences to TTI Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize Revert: r189565 - Add getUnrollingPreferences to TTI Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566 show more ...
# 63e6c0e9	29-Aug-2013	Hal Finkel <hfinkel@anl.gov>	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core ( Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565 show more ...
# 37cd6cfb	23-Aug-2013	Richard Sandiford <rsandifo@linux.vnet.ibm.com>	Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to tes Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097 show more ...
# 0c5c01aa	19-Aug-2013	Hal Finkel <hfinkel@anl.gov>	Add a llvm.copysign intrinsic This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to Add a llvm.copysign intrinsic This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728 show more ...
123