History log of /llvm-project/llvm/lib/CodeGen/BasicTargetTransformInfo.cpp (Results 26 – 50 of 69)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-3.5.0-rc3
# b9fd9ed3 07-Aug-2014 Eric Christopher <echristo@gmail.com>

Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Ha

Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 215154

show more ...


# f8b27c41 07-Aug-2014 Rafael Espindola <rafael.espindola@gmail.com>

Nuke the old JIT.

I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

llvm-svn: 215111


Revision tags: llvmorg-3.5.0-rc2
# d913448b 04-Aug-2014 Eric Christopher <echristo@gmail.com>

Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781


# 93046910 25-Jul-2014 Hal Finkel <hfinkel@anl.gov>

Add @llvm.assume, lowering, and some basic properties

This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assum

Add @llvm.assume, lowering, and some basic properties

This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assume to be true
(when the control flow would hit the intrinsic call). Some basic properties are added here:

- llvm.invariant(true) is dead.
- llvm.invariant(false) is unreachable (this directly corresponds to the
documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).

The intrinsic is tagged as writing arbitrarily, in order to maintain control
dependencies. BasicAA has been updated, however, to return NoModRef for any
particular location-based query so that we don't unnecessarily block code
motion.

llvm-svn: 213973

show more ...


Revision tags: llvmorg-3.5.0-rc1
# e03a25da 20-Jun-2014 Karthik Bhat <kv.bhat@samsung.com>

Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.

This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as

Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.

This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015

llvm-svn: 211339

show more ...


Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1
# e8172d85 08-May-2014 Hal Finkel <hfinkel@anl.gov>

Fix a spelling error

llvm-svn: 208314


# 6532c20f 08-May-2014 Hal Finkel <hfinkel@anl.gov>

Move late partial-unrolling thresholds into the processor definitions

The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features)

Move late partial-unrolling thresholds into the processor definitions

The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289

show more ...


# 1625bfcc 06-May-2014 Benjamin Kramer <benny.kra@googlemail.com>

TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target.

llvm-svn: 208115


Revision tags: llvmorg-3.4.1, llvmorg-3.4.1-rc2
# 1b9dde08 22-Apr-2014 Chandler Carruth <chandlerc@gmail.com>

[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind

[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.

Other sub-trees will follow.

llvm-svn: 206837

show more ...


# 56bf297e 14-Apr-2014 Hal Finkel <hfinkel@anl.gov>

Don't assert in BasicTTI::getMemoryOpCost for non-simple types

BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not suf

Don't assert in BasicTTI::getMemoryOpCost for non-simple types

BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).

llvm-svn: 206150

show more ...


# c0196b1b 14-Apr-2014 Craig Topper <craig.topper@gmail.com>

[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.

llvm-svn: 206142


Revision tags: llvmorg-3.4.1-rc1
# 6fd19ab3 03-Apr-2014 Hal Finkel <hfinkel@anl.gov>

Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads

When a vector type legalizes to a larger vector type, and the target does not
support the associated extending

Account for scalarization costs in BasicTTI::getMemoryOpCost for extending vector loads

When a vector type legalizes to a larger vector type, and the target does not
support the associated extending load (or truncating store), then legalization
will scalarize the load (or store) resulting in an associated scalarization
cost. BasicTTI::getMemoryOpCost needs to account for this.

Between this, and r205487, PowerPC on the P7 with VSX enabled shows:

MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup
SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup
SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup

(some of these are new; some of these, such as PAQ8p, just reverse regressions
that VSX support would trigger)

llvm-svn: 205495

show more ...


# 55312deb 02-Apr-2014 Hal Finkel <hfinkel@anl.gov>

Fix multi-register costs in BasicTTI::getCastInstrCost

For an cast (extension, etc.), the currently logic predicts a low cost if the
associated operation (keyed on the destination type) is legal (or

Fix multi-register costs in BasicTTI::getCastInstrCost

For an cast (extension, etc.), the currently logic predicts a low cost if the
associated operation (keyed on the destination type) is legal (or promoted).
This is not true when the number of values required to legalize the type is
changing. For example, <8 x i16> being sign extended by <8 x i32> is not
generically cheap on PPC with VSX, even though sign extension to v4i32 is
legal, because two output v4i32 values are required compared to the single
v8i16 input value, and without custom logic in the target, this conversion will
scalarize.

llvm-svn: 205487

show more ...


# ce376c0f 10-Mar-2014 Raul E. Silvera <rsilvera@google.com>

When analyzing vectors of element type that require legalization,
the legalization cost must be included to get an accurate
estimation of the total cost of the scalarized vector.
The inaccurate cost

When analyzing vectors of element type that require legalization,
the legalization cost must be included to get an accurate
estimation of the total cost of the scalarized vector.
The inaccurate cost triggered unprofitable SLP vectorization on
32-bit X86.

Summary:
Include legalization overhead when computing scalarization cost

Reviewers: hfinkel, nadav

CC: chandlerc, rnk, llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2992

llvm-svn: 203509

show more ...


# 24e685fd 10-Mar-2014 Craig Topper <craig.topper@gmail.com>

[C++11] Remove 'virtual' keyword from methods marked with 'override' keyword.

llvm-svn: 203444


# aee3ca6c 10-Mar-2014 Chandler Carruth <chandlerc@gmail.com>

[TTI] There is actually no realistic way to pop TTI implementations off
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use overrid

[TTI] There is actually no realistic way to pop TTI implementations off
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use override
systematically -- we weren't overriding anything for 'finalizePass'
because there is no such thing.

This is kind of a lame restriction on the API -- we can no longer push
and pop things, we just set up the stack and run. However, I'm not
invested in building some better solution on top of the existing
(terrifying) immutable pass and legacy pass manager.

llvm-svn: 203437

show more ...


# 73156025 02-Mar-2014 Craig Topper <craig.topper@gmail.com>

Switch all uses of LLVM_OVERRIDE to just use 'override' directly.

llvm-svn: 202621


# 77dfe45f 02-Mar-2014 Craig Topper <craig.topper@gmail.com>

Switch all uses of LLVM_FINAL to just use 'final', and remove the macro.

llvm-svn: 202618


# 3e752e7a 24-Jan-2014 Juergen Ributzka <juergen@apple.com>

Add final and owerride keywords to TargetTransformInfo's subclasses.

llvm-svn: 200021


Revision tags: llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1
# cae8735a 17-Sep-2013 Arnold Schwaighofer <aschwaighofer@apple.com>

Costmodel: Add support for horizontal vector reductions

Upcoming SLP vectorization improvements will want to be able to estimate costs
of horizontal reductions. Add infrastructure to support this.

Costmodel: Add support for horizontal vector reductions

Upcoming SLP vectorization improvements will want to be able to estimate costs
of horizontal reductions. Add infrastructure to support this.

We model reductions as a series of (shufflevector,add) tuples ultimately
followed by an extractelement. For example, for an add-reduction of <4 x float>
we could generate the following sequence:

(v0, v1, v2, v3)
\ \ / /
\ \ /
+ +

(v0+v2, v1+v3, undef, undef)
\ /
((v0+v2) + (v1+v3), undef, undef)

%rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
<4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
%bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
%rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
<4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
%bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
%r = extractelement <4 x float> %bin.rdx8, i32 0

This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
that will allow clients to ask for the cost of such a reduction (as backends
might generate more efficient code than the cost of the individual instructions
summed up). This interface is excercised by the CostModel analysis pass which
looks for reduction patterns like the one above - starting at extractelements -
and if it sees a matching sequence will call the cost model interface.

We will also support a second form of pairwise reduction that is well supported
on common architectures (haddps, vpadd, faddp).

(v0, v1, v2, v3)
\ / \ /
(v0+v1, v2+v3, undef, undef)
\ /
((v0+v1)+(v2+v3), undef, undef, undef)

%rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
<4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
%rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
<4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
%bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
%rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
<4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
%rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
<4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
%bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
%r = extractelement <4 x float> %bin.rdx.1, i32 0

llvm-svn: 190876

show more ...


# 8f2e7005 11-Sep-2013 Hal Finkel <hfinkel@anl.gov>

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 190542

show more ...


# 8e83820a 29-Aug-2013 Hal Finkel <hfinkel@anl.gov>

Revert: r189565 - Add getUnrollingPreferences to TTI

Revert unintentional commit (of an unreviewed change).

Original commit message:

Add getUnrollingPreferences to TTI

Allow targets to customize

Revert: r189565 - Add getUnrollingPreferences to TTI

Revert unintentional commit (of an unreviewed change).

Original commit message:

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189566

show more ...


# 63e6c0e9 29-Aug-2013 Hal Finkel <hfinkel@anl.gov>

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189565

show more ...


# 37cd6cfb 23-Aug-2013 Richard Sandiford <rsandifo@linux.vnet.ibm.com>

Turn MipsOptimizeMathLibCalls into a target-independent scalar transform

...so that it can be used for z too. Most of the code is the same.
The only real change is to use TargetTransformInfo to tes

Turn MipsOptimizeMathLibCalls into a target-independent scalar transform

...so that it can be used for z too. Most of the code is the same.
The only real change is to use TargetTransformInfo to test when a sqrt
instruction is available.

The pass is opt-in because at the moment it only handles sqrt.

llvm-svn: 189097

show more ...


# 0c5c01aa 19-Aug-2013 Hal Finkel <hfinkel@anl.gov>

Add a llvm.copysign intrinsic

This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to

Add a llvm.copysign intrinsic

This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to copysign in the loop vectorizer, we need a corresponding
intrinsic as well.

In addition to the expected changes to the language reference, the loop
vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
expanded.

In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
be Expand for vector types. This seems correct for all in-tree targets, and I
think is the right thing to do because, previously, there was no way to generate
vector-values FCOPYSIGN nodes (and most targets don't specify an action for
vector-typed FCOPYSIGN).

llvm-svn: 188728

show more ...


123