History log of /llvm-project/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (Results 151 – 175 of 227)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 38bbf81a 14-Oct-2018 Dorit Nuzman <dorit.nuzman@intel.com>

recommit 344472 after fixing build failure on ARM and PPC.

llvm-svn: 344475


# 5118c68c 14-Oct-2018 Dorit Nuzman <dorit.nuzman@intel.com>

revert 344472 due to failures.

llvm-svn: 344473


# 81743689 14-Oct-2018 Dorit Nuzman <dorit.nuzman@intel.com>

[IAI,LV] Add support for vectorizing predicated strided accesses using masked
interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/sto

[IAI,LV] Add support for vectorizing predicated strided accesses using masked
interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/stores; predicated strided accesses can currently be
vectorized only using masked gather/scatter or scalarization. This patch makes
predicated loads/stores candidates for forming interleave-groups during the
Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
groups to the Loop-Vectorizer's planning and transformation stages. The patch
also extends the TTI API to allow querying the cost of masked interleave groups
(which each target can control); Targets that support masked vector loads/
stores may choose to enable this feature and allow vectorizing predicated
strided loads/stores using masked wide loads/stores and shuffles.

Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53011

llvm-svn: 344472

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1
# f78650a8 30-Jul-2018 Fangrui Song <maskray@google.com>

Remove trailing space

sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293


Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1
# ef7c4976 09-Mar-2018 Stefan Pintilie <stefanp@ca.ibm.com>

Revert "[PowerPC] LSR tunings for PowerPC"

Revert the rest of the LST tune commit.
It seems that the LSR tune commit breaks internal tests.
Reverting the commit.

llvm-svn: 327143


# f8438e8e 07-Mar-2018 Stefan Pintilie <stefanp@ca.ibm.com>

[PowerPC] LSR tunings for PowerPC

The purpose of this patch is to have LSR generate better code on Power.
This is done by overriding isLSRCostLess.

Differential Revision: https://reviews.llvm.org/D

[PowerPC] LSR tunings for PowerPC

The purpose of this patch is to have LSR generate better code on Power.
This is done by overriding isLSRCostLess.

Differential Revision: https://reviews.llvm.org/D40855

llvm-svn: 326906

show more ...


Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2
# 1f59ae31 30-Jan-2018 Zaara Syeda <syzaara@ca.ibm.com>

Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 323778

show more ...


# c9dc7b45 17-Jan-2018 Zaara Syeda <syzaara@ca.ibm.com>

Revert [PowerPC] This reverts commit rL322721

Failing build bots. Revert the commit now.

llvm-svn: 322748


# 8e951fd2 17-Jan-2018 Zaara Syeda <syzaara@ca.ibm.com>

[PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of

[PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 322721

show more ...


Revision tags: llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2
# b3bde2ea 17-Nov-2017 David Blaikie <dblaikie@gmail.com>

Fix a bunch more layering of CodeGen headers that are in Target

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, n

Fix a bunch more layering of CodeGen headers that are in Target

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490

show more ...


Revision tags: llvmorg-5.0.1-rc1
# b2c3eb8c 30-Oct-2017 Clement Courbet <courbet@google.com>

[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).

- Targets that want to support memcmp expansions now return the list of
supported load sizes.
- Expansion codegen does not as

[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).

- Targets that want to support memcmp expansions now return the list of
supported load sizes.
- Expansion codegen does not assume that all power-of-two load sizes
smaller than the max load size are valid. For examples, this is not the
case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905

show more ...


# 488782ef 19-Oct-2017 Graham Yiu <gyiu@ca.ibm.com>

The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is n

The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost.

Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com)

Differential Revision: https://reviews.llvm.org/D38961

llvm-svn: 316174

show more ...


# 2807c0a4 25-Sep-2017 Clement Courbet <courbet@google.com>

[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.

Summary:
Right now there are two functions with the same name, one does the work
and the

[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.

Summary:
Right now there are two functions with the same name, one does the work
and the other one returns true if expansion is needed. Rename
TargetTransformInfo::expandMemCmp to make it more consistent with other
members of TargetTransformInfo.

Remove the unused Instruction* parameter.

Differential Revision: https://reviews.llvm.org/D38165

llvm-svn: 314096

show more ...


Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1
# 66d9bdbc 28-Jun-2017 Geoff Berry <gberry@codeaurora.org>

[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, j

[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D34531

llvm-svn: 306554

show more ...


Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3
# c0112ae8 12-Jun-2017 Daniel Neilson <dneilson@azul.com>

Const correctness for TTI::getRegisterBitWidth

Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Con

Const correctness for TTI::getRegisterBitWidth

Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.

Reviewers: chandlerc, rnk, reames

Reviewed By: reames

Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D33903

llvm-svn: 305189

show more ...


# 457ddd31 31-May-2017 Sean Fertile <sfertile@ca.ibm.com>

[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.

Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.

Commiting on behalf of Stefa

[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.

Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for
newer ppc processors.

Commiting on behalf of Stefan Pintilie.
Differential Revision: https://reviews.llvm.org/D33656

llvm-svn: 304317

show more ...


# 3a7578c6 31-May-2017 Zaara Syeda <syzaara@ca.ibm.com>

[PPC] Inline expansion of memcmp

This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a targ

[PPC] Inline expansion of memcmp

This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313

show more ...


Revision tags: llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1
# fccc7d66 12-Apr-2017 Jonas Paulsson <paulsson@linux.vnet.ibm.com>

[SystemZ] TargetTransformInfo cost functions implemented.

getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleav

[SystemZ] TargetTransformInfo cost functions implemented.

getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052

show more ...


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3
# 7ec2c720 17-Feb-2017 Guozhi Wei <carrot@google.com>

[PPC] Give unaligned memory access lower cost on processor that supports it

Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handl

[PPC] Give unaligned memory access lower cost on processor that supports it

Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.

llvm-svn: 295506

show more ...


Revision tags: llvmorg-4.0.0-rc2
# 65144c85 25-Jan-2017 Daniel Jasper <djasper@google.com>

Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"

This reverts commit r292680. It is causing significantly worse
performance and test timeouts in our internal buil

Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"

This reverts commit r292680. It is causing significantly worse
performance and test timeouts in our internal builds. I have already
routed reproduction instructions your way.

llvm-svn: 293092

show more ...


# a5c6ed5a 20-Jan-2017 Guozhi Wei <carrot@google.com>

[PPC] Give unaligned memory access lower cost on processor that supports it

Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handl

[PPC] Give unaligned memory access lower cost on processor that supports it

Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

llvm-svn: 292680

show more ...


Revision tags: llvmorg-4.0.0-rc1
# 2c96c433 11-Jan-2017 Mohammed Agabaria <mohammed.agabaria@intel.com>

[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.

updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimiz

[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.

updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104

llvm-svn: 291657

show more ...


Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3
# 835de1f3 03-Dec-2016 Guozhi Wei <carrot@google.com>

[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR

VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory

[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR

VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.

Differential Revision: https://reviews.llvm.org/D26713

llvm-svn: 288560

show more ...


Revision tags: llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2
# b03fd12c 17-Aug-2016 Justin Bogner <mail@justinbogner.com>

Replace "fallthrough" comments with LLVM_FALLTHROUGH

This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm

Replace "fallthrough" comments with LLVM_FALLTHROUGH

This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902

show more ...


Revision tags: llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 6e29baf7 09-May-2016 Nemanja Ivanovic <nemanja.i.ibm@gmail.com>

[Power9] Add support for -mcpu=pwr9 in the back end

This patch corresponds to review:
http://reviews.llvm.org/D19683

Simply adds the bits for being able to specify -mcpu=pwr9 to the back end.

llvm

[Power9] Add support for -mcpu=pwr9 in the back end

This patch corresponds to review:
http://reviews.llvm.org/D19683

Simply adds the bits for being able to specify -mcpu=pwr9 to the back end.

llvm-svn: 268950

show more ...


12345678910