#
38bbf81a |
| 14-Oct-2018 |
Dorit Nuzman <dorit.nuzman@intel.com> |
recommit 344472 after fixing build failure on ARM and PPC.
llvm-svn: 344475
|
#
5118c68c |
| 14-Oct-2018 |
Dorit Nuzman <dorit.nuzman@intel.com> |
revert 344472 due to failures.
llvm-svn: 344473
|
#
81743689 |
| 14-Oct-2018 |
Dorit Nuzman <dorit.nuzman@intel.com> |
[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group
The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/sto
[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group
The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles.
Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D53011
llvm-svn: 344472
show more ...
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
#
f78650a8 |
| 30-Jul-2018 |
Fangrui Song <maskray@google.com> |
Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}
llvm-svn: 338293
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1 |
|
#
ef7c4976 |
| 09-Mar-2018 |
Stefan Pintilie <stefanp@ca.ibm.com> |
Revert "[PowerPC] LSR tunings for PowerPC"
Revert the rest of the LST tune commit. It seems that the LSR tune commit breaks internal tests. Reverting the commit.
llvm-svn: 327143
|
#
f8438e8e |
| 07-Mar-2018 |
Stefan Pintilie <stefanp@ca.ibm.com> |
[PowerPC] LSR tunings for PowerPC
The purpose of this patch is to have LSR generate better code on Power. This is done by overriding isLSRCostLess.
Differential Revision: https://reviews.llvm.org/D
[PowerPC] LSR tunings for PowerPC
The purpose of this patch is to have LSR generate better code on Power. This is done by overriding isLSRCostLess.
Differential Revision: https://reviews.llvm.org/D40855
llvm-svn: 326906
show more ...
|
Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2 |
|
#
1f59ae31 |
| 30-Jan-2018 |
Zaara Syeda <syzaara@ca.ibm.com> |
Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute.
This recommits r322721 reverted due to sanitizer memory leak build bot failures.
Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute.
This recommits r322721 reverted due to sanitizer memory leak build bot failures.
Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions.
Differential Revision: https://reviews.llvm.org/D38413
llvm-svn: 323778
show more ...
|
#
c9dc7b45 |
| 17-Jan-2018 |
Zaara Syeda <syzaara@ca.ibm.com> |
Revert [PowerPC] This reverts commit rL322721
Failing build bots. Revert the commit now.
llvm-svn: 322748
|
#
8e951fd2 |
| 17-Jan-2018 |
Zaara Syeda <syzaara@ca.ibm.com> |
[PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute.
This patch adds support for the coldcc calling convention for Power. This changes the set of
[PowerPC] Add handling for ColdCC calling convention and a pass to mark candidates with coldcc attribute.
This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions.
Differential Revision: https://reviews.llvm.org/D38413
llvm-svn: 322721
show more ...
|
Revision tags: llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2 |
|
#
b3bde2ea |
| 17-Nov-2017 |
David Blaikie <dblaikie@gmail.com> |
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, n
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around).
llvm-svn: 318490
show more ...
|
Revision tags: llvmorg-5.0.1-rc1 |
|
#
b2c3eb8c |
| 30-Oct-2017 |
Clement Courbet <courbet@google.com> |
[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not as
[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2.
Fixes PR34887.
llvm-svn: 316905
show more ...
|
#
488782ef |
| 19-Oct-2017 |
Graham Yiu <gyiu@ca.ibm.com> |
The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is n
The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost.
Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com)
Differential Revision: https://reviews.llvm.org/D38961
llvm-svn: 316174
show more ...
|
#
2807c0a4 |
| 25-Sep-2017 |
Clement Courbet <courbet@google.com> |
[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.
Summary: Right now there are two functions with the same name, one does the work and the
[CodeGenPrepare][NFC] Rename TargetTransformInfo::expandMemCmp -> TargetTransformInfo::enableMemCmpExpansion.
Summary: Right now there are two functions with the same name, one does the work and the other one returns true if expansion is needed. Rename TargetTransformInfo::expandMemCmp to make it more consistent with other members of TargetTransformInfo.
Remove the unused Instruction* parameter.
Differential Revision: https://reviews.llvm.org/D38165
llvm-svn: 314096
show more ...
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
#
66d9bdbc |
| 28-Jun-2017 |
Geoff Berry <gberry@codeaurora.org> |
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, j
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34531
llvm-svn: 306554
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
c0112ae8 |
| 12-Jun-2017 |
Daniel Neilson <dneilson@azul.com> |
Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Con
Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.
Reviewers: chandlerc, rnk, reames
Reviewed By: reames
Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D33903
llvm-svn: 305189
show more ...
|
#
457ddd31 |
| 31-May-2017 |
Sean Fertile <sfertile@ca.ibm.com> |
[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors.
Commiting on behalf of Stefa
[PowerPC] Correctly specify the cache line size for Power 7, 8 and 9.
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors.
Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D33656
llvm-svn: 304317
show more ...
|
#
3a7578c6 |
| 31-May-2017 |
Zaara Syeda <syzaara@ca.ibm.com> |
[PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a targ
[PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
show more ...
|
Revision tags: llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
#
fccc7d66 |
| 12-Apr-2017 |
Jonas Paulsson <paulsson@linux.vnet.ibm.com> |
[SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleav
[SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented.
Interleaved access vectorization enabled.
BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0.
Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631
llvm-svn: 300052
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3 |
|
#
7ec2c720 |
| 17-Feb-2017 |
Guozhi Wei <carrot@google.com> |
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handl
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.
This patch fixes pr31492.
Differential Revision: https://reviews.llvm.org/D28630
This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.
llvm-svn: 295506
show more ...
|
Revision tags: llvmorg-4.0.0-rc2 |
|
#
65144c85 |
| 25-Jan-2017 |
Daniel Jasper <djasper@google.com> |
Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"
This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal buil
Revert "[PPC] Give unaligned memory access lower cost on processor that supports it"
This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal builds. I have already routed reproduction instructions your way.
llvm-svn: 293092
show more ...
|
#
a5c6ed5a |
| 20-Jan-2017 |
Guozhi Wei <carrot@google.com> |
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handl
[PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.
This patch fixes pr31492.
Differential Revision: https://reviews.llvm.org/D28630
llvm-svn: 292680
show more ...
|
Revision tags: llvmorg-4.0.0-rc1 |
|
#
2c96c433 |
| 11-Jan-2017 |
Mohammed Agabaria <mohammed.agabaria@intel.com> |
[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimiz
[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16.
Differential Revision: https://reviews.llvm.org/D28104
llvm-svn: 291657
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3 |
|
#
835de1f3 |
| 03-Dec-2016 |
Guozhi Wei <carrot@google.com> |
[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory
[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.
Differential Revision: https://reviews.llvm.org/D26713
llvm-svn: 288560
show more ...
|
Revision tags: llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2 |
|
#
b03fd12c |
| 17-Aug-2016 |
Justin Bogner <mail@justinbogner.com> |
Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm
Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm-svn: 278902
show more ...
|
Revision tags: llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
6e29baf7 |
| 09-May-2016 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[Power9] Add support for -mcpu=pwr9 in the back end
This patch corresponds to review: http://reviews.llvm.org/D19683
Simply adds the bits for being able to specify -mcpu=pwr9 to the back end.
llvm
[Power9] Add support for -mcpu=pwr9 in the back end
This patch corresponds to review: http://reviews.llvm.org/D19683
Simply adds the bits for being able to specify -mcpu=pwr9 to the back end.
llvm-svn: 268950
show more ...
|