Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
#
be7a1070 |
| 08-Dec-2019 |
David Green <david.green@arm.com> |
[ARM] Teach the Arm cost model that a Shift can be folded into other instructions
This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under
[ARM] Teach the Arm cost model that a Shift can be folded into other instructions
This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3
So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost.
We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp.
Differential Revision: https://reviews.llvm.org/D70966
show more ...
|
Revision tags: llvmorg-9.0.1-rc2 |
|
#
dcceab1a |
| 27-Nov-2019 |
Stefan Pintilie <stefanp@ca.ibm.com> |
[PowerPC] Add new Future CPU for PowerPC in LLVM
This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC back
[PowerPC] Add new Future CPU for PowerPC in LLVM
This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC backend. At this point the patch simply assumes that a future CPU will have the same characteristics as pwr9. Those characteristics may change with later patches.
Differential Revision: https://reviews.llvm.org/D70333
show more ...
|
Revision tags: llvmorg-9.0.1-rc1 |
|
#
85e4f5bc |
| 15-Nov-2019 |
Kit Barton <kbarton@ca.ibm.com> |
[PowerPC] Rename DarwinDirective to CPUDirective (NFC)
Summary: This patch renames the DarwinDirective (used to identify which CPU was defined) to CPUDirective. It also adds the getCPUDirective() me
[PowerPC] Rename DarwinDirective to CPUDirective (NFC)
Summary: This patch renames the DarwinDirective (used to identify which CPU was defined) to CPUDirective. It also adds the getCPUDirective() method and replaces all uses of getDarwinDirective() with getCPUDirective().
Once this patch lands and downstream users of the getDarwinDirective() method have switched to the getCPUDirective() method, the old getDarwinDirective() method will be removed.
Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits
Reviewed By: hfinkel, jsji, jhibbits
Subscribers: hiraditya, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70352
show more ...
|
#
97e36260 |
| 28-Oct-2019 |
Nemanja Ivanovic <nemanjai@ca.ibm.com> |
[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround
These two intrinsics are lowered to calls so should prevent the formation of CTR loops. In a subsequent patch, we will handle
[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround
These two intrinsics are lowered to calls so should prevent the formation of CTR loops. In a subsequent patch, we will handle all currently known intrinsics and prevent the formation of HW loops if any unknown intrinsics are encountered.
Differential revision: https://reviews.llvm.org/D68841
show more ...
|
#
a4783ef5 |
| 22-Oct-2019 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] getMemoryOpCost uses MaybeAlign
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019
[Alignment][NFC] getMemoryOpCost uses MaybeAlign
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69307
show more ...
|
#
9802268a |
| 12-Oct-2019 |
Zi Xuan Wu <wuzish@cn.ibm.com> |
recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Curre
recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.
So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.
For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled.
It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.
Differential revision: https://reviews.llvm.org/D67148
llvm-svn: 374634
show more ...
|
#
2e6f6b4d |
| 09-Oct-2019 |
David Greene <greened@obbligato.org> |
[System Model] [TTI] Update cache and prefetch TTI interfaces
Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubta
[System Model] [TTI] Update cache and prefetch TTI interfaces
Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo.
Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include:
- Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Moving the default TargetTransformInfoImplBase implementation to a default MCSubtarget implementation
Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations.
The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default MCSubtargetInfo implementation essentially returns the defaults TargetTransformInfoImplBase used to return. Existing users of TTI defaults will hit the defaults now in MCSubtargetInfo. Targets that define their own custom TTI implementations won't use the BasicTTIImpl implementations that route to the subtarget.
Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed.
Differential Revision: https://reviews.llvm.org/D63614
llvm-svn: 374205
show more ...
|
#
9912232b |
| 08-Oct-2019 |
Jinsong Ji <jji@us.ibm.com> |
Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"
Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"
This reverts commit 9f41dec
Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"
Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"
This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.
The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone.
llvm-svn: 374091
show more ...
|
#
9f41decc |
| 08-Oct-2019 |
Zi Xuan Wu <wuzish@cn.ibm.com> |
[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it d
[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.
So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.
For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled.
It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.
Differential revision: https://reviews.llvm.org/D67148
llvm-svn: 374017
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
#
18db4e9a |
| 26-Aug-2019 |
Roland Froese <froese@ca.ibm.com> |
Recommit [PowerPC] Update P9 vector costs for insert/extract
Now that the v1i128 smin regression has been fixed, recommit the P9 cost updates from D60160.
llvm-svn: 369952
|
Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init |
|
#
d300a493 |
| 10-Jul-2019 |
David Greene <greened@obbligato.org> |
Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"
This broke some PPC prefetching tests.
This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c.
llvm-svn: 365680
|
#
9fdfb045 |
| 10-Jul-2019 |
David Greene <greened@obbligato.org> |
[System Model] [TTI] Update cache and prefetch TTI interfaces
Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include:
- Marki
[System Model] [TTI] Update cache and prefetch TTI interfaces
Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include:
- Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Adding a default "no information" subtarget implementation
Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations.
The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default subtarget implementation essentially returns "no information" for these interfaces. None of the existing users of TTI will hit that implementation because they define their own custom TTI implementations and won't use the BasicTTIImpl implementations.
Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed.
Differential Revision: https://reviews.llvm.org/D63614
llvm-svn: 365676
show more ...
|
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
#
dfdccbb2 |
| 03-Jul-2019 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop. Differential Revision: https://reviews.llvm.org/D63477
llvm-svn: 364993
|
#
351b7e7b |
| 01-Jul-2019 |
Jordan Rupprecht <rupprecht@google.com> |
Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element
This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e)
This crashes as reported on the commit thread.
Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element
This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e)
This crashes as reported on the commit thread. Repro instructions TBD.
llvm-svn: 364876
show more ...
|
#
9f7f5858 |
| 27-Jun-2019 |
Roland Froese <froese@ca.ibm.com> |
Recommit [PowerPC] Update P9 vector costs for insert/extract element
Recommit patch D60160 after regression fix patch D63463.
llvm-svn: 364557
|
Revision tags: llvmorg-8.0.1-rc3 |
|
#
3bc5ad55 |
| 25-Jun-2019 |
Clement Courbet <courbet@google.com> |
[ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.
llvm-svn: 364281
|
#
c5b918de |
| 19-Jun-2019 |
Chen Zheng <czhengsz@cn.ibm.com> |
[NFC] move some hardware loop checking code to a common place for other using. Differential Revision: https://reviews.llvm.org/D63478
llvm-svn: 363758
|
Revision tags: llvmorg-8.0.1-rc2 |
|
#
c5ef502e |
| 07-Jun-2019 |
Sam Parker <sam.parker@arm.com> |
[CodeGen] Generic Hardware Loop Support Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops
[CodeGen] Generic Hardware Loop Support Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining.
llvm-svn: 362774
show more ...
|
Revision tags: llvmorg-8.0.1-rc1 |
|
#
fccb505f |
| 01-May-2019 |
David L. Jones <dlj@google.com> |
Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"
This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thr
Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"
This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313.
llvm-svn: 359648
show more ...
|
#
2755b73b |
| 29-Apr-2019 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
Fix operator precedence warning. NFCI.
Reported in https://www.viva64.com/en/b/0629/
llvm-svn: 359469
|
#
4b17772b |
| 26-Apr-2019 |
Roland Froese <froese@ca.ibm.com> |
[PowerPC] Update P9 vector costs for insert/extract element
The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-fr
[PowerPC] Update P9 vector costs for insert/extract element
The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-from VSR instructions. Update getVectorInstrCost to give appropriate values for when the newer instructions are present.
Differential Revision: https://reviews.llvm.org/D60160
llvm-svn: 359313
show more ...
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2 |
|
#
7f29195c |
| 01-Feb-2019 |
Roland Froese <froese@ca.ibm.com> |
test commit (add blank line) NFC
llvm-svn: 352897
|
#
7d007dde |
| 26-Jan-2019 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Update Vector Costs for P9
For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost
[PowerPC] Update Vector Costs for P9
For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9.
Patch by RolandF.
Differential revision: https://reviews.llvm.org/D55461
llvm-svn: 352261
show more ...
|
Revision tags: llvmorg-8.0.0-rc1 |
|
#
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <chandlerc@gmail.com> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
#
34da6dd6 |
| 31-Oct-2018 |
Dorit Nuzman <dorit.nuzman@intel.com> |
[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads
Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps a
[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads
Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects).
Reviewers: Ayal, hsaito, dcaballe, fhahn
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D53668
llvm-svn: 345705
show more ...
|