History log of /llvm-project/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (Results 126 – 150 of 227)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3
# be7a1070 08-Dec-2019 David Green <david.green@arm.com>

[ARM] Teach the Arm cost model that a Shift can be folded into other instructions

This attempts to teach the cost model in Arm that code such as:
%s = shl i32 %a, 3
%a = and i32 %s, %b
Can under

[ARM] Teach the Arm cost model that a Shift can be folded into other instructions

This attempts to teach the cost model in Arm that code such as:
%s = shl i32 %a, 3
%a = and i32 %s, %b
Can under Arm or Thumb2 become:
and r0, r1, r2, lsl #3

So the cost of the shift can essentially be free. To do this without
trying to artificially adjust the cost of the "and" instruction, it
needs to get the users of the shl and check if they are a type of
instruction that the shift can be folded into. And so it needs to have
access to the actual instruction in getArithmeticInstrCost, which if
available is added as an extra parameter much like getCastInstrCost.

We otherwise limit it to shifts with a single user, which should
hopefully handle most of the cases. The list of instruction that the
shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR,
ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and
ICmp.

Differential Revision: https://reviews.llvm.org/D70966

show more ...


Revision tags: llvmorg-9.0.1-rc2
# dcceab1a 27-Nov-2019 Stefan Pintilie <stefanp@ca.ibm.com>

[PowerPC] Add new Future CPU for PowerPC in LLVM

This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC back

[PowerPC] Add new Future CPU for PowerPC in LLVM

This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.

Differential Revision: https://reviews.llvm.org/D70333

show more ...


Revision tags: llvmorg-9.0.1-rc1
# 85e4f5bc 15-Nov-2019 Kit Barton <kbarton@ca.ibm.com>

[PowerPC] Rename DarwinDirective to CPUDirective (NFC)

Summary:
This patch renames the DarwinDirective (used to identify which CPU was defined)
to CPUDirective. It also adds the getCPUDirective() me

[PowerPC] Rename DarwinDirective to CPUDirective (NFC)

Summary:
This patch renames the DarwinDirective (used to identify which CPU was defined)
to CPUDirective. It also adds the getCPUDirective() method and replaces all uses
of getDarwinDirective() with getCPUDirective().

Once this patch lands and downstream users of the getDarwinDirective() method
have switched to the getCPUDirective() method, the old getDarwinDirective()
method will be removed.

Reviewers: nemanjai, hfinkel, power-llvm-team, jsji, echristo, #powerpc, jhibbits

Reviewed By: hfinkel, jsji, jhibbits

Subscribers: hiraditya, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70352

show more ...


# 97e36260 28-Oct-2019 Nemanja Ivanovic <nemanjai@ca.ibm.com>

[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround

These two intrinsics are lowered to calls so should prevent the formation of
CTR loops. In a subsequent patch, we will handle

[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround

These two intrinsics are lowered to calls so should prevent the formation of
CTR loops. In a subsequent patch, we will handle all currently known intrinsics
and prevent the formation of HW loops if any unknown intrinsics are encountered.

Differential revision: https://reviews.llvm.org/D68841

show more ...


# a4783ef5 22-Oct-2019 Guillaume Chatelet <gchatelet@google.com>

[Alignment][NFC] getMemoryOpCost uses MaybeAlign

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019

[Alignment][NFC] getMemoryOpCost uses MaybeAlign

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69307

show more ...


# 9802268a 12-Oct-2019 Zi Xuan Wu <wuzish@cn.ibm.com>

recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Curre

recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148

llvm-svn: 374634

show more ...


# 2e6f6b4d 09-Oct-2019 David Greene <greened@obbligato.org>

[System Model] [TTI] Update cache and prefetch TTI interfaces

Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved
moving defaults from TargetTransformInfoImplBase to MCSubta

[System Model] [TTI] Update cache and prefetch TTI interfaces

Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon. This involved
moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo.

Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model. Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
implementation
- Moving the default TargetTransformInfoImplBase implementation to a default
MCSubtarget implementation

Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC
and SystemZ. AArch64 already has a custom subtarget implementation, so its
custom TTI implementation is migrated to use the new facilities in BasicTTIImpl
to invoke its custom subtarget implementation. The custom TTI implementations
continue to exist for the other targets with this change. They are not moved
over to subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to the system
model defined by the target. With this change, the default MCSubtargetInfo
implementation essentially returns the defaults TargetTransformInfoImplBase used
to return. Existing users of TTI defaults will hit the defaults now in
MCSubtargetInfo. Targets that define their own custom TTI implementations won't
use the BasicTTIImpl implementations that route to the subtarget.

Once system models are in place for the targets that use these interfaces, their
custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

llvm-svn: 374205

show more ...


# 9912232b 08-Oct-2019 Jinsong Ji <jji@us.ibm.com>

Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"

Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41dec

Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"

Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a.
This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.

The patch is breaking PowerPC internal build, checked with author, reverting
on behalf of him for now due to timezone.

llvm-svn: 374091

show more ...


# 9f41decc 08-Oct-2019 Zi Xuan Wu <wuzish@cn.ibm.com>

[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it d

[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148

llvm-svn: 374017

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 18db4e9a 26-Aug-2019 Roland Froese <froese@ca.ibm.com>

Recommit [PowerPC] Update P9 vector costs for insert/extract

Now that the v1i128 smin regression has been fixed, recommit the P9 cost
updates from D60160.

llvm-svn: 369952


Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init
# d300a493 10-Jul-2019 David Greene <greened@obbligato.org>

Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"

This broke some PPC prefetching tests.

This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c.

llvm-svn: 365680


# 9fdfb045 10-Jul-2019 David Greene <greened@obbligato.org>

[System Model] [TTI] Update cache and prefetch TTI interfaces

Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model. Changes include:

- Marki

[System Model] [TTI] Update cache and prefetch TTI interfaces

Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model. Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
implementation
- Adding a default "no information" subtarget implementation

Only a handful of targets use these interfaces currently: AArch64,
Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget
implementation, so its custom TTI implementation is migrated to use
the new facilities in BasicTTIImpl to invoke its custom subtarget
implementation. The custom TTI implementations continue to exist for
the other targets with this change. They are not moved over to
subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to
the system model defined by the target. With this change, the default
subtarget implementation essentially returns "no information" for
these interfaces. None of the existing users of TTI will hit that
implementation because they define their own custom TTI
implementations and won't use the BasicTTIImpl implementations.

Once system models are in place for the targets that use these
interfaces, their custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

llvm-svn: 365676

show more ...


Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4
# dfdccbb2 03-Jul-2019 Chen Zheng <czhengsz@cn.ibm.com>

[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop.

Differential Revision: https://reviews.llvm.org/D63477

llvm-svn: 364993


# 351b7e7b 01-Jul-2019 Jordan Rupprecht <rupprecht@google.com>

Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element

This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e)

This crashes as reported on the commit thread.

Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element

This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e)

This crashes as reported on the commit thread. Repro instructions TBD.

llvm-svn: 364876

show more ...


# 9f7f5858 27-Jun-2019 Roland Froese <froese@ca.ibm.com>

Recommit [PowerPC] Update P9 vector costs for insert/extract element

Recommit patch D60160 after regression fix patch D63463.

llvm-svn: 364557


Revision tags: llvmorg-8.0.1-rc3
# 3bc5ad55 25-Jun-2019 Clement Courbet <courbet@google.com>

[ExpandMemCmp] Move all options to TargetTransformInfo.

Split off from D60318.

llvm-svn: 364281


# c5b918de 19-Jun-2019 Chen Zheng <czhengsz@cn.ibm.com>

[NFC] move some hardware loop checking code to a common place for other using.
Differential Revision: https://reviews.llvm.org/D63478

llvm-svn: 363758


Revision tags: llvmorg-8.0.1-rc2
# c5ef502e 07-Jun-2019 Sam Parker <sam.parker@arm.com>

[CodeGen] Generic Hardware Loop Support

Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops

[CodeGen] Generic Hardware Loop Support

Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.

Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
Takes the maximum number of elements processed in an iteration of
the loop body and subtracts this from the total count. Returns
false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
Takes the number of elements remaining to be processed as well as
the maximum numbe of elements processed in an iteration of the loop
body. Returns the updated number of elements remaining.

llvm-svn: 362774

show more ...


Revision tags: llvmorg-8.0.1-rc1
# fccb505f 01-May-2019 David L. Jones <dlj@google.com>

Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"

This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thr

Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"

This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313.

llvm-svn: 359648

show more ...


# 2755b73b 29-Apr-2019 Simon Pilgrim <llvm-dev@redking.me.uk>

Fix operator precedence warning. NFCI.

Reported in https://www.viva64.com/en/b/0629/

llvm-svn: 359469


# 4b17772b 26-Apr-2019 Roland Froese <froese@ca.ibm.com>

[PowerPC] Update P9 vector costs for insert/extract element

The PPC vector cost model values for insert/extract element reflect older
processors that lacked vector insert/extract and move-to/move-fr

[PowerPC] Update P9 vector costs for insert/extract element

The PPC vector cost model values for insert/extract element reflect older
processors that lacked vector insert/extract and move-to/move-from VSR
instructions. Update getVectorInstrCost to give appropriate values for when
the newer instructions are present.

Differential Revision: https://reviews.llvm.org/D60160

llvm-svn: 359313

show more ...


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2
# 7f29195c 01-Feb-2019 Roland Froese <froese@ca.ibm.com>

test commit (add blank line) NFC

llvm-svn: 352897


# 7d007dde 26-Jan-2019 Nemanja Ivanovic <nemanja.i.ibm@gmail.com>

[PowerPC] Update Vector Costs for P9

For the power9 CPU, vector operations consume a pair of execution units rather
than one execution unit like a scalar operation. Update the target transform
cost

[PowerPC] Update Vector Costs for P9

For the power9 CPU, vector operations consume a pair of execution units rather
than one execution unit like a scalar operation. Update the target transform
cost functions to reflect the higher cost of vector operations when targeting
Power9.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D55461

llvm-svn: 352261

show more ...


Revision tags: llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 34da6dd6 31-Oct-2018 Dorit Nuzman <dorit.nuzman@intel.com>

[LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps a

[LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668

llvm-svn: 345705

show more ...


12345678910