Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
f71cb9db |
| 14-Nov-2024 |
Kazu Hirata <kazu@google.com> |
[PowerPC] Remove unused includes (NFC) (#116163)
Identified with misc-include-cleaner.
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
06c8210a |
| 03-Oct-2024 |
RolandF77 <55763885+RolandF77@users.noreply.github.com> |
update P7 32-bit partial vector load cost (#108261)
Update cost model to reflect codegen change to use lfiwzx
for 32-bit partial vector loads on pwr7 with
https://github.com/llvm/llvm-project/pul
update P7 32-bit partial vector load cost (#108261)
Update cost model to reflect codegen change to use lfiwzx
for 32-bit partial vector loads on pwr7 with
https://github.com/llvm/llvm-project/pull/104507.
show more ...
|
Revision tags: llvmorg-19.1.1 |
|
#
d2885743 |
| 25-Sep-2024 |
Philip Reames <preames@rivosinc.com> |
[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824)
This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704,
and extends the costing API for compares
[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824)
This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704,
and extends the costing API for compares and selects to provide
information about the operands passed in an analogous manner. This
allows us to model the cost of materializing the vector constant, as
some select-of-constants are significantly more expensive than others
when you account for the cost of materializing the constants involved.
This is a stepping stone towards fixing
https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch
will be required to utilize the new API.
show more ...
|
Revision tags: llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1 |
|
#
1df4d866 |
| 23-Jul-2024 |
azhan92 <alisonxzhang@gmail.com> |
[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511)
This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in
clang and llvm.
|
Revision tags: llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
4ac2721e |
| 09-Apr-2024 |
David Green <david.green@arm.com> |
[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934)
This tries to add some costs for the shuffle in a ST3/ST4 instruction,
which are represented in LLVM IR as sto
[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934)
This tries to add some costs for the shuffle in a ST3/ST4 instruction,
which are represented in LLVM IR as store(interleaving shuffle). In
order to detect the store, it needs to add a CxtI context instruction to
check the users of the shuffle. LD3 and LD4 are added, LD2 should be a
zip1 shuffle, which will be added in another patch.
It should help fix some of the regressions from #87510.
show more ...
|
Revision tags: llvmorg-18.1.3 |
|
#
308ed023 |
| 26-Mar-2024 |
Il-Capitano <52455591+Il-Capitano@users.noreply.github.com> |
[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
[Intrinsics] Make `patchpoint.i64` generic on its return type (#85911)
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
8d1046ae |
| 04-Mar-2024 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] adjust cost for extract i64 from vector on P9 and above (#82963)
https://godbolt.org/z/Ma347Tx1W
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
80f3bb4c |
| 19-Feb-2024 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] adjust cost for vector insert/extract with non const index (#79092)
P9 has vxform `Vector Extract Element Instructions` like `vextuwrx` and
P10 has vxform `Vector Insert Element instructi
[PowerPC] adjust cost for vector insert/extract with non const index (#79092)
P9 has vxform `Vector Extract Element Instructions` like `vextuwrx` and
P10 has vxform `Vector Insert Element instructions` like `vinsd`. Update
the instruction cost reflecting these instructions.
Fixes https://github.com/llvm/llvm-project/issues/50249
show more ...
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
4beea6b1 |
| 23-Jan-2024 |
RolandF77 <55763885+RolandF77@users.noreply.github.com> |
[PowerPC] lower partial vector store cost (#78358)
There are matching store opcodes (stfd, stxsiwx) for the load opcodes
that make 32-bit and 64-bit vector operations cheap with VSX, so stores
sho
[PowerPC] lower partial vector store cost (#78358)
There are matching store opcodes (stfd, stxsiwx) for the load opcodes
that make 32-bit and 64-bit vector operations cheap with VSX, so stores
should also be cheap.
show more ...
|
#
286ef12b |
| 08-Dec-2023 |
Kazu Hirata <kazu@google.com> |
[Target] Remove unnecessary includes (NFC)
|
Revision tags: llvmorg-17.0.6 |
|
#
81b7f115 |
| 22-Nov-2023 |
Sander de Smalen <sander.desmalen@arm.com> |
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:
TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)
with
[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:
TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)
without failing its assert that explicitly tests for this case:
assert(LHS.Scalable == RHS.Scalable && ...);
The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.
This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.
The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
e69e066b |
| 01-Nov-2023 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[llvm][PowerPC] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort.
|
Revision tags: llvmorg-17.0.4 |
|
#
8e247b8f |
| 27-Oct-2023 |
Fangrui Song <i@maskray.me> |
Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3 |
|
#
4d425f86 |
| 14-Aug-2023 |
Roland Froese <froese@ca.ibm.com> |
[PowerPC] vector cost model add cost to extract i1
Try to avoid some unprofitable predication on PPC. Recognize in the cost model that computing on i1 values will require extra mask or compare opera
[PowerPC] vector cost model add cost to extract i1
Try to avoid some unprofitable predication on PPC. Recognize in the cost model that computing on i1 values will require extra mask or compare operation.
Differential Revision: https://reviews.llvm.org/D155876
show more ...
|
Revision tags: llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
65f68812 |
| 01-Mar-2023 |
Ting Wang <Ting.Wang.SH@ibm.com> |
[PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions
This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check `PPCTTIImpl::supportsTailCallFor()`.
Fixes #59315
Reviewed By:
[PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions
This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check `PPCTTIImpl::supportsTailCallFor()`.
Fixes #59315
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D140369
show more ...
|
Revision tags: llvmorg-16.0.0-rc3 |
|
#
b02b1e0e |
| 21-Feb-2023 |
Luke Lau <luke@igalia.com> |
[LV][NFC] Use ElementCount for getMaxInterleaveFactor
In order to allow targets to disable interleaving for scalable vectors, pass the entire VF's ElementCount to getMaxInterleaveFactor. This is bas
[LV][NFC] Use ElementCount for getMaxInterleaveFactor
In order to allow targets to disable interleaving for scalable vectors, pass the entire VF's ElementCount to getMaxInterleaveFactor. This is based off of the approach used here: https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/commit/8d36708507b3c378078b9fe364bc548354aaec86
The plan would then be to disable interleaving on scalable VFs on RISC-V in a follow up patch. See https://reviews.llvm.org/D143723#4132349
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D144474
show more ...
|
Revision tags: llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init |
|
#
5fb3a57e |
| 21-Jan-2023 |
ShihPo Hung <shihpo.hung@sifive.com> |
[Cost] Add CostKind to getVectorInstrCost and its related users
LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). An
[Cost] Add CostKind to getVectorInstrCost and its related users
LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead().
To address this, this patch adds an argument CostKind to these functions.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D142116
show more ...
|
Revision tags: llvmorg-15.0.7 |
|
#
9b5f6268 |
| 21-Dec-2022 |
Alexey Bataev <a.bataev@outlook.com> |
[SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the
[SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector.
Differential Revision: https://reviews.llvm.org/D140498
show more ...
|
#
85edf1fc |
| 06-Jan-2023 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] remove the ctr clobbers check related to TLS access
Dynamic tls access model will be lowered to MI which clobbers CTR in the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will r
[PowerPC] remove the ctr clobbers check related to TLS access
Dynamic tls access model will be lowered to MI which clobbers CTR in the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will revert the loop to a normal compare + branch form.
So no need to add this clobber check in hardware loop insertion pass now.
Reviewed By: nemanjai
Differential revision: https://reviews.llvm.org/D140367
show more ...
|
#
f74324a1 |
| 19-Dec-2022 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] don't generate hardware loop.
If the candidate loop already has hardware loop related intrinsics, don't generate hardware loop on PPC. PPC does not support nested hardware loops.
|
#
b5e1fc19 |
| 02-Dec-2022 |
Chen Zheng <czhengsz@cn.ibm.com> |
[PowerPC] don't check CTR clobber in hardware loop insertion pass
We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal lo
[PowerPC] don't check CTR clobber in hardware loop insertion pass
We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal loop based on the loop context. So we don't need to conservatively check the CTR clobber now on the IR level.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/D135847
show more ...
|
#
86fe4dfd |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
|
#
4e12d183 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.
Some buildbots are failing.
|
#
b8371124 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2 |
|
#
4ea121c9 |
| 04-Oct-2022 |
Nemanja Ivanovic <nemanja.i.ibm@gmail.com> |
[PowerPC] Fix a number of inefficiencies and issues with atomic code gen
There are a few issues with the code we generate for atomic operations and the way we generate it:
- Hard coded CR0 for comp
[PowerPC] Fix a number of inefficiencies and issues with atomic code gen
There are a few issues with the code we generate for atomic operations and the way we generate it:
- Hard coded CR0 for compares - Order of operands for compares not conducive to emitting compare-immediate or for CSE of compares - Missing MachineMemOperand for st[bhwd]cx intrinsics - Missing intrinsic properties for the same - Unnecessary blocks with store conditional instructions to clear reservation (which ends up hindering performance) - Move from CR instructions just to compare the result of a store conditional with zero (even though it is a record-form)
This patch aims to resolve all of those issues.
Differential revision: https://reviews.llvm.org/D134783
show more ...
|