#
69192e01 |
| 08-Jul-2024 |
Manish Kausik H <46352931+Nirhar@users.noreply.github.com> |
[LegalizeDAG] Optimize CodeGen for `ISD::CTLZ_ZERO_UNDEF` (#83039)
Previously we had the same instructions being generated for `ISD::CTLZ` and `ISD::CTLZ_ZERO_UNDEF` which did not take advantage of
[LegalizeDAG] Optimize CodeGen for `ISD::CTLZ_ZERO_UNDEF` (#83039)
Previously we had the same instructions being generated for `ISD::CTLZ` and `ISD::CTLZ_ZERO_UNDEF` which did not take advantage of the fact that zero is an invalid input for `ISD::CTLZ_ZERO_UNDEF`. This commit separates codegen for the two cases to allow for the optimization for the latter case.
The details of the optimization are outlined in #82075
Fixes #82075
Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
2347020e |
| 15-Apr-2024 |
Dávid Ferenc Szabó <30732159+dfszabo@users.noreply.github.com> |
[GlobalISel] Fix fewerElementsVectorPhi to insert after G_PHIs (#87927)
Currently the inserted mergelike instructions will be inserted at the
location of the G_PHI. Seems like the behaviour was cor
[GlobalISel] Fix fewerElementsVectorPhi to insert after G_PHIs (#87927)
Currently the inserted mergelike instructions will be inserted at the
location of the G_PHI. Seems like the behaviour was correct before, but
the rework done in https://reviews.llvm.org/D114198 forgot to include
the part which makes sure the instructions will be inserted after all
the G_PHIs.
show more ...
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
96049fcf |
| 07-Mar-2024 |
Michael Maitland <michaeltmaitland@gmail.com> |
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
Recommits llvm/llvm-project#80378 which was reverted in llvm/llvm-project#84330. The problem was that the change in llvm
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
Recommits llvm/llvm-project#80378 which was reverted in llvm/llvm-project#84330. The problem was that the change in llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used 217 as an opcode instead of a regex.
show more ...
|
#
552da248 |
| 07-Mar-2024 |
Michael Maitland <michaeltmaitland@gmail.com> |
Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector types" (#84330)
Reverts llvm/llvm-project#80378
causing Buildbot failures that did not show up with check-llvm or CI.
|
#
2b8aaef0 |
| 07-Mar-2024 |
Michael Maitland <michaeltmaitland@gmail.com> |
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.
ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.
`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1 |
|
#
f2d0bba8 |
| 26-Jan-2024 |
Kai Nacke <kai.peter.nacke@ibm.com> |
[GISel] Lower scalar G_SELECT in LegalizerHelper (#79342)
The LegalizerHelper only has support to lower G_SELECT with
vector operands. The approach is the same for scalar arguments,
which this PR
[GISel] Lower scalar G_SELECT in LegalizerHelper (#79342)
The LegalizerHelper only has support to lower G_SELECT with
vector operands. The approach is the same for scalar arguments,
which this PR adds.
show more ...
|
Revision tags: llvmorg-19-init |
|
#
d659bd16 |
| 03-Jan-2024 |
David Green <david.green@arm.com> |
[GlobalISel][AArch64] Tail call libcalls. (#74929)
This tries to allow libcalls to be tail called, using a similar method
to DAG where the type is checked to make sure they match, and if so the
ba
[GlobalISel][AArch64] Tail call libcalls. (#74929)
This tries to allow libcalls to be tail called, using a similar method
to DAG where the type is checked to make sure they match, and if so the
backend, through lowerCall checks that the tailcall is valid for all
arguments.
show more ...
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1 |
|
#
7fc87159 |
| 25-Jan-2023 |
Paul Robinson <paul.robinson@sony.com> |
[unittests] Use GTEST_SKIP() instead of return when appropriate
Basically NFC: A TEST/TEST_F/etc that bails out early (usually because setup failed or some other runtime condition wasn't met) genera
[unittests] Use GTEST_SKIP() instead of return when appropriate
Basically NFC: A TEST/TEST_F/etc that bails out early (usually because setup failed or some other runtime condition wasn't met) generally should use GTEST_SKIP() to report its status correctly, unless it takes steps to report another status (e.g., FAIL()).
I did see a handful of tests show up as SKIPPED after this change, which is not unexpected. The status seemed appropriate in all the new cases.
show more ...
|
Revision tags: llvmorg-17-init, llvmorg-15.0.7 |
|
#
f95a5fbe |
| 09-Jan-2023 |
Diana Picus <Diana-Magda.Picus@amd.com> |
MachineIRBuilder: Rename buildMerge. NFC
`buildMerge` may build a G_MERGE_VALUES, G_BUILD_VECTOR or G_CONCAT_VECTORS. Rename it to `buildMergeLikeInstr`.
This is a follow-up suggested in https://re
MachineIRBuilder: Rename buildMerge. NFC
`buildMerge` may build a G_MERGE_VALUES, G_BUILD_VECTOR or G_CONCAT_VECTORS. Rename it to `buildMergeLikeInstr`.
This is a follow-up suggested in https://reviews.llvm.org/D140964
Differential Revision: https://reviews.llvm.org/D141372
show more ...
|
#
cb38be9e |
| 07-Dec-2022 |
Gregory Alfonso <gfunni234@gmail.com> |
[NFC] Use Register instead of unsigned for variables that receive a Register object
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D139451
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2 |
|
#
b3837537 |
| 02-Aug-2022 |
Kai Nacke <kai.nacke@de.ibm.com> |
[GIsel] Add missing libcall for G_MUL to LegalizerHelper
The LegalizerHelper misses the code to lower G_MUL to a library call, which this change adds.
Reviewed By: arsenm
Differential Revision: ht
[GIsel] Add missing libcall for G_MUL to LegalizerHelper
The LegalizerHelper misses the code to lower G_MUL to a library call, which this change adds.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D130987
show more ...
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2 |
|
#
3754f601 |
| 12-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Implement MoreElements for select of vector conditions
|
#
95c2bcbf |
| 12-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Handle widening umulo/smulo condition outputs
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
29f88b93 |
| 23-Dec-2021 |
Petar Avramovic <Petar.Avramovic@amd.com> |
[GlobalISel] Rework more/fewer elements for vectors
Artifact combiner is not able to access individual elements after using LCMTy style merge/unmerge, extract and insert to change vector number of e
[GlobalISel] Rework more/fewer elements for vectors
Artifact combiner is not able to access individual elements after using LCMTy style merge/unmerge, extract and insert to change vector number of elements (pad with undef or split to sub-vector instructions). Use unmerge to individual elements instead and then merge elements into requested types. Change argument lowering for vectors and moreElementsVector to use buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements. FewerElementsVector had a few helpers that had different behavior, introduce new helper for most of the opcodes. FewerElementsVector helper is more flexible since it can create leftover instruction smaller then requested type (useful in case target wants to avoid pad with undef and use fewer registers). If target does not want leftover of different type it should call more elements first. Some helpers were performing more elements first to have split without leftover. Opcodes that used this helper use clampMaxNumElementsStrict (does more elements first) in LegalizerInfo to avoid test changes. Fixes failures caused by failing to combine artifacts created during more/fewer elements vector.
Differential Revision: https://reviews.llvm.org/D114198
show more ...
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
8bde5e58 |
| 27-Sep-2021 |
Amara Emerson <amara@apple.com> |
Delay outgoing register assignments to last.
The delayed stack protector feature which is currently used for SDAG (and thus allows for more commonly generating tail calls) depends on being able to e
Delay outgoing register assignments to last.
The delayed stack protector feature which is currently used for SDAG (and thus allows for more commonly generating tail calls) depends on being able to extract the tail call into a separate return block. To do this it also has to extract the vreg->physreg copies that set up the call's arguments, since if it doesn't then the call inst ends up using undefined physregs in it's new spliced block.
SelectionDAG implementations can do this because they delay emitting register copies until *after* the stack arguments are set up. GISel however just processes and emits the arguments in IR order, so stack arguments always end up last, and thus this breaks the code that looks for any register arg copies that precede the call instruction.
This patch adds a thunk argument to the assignValueToReg() and custom assignment hooks. For outgoing arguments, register assignments use this return param to return a thunk that does the actual generating of the copies. We collect these until all the outgoing stack assignments have been done and then execute them, so that the copies (and perhaps some artifacts like G_SEXTs) are placed after any stores.
Differential Revision: https://reviews.llvm.org/D110610
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
#
64bef13f |
| 08-Oct-2020 |
Konstantin Schwarz <konstantin.schwarz@hightec-rt.com> |
[GlobalISel] Look through truncs and extends in narrowScalarShift
If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be shifted individually by the constant shift amount.
[GlobalISel] Look through truncs and extends in narrowScalarShift
If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be shifted individually by the constant shift amount.
However in case the shift amount came from a G_TRUNC(G_CONSTANT), the generic shift legalization code was used, producing intermediate shifts that are potentially illegal on some targets.
This change teaches narrowScalarShift to look through G_TRUNCs and G_*EXTs.
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D89100
show more ...
|
#
57b9107e |
| 06-Aug-2021 |
Jay Foad <jay.foad@amd.com> |
[GlobalISel] Improve widening of cttz/cttz_zero_undef
Differential Revision: https://reviews.llvm.org/D107631
|
#
cd2594e1 |
| 04-Aug-2021 |
Jay Foad <jay.foad@amd.com> |
[GlobalISel] Improve legalization of narrow CTTZ
Differential Revision: https://reviews.llvm.org/D107457
|
#
97c42639 |
| 10-Jul-2021 |
Amara Emerson <amara@apple.com> |
[AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR.
Differential Revision: https://reviews.llvm.org/D103301
|
#
9b057f64 |
| 08-Jul-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Track original argument index in ArgInfo
SelectionDAG's equivalents in ISD::InputArg/OutputArg track the original argument index. Mips relies on this, and its currently reinventing its o
GlobalISel: Track original argument index in ArgInfo
SelectionDAG's equivalents in ISD::InputArg/OutputArg track the original argument index. Mips relies on this, and its currently reinventing its own parallel CallLowering infrastructure which tracks these indexes on the side. Add this to help move towards deleting the custom mips handling.
show more ...
|
#
fae05692 |
| 20-May-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted
CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert it to a scalar. This should be removed in the near future (I think I converted all of the tests already, but likely missed a few).
Not sure what the exact syntax and policy should be. We can continue printing the number of bytes for non-generic instructions to avoid test churn and only allow non-scalar types for generic instructions.
This will currently print the LLT in parentheses, but accept parsing the existing integers and implicitly converting to scalar. The parentheses are a bit ugly, but the parser logic seems unable to deal without either parentheses or some keyword to indicate the start of a type.
show more ...
|
#
d5e14ba8 |
| 24-Jun-2021 |
Sander de Smalen <sander.desmalen@arm.com> |
[GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case: * LLT::fixed_vector * LLT::scalable_vector
The strategy for migrating to t
[GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case: * LLT::fixed_vector * LLT::scalable_vector
The strategy for migrating to the new interfaces was as follows: * If the new LLT is a (modified) clone of another LLT, taking the same number of elements, then use LLT::vector(OtherTy.getElementCount()) or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2) or operator*. That is because there is no reason to specifically restrict the types to 'fixed_vector'. * If the algorithm works on the number of elements (as unsigned), then just use fixed_vector. This will need to be fixed up in the future when modifying the algorithm to also work for scalable vectors, and will need then need additional tests to confirm the behaviour works the same for scalable vectors. * If the test used the '/*Scalable=*/true` flag of LLT::vector, then this is replaced by LLT::scalable_vector.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104451
show more ...
|
#
31a9659d |
| 07-Jun-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
GlobalISel: Avoid use of G_INSERT in insertParts
G_INSERT legalization is incomplete and doesn't work very well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES padding with undef va
GlobalISel: Avoid use of G_INSERT in insertParts
G_INSERT legalization is incomplete and doesn't work very well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES padding with undef values (although this can get pretty large).
For the case of load/store narrowing, this is still performing the load/stores in irregularly sized pieces. It might be cleaner to split this down into equal sized pieces, and rely on load/store merging to optimize it.
show more ...
|
#
aaac2682 |
| 29-May-2021 |
Daniel Sanders <daniel_l_sanders@apple.com> |
[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
It's still in use in a few places so we can't delete it yet but there's not many at this point.
Differential Revis
[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one
It's still in use in a few places so we can't delete it yet but there's not many at this point.
Differential Revision: https://reviews.llvm.org/D103352
show more ...
|
#
08d31ff4 |
| 27-May-2021 |
Jessica Paquette <jpaquette@apple.com> |
Fix unit test after 324af79dbc6066
Needed to add in an extra parameter to calls to `libcall`.
|