History log of /llvm-project/llvm/test/Analysis/CostModel/AArch64/cmp.ll (Results 1 – 11 of 11)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0
# 960c975a 16-Sep-2024 David Green <david.green@arm.com>

[AArch64] Expand scmp/ucmp vector operations with sub (#108830)

Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select,
under Neon we can use the arithmetic expansion to generate fewe

[AArch64] Expand scmp/ucmp vector operations with sub (#108830)

Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select,
under Neon we can use the arithmetic expansion to generate fewer
instructions. Notably it also prevents the scalarization of vselect
during vector-legalization.

show more ...


Revision tags: llvmorg-19.1.0-rc4
# 140e80a2 31-Aug-2024 Yingwei Zheng <dtcxzyw2333@gmail.com>

[TTI] Add cost model support for [u|s]cmp (#106824)

This patch adds cost model support for [u|s]cmp.


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 2a859b20 28-Jul-2023 David Green <david.green@arm.com>

[AArch64] Change the cost of vector insert/extract to 2

The cost of vector instructions has always been high under AArch64, in order to
add a high cost for inserts/extracts, shuffles and scalarizati

[AArch64] Change the cost of vector insert/extract to 2

The cost of vector instructions has always been high under AArch64, in order to
add a high cost for inserts/extracts, shuffles and scalarization. This is a
conservative approach to limit the scope of unusual SLP vectorization where the
codegen ends up being quite poor, but has always been higher than the correct
costs would be for any specific core.

This relaxes that, reducing the vector insert/extract cost from 3 to 2. It is a
generalization of D142359 to all AArch64 cpus. The ScalarizationOverhead is
also overridden for integer vector at the same time, to remove the effect of
lane 0 being considered free for integer vectors (something that should only be
true for float when scalarizing).

The lower insert/extract cost will reduce the cost of insert, extracts,
shuffling and scalarization. The adjustments of ScalaizationOverhead will
increase the cost on integer, especially for small vectors. The end result will
be lower cost for float and long-integer types, some higher cost for some
smaller vectors. This, along with the raw insert/extract cost being lower, will
generally mean more vectorization from the Loop and SLP vectorizer.

We may end up regretting this, as that vectorization is not always profitable.
In all the benchmarking I have done this is generally an improvement in the
overall performance, and I've attempted to address the places where it wasn't
with other costmodel adjustments.

Differential Revision: https://reviews.llvm.org/D155459

show more ...


Revision tags: llvmorg-18-init
# 5106b221 01-Jul-2023 David Green <david.green@arm.com>

[AArch64] Treat the icmp in icmp(and(..), 0) as free

As in https://godbolt.org/z/4dafd9Geq, the icmp from an And may use an Ands to
set flags, meaning the icmp is free.

This could also be done for

[AArch64] Treat the icmp in icmp(and(..), 0) as free

As in https://godbolt.org/z/4dafd9Geq, the icmp from an And may use an Ands to
set flags, meaning the icmp is free.

This could also be done for add/sub, but those patterns often happen in the
induction variable of a loop, making them quite performance sensitive.

Differential Revision: https://reviews.llvm.org/D153611

show more ...


# 46ef3337 29-Jun-2023 David Green <david.green@arm.com>

[AArch64] Add and cmp cost model tests. NFC

See D153611. Tests for the cost of icmp(and, 0) are added, in addition to
expanding the extractelements-to-shuffle.ll test, which has always been a bit
si

[AArch64] Add and cmp cost model tests. NFC

See D153611. Tests for the cost of icmp(and, 0) are added, in addition to
expanding the extractelements-to-shuffle.ll test, which has always been a bit
simple, to include a more complete example with both a vector and scalar
version. The icmp(and, 0) costs are targetting at improving the second when the
cost of vector inserts and extracts is lowered.

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0
# 75f1b328 29-Aug-2022 Mingming Liu <mingmingl@google.com>

[AArch64][CostModel][NFC] Specify target datalayout explicitly for cost analysis test.

- Use linux little endian data layout string.

Differential Revision: https://reviews.llvm.org/D132889


Revision tags: llvmorg-15.0.0-rc3
# 4178e334 10-Aug-2022 Simon Pilgrim <llvm-dev@redking.me.uk>

[CostModel] Update RUN -passes=* to double quotes to appease update scripts on windows

DOS really doesn't like `` quotes to be used in command lines

Some prep work as I'm intending to resurrect D79

[CostModel] Update RUN -passes=* to double quotes to appease update scripts on windows

DOS really doesn't like `` quotes to be used in command lines

Some prep work as I'm intending to resurrect D79483 soon

show more ...


Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 15ba588d 09-Feb-2022 Arthur Eubanks <aeubanks@google.com>

[test] Migrate '-analyze -cost-model' to '-passes=print<cost-model>'


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 1ccc4992 30-Jun-2020 Florian Hahn <flo@fhahn.com>

[AArch64] Add getCFInstrCost, treat branches as free for throughput.

D79164/2596da31740f changed getCFInstrCost to return 1 per default.
AArch64 did not have its own implementation, hence the throug

[AArch64] Add getCFInstrCost, treat branches as free for throughput.

D79164/2596da31740f changed getCFInstrCost to return 1 per default.
AArch64 did not have its own implementation, hence the throughput cost
of CFI instructions is overestimated. On most cores, most branches should
be predicated and essentially free throughput wise.

This restores a 9% performance regression on a SPEC2006 benchmark on
AArch64 with -O3 LTO & PGO.

This patch effectively restores pre 2596da31740f behavior for AArch64
and undoes the AArch64 test changes of the patch.

Reviewers: samparker, dmgreen, anemet

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D82755

show more ...


Revision tags: llvmorg-10.0.1-rc2
# 2596da31 15-Jun-2020 Sam Parker <sam.parker@arm.com>

[CostModel] getCFInstrCost in getUserCost.

Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 spe

[CostModel] getCFInstrCost in getUserCost.

Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164

show more ...


# 792575ff 26-May-2020 Sam Parker <sam.parker@arm.com>

[NFC][ARM][AArch64] More code size tests

Add analysis runs for icmp, fcmp and select instructions.