Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
e3e7c756 |
| 14-Nov-2024 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Update pattern matching from "x&(-1>>(32-y))" to "bfe x, 0, y" (#116115)
It is not correct to lower "x&(-1>>(32-y))" to "bfe x, 0, y". When y
equals 32, "-1" is not shifted, so x&(-1>>(32-3
AMDGPU: Update pattern matching from "x&(-1>>(32-y))" to "bfe x, 0, y" (#116115)
It is not correct to lower "x&(-1>>(32-y))" to "bfe x, 0, y". When y
equals 32, "-1" is not shifted, so x&(-1>>(32-32) is still x, but "bfe
x, 0, 32" is 0. However, if we know y is at most of 5 bits (< 32), we
can still do the pattern matching.
show more ...
|
#
9778fc76 |
| 13-Nov-2024 |
Changpeng Fang <changpeng.fang@amd.com> |
Revert "AMDGPU: Don't avoid clamp of bit shift in BFE pattern (#115372)" (#116091)
Based on the suggestion from
https://github.com/llvm/llvm-project/pull/115543, we should not do the
pattern match
Revert "AMDGPU: Don't avoid clamp of bit shift in BFE pattern (#115372)" (#116091)
Based on the suggestion from
https://github.com/llvm/llvm-project/pull/115543, we should not do the
pattern matching from x << (32-y) >> (32-y) to "bfe x, 0, y" at all.
This reverts commits a2bacf8ab58af4c1a0247026ea131443d6066602 and
https://github.com/llvm/llvm-project/commit/bdf8e308b7ea430f619ca3aa1199a76eb6b4e2d4.
show more ...
|
#
db6f476e |
| 08-Nov-2024 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Use "countMaxActiveBits() <= 5" to define uint5Bits (#115543)
countMaxTrailingOnes() is not correct. This patch follows the suggestion
from https://github.com/llvm/llvm-project/pull/115372.
|
#
bdf8e308 |
| 07-Nov-2024 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Don't avoid clamp of bit shift in BFE pattern (#115372)
Enable pattern matching from "x<<32-y>>32-y" to "bfe x, 0, y" when we
know y is in [0,31].
This is the follow-up for the PR:
https:
AMDGPU: Don't avoid clamp of bit shift in BFE pattern (#115372)
Enable pattern matching from "x<<32-y>>32-y" to "bfe x, 0, y" when we
know y is in [0,31].
This is the follow-up for the PR:
https://github.com/llvm/llvm-project/pull/114279 to fix the issue:
https://github.com/llvm/llvm-project/issues/114282
show more ...
|
#
ca1154d1 |
| 30-Oct-2024 |
Changpeng Fang <changpeng.fang@amd.com> |
AMDGPU: Disable pattern matching "x<<32-y>>32-y" to "bfe x, 0, y" (#114279)
It is not correct to lower "x<<32-y>>32-y" to "bfe x, 0, y". When y
equals to 32, the left-hand side is still x (unchange
AMDGPU: Disable pattern matching "x<<32-y>>32-y" to "bfe x, 0, y" (#114279)
It is not correct to lower "x<<32-y>>32-y" to "bfe x, 0, y". When y
equals to 32, the left-hand side is still x (unchanged), however, the
right-hand side will be evaluated to 0. So it is not always correct to
do such transformation.
We may be able to keep the pattern for immediate y while y is within [0,
31]. However, the immediate operands of the sub (32 - y) are easily
folded, and "(x << imm) >> imm" will be lowered to "and x,
(2^(32-imm))-1" anyway. So no bfe matching is needed.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
ceb68eea |
| 15-Sep-2023 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove repeated -mtriple options from RUN lines (#66486)
|
#
806761a7 |
| 11-Sep-2023 |
Fangrui Song <i@maskray.me> |
[test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture p
[test] Change llc -march= to -mtriple=
The issue is uncovered by #47698: for IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. riscv64-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly.
show more ...
|
Revision tags: llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
#
8a52bd82 |
| 19-Nov-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Only select VOP3 forms of VOP2 instructions
Change VOP_PAT_GEN to default to not generating an instruction selection pattern for the VOP2 (e32) form of an instruction, only for the VOP3 (e6
[AMDGPU] Only select VOP3 forms of VOP2 instructions
Change VOP_PAT_GEN to default to not generating an instruction selection pattern for the VOP2 (e32) form of an instruction, only for the VOP3 (e64) form. This allows SIFoldOperands maximum freedom to fold copies into the operands of an instruction, before SIShrinkInstructions tries to shrink it back to the smaller encoding.
This affects the following VOP2 instructions: v_min_i32 v_max_i32 v_min_u32 v_max_u32 v_and_b32 v_or_b32 v_xor_b32 v_lshr_b32 v_ashr_i32 v_lshl_b32
A further cleanup could simplify or remove VOP_PAT_GEN, since its optional second argument is never used.
Differential Revision: https://reviews.llvm.org/D114252
show more ...
|
#
078da26b |
| 08-Nov-2021 |
Abinav Puthan Purayil <abinav.puthanpurayil@amd.com> |
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnnee
[AMDGPU] Check for unneeded shift mask in shift PatFrags.
The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in the shift PatFrag predicates.
Differential Revision: https://reviews.llvm.org/D113448
show more ...
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2 |
|
#
5df1ac78 |
| 31-Jan-2020 |
alex-t <alexander.timofeev@amd.com> |
[AMDGPU] fixed divergence driven shift operations selection
Differential Revision: https://reviews.llvm.org/D73483
Reviewers: rampitec
|
Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
#
36617f01 |
| 21-Sep-2018 |
Alexander Timofeev <Alexander.Timofeev@amd.com> |
[AMDGPU] Divergence driven instruction selection. Part 1.
Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector
[AMDGPU] Divergence driven instruction selection. Part 1.
Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267
Differential revision: https://reviews.llvm.org/D52019
Reviewers: rampitec
llvm-svn: 342719
show more ...
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
#
dec562c8 |
| 15-Jun-2018 |
Roman Lebedev <lebedev.ri@gmail.com> |
[AMDGPU] Recognize x & ~(-1 << y) pattern.
Summary: The same pattern as D48010, but this one is IR-canonical as of D47428.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscr
[AMDGPU] Recognize x & ~(-1 << y) pattern.
Summary: The same pattern as D48010, but this one is IR-canonical as of D47428.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48012
llvm-svn: 334817
show more ...
|
#
9c17dad8 |
| 15-Jun-2018 |
Roman Lebedev <lebedev.ri@gmail.com> |
[AMDGPU] Recognize x & ((1 << y) - 1) pattern.
Summary: As a followup for D48007.
Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern, which does not have ub for both the edge c
[AMDGPU] Recognize x & ((1 << y) - 1) pattern.
Summary: As a followup for D48007.
Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern, which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`), i think also handling a pattern that is ub for `y == bitwidth` should be fine.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48010
llvm-svn: 334816
show more ...
|
#
aa8587d1 |
| 15-Jun-2018 |
Roman Lebedev <lebedev.ri@gmail.com> |
[AMDGPU] Recognize x & (-1 >> (32 - y)) pattern.
Summary: D47980 will canonicalize the `x << (32 - y) >> (32 - y)`, which is the pattern the AMDGPU expects to `x & (-1 >> (32 - y))`, which is no
[AMDGPU] Recognize x & (-1 >> (32 - y)) pattern.
Summary: D47980 will canonicalize the `x << (32 - y) >> (32 - y)`, which is the pattern the AMDGPU expects to `x & (-1 >> (32 - y))`, which is not recognized by AMDGPU.
Thus, it needs to be recognized, too.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48007
llvm-svn: 334815
show more ...
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
#
b896c4e8 |
| 11-Jun-2018 |
Roman Lebedev <lebedev.ri@gmail.com> |
[NFC][AMDGPU] Add tests for all the various IR patterns equivalent to extracting low bits.
Summary: The idiom recognition seems rather poor. Only the `@bzhi32_d0` produces `v_bfe_u32`. But they all
[NFC][AMDGPU] Add tests for all the various IR patterns equivalent to extracting low bits.
Summary: The idiom recognition seems rather poor. Only the `@bzhi32_d0` produces `v_bfe_u32`. But they all should.
This needs to be fixed before D47980 can be re-landed.
Reviewers: mareko, bogner, rampitec, arsenm, tstellar, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48005
llvm-svn: 334398
show more ...
|