History log of /llvm-project/llvm/test/CodeGen/AMDGPU/add3.ll (Results 1 – 12 of 12)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# b434051d 28-Mar-2023 skc7 <Krishna.Sankisa@amd.com>

[AMDGPU] Introduce SIInstrWorklist to process instructions in moveToVALU

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D147168


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# 2e29b013 21-Jun-2022 Alexander Timofeev <alexander.timofeev@amd.com>

[AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.

Since the divergence-driven instruction selection has been enabled for AMDGPU,
all the uniform instructions are expected

[AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.

Since the divergence-driven instruction selection has been enabled for AMDGPU,
all the uniform instructions are expected to be selected to SALU form, except those not having one.
VGPR to SGPR copies appear in MIR to connect values producers and consumers. This change implements an algorithm
that evolves a reasonable tradeoff between the profit achieved from keeping the uniform instructions in SALU form
and overhead introduced by the data transfer between the VGPRs and SGPRs.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D128252

show more ...


# 5cae8816 06-Jul-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add GFX11 test coverage

Add GFX11 test coverage to a bunch of tests where it was easy to do so,
mostly because the checks are autogenerated and/or GFX11 can share the
same checks as GFX10.

[AMDGPU] Add GFX11 test coverage

Add GFX11 test coverage to a bunch of tests where it was easy to do so,
mostly because the checks are autogenerated and/or GFX11 can share the
same checks as GFX10.

Differential Revision: https://reviews.llvm.org/D129295

show more ...


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 3ce1b963 08-Sep-2021 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] Switch PostRA sched to MachineSched

Use GCNHazardRecognizer in postra sched.
Updated tests for the new schedules.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D1095

[AMDGPU] Switch PostRA sched to MachineSched

Use GCNHazardRecognizer in postra sched.
Updated tests for the new schedules.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D109536

Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde

show more ...


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2
# d28624a2 01-Dec-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop adding an implicit def of vcc_hi for wave32

This doesn't seem to be needed for anything.

Differential Revision: https://reviews.llvm.org/D92400


Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 43830790 07-Oct-2019 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove dubious logic in bidirectional list scheduler

Summary:
pickNodeBidirectional tried to compare the best top candidate and the
best bottom candidate by examining TopCand.Reason and Bot

[AMDGPU] Remove dubious logic in bidirectional list scheduler

Summary:
pickNodeBidirectional tried to compare the best top candidate and the
best bottom candidate by examining TopCand.Reason and BotCand.Reason.
This is unsound because, after calling pickNodeFromQueue, Cand.Reason
does not reflect the most important reason why Cand was chosen. Rather
it reflects the most recent reason why it beat some other potential
candidate, which could have been for some low priority tie breaker
reason.

I have seen this cause problems where TopCand is a good candidate, but
because TopCand.Reason is ORDER (which is very low priority) it is
repeatedly ignored in favour of a mediocre BotCand. This is not how
bidirectional scheduling is supposed to work.

To fix this I changed the code to always compare TopCand and BotCand
directly, like the generic implementation of pickNodeBidirectional does.
This removes some uncommented AMDGPU-specific logic; if this logic turns
out to be important then perhaps it could be moved into an override of
tryCandidate instead.

Graphics shader benchmarking on gfx10 shows a lot more positive than
negative effects from this change.

Reviewers: arsenm, tstellar, rampitec, kzhuravl, vpykhtin, dstuttard, tpr, atrick, MatzeB

Subscribers: jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68338

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 0846c125 20-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx1010 core wave32 changes

Differential Revision: https://reviews.llvm.org/D63204

llvm-svn: 363934


Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 64196850 10-May-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Pattern for v_xor3_b32

This also allows three op patterns to use increased constant bus
limit of GFX10.

Differential Revision: https://reviews.llvm.org/D61763

llvm-svn: 360395


# 971cb8b6 06-May-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32

GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for
all targets.

Differential Revision: https://reviews.llvm.org/D61525

llvm-s

[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32

GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for
all targets.

Differential Revision: https://reviews.llvm.org/D61525

llvm-svn: 360094

show more ...


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3
# 871821f7 14-Feb-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Ressociate 'add (add x, y), z' to use SALU

Reassociate adds to collect scalar operands in a single
instruction when possible. That will result in a scalar
add followed by vector instead of

[AMDGPU] Ressociate 'add (add x, y), z' to use SALU

Reassociate adds to collect scalar operands in a single
instruction when possible. That will result in a scalar
add followed by vector instead of two vector adds, thus
better utilizing SALU.

Differential Revision: https://reviews.llvm.org/D58220

llvm-svn: 354066

show more ...


Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 20fe3d2f 15-Jan-2019 Changpeng Fang <changpeng.fang@gmail.com>

AMDGPU: Raise the priority of MAD24 in instruction selection.

Summary:
We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern
is broken when v_add

AMDGPU: Raise the priority of MAD24 in instruction selection.

Summary:
We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern
is broken when v_add3 is generated. We also see the register pressure increased. While we could not properly
estimate register pressure during instruction selection, we can give mad a higher priority.

In this work, we raise the priority for mad24 in selection and resolve the performance regression.

Reviewers:
rampitec

Differential Revision:
https://reviews.llvm.org/D56745

llvm-svn: 351273

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3
# ca4a3294 06-Dec-2018 Nicolai Haehnle <nhaehnle@gmail.com>

AMDGPU: Generate VALU ThreeOp Integer instructions

Summary:
Original patch by: Fabian Wahlster <razor@singul4rity.com>

Change-Id: I148f692a88432541fad468963f58da9ddf79fac5

Reviewers: arsenm, rampi

AMDGPU: Generate VALU ThreeOp Integer instructions

Summary:
Original patch by: Fabian Wahlster <razor@singul4rity.com>

Change-Id: I148f692a88432541fad468963f58da9ddf79fac5

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, b-sumner, llvm-commits

Differential Revision: https://reviews.llvm.org/D51995

llvm-svn: 348488

show more ...