History log of /llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (Results 126 – 150 of 364)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# aad2d272 26-Nov-2022 Kazu Hirata <kazu@google.com>

[Utils] Use std::optional in AMDGPUBaseInfo.cpp (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-ge

[Utils] Use std::optional in AMDGPUBaseInfo.cpp (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# 96155bf4 18-Nov-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][GFX11][NFC] Refactor VOPD operands handling (part 2)

Rename interface functions and operands to make code clearer.

Differential Revision: https://reviews.llvm.org/D138133


# e468b1b7 16-Nov-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][GFX11] Refactor VOPD operands handling

Differential Revision: https://reviews.llvm.org/D137952


Revision tags: llvmorg-15.0.5
# 7425077e 07-Nov-2022 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn'

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D137540

show more ...


Revision tags: llvmorg-15.0.4
# 01b8140d 20-Oct-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] Fix delay alu for VOPD with src2acc

V_FMAC_F32 and V_DOT2C_F32_F16 have a dummy src2 operand tied to vdst to
inform passes that the instructions read the dst operand. The VOPD
versions of t

[AMDGPU] Fix delay alu for VOPD with src2acc

V_FMAC_F32 and V_DOT2C_F32_F16 have a dummy src2 operand tied to vdst to
inform passes that the instructions read the dst operand. The VOPD
versions of these instructions lacked the dummy operand, which was a
problem for inserting s_delay_alu.
Introduce the dummy src2 operand on the VOPD versions, and fix the VOPD operand
tracking logic to account for it.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D136629

show more ...


Revision tags: llvmorg-15.0.3
# fd7b0eea 07-Oct-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC][GFX11] Add VOPD VGPR bank access validation

Differential Revision: https://reviews.llvm.org/D134960


Revision tags: working, llvmorg-15.0.2
# ddfa0f62 23-Sep-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Add GFX11 feature for subtargets with more VGPRs

The full complement of physical VGPRs for GFX11 is 50% more than GFX10.
Some subtargets have this, others stay the same as GFX10. This affec

[AMDGPU] Add GFX11 feature for subtargets with more VGPRs

The full complement of physical VGPRs for GFX11 is 50% more than GFX10.
Some subtargets have this, others stay the same as GFX10. This affects
occupancy calculations.

Differential Revision: https://reviews.llvm.org/D134522

show more ...


Revision tags: llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# b982ba2a 13-Jul-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that

[AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C

Due to the encoding changes in GFX11, we had a hack in place that
disables the use of VGPRs above 128. This patch removes the need for
that hack.

We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
operands of VOP1, VOP2, and VOPC instructions. This register class only has the
low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
VOP2, and VOPC instructions are correctly limited to use the first 128
VGPRs, while the other instructions can freely use all 256.

We introduce new pseduo-instructions used on GFX11 which have the suffix
t16 (True 16) to use the VGPR_32_Lo128 register class.

Reviewed By: foad, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D133723

show more ...


# c89e60bf 15-Sep-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC][GFX11] Add VOPD literals validation

Differential Revision: https://reviews.llvm.org/D133864


# a80116ef 13-Sep-2022 Dmitry Preobrazhensky <dmitri.preobrazhenski@gmail.com>

[AMDGPU][MC][GFX11] Add a helper function for identification of VOPD instructions

Differential Revision: https://reviews.llvm.org/D133608


# 7094ab4e 18-Jul-2022 Kazu Hirata <kazu@google.com>

[llvm] Modernize bool literals (NFC)

Identified with modernize-use-bool-literals.


# d1af09ad 23-Jun-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a

[AMDGPU] gfx11 Generate VOPD Instructions

We form VOPD instructions in the GCNCreateVOPD pass by combining
back-to-back component instructions. There are strict register
constraints for creating a legal VOPD, namely that the matching operands
(e.g. src0x and src0y, src1x and src1y) must be in different register
banks. We add a PostRA scheduler
mutation to put possible VOPD components back-to-back.

Depends on D128442, D128270

Reviewed By: #amdgpu, rampitec

Differential Revision: https://reviews.llvm.org/D128656

show more ...


# 4874838a 28-Jun-2022 Piotr Sobczak <piotr.sobczak@amd.com>

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D1287

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# 07b7fada 25-May-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an

[AMDGPU] gfx11 VOPD instructions MC support

VOPD is a new encoding for dual-issue instructions for use in wave32.
This patch includes MC layer support only.

A VOPD instruction is constituted of an X component (for which there are
13 possible opcodes) and a Y component (for which there are the 13 X
opcodes plus 3 more). Most of the complexity in defining and parsing
a VOPD operation arises from the possible different total numbers of
operands and deferred parsing of certain operands depending on the
constituent X and Y opcodes.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D128218

show more ...


# 485e8b4f 20-Jun-2022 Dmitry Preobrazhensky <d-pre@mail.ru>

[AMDGPU][MC][GFX11] Correct disassembly of DPP variants of VOPC64 opcodes

Fix bugs https://github.com/llvm/llvm-project/issues/56091, https://github.com/llvm/llvm-project/issues/56065.

Differential

[AMDGPU][MC][GFX11] Correct disassembly of DPP variants of VOPC64 opcodes

Fix bugs https://github.com/llvm/llvm-project/issues/56091, https://github.com/llvm/llvm-project/issues/56065.

Differential Revision: https://reviews.llvm.org/D128075

show more ...


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 7050d5b9 17-Feb-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Limit GFX11 to using 128 VGPRs

This is a temporary measure to avoid generating incorrect code until the
compiler understands the new way that GFX11 encodes 16-bit operands in
VOP instructio

[AMDGPU] Limit GFX11 to using 128 VGPRs

This is a temporary measure to avoid generating incorrect code until the
compiler understands the new way that GFX11 encodes 16-bit operands in
VOP instructions.

Differential Revision: https://reviews.llvm.org/D128054

show more ...


# cb9ae937 10-Jun-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Define SGPR_NULL64 register. NFCI.

On gfx10+ null register can be used as both 32 and 64 bit operand.
Define a 64 bit version of the register to use during codegen.

Differential Revision:

[AMDGPU] Define SGPR_NULL64 register. NFCI.

On gfx10+ null register can be used as both 32 and 64 bit operand.
Define a 64 bit version of the register to use during codegen.

Differential Revision: https://reviews.llvm.org/D127527

show more ...


# 95a13425 05-Jun-2022 Fangrui Song <i@maskray.me>

Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options


# 835e09c4 10-May-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 FLAT Instructions

MachineCode Support for FLAT type instructions

Contributors:
Sebastian Neubauer <sebastian.neubauer@amd.com>

Patch 12/N for upstreaming of AMDGPU gfx11 architectur

[AMDGPU] gfx11 FLAT Instructions

MachineCode Support for FLAT type instructions

Contributors:
Sebastian Neubauer <sebastian.neubauer@amd.com>

Patch 12/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125989

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D125992

show more ...


# 1a51ab76 25-Apr-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 export instructions

Contributors:
Jay Foad <jay.foad@amd.com>
Dmitry Preobrazhensky <d-pre@mail.ru>

Patch 10/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125822

Revi

[AMDGPU] gfx11 export instructions

Contributors:
Jay Foad <jay.foad@amd.com>
Dmitry Preobrazhensky <d-pre@mail.ru>

Patch 10/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125822

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D125824

show more ...


# d21b9b49 21-Apr-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 scalar alu instructions

MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <jay.foad@amd.com>

Patch

[AMDGPU] gfx11 scalar alu instructions

MC layer support for SOP(scalar alu operations) including encoding
support for s_delay_alu and s_sendmsg_rtn.

Contributors:
Jay Foad <jay.foad@amd.com>

Patch 7/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D125319

Reviewed By: #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D125498

show more ...


# c7025940 19-Apr-2022 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] gfx11 BUF Instructions

Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

[AMDGPU] gfx11 BUF Instructions

Includes MachineCode layer support and tests, and MIR tests not requiring
CodeGen pass changes.
Includes a small change in SMInstructions.td to correct encoded bits.

Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

Depends on D125316

Patch 6/N for upstreaming of AMDGPU gfx11 architecture.

Reviewed By: dp, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D125319

show more ...


# e01dbabd 19-Apr-2022 Dmitry Preobrazhensky <d-pre@mail.ru>

[AMDGPU][MC] Corrected error message "image data size does not match dmask and tfe"

Differential Revision: https://reviews.llvm.org/D123929


# 8edaf259 12-Apr-2022 Changpeng Fang <Changpeng.Fang@amd.com>

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset t

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset to check whether the multigrid synchronization
pointer is used. If yes, we remove this attribute and also remove
amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg
only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function.

Reviewers: arsenm, sameerds, b-sumner and foad

Differential Revision: https://reviews.llvm.org/D123548

show more ...


# 1f6aa903 07-Apr-2022 Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand

Added the following helpers:

depctr_hold_cnt(...)
depctr_sa_sdst(...)
depctr_va_vdst(...)
depctr_va_sdst(...)

[AMDGPU][MC][GFX10] Added syntactic sugar for s_waitcnt_depctr operand

Added the following helpers:

depctr_hold_cnt(...)
depctr_sa_sdst(...)
depctr_va_vdst(...)
depctr_va_sdst(...)
depctr_va_ssrc(...)
depctr_va_vcc(...)
depctr_vm_vsrc(...)

Differential Revision: https://reviews.llvm.org/D123022

show more ...


12345678910>>...15