History log of /llvm-project/llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.ll (Results 1 – 8 of 8)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5
# 39337ff2 02-Dec-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle cvt_scale F32/F16->F4/F8 gfx950 hazard (#117844)

gfx950 SP changes doc says:
No 4 clk forwarding on opcodes that convert from
F32/F16->F8 or F32/F16->F4. Must insert a NOP or
instruct

AMDGPU: Handle cvt_scale F32/F16->F4/F8 gfx950 hazard (#117844)

gfx950 SP changes doc says:
No 4 clk forwarding on opcodes that convert from
F32/F16->F8 or F32/F16->F4. Must insert a NOP or
instruction writing some other destination VREG
after a conversion to F4/F8 since it writes either
low/high half or bytes.

Co-authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
Co-authored-by: Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>

show more ...


Revision tags: llvmorg-19.1.4
# 1bf385f1 09-Nov-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Default to selecting frame indexes to SGPRs (#115060)

Only select to a VGPR if it's trivally used in VGPR only contexts.
This fixes mishandling frame indexes used in SGPR only contexts,
like

AMDGPU: Default to selecting frame indexes to SGPRs (#115060)

Only select to a VGPR if it's trivally used in VGPR only contexts.
This fixes mishandling frame indexes used in SGPR only contexts,
like inline assembly constraints.

This is suboptimal in the common case where the frame index
is transitively used by only VALU ops. We make up for this by later
folding the copy to VALU plus scalar op in SIFoldOperands.

show more ...


Revision tags: llvmorg-19.1.3
# ef91cd3f 19-Oct-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Handle folding frame indexes into add with immediate (#110738)


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1
# ac0f64f0 30-Sep-2024 Christudasan Devadasan <christudasan.devadasan@amd.com>

[AMDGPU] Split vgpr regalloc pipeline (#93526)

Allocating wwm-registers and per-thread VGPR operands
together imposes many challenges in the way the
registers are reused during allocation. There a

[AMDGPU] Split vgpr regalloc pipeline (#93526)

Allocating wwm-registers and per-thread VGPR operands
together imposes many challenges in the way the
registers are reused during allocation. There are
times when regalloc reuses the registers of regular
VGPRs operations for wwm-operations in a small range
leading to unwantedly clobbering their inactive lanes
causing correctness issues that are hard to trace.

This patch splits the VGPR allocation pipeline further
to allocate wwm-registers first and the regular VGPR
operands in a separate pipeline. The splitting would
ensure that the physical registers used for wwm
allocations won't take part in the next allocation
pipeline to avoid any such clobbering.

show more ...


Revision tags: llvmorg-19.1.0
# 86627149 04-Sep-2024 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067)

Any SGPR read by a VALU can potentially obscure SALU writes to the same
register.
Insert s_wait_alu instructions to mitigate the hazard on a

[AMDGPU] Mitigate GFX12 VALU read SGPR hazard (#100067)

Any SGPR read by a VALU can potentially obscure SALU writes to the same
register.
Insert s_wait_alu instructions to mitigate the hazard on affected paths.

Compute a global cache of SGPRs with any VALU reads and use this to
avoid inserting mitigation for SGPRs never accessed by VALUs.

To avoid excessive search when compile time is priority implement
secondary mode where all SALU writes are mitigated.

Co-authored-by: Shilei Tian <shilei.tian@amd.com>

show more ...


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2
# 42bae9c5 05-Aug-2024 Pankaj Dwivedi <pankajkumar.divedi@amd.com>

[AMDGPU] Optimize the register uses if offset inlinable (#101676)

Fold the frame index offset into v_mad if inlinable.


# adac04ff 02-Aug-2024 Pankaj Dwivedi <divedi.pk.117@gmail.com>

[AMDGPU] Fix using wrong register in frame index shift (#101649)

In case of v_mad we have materialized the offset in vgpr and mad is
performed in wave space, later vgpr have to be shifted back in l

[AMDGPU] Fix using wrong register in frame index shift (#101649)

In case of v_mad we have materialized the offset in vgpr and mad is
performed in wave space, later vgpr have to be shifted back in lane
space. [#99556](https://github.com/llvm/llvm-project/pull/99556)
introduces a bug.

Co-authored-by: Pankajdwivedi-25 <pankajkumar.divedi@amd.com>

show more ...


# ef67664d 31-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add testcase for materializing sgpr frame indexes (#101306)

These add some IR tests for 57d10b4fc9142d12fbdec578a0cc6f78deb67ef4.
These do rely on some lucky MIR placement to test the scc i

AMDGPU: Add testcase for materializing sgpr frame indexes (#101306)

These add some IR tests for 57d10b4fc9142d12fbdec578a0cc6f78deb67ef4.
These do rely on some lucky MIR placement to test the scc input, but I
haven't found a better way to do it. Also, scc handling in inline asm
is extremely buggy.

show more ...