History log of /llvm-project/llvm/test/CodeGen/AMDGPU/global_atomics_iterative_scan.ll (Results 1 – 4 of 4)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 5feb32ba 25-Jun-2024 Vikram Hegde <115221833+vikramRH@users.noreply.github.com>

[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)

This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass t

[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)

This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass to support i64 and f64 operations (along
with removing all unnecessary bitcasts). This legalizes 64 bit readlane,
writelane and readfirstlane ops pre-ISel

---------

Co-authored-by: vikramRH <vikhegde@amd.com>

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# f09360d2 30-Aug-2023 Pravin Jagtap <Pravin.Jagtap@amd.com>

[AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.

Reduction and Scan are implemented using `Iterative`
and `DPP` strategy for `float` type.

Reviewed By: arsenm, #amdgpu

Different

[AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer.

Reduction and Scan are implemented using `Iterative`
and `DPP` strategy for `float` type.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D156301

show more ...


Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 699addef 20-Jun-2023 Pravin Jagtap <Pravin.Jagtap@amd.com>

[AMDGPU] Use verify<domtree> instead of intra-pass asserts.

Verifying dominator tree is expensive using intra-pass
asserts. Asserts added during D147408 are
increasing the build time of libc signifi

[AMDGPU] Use verify<domtree> instead of intra-pass asserts.

Verifying dominator tree is expensive using intra-pass
asserts. Asserts added during D147408 are
increasing the build time of libc significantly. This change
does the verification after the atomic optimizer pass
and should fix the regression reported in D153232.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D153261

show more ...


Revision tags: llvmorg-16.0.6
# f6c8a8e9 09-Jun-2023 Pravin Jagtap <Pravin.Jagtap@amd.com>

[AMDGPU] Iterative scan implementation for atomic optimizer.

This patch provides an alternative implementation to DPP for Scan Computations.

An alternative implementation iterates over all active l

[AMDGPU] Iterative scan implementation for atomic optimizer.

This patch provides an alternative implementation to DPP for Scan Computations.

An alternative implementation iterates over all active lanes of Wavefront
using llvm.cttz and performs the following steps:
1. Read the value that needs to be atomically incremented using
llvm.amdgcn.readlane intrinsic
2. Accumulate the result.
3. Update the scan result using llvm.amdgcn.writelane intrinsic
if intermediate scan results are needed later in the kernel.

Reviewed By: arsenm, cdevadas

Differential Revision: https://reviews.llvm.org/D147408

show more ...