global_atomics_iterative_scan.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/global_atomics_iterative

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 5feb32ba	25-Jun-2024	Vikram Hegde <115221833+vikramRH@users.noreply.github.com>	[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass t [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass to support i64 and f64 operations (along with removing all unnecessary bitcasts). This legalizes 64 bit readlane, writelane and readfirstlane ops pre-ISel --------- Co-authored-by: vikramRH <vikhegde@amd.com> show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# f09360d2	30-Aug-2023	Pravin Jagtap <Pravin.Jagtap@amd.com>	[AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer. Reduction and Scan are implemented using `Iterative` and `DPP` strategy for `float` type. Reviewed By: arsenm, #amdgpu Different [AMDGPU] Support FAdd/FSub global atomics in AMDGPUAtomicOptimizer. Reduction and Scan are implemented using `Iterative` and `DPP` strategy for `float` type. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D156301 show more ...
Revision tags: llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 699addef	20-Jun-2023	Pravin Jagtap <Pravin.Jagtap@amd.com>	[AMDGPU] Use verify<domtree> instead of intra-pass asserts. Verifying dominator tree is expensive using intra-pass asserts. Asserts added during D147408 are increasing the build time of libc signifi [AMDGPU] Use verify<domtree> instead of intra-pass asserts. Verifying dominator tree is expensive using intra-pass asserts. Asserts added during D147408 are increasing the build time of libc significantly. This change does the verification after the atomic optimizer pass and should fix the regression reported in D153232. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D153261 show more ...
Revision tags: llvmorg-16.0.6
# f6c8a8e9	09-Jun-2023	Pravin Jagtap <Pravin.Jagtap@amd.com>	[AMDGPU] Iterative scan implementation for atomic optimizer. This patch provides an alternative implementation to DPP for Scan Computations. An alternative implementation iterates over all active l [AMDGPU] Iterative scan implementation for atomic optimizer. This patch provides an alternative implementation to DPP for Scan Computations. An alternative implementation iterates over all active lanes of Wavefront using llvm.cttz and performs the following steps: 1. Read the value that needs to be atomically incremented using llvm.amdgcn.readlane intrinsic 2. Accumulate the result. 3. Update the scan result using llvm.amdgcn.writelane intrinsic if intermediate scan results are needed later in the kernel. Reviewed By: arsenm, cdevadas Differential Revision: https://reviews.llvm.org/D147408 show more ...