AMDGPULibCalls.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4
# ee795fd1	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Handle rounding intrinsic exponents in isKnownIntegral https://reviews.llvm.org/D158999
# def22855	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Use pown instead of pow if known integral https://reviews.llvm.org/D158998
# deefda70	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32 These codegen correctly but f64 doesn't. This prevents losing fast math flags on the way to the underlying intrinsic. https://reviews.llvm. AMDGPU: Use exp2 and log2 intrinsics directly for f16/f32 These codegen correctly but f64 doesn't. This prevents losing fast math flags on the way to the underlying intrinsic. https://reviews.llvm.org/D158997 show more ...
# dac8f974	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Handle sitofp and uitofp exponents in fast pow expansion https://reviews.llvm.org/D158996
# 699685b7	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Enable assumptions in AMDGPULibCalls https://reviews.llvm.org/D159006
# a45b787c	25-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Turn pow libcalls into powr powr is just pow with the assumption that x >= 0, otherwise nan. This fires at least 6 times in luxmark https://reviews.llvm.org/D158908
# f5d8a9b1	25-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Simplify handling of constant vectors in libcalls Also fixes not handling the partially undef case. https://reviews.llvm.org/D158905
# afb24cbb	25-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Don't require all flags to expand fast powr This was requiring all fast math flags, which is practically useless. This wouldn't fire using all the standard OpenCL fast math flags. This only AMDGPU: Don't require all flags to expand fast powr This was requiring all fast math flags, which is practically useless. This wouldn't fire using all the standard OpenCL fast math flags. This only needs afn nnan and ninf. https://reviews.llvm.org/D158904 show more ...
# bfe6bc05	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Cleanup check for integral exponents in pow folds Also improves undef handling https://reviews.llvm.org/D159006
# 80e5b46e	26-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix assertion on half typed pow with constant exponents https://reviews.llvm.org/D158993
# 35c2a754	25-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix asserting on fast f16 pown https://reviews.llvm.org/D158903
# b24dab0e	25-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Trim dead includes
Revision tags: llvmorg-17.0.0-rc3
# 66ee7940	16-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix verifier error on splatted opencl fmin/fmax and ldexp calls Apparently the spec has overloads for fmin/fmax and ldexp with one of the operands as scalar. We need to broadcast the scalars AMDGPU: Fix verifier error on splatted opencl fmin/fmax and ldexp calls Apparently the spec has overloads for fmin/fmax and ldexp with one of the operands as scalar. We need to broadcast the scalars to the vector type. https://reviews.llvm.org/D158077 show more ...
# d2517616	14-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Replace log libcalls with log intrinsics
# d45022b0	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove special case constant folding of divide We should probably just swap this out for the fdiv, but that's what the implementation is anyway.
# 483cc218	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove special case folding of sqrt
# 416f6af9	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove special case folding of fma/mad These just get replaced with an intrinsic now. This was also introducing host dependence on the result since it relied on the compiler choice to contra AMDGPU: Remove special case folding of fma/mad These just get replaced with an intrinsic now. This was also introducing host dependence on the result since it relied on the compiler choice to contract or not. show more ...
# 0eabe65b	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Replace ldexp libcalls with intrinsic
# f337a77c	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Replace rounding libcalls with intrinsics
# c7876c55	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Replace fabs and copysign libcalls with intrinsics Preserves flags and metadata like the other cases.
# a70006c4	12-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Replace some libcalls with intrinsics OpenCL loses fast math information by going through libcall wrappers around intrinsics. Do this to preserve call site flags which are lost when inlinin AMDGPU: Replace some libcalls with intrinsics OpenCL loses fast math information by going through libcall wrappers around intrinsics. Do this to preserve call site flags which are lost when inlining. It's not safe in general to propagate flags during inline, so avoid dealing with this by just special casing some of the useful calls. show more ...
Revision tags: llvmorg-17.0.0-rc2
# f44beecb	31-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Try to use private version of sincos if available The comment was out of date, the device libs build does provide all the pointer overloads. An extremely pedantic interpretation of the spec AMDGPU: Try to use private version of sincos if available The comment was out of date, the device libs build does provide all the pointer overloads. An extremely pedantic interpretation of the spec would suggest only the flat version exists, but the overloads do exist in the implementation. https://reviews.llvm.org/D156720 show more ...
# 42c6e420	30-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Handle multiple uses when matching sincos Match how the generic implementation handles this. We now will leave behind the dead other user for later passes to deal with. https://reviews.llvm AMDGPU: Handle multiple uses when matching sincos Match how the generic implementation handles this. We now will leave behind the dead other user for later passes to deal with. https://reviews.llvm.org/D156707 show more ...
# 6dbd4581	30-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove pointless libcall optimization of fma/mad After the library is linked and trivially inlined, the generic fma and fmuladd intrinsics already handle these cases, and with precise flag h AMDGPU: Remove pointless libcall optimization of fma/mad After the library is linked and trivially inlined, the generic fma and fmuladd intrinsics already handle these cases, and with precise flag handling. This was requiring all fast math flags when we really just need nsz for the fma(a, b, 0) case. https://reviews.llvm.org/D156677 show more ...
# 6448d5ba	30-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove pointless libcall recognition of native_{divide\|recip} This was trying to constant fold these calls, and also turn some of them into a regular fmul/fdiv. There's no point to doing tha AMDGPU: Remove pointless libcall recognition of native_{divide\|recip} This was trying to constant fold these calls, and also turn some of them into a regular fmul/fdiv. There's no point to doing that, the underlying library implementation should be using those in the first place. Even when the library does use the rcp intrinsics, the backend handles constant folding of those. This was also only performing the folds under overly strict fast-evertyhing-is-required conditions. The one possible plus this gained over linking in the library is if you were using all fast math flags, it would propagate them to the new instructions. We could address this in the library by adding more fast math flags to the native implementations. The constant fold case also had no test coverage. https://reviews.llvm.org/D156676 show more ...
123 4 5