Revision tags: llvmorg-21-init |
|
#
8e702735 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and sim
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
83cbb170 |
| 03-Dec-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Refine AMDGPUAtomicOptimizerImpl class. NFC. (#118302)
Use references instead of pointers for most state and common up some of
the initialization between the legacy and new pass manager pa
[AMDGPU] Refine AMDGPUAtomicOptimizerImpl class. NFC. (#118302)
Use references instead of pointers for most state and common up some of
the initialization between the legacy and new pass manager paths.
show more ...
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3 |
|
#
922992a2 |
| 18-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
Fix typo "instrinsic" (#112899)
|
#
85c17e40 |
| 17-Oct-2024 |
Jay Foad <jay.foad@amd.com> |
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsi
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
Fn = Intrinsic::getOrInsertDeclaration(...);
CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
show more ...
|
Revision tags: llvmorg-19.1.2 |
|
#
fa789dff |
| 11-Oct-2024 |
Rahul Joshi <rjoshi@nvidia.com> |
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is a
[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
126d6f27 |
| 04-Sep-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Improve codegen for GFX10+ DPP reductions and scans (#107108)
Use poison for an unused input to the permlanex16 intrinsic, to improve
register allocation and avoid an unnecessary v_mov ins
[AMDGPU] Improve codegen for GFX10+ DPP reductions and scans (#107108)
Use poison for an unused input to the permlanex16 intrinsic, to improve
register allocation and avoid an unnecessary v_mov instruction.
show more ...
|
Revision tags: llvmorg-19.1.0-rc4 |
|
#
c8ba3170 |
| 23-Aug-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Remove comment outdated by #96933
|
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
cf230e77 |
| 15-Jul-2024 |
Vikram Hegde <115221833+vikramRH@users.noreply.github.com> |
[AMDGPU] Enable atomic optimizer for divergent i64 and double values (#96934)
|
#
ae63db78 |
| 13-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Re-enable atomic optimization of uniform fadd/fsub with result (#97604)
Fix various problems to do with the first active lane of the result of
optimized fp atomics, as explained in the com
[AMDGPU] Re-enable atomic optimization of uniform fadd/fsub with result (#97604)
Fix various problems to do with the first active lane of the result of
optimized fp atomics, as explained in the comment.
Fixes #97554
show more ...
|
#
5ab9e003 |
| 08-Jul-2024 |
Jie Fu <jiefu@tencent.com> |
[AMDGPU] Fix -Wunused-variable in AMDGPUAtomicOptimizer.cpp (NFC)
/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:688:18: error: unused variable 'TyBitWidth' [-Werror,-Wunused-variabl
[AMDGPU] Fix -Wunused-variable in AMDGPUAtomicOptimizer.cpp (NFC)
/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:688:18: error: unused variable 'TyBitWidth' [-Werror,-Wunused-variable] const unsigned TyBitWidth = DL->getTypeSizeInBits(Ty); ^ 1 error generated.
show more ...
|
#
2a960716 |
| 08-Jul-2024 |
Vikram Hegde <115221833+vikramRH@users.noreply.github.com> |
[AMDGPU] Cleanup bitcast spam in atomic optimizer (#96933)
|
#
b76dd4ed |
| 03-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Disable atomic optimization of fadd/fsub with result (#96479)
An atomic fadd instruction like this should return %x:
; value at %ptr is %x
%r = atomicrmw fadd ptr %ptr, float %y
[AMDGPU] Disable atomic optimization of fadd/fsub with result (#96479)
An atomic fadd instruction like this should return %x:
; value at %ptr is %x
%r = atomicrmw fadd ptr %ptr, float %y
After atomic optimization, if %y is uniform, the result is calculated
as %r = %x + * %y * +0.0. This has a couple of problems:
1. If %y is Inf or NaN, this will return NaN instead of %x.
2. If %x is -0.0 and %y is positive, this will return +0.0 instead of
-0.0.
Avoid these problems by disabling the "%y is uniform" path if there are
any uses of the result.
show more ...
|
#
43b98882 |
| 02-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Use nan as the identity for atomicrmw fmax/fmin (#97411)
atomicrmw fmax/fmin perform the same operation as llvm.maxnum/minnum
which return the other operand if one operand is nan. This mea
[AMDGPU] Use nan as the identity for atomicrmw fmax/fmin (#97411)
atomicrmw fmax/fmin perform the same operation as llvm.maxnum/minnum
which return the other operand if one operand is nan. This means that,
in the presence of nan arguments, +/- inf is not an identity for these
operations but nan is (at least if you don't care about nan payloads).
show more ...
|
#
9df71d76 |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
show more ...
|
#
35f7b60a |
| 26-Jun-2024 |
Vikram Hegde <115221833+vikramRH@users.noreply.github.com> |
[AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (#92725)
These are incremental changes over #89217 , with core logic being the
same. This patch along wit
[AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (#92725)
These are incremental changes over #89217 , with core logic being the
same. This patch along with #89217 and #91190 should get us ready to enable 64
bit optimizations in atomic optimizer.
show more ...
|
#
5feb32ba |
| 25-Jun-2024 |
Vikram Hegde <115221833+vikramRH@users.noreply.github.com> |
[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)
This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass t
[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217)
This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass to support i64 and f64 operations (along
with removing all unnecessary bitcasts). This legalizes 64 bit readlane,
writelane and readfirstlane ops pre-ISel
---------
Co-authored-by: vikramRH <vikhegde@amd.com>
show more ...
|
#
d75f9dd1 |
| 24-Jun-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and did not update all callsites:
https://lab.llvm.org/buildbot
Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and did not update all callsites:
https://lab.llvm.org/buildbot/#/builders/29/builds/382
This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
show more ...
|
#
6481dc57 |
| 24-Jun-2024 |
Stephen Tozer <stephen.tozer@sony.com> |
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
show more ...
|
Revision tags: llvmorg-18.1.8 |
|
#
18ec885a |
| 10-Jun-2024 |
Jay Foad <jay.foad@amd.com> |
[RFC][AMDGPU] Remove old llvm.amdgcn.buffer.* and tbuffer intrinsics (#93801)
They have been superseded by llvm.amdgcn.raw.buffer.* and
llvm.amdgcn.struct.buffer.*.
|
Revision tags: llvmorg-18.1.7, llvmorg-18.1.6 |
|
#
e2d17a05 |
| 09-May-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Build lane intrinsics in a mangling-agnostic way. NFC. (#91583)
Use the form of CreateIntrinsic that takes an explicit return type and
works out the mangling based on that and the types of
[AMDGPU] Build lane intrinsics in a mangling-agnostic way. NFC. (#91583)
Use the form of CreateIntrinsic that takes an explicit return type and
works out the mangling based on that and the types of the arguments. The
advantage is that this still works if intrinsics are changed to have
type mangling, e.g. if readlane/readfirstlane/writelane are changed to
work on any type.
show more ...
|
Revision tags: llvmorg-18.1.5 |
|
#
fcdb2203 |
| 18-Apr-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU][AtomicOptimizer] Fix DT update for divergent values with Iterative strategy (#87605)
We take the terminator from EntryBB and put it in ComputeEnd. Make sure
we also move the DT edges, we p
[AMDGPU][AtomicOptimizer] Fix DT update for divergent values with Iterative strategy (#87605)
We take the terminator from EntryBB and put it in ComputeEnd. Make sure
we also move the DT edges, we previously only did it assuming a
non-conditional branch.
Fixes SWDEV-453943
show more ...
|
Revision tags: llvmorg-18.1.4, llvmorg-18.1.3 |
|
#
e1a8120a |
| 22-Mar-2024 |
Pravin Jagtap <prjagtap@amd.com> |
[AMDGPU] Support double type in atomic optimizer. (#84307)
Presently the atomic optimizer supports only 32-bit operations. Plan is
to extend the atomic optimizer for 64-bit operations for compute a
[AMDGPU] Support double type in atomic optimizer. (#84307)
Presently the atomic optimizer supports only 32-bit operations. Plan is
to extend the atomic optimizer for 64-bit operations for compute and
graphics. This patch extends support for double type for `uniform
values` only. Going forward, will extend the support for divergent
values. Adding support for divergent values requires
extending/legalizing readfirstlane, readlane, writelane, etc ops for
64-bit operations to avoid `bitcast` noise that we have currently.
---------
Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
show more ...
|
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0 |
|
#
3755ea93 |
| 13-Sep-2023 |
Pravin Jagtap <prjagtap@amd.com> |
[AMDGPU] Fix scan of atomicFSub in AtomicOptimizer. (#66082)
[D156301](https://reviews.llvm.org/D156301) introduced atomic
optimizations for FAdd/FSub. For FSub, reduction/scan needs to be
perform
[AMDGPU] Fix scan of atomicFSub in AtomicOptimizer. (#66082)
[D156301](https://reviews.llvm.org/D156301) introduced atomic
optimizations for FAdd/FSub. For FSub, reduction/scan needs to be
performed using add operation (`not sub`) and memory location will be
updated by reduced value using atomic sub later by only one lane.
---------
Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
show more ...
|
#
e54277fa |
| 11-Sep-2023 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and up
[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder
This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and updates many call sites to use it. The motivating reason for doing this is given here [0], we'd like to pass around more information about the position of debug-info in the iterator object. That necessitates passing iterators around most of the time.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152468
show more ...
|
Revision tags: llvmorg-17.0.0-rc4 |
|
#
edb9fab3 |
| 30-Aug-2023 |
Pravin Jagtap <Pravin.Jagtap@amd.com> |
[AMDGPU] Support FMin/FMax in AMDGPUAtomicOptimizer.
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D157388
|