Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
30dd1297 |
| 04-Nov-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Custom expand flat cmpxchg which may access private (#109410)
64-bit flat cmpxchg instructions do not work correctly for scratch addresses, and need to be expanded as non-atomic.
Allow cust
AMDGPU: Custom expand flat cmpxchg which may access private (#109410)
64-bit flat cmpxchg instructions do not work correctly for scratch addresses, and need to be expanded as non-atomic.
Allow custom expansion of cmpxchg in AtomicExpand, as is already the case for atomicrmw.
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
4af249fe |
| 06-Sep-2024 |
anjenner <161845516+anjenner@users.noreply.github.com> |
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3 |
|
#
5ececb47 |
| 14-Aug-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
LowerAtomic: Use explicit alignment in lowerAtomicCmpXchgInst (#103767)
|
Revision tags: llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init |
|
#
3701ebe7 |
| 09-Jul-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AtomicExpand: Fix expanding atomics into unconstrained FP in strictfp functions
Ideally the normal fadd/fmin/fmax this was creating would fail the verifier. It's probably also necessary to force off
AtomicExpand: Fix expanding atomics into unconstrained FP in strictfp functions
Ideally the normal fadd/fmin/fmax this was creating would fail the verifier. It's probably also necessary to force off FP exception handlers in the cmpxchg loop but we don't have a generic way to do that now.
Note strictfp builder is broken in the minnum/maxnum case
https://reviews.llvm.org/D154993
show more ...
|
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
a20f7efb |
| 15-Apr-2023 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in passes that no longer has support for the legacy PM.
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5 |
|
#
778cf543 |
| 03-Nov-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.
AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use ta
IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.
AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.
show more ...
|
Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
9df0b254 |
| 23-Jul-2022 |
Nuno Lopes <nuno.lopes@tecnico.ulisboa.pt> |
[NFC] Switch a few uses of undef to poison as placeholders for unreachable code
|
#
1023ddaf |
| 06-Jul-2022 |
Shilei Tian <i@tianshilei.me> |
[LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instr
[LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127041
show more ...
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
9fdd2584 |
| 06-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Transforms: Fix code duplication between LowerAtomic and AtomicExpand
|
#
ff485d72 |
| 07-Apr-2022 |
Benjamin Kramer <benny.kra@googlemail.com> |
Transforms: Remove unused include
Utils can't depend on Scalar transforms.
|
#
39f15686 |
| 05-Apr-2022 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Transforms: Split LowerAtomics into separate Utils and pass
This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.
|