History log of /llvm-project/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp (Results 1 – 25 of 86)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2
# 924a64a3 07-Oct-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Only emit SCOPE_SYS global_wb (#110636)

global_wb with scopes lower than SCOPE_SYS is unnecessary for
correctness.

I was initially optimistic they would be very cheap no-ops but they ca

[AMDGPU] Only emit SCOPE_SYS global_wb (#110636)

global_wb with scopes lower than SCOPE_SYS is unnecessary for
correctness.

I was initially optimistic they would be very cheap no-ops but they can
actually be quite expensive so let's avoid them.

show more ...


Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# eaac4a26 09-Sep-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Document & Finalize GFX12 Memory Model (#98599)

Documents the memory model implemented as of #98591, with some
fixes/optimizations to the implementation.


Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 7b28cc0c 22-Jul-2024 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Query MachineModuleInfo from PM instead of MachineFunction (#99679)


# 74b87b02 16-Jul-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] Fix and add namespace closing comments. NFC.


# b3a44665 16-Jul-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Implement GFX12 Memory Model (#98591)

- Emit GLOBAL_WB instructions
- Reflect synscope on instructions's `scope:` operand

Fixes SWDEV-468508
Fixes SWDEV-470735
Fixes SWDEV-468392
Fix

[AMDGPU] Implement GFX12 Memory Model (#98591)

- Emit GLOBAL_WB instructions
- Reflect synscope on instructions's `scope:` operand

Fixes SWDEV-468508
Fixes SWDEV-470735
Fixes SWDEV-468392
Fixes SWDEV-469622

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# c1ac6d2d 27-May-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Add amdgpu-as MMRA for fences (#78572)

Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.

[AMDGPU] Add amdgpu-as MMRA for fences (#78572)

Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.

This is done through a `amdgpu-as` MMRA. Currently focused on OpenCL
fences, but can very easily support more AS names and codegen on more
than just fences.

show more ...


Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 1fd1f4c0 06-Mar-2024 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] Handle amdgpu.last.use metadata (#83816)

Convert !amdgpu.last.use metadata into MachineMemOperand for last use
and handle it in SIMemoryLegalizer similar to nontemporal and volatile.


# 27ce5121 04-Mar-2024 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] Fix setting nontemporal in memory legalizer (#83815)

Iterator MI can advance in insertWait() but we need original instruction
to set temporal hint. Just move it before handling volatile.


# 3e35ba53 28-Feb-2024 Petar Avramovic <Petar.Avramovic@amd.com>

AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)

Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to

AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)

Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to desired
coherence level in cache hierarchy.
Intrinsic stores can set scope in cache policy operand.
If volatile keyword is used on generic stores memory legalizer will set
scope to system. Generic stores, by default, get lowest scope level.
Waitcnts are not required if it is guaranteed that memory is cached.
For example vulkan shaders can guarantee this.
TODO: implement flag for frontends to give us a hint not to insert
waits.
Expecting vulkan flag to be implemented as vulkan:private MMRA.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# 87d77119 13-Feb-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU][SIMemoryLegalizer] Fix order of GL0/1_INV on GFX10/11 (#81450)

Fixes SWDEV-443292


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# ba52f06f 18-Jan-2024 Jay Foad <jay.foad@amd.com>

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
sep

[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)

Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
separate LOADCNT, SAMPLECNT and BVHCNT counters.

show more ...


# 7ca4473d 08-Jan-2024 Mirko Brkušanin <Mirko.Brkusanin@amd.com>

[AMDGPU] Add new cache flushing instructions for GFX12 (#76944)

Co-authored-by: Diana Picus <Diana-Magda.Picus@amd.com>


# ef067f52 15-Dec-2023 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU][SIInsertWaitcnts] Do not add s_waitcnt when the counters are known to be 0 already (#72830)

Co-authored-by: Juan Manuel MARTINEZ CAAMAÑO <juamarti@amd.com>


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4
# 42bd8141 12-May-2023 Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941

Differential Revision: https://reviews.llvm.org/D149986


Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4
# 59162e38 07-Mar-2023 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Skip buffer_wbl2 before atomic fence acquire

Memory models for gfx90a and gfx940 do not require buffer_wbl2
before the fence for acquire ordering, but we do insert the full
release.

Fixes:

[AMDGPU] Skip buffer_wbl2 before atomic fence acquire

Memory models for gfx90a and gfx940 do not require buffer_wbl2
before the fence for acquire ordering, but we do insert the full
release.

Fixes: SWDEV-386785

Differential Revision: https://reviews.llvm.org/D145524

show more ...


Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2
# 8e3d7cf5 07-Feb-2023 Archibald Elliott <archibald.elliott@arm.com>

[NFC][TargetParser] Remove llvm/Support/TargetParser.h


Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 21c4dc79 17-Dec-2022 Fangrui Song <i@maskray.me>

std::optional::value => operator*/operator->

value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_E

std::optional::value => operator*/operator->

value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This fixes clang.

show more ...


# 6443c0ee 12-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.l

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# 8a7cbea5 09-Dec-2022 Kazu Hirata <kazu@google.com>

[llvm] Use std::nullopt instead of None in comments (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalu

[llvm] Use std::nullopt instead of None in comments (NFC)

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


# 20cde154 03-Dec-2022 Kazu Hirata <kazu@google.com>

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0
# ee761374 02-Sep-2022 Juan Manuel MARTINEZ CAAMAÑO <juamarti@amd.com>

[AMDGPU][NFC] Fix typo in commment: replace SiMemOpInfo by SIMemOpInfo


Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 611ffcf4 14-Jul-2022 Kazu Hirata <kazu@google.com>

[llvm] Use value instead of getValue (NFC)


# 3b7c3a65 25-Jun-2022 Kazu Hirata <kazu@google.com>

Revert "Don't use Optional::hasValue (NFC)"

This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.


# aa8feeef 25-Jun-2022 Kazu Hirata <kazu@google.com>

Don't use Optional::hasValue (NFC)


1234