Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2 |
|
#
924a64a3 |
| 07-Oct-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Only emit SCOPE_SYS global_wb (#110636)
global_wb with scopes lower than SCOPE_SYS is unnecessary for
correctness.
I was initially optimistic they would be very cheap no-ops but they ca
[AMDGPU] Only emit SCOPE_SYS global_wb (#110636)
global_wb with scopes lower than SCOPE_SYS is unnecessary for
correctness.
I was initially optimistic they would be very cheap no-ops but they can
actually be quite expensive so let's avoid them.
show more ...
|
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
eaac4a26 |
| 09-Sep-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Document & Finalize GFX12 Memory Model (#98599)
Documents the memory model implemented as of #98591, with some
fixes/optimizations to the implementation.
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init |
|
#
7b28cc0c |
| 22-Jul-2024 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Query MachineModuleInfo from PM instead of MachineFunction (#99679)
|
#
74b87b02 |
| 16-Jul-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Fix and add namespace closing comments. NFC.
|
#
b3a44665 |
| 16-Jul-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Implement GFX12 Memory Model (#98591)
- Emit GLOBAL_WB instructions
- Reflect synscope on instructions's `scope:` operand
Fixes SWDEV-468508
Fixes SWDEV-470735
Fixes SWDEV-468392
Fix
[AMDGPU] Implement GFX12 Memory Model (#98591)
- Emit GLOBAL_WB instructions
- Reflect synscope on instructions's `scope:` operand
Fixes SWDEV-468508
Fixes SWDEV-470735
Fixes SWDEV-468392
Fixes SWDEV-469622
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
c1ac6d2d |
| 27-May-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU] Add amdgpu-as MMRA for fences (#78572)
Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.
[AMDGPU] Add amdgpu-as MMRA for fences (#78572)
Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only
target one or more address spaces, instead of fencing all address spaces
at once.
This is done through a `amdgpu-as` MMRA. Currently focused on OpenCL
fences, but can very easily support more AS names and codegen on more
than just fences.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
1fd1f4c0 |
| 06-Mar-2024 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] Handle amdgpu.last.use metadata (#83816)
Convert !amdgpu.last.use metadata into MachineMemOperand for last use
and handle it in SIMemoryLegalizer similar to nontemporal and volatile.
|
#
27ce5121 |
| 04-Mar-2024 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] Fix setting nontemporal in memory legalizer (#83815)
Iterator MI can advance in insertWait() but we need original instruction
to set temporal hint. Just move it before handling volatile.
|
#
3e35ba53 |
| 28-Feb-2024 |
Petar Avramovic <Petar.Avramovic@amd.com> |
AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)
Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to
AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (#82996)
Insert waitcnts for loads and atomics before stores with system scope.
Scope is field in instruction encoding and corresponds to desired
coherence level in cache hierarchy.
Intrinsic stores can set scope in cache policy operand.
If volatile keyword is used on generic stores memory legalizer will set
scope to system. Generic stores, by default, get lowest scope level.
Waitcnts are not required if it is guaranteed that memory is cached.
For example vulkan shaders can guarantee this.
TODO: implement flag for frontends to give us a hint not to insert
waits.
Expecting vulkan flag to be implemented as vulkan:private MMRA.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
87d77119 |
| 13-Feb-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU][SIMemoryLegalizer] Fix order of GL0/1_INV on GFX10/11 (#81450)
Fixes SWDEV-443292
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
#
ba52f06f |
| 18-Jan-2024 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)
Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
sep
[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)
Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait
instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into
separate LOADCNT, SAMPLECNT and BVHCNT counters.
show more ...
|
#
7ca4473d |
| 08-Jan-2024 |
Mirko Brkušanin <Mirko.Brkusanin@amd.com> |
[AMDGPU] Add new cache flushing instructions for GFX12 (#76944)
Co-authored-by: Diana Picus <Diana-Magda.Picus@amd.com>
|
#
ef067f52 |
| 15-Dec-2023 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[AMDGPU][SIInsertWaitcnts] Do not add s_waitcnt when the counters are known to be 0 already (#72830)
Co-authored-by: Juan Manuel MARTINEZ CAAMAÑO <juamarti@amd.com>
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4 |
|
#
42bd8141 |
| 12-May-2023 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941
Differential Revision: https://reviews.llvm.org/D149986
|
Revision tags: llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
59162e38 |
| 07-Mar-2023 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Skip buffer_wbl2 before atomic fence acquire
Memory models for gfx90a and gfx940 do not require buffer_wbl2 before the fence for acquire ordering, but we do insert the full release.
Fixes:
[AMDGPU] Skip buffer_wbl2 before atomic fence acquire
Memory models for gfx90a and gfx940 do not require buffer_wbl2 before the fence for acquire ordering, but we do insert the full release.
Fixes: SWDEV-386785
Differential Revision: https://reviews.llvm.org/D145524
show more ...
|
Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2 |
|
#
8e3d7cf5 |
| 07-Feb-2023 |
Archibald Elliott <archibald.elliott@arm.com> |
[NFC][TargetParser] Remove llvm/Support/TargetParser.h
|
Revision tags: llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
21c4dc79 |
| 17-Dec-2022 |
Fangrui Song <i@maskray.me> |
std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_E
std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
show more ...
|
#
6443c0ee |
| 12-Dec-2022 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Stop using make_pair and make_tuple. NFC.
C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple.
Differential Revision: https://reviews.l
[AMDGPU] Stop using make_pair and make_tuple. NFC.
C++17 allows us to call constructors pair and tuple instead of helper functions make_pair and make_tuple.
Differential Revision: https://reviews.llvm.org/D139828
show more ...
|
#
67819a72 |
| 13-Dec-2022 |
Fangrui Song <i@maskray.me> |
[CodeGen] llvm::Optional => std::optional
|
#
8a7cbea5 |
| 09-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalu
[llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
#
20cde154 |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of
[Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
ee761374 |
| 02-Sep-2022 |
Juan Manuel MARTINEZ CAAMAÑO <juamarti@amd.com> |
[AMDGPU][NFC] Fix typo in commment: replace SiMemOpInfo by SIMemOpInfo
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
611ffcf4 |
| 14-Jul-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Use value instead of getValue (NFC)
|
#
3b7c3a65 |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
|
#
aa8feeef |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Don't use Optional::hasValue (NFC)
|