History log of /llvm-project/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (Results 26 – 50 of 167)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-15.0.7
# 6443c0ee 12-Dec-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.l

[AMDGPU] Stop using make_pair and make_tuple. NFC.

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828

show more ...


# 67819a72 13-Dec-2022 Fangrui Song <i@maskray.me>

[CodeGen] llvm::Optional => std::optional


# 20cde154 03-Dec-2022 Kazu Hirata <kazu@google.com>

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of

[Target] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5
# 1b560e6a 14-Nov-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][MC] Support TFE modifiers in MUBUF loads and stores.

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D137783


# 7425077e 07-Nov-2022 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn'

[AMDGPU] Add & use `hasNamedOperand`, NFC

In a lot of places, we were just calling `getNamedOperandIdx` to check if the result was != or == to -1.
This is fine in itself, but it's verbose and doesn't make the intention clear, IMHO. I added a `hasNamedOperand` and replaced all cases I could find with regexes and manually.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D137540

show more ...


Revision tags: llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# 693f8162 15-Sep-2022 Ivan Kosarev <ivan.kosarev@amd.com>

[AMDGPU][SILoadStoreOptimizer] Merge SGPR_IMM scalar buffer loads.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D133787


Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2
# de9d80c1 08-Aug-2022 Fangrui Song <i@maskray.me>

[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC

With C++17 there is no Clang pedantic warning or MSVC C5051.


Revision tags: llvmorg-15.0.0-rc1
# 4c4db816 30-Jul-2022 Carl Ritson <carl.ritson@amd.com>

[AMDGPU] Extend SILoadStoreOptimizer to s_load instructions

Apply merging to s_load as is done for s_buffer_load.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D130742


Revision tags: llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 33fb23f7 24-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Merge flat with global in the SILoadStoreOptimizer

Flat can be merged with flat global since address cast is a no-op.
A combined memory operation needs to be promoted to flat.

Differential

[AMDGPU] Merge flat with global in the SILoadStoreOptimizer

Flat can be merged with flat global since address cast is a no-op.
A combined memory operation needs to be promoted to flat.

Differential Revision: https://reviews.llvm.org/D120431

show more ...


# 517171ce 24-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Extend SILoadStoreOptimizer to handle flat load/stores

TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120351


# 3279e440 22-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Extend SILoadStoreOptimizer to handle global stores

TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120346


# cefa1c5c 23-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Fix combined MMO in load-store merge

Loads and stores can be out of order in the SILoadStoreOptimizer.
When combining MachineMemOperands of two instructions operands are
sent in the IR orde

[AMDGPU] Fix combined MMO in load-store merge

Loads and stores can be out of order in the SILoadStoreOptimizer.
When combining MachineMemOperands of two instructions operands are
sent in the IR order into the combineKnownAdjacentMMOs. At the
moment it picks the first operand and just replaces its offset and
size. This essentially loses alignment information and may generally
result in an incorrect base pointer to be used.

Use a base pointer in memory addresses order instead and only adjust
size.

Differential Revision: https://reviews.llvm.org/D120370

show more ...


# 9e055c0f 21-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Extend SILoadStoreOptimizer to handle global saddr loads

This adds handling of the _SADDR forms to the GLOBAL_LOAD combining.

TODO: merge global stores.
TODO: merge flat load/stores.
TODO:

[AMDGPU] Extend SILoadStoreOptimizer to handle global saddr loads

This adds handling of the _SADDR forms to the GLOBAL_LOAD combining.

TODO: merge global stores.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120285

show more ...


# ba17bd26 21-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Extend SILoadStoreOptimizer to handle global loads

There can be situations where global and flat loads and stores are not
combined by the vectorizer, in particular if their address space
di

[AMDGPU] Extend SILoadStoreOptimizer to handle global loads

There can be situations where global and flat loads and stores are not
combined by the vectorizer, in particular if their address space
differ in the IR but they end up the same class instructions after
selection. For example a divergent load from constant address space
ends up being the same global_load as a load from global address space.

TODO: merge global stores.
TODO: handle SADDR forms.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120279

show more ...


# dc098156 21-Feb-2022 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Remove redundand check in the SILoadStoreOptimizer

Differential Revision: https://reviews.llvm.org/D120268


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# 359a792f 28-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: avoid unbounded register pressure increases

Previously when combining two loads this pass would sink the
first one down to the second one, putting the combined load
wh

[AMDGPU] SILoadStoreOptimizer: avoid unbounded register pressure increases

Previously when combining two loads this pass would sink the
first one down to the second one, putting the combined load
where the second one was. It would also sink any intervening
instructions which depended on the first load down to just
after the combined load.

For example, if we started with this sequence of
instructions (code flowing from left to right):

X A B C D E F Y

After combining loads X and Y into XY we might end up with:

A B C D E F XY

But if B D and F depended on X, we would get:

A C E XY B D F

Now if the original code had some short disjoint live ranges
from A to B, C to D and E to F, in the transformed code
these live ranges will be long and overlapping. In this way
a single merge of two loads could cause an unbounded
increase in register pressure.

To fix this, change the way the way that loads are moved in
order to merge them so that:
- The second load is moved up to the first one. (But when
merging stores, we still move the first store down to the
second one.)
- Intervening instructions are never moved.
- Instead, if we find an intervening instruction that would
need to be moved, give up on the merge. But this case
should now be pretty rare because normal stores have no
outputs, and normal loads only have address register
inputs, but these will be identical for any pair of loads
that we try to merge.

As well as fixing the unbounded register pressure increase
problem, moving loads up and stores down seems like it
should usually be a win for memory latency reasons.

Differential Revision: https://reviews.llvm.org/D119006

show more ...


# 6527b2a4 18-Feb-2022 Sebastian Neubauer <Sebastian.Neubauer@amd.com>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


# a456ace9 27-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: rewrite checkAndPrepareMerge. NFCI.

Separate the function clearly into:
- Checks that can be done on CI and Paired before the loop.
- The loop over all instructions be

[AMDGPU] SILoadStoreOptimizer: rewrite checkAndPrepareMerge. NFCI.

Separate the function clearly into:
- Checks that can be done on CI and Paired before the loop.
- The loop over all instructions between CI and Paired.
- Checks that must be done on InstsToMove after the loop.

Previously these were mostly done inside the loop in a very
confusing way.

Differential Revision: https://reviews.llvm.org/D118994

show more ...


# 001cb431 04-Feb-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: fewer calls to offsetsCanBeCombined

Only call offsetsCanBeCombined with Modify = true in cases
where it will really do something. NFC.


# 00bbda07 28-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: simplify class/subclass checks

Also add a comment explaining the difference between class
and subclass. NFCI.


# 33ef8bdf 04-Feb-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: simplify optimizeInstsWithSameBaseAddr

Common up all the calls to CI.setMI. NFCI.


# ca05edd9 04-Feb-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: simplify OptimizeListAgain test

At this point CI represents the combined access (original CI combined
with Paired) so it doesn't make any sense to add in Paired.width

[AMDGPU] SILoadStoreOptimizer: simplify OptimizeListAgain test

At this point CI represents the combined access (original CI combined
with Paired) so it doesn't make any sense to add in Paired.width again.
NFCI.

show more ...


# 68e39462 27-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: break lists on instructions with side effects

This just helps to keep the lists shorter and faster to sort. NFCI.

Differential Revision: https://reviews.llvm.org/D118

[AMDGPU] SILoadStoreOptimizer: break lists on instructions with side effects

This just helps to keep the lists shorter and faster to sort. NFCI.

Differential Revision: https://reviews.llvm.org/D118384

show more ...


# 4b133cee 27-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: reject AGPR DS_WRITE sooner

Rejecting AGPR DS_WRITE instructions before adding them to any mergeable
list seems cleaner than adding them to the list and rejecting them

[AMDGPU] SILoadStoreOptimizer: reject AGPR DS_WRITE sooner

Rejecting AGPR DS_WRITE instructions before adding them to any mergeable
list seems cleaner than adding them to the list and rejecting them
later.

Differential Revision: https://reviews.llvm.org/D118368

show more ...


# 94a4594c 27-Jan-2022 Jay Foad <jay.foad@amd.com>

[AMDGPU] SILoadStoreOptimizer: use separate lists for AGPR instructions

Using separate lists for AGPR and non-AGPR instructions seems like a
cleaner solution than putting them all in the same list a

[AMDGPU] SILoadStoreOptimizer: use separate lists for AGPR instructions

Using separate lists for AGPR and non-AGPR instructions seems like a
cleaner solution than putting them all in the same list and then later
refusing to merge instructions of different AGPR-ness.

Differential Revision: https://reviews.llvm.org/D118367

show more ...


1234567