History log of /llvm-project/llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp (Results 1 – 25 of 49)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init
# 6292a808 24-Jan-2025 Jeremy Morse <jeremy.morse@sony.com>

[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)

As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNo

[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)

As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.

This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).

---------

Co-authored-by: Stephen Tozer <Melamoto@gmail.com>

show more ...


Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 298127dc 15-Nov-2024 Alex Bradbury <asb@igalia.com>

Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583)

Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the
test case.

Supersedes the draft PR #94992, tak

Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583)

Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the
test case.

Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline

As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).

show more ...


# 0fb8fac5 15-Nov-2024 Alex Bradbury <asb@igalia.com>

Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)"

This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.

Recent scheduling changes means tests need to be re-gen

Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)"

This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.

Recent scheduling changes means tests need to be re-generated. Reverting
to green while I do that.

show more ...


# 7ff3a9ac 15-Nov-2024 Alex Bradbury <asb@igalia.com>

[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)

Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't

[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)

Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline

As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).

show more ...


Revision tags: llvmorg-19.1.3
# 4c697f70 22-Oct-2024 Fabian Ritter <fabian.ritter@amd.com>

[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707)

The IR lowering of memcpy/memmove intrinsics uses a target-specific type
for its load/store operations. So far, the loaded and

[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707)

The IR lowering of memcpy/memmove intrinsics uses a target-specific type
for its load/store operations. So far, the loaded and stored addresses
are computed with GEPs based on this type. That is wrong if the
allocation size of the type differs from its store size: The width of
the accesses is determined by the store size, while the GEP stride is
determined by the allocation size. If the allocation size is greater
than the store size, some bytes are not copied/moved.

This patch changes the GEPs to use i8 addressing, with offsets based on
the type's store size. The correctness of the lowering therefore no
longer depends on the type's allocation size.

This is in support of PR #112332, which allows adjusting the memcpy loop
lowering type through a command line argument in the AMDGPU backend.

show more ...


Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2
# edf46f36 03-Aug-2024 Florian Hahn <flo@fhahn.com>

[SCEV] Use const SCEV * explicitly in more places.

Use const SCEV * explicitly in more places to prepare for
https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.


# 9e462b7e 29-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

[LowerMemIntrinsics][NFC] Use Align in TTI::getMemcpyLoopLoweringType (#100984)

...and also in TTI::getMemcpyLoopResidualLoweringType.


Revision tags: llvmorg-19.1.0-rc1
# 92a06546 26-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses (#100122)

So far, the IR-level lowering of llvm.memmove intrinsics generates loops
that copy each byte individually. This can be wast

[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses (#100122)

So far, the IR-level lowering of llvm.memmove intrinsics generates loops
that copy each byte individually. This can be wasteful for targets that
provide wider memory access operations.

This patch makes the memmove lowering more similar to the lowering of
memcpy with unknown length.
TargetTransformInfo::getMemcpyLoopLoweringType() is queried for an
adequate type for the memory accesses, and if it is wider than a single
byte, the greatest multiple of the type's size that is less than or
equal to the length is copied with corresponding wide memory accesses. A
residual loop with byte-wise accesses (or a sequence of suitable memory
accesses in case the length is statically known) is introduced for the
remaining bytes.

For memmove, this construct is required in two variants: one for copying
forward and one for copying backwards, to handle overlapping memory
ranges. For the backwards case, the residual code still covers the bytes
at the end of the copied region and is therefore executed before the
wide main loop. This implementation choice is based on the assumption
that we are more likely to encounter memory ranges whose start aligns
with the access width than ones whose end does.

In microbenchmarks on gfx1030 (AMDGPU), this change yields speedups up
to 16x for memmoves with variable or large constant lengths.

Part of SWDEV-455845.

show more ...


Revision tags: llvmorg-20-init
# cbc96b9e 12-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

Reapply "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98482)

Reverts llvm/llvm-project#98295, which reverted llvm/llvm-project#97998

The failure in the

Reapply "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98482)

Reverts llvm/llvm-project#98295, which reverted llvm/llvm-project#97998

The failure in the "InOneWeekend" test of the HIP test suite on
clang-hip-vega20
(https://lab.llvm.org/buildbot/#/builders/123/builds/1498) seems to be
unrelated; I observed it (and a similar failure for the "TheNextWeek"
test in the same suite) intermittently on my system, with and without
the patch applied. (It occurred in 2 out of 50 repeated runs without the
patch and in 1 out of 50 runs with the patch.)

show more ...


# 17316a59 10-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

Revert "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98295)

Reverts llvm/llvm-project#97998
This seems to cause a buildbot failure on clang-hip-vega20, in

Revert "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98295)

Reverts llvm/llvm-project#97998
This seems to cause a buildbot failure on clang-hip-vega20, in the HIP
test-suite, need to investigate.

show more ...


# 6c84bba2 10-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy (#97998)

Memcpy intrinsics with statically unknown loop sizes are lowered with
two load/store loops: one with ac

[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy (#97998)

Memcpy intrinsics with statically unknown loop sizes are lowered with
two load/store loops: one with access widths specified by the target,
and a residual loop that copies remaining bytes individually.

As the residual loop operates byte-wise, its accesses are only
1-aligned. However, we currently use the alignment that is optimal for
the first loop in both, which is unsound. With this patch, we use the
correct alignment in the residual loop.

The lowering of memcpy with a static size already handles alignments for
the residual correctly.

show more ...


# d37e7ec2 03-Jul-2024 Fabian Ritter <fabian.ritter@amd.com>

[LowerMemIntrinsics] Respect the volatile argument of llvm.memmove (#97545)

So far, we ignored if a memmove intrinsic is volatile when lowering it
to loops in the IR. This change generates volatile

[LowerMemIntrinsics] Respect the volatile argument of llvm.memmove (#97545)

So far, we ignored if a memmove intrinsic is volatile when lowering it
to loops in the IR. This change generates volatile loads and stores in
this case (similar to how memcpy is handled) and adds tests for volatile
memmoves and memcpys.

show more ...


# 9df71d76 28-Jun-2024 Nikita Popov <npopov@redhat.com>

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re

[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)

Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.

show more ...


Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1
# 6b62a913 04-Mar-2024 Jeremy Morse <jeremy.morse@sony.com>

[RemoveDIs] Reapply 3fda50d3915, insert instructions using iterators

I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:

[NFC][RemoveDIs] Bulk update utilities to i

[RemoveDIs] Reapply 3fda50d3915, insert instructions using iterators

I'd reverted this in 6c7805d5d1 after a bad stage. Original commit
messsage follows:

[NFC][RemoveDIs] Bulk update utilities to insert with iterators

As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.

All of this should be NFC.

show more ...


# 6c7805d5 29-Feb-2024 Jeremy Morse <jeremy.morse@sony.com>

Revert "[NFC][RemoveDIs] Bulk update utilities to insert with iterators"

This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808.

Apparently I've missed a hunk while staging this; will back ou

Revert "[NFC][RemoveDIs] Bulk update utilities to insert with iterators"

This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808.

Apparently I've missed a hunk while staging this; will back out for now.

Picked up here: https://lab.llvm.org/buildbot/#/builders/139/builds/60429/steps/6/logs/stdio

show more ...


# 3fda50d3 29-Feb-2024 Jeremy Morse <jeremy.morse@sony.com>

[NFC][RemoveDIs] Bulk update utilities to insert with iterators

As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carr

[NFC][RemoveDIs] Bulk update utilities to insert with iterators

As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
* Almost all call-sites just call getIterator on an instruction
* Several make use of an existing iterator (scenarios where the code is
actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

I've also switched DemotePHIToStack to take an optional iterator: it needs
to take an iterator, and having a no-insert-location behaviour appears to
be important. The constructors for ICmpInst and FCmpInst have been updated
too. They're the only instructions that take block _references_ rather than
pointers for certain calls, and a future patch is going to make use of
default-null block insertion locations.

All of this should be NFC.

show more ...


Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3
# 1e36d92b 12-Feb-2024 Pierre van Houtryve <pierre.vanhoutryve@amd.com>

[LowerMemIntrinsics] Avoid udiv/urem when type size is a power of 2 (#81238)

See #64620 - does not fix the issue but improves the generated code a
bit.


Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6
# f432a004 15-Nov-2023 Youngsuk Kim <youngsuk.kim@hpe.com>

[llvm] Remove no-op ptr-to-ptr bitcasts (NFC)

Opaque ptr cleanup effort (NFC).


Revision tags: llvmorg-17.0.5, llvmorg-17.0.4
# 4c60c0cb 25-Oct-2023 Youngsuk Kim <youngsuk.kim@hpe.com>

[LowerMemIntrinsics] Remove no-op ptr-to-ptr bitcasts (NFC)

Remove ptr-to-ptr bitcasts, which are unnecessary with opaque pointers
enabled.

Opaque pointer clean-up effort. NFC.


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6
# 06962403 10-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

LowerMemIntrinsics: Check address space aliasing for memmove expansion

For cases where we cannot insert an addrspacecast, we can still expand
like a memcpy if we know the address spaces cannot alias

LowerMemIntrinsics: Check address space aliasing for memmove expansion

For cases where we cannot insert an addrspacecast, we can still expand
like a memcpy if we know the address spaces cannot alias. Normally
non-aliasing memmoves are optimized to memcpy, but we cannot rely on
that for lowering. If a target has aliasing address spaces that cannot
be casted between, we still have to give up lowering this.

show more ...


# ee19fabc 10-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

LowerMemIntrinsics: Handle inserting addrspacecast for memmove lowering

We're missing a trivial non-AA way to check for non-aliasing address
spaces.


# 6d2e5c34 10-Jun-2023 Matt Arsenault <Matthew.Arsenault@amd.com>

LowerMemIntrinsics: Skip memmove with different address spaces

This is a quick fix for an assert when the source and dest have
different address spaces. The pointer compare needs to have matching
ty

LowerMemIntrinsics: Skip memmove with different address spaces

This is a quick fix for an assert when the source and dest have
different address spaces. The pointer compare needs to have matching
types, but we can't generically introduce addrspacecast and we don't
know if the address spaces alias.

show more ...


Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 86fe4dfd 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional

Recommit: added missing "#include <cstdint>".


# 4e12d183 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

Revert "TargetTransformInfo: convert Optional to std::optional"

This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.

Some buildbots are failing.


# b8371124 02-Dec-2022 Krzysztof Parzyszek <kparzysz@quicinc.com>

TargetTransformInfo: convert Optional to std::optional


12