Revision tags: llvmorg-21-init |
|
#
6292a808 |
| 24-Jan-2025 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNo
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
show more ...
|
Revision tags: llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
298127dc |
| 15-Nov-2024 |
Alex Bradbury <asb@igalia.com> |
Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case.
Supersedes the draft PR #94992, tak
Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case.
Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline
As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).
show more ...
|
#
0fb8fac5 |
| 15-Nov-2024 |
Alex Bradbury <asb@igalia.com> |
Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)"
This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.
Recent scheduling changes means tests need to be re-gen
Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)"
This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.
Recent scheduling changes means tests need to be re-generated. Reverting to green while I do that.
show more ...
|
#
7ff3a9ac |
| 15-Nov-2024 |
Alex Bradbury <asb@igalia.com> |
[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't
[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline
As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).
show more ...
|
Revision tags: llvmorg-19.1.3 |
|
#
4c697f70 |
| 22-Oct-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707)
The IR lowering of memcpy/memmove intrinsics uses a target-specific type
for its load/store operations. So far, the loaded and
[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707)
The IR lowering of memcpy/memmove intrinsics uses a target-specific type
for its load/store operations. So far, the loaded and stored addresses
are computed with GEPs based on this type. That is wrong if the
allocation size of the type differs from its store size: The width of
the accesses is determined by the store size, while the GEP stride is
determined by the allocation size. If the allocation size is greater
than the store size, some bytes are not copied/moved.
This patch changes the GEPs to use i8 addressing, with offsets based on
the type's store size. The correctness of the lowering therefore no
longer depends on the type's allocation size.
This is in support of PR #112332, which allows adjusting the memcpy loop
lowering type through a command line argument in the AMDGPU backend.
show more ...
|
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
edf46f36 |
| 03-Aug-2024 |
Florian Hahn <flo@fhahn.com> |
[SCEV] Use const SCEV * explicitly in more places.
Use const SCEV * explicitly in more places to prepare for https://github.com/llvm/llvm-project/pull/91961. Split off as suggested.
|
#
9e462b7e |
| 29-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
[LowerMemIntrinsics][NFC] Use Align in TTI::getMemcpyLoopLoweringType (#100984)
...and also in TTI::getMemcpyLoopResidualLoweringType.
|
Revision tags: llvmorg-19.1.0-rc1 |
|
#
92a06546 |
| 26-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses (#100122)
So far, the IR-level lowering of llvm.memmove intrinsics generates loops
that copy each byte individually. This can be wast
[LowerMemIntrinsics] Lower llvm.memmove to wide memory accesses (#100122)
So far, the IR-level lowering of llvm.memmove intrinsics generates loops
that copy each byte individually. This can be wasteful for targets that
provide wider memory access operations.
This patch makes the memmove lowering more similar to the lowering of
memcpy with unknown length.
TargetTransformInfo::getMemcpyLoopLoweringType() is queried for an
adequate type for the memory accesses, and if it is wider than a single
byte, the greatest multiple of the type's size that is less than or
equal to the length is copied with corresponding wide memory accesses. A
residual loop with byte-wise accesses (or a sequence of suitable memory
accesses in case the length is statically known) is introduced for the
remaining bytes.
For memmove, this construct is required in two variants: one for copying
forward and one for copying backwards, to handle overlapping memory
ranges. For the backwards case, the residual code still covers the bytes
at the end of the copied region and is therefore executed before the
wide main loop. This implementation choice is based on the assumption
that we are more likely to encounter memory ranges whose start aligns
with the access width than ones whose end does.
In microbenchmarks on gfx1030 (AMDGPU), this change yields speedups up
to 16x for memmoves with variable or large constant lengths.
Part of SWDEV-455845.
show more ...
|
Revision tags: llvmorg-20-init |
|
#
cbc96b9e |
| 12-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
Reapply "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98482)
Reverts llvm/llvm-project#98295, which reverted llvm/llvm-project#97998
The failure in the
Reapply "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98482)
Reverts llvm/llvm-project#98295, which reverted llvm/llvm-project#97998
The failure in the "InOneWeekend" test of the HIP test suite on
clang-hip-vega20
(https://lab.llvm.org/buildbot/#/builders/123/builds/1498) seems to be
unrelated; I observed it (and a similar failure for the "TheNextWeek"
test in the same suite) intermittently on my system, with and without
the patch applied. (It occurred in 2 out of 50 repeated runs without the
patch and in 1 out of 50 runs with the patch.)
show more ...
|
#
17316a59 |
| 10-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
Revert "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98295)
Reverts llvm/llvm-project#97998
This seems to cause a buildbot failure on clang-hip-vega20, in
Revert "[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy" (#98295)
Reverts llvm/llvm-project#97998
This seems to cause a buildbot failure on clang-hip-vega20, in the HIP
test-suite, need to investigate.
show more ...
|
#
6c84bba2 |
| 10-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy (#97998)
Memcpy intrinsics with statically unknown loop sizes are lowered with
two load/store loops: one with ac
[LowerMemIntrinsics] Use correct alignment in residual loop for variable llvm.memcpy (#97998)
Memcpy intrinsics with statically unknown loop sizes are lowered with
two load/store loops: one with access widths specified by the target,
and a residual loop that copies remaining bytes individually.
As the residual loop operates byte-wise, its accesses are only
1-aligned. However, we currently use the alignment that is optimal for
the first loop in both, which is unsound. With this patch, we use the
correct alignment in the residual loop.
The lowering of memcpy with a static size already handles alignments for
the residual correctly.
show more ...
|
#
d37e7ec2 |
| 03-Jul-2024 |
Fabian Ritter <fabian.ritter@amd.com> |
[LowerMemIntrinsics] Respect the volatile argument of llvm.memmove (#97545)
So far, we ignored if a memmove intrinsic is volatile when lowering it
to loops in the IR. This change generates volatile
[LowerMemIntrinsics] Respect the volatile argument of llvm.memmove (#97545)
So far, we ignored if a memmove intrinsic is volatile when lowering it
to loops in the IR. This change generates volatile loads and stores in
this case (similar to how memcpy is handled) and adds tests for volatile
memmoves and memcpys.
show more ...
|
#
9df71d76 |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, re
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
show more ...
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1 |
|
#
6b62a913 |
| 04-Mar-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
[RemoveDIs] Reapply 3fda50d3915, insert instructions using iterators
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit messsage follows:
[NFC][RemoveDIs] Bulk update utilities to i
[RemoveDIs] Reapply 3fda50d3915, insert instructions using iterators
I'd reverted this in 6c7805d5d1 after a bad stage. Original commit messsage follows:
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors.
There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs to take an iterator, and having a no-insert-location behaviour appears to be important. The constructors for ICmpInst and FCmpInst have been updated too. They're the only instructions that take block _references_ rather than pointers for certain calls, and a future patch is going to make use of default-null block insertion locations.
All of this should be NFC.
show more ...
|
#
6c7805d5 |
| 29-Feb-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
Revert "[NFC][RemoveDIs] Bulk update utilities to insert with iterators"
This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808.
Apparently I've missed a hunk while staging this; will back ou
Revert "[NFC][RemoveDIs] Bulk update utilities to insert with iterators"
This reverts commit 3fda50d3915b2163a54a37b602be7783a89dd808.
Apparently I've missed a hunk while staging this; will back out for now.
Picked up here: https://lab.llvm.org/buildbot/#/builders/139/builds/60429/steps/6/logs/stdio
show more ...
|
#
3fda50d3 |
| 29-Feb-2024 |
Jeremy Morse <jeremy.morse@sony.com> |
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carr
[NFC][RemoveDIs] Bulk update utilities to insert with iterators
As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors.
There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way.
I've also switched DemotePHIToStack to take an optional iterator: it needs to take an iterator, and having a no-insert-location behaviour appears to be important. The constructors for ICmpInst and FCmpInst have been updated too. They're the only instructions that take block _references_ rather than pointers for certain calls, and a future patch is going to make use of default-null block insertion locations.
All of this should be NFC.
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
1e36d92b |
| 12-Feb-2024 |
Pierre van Houtryve <pierre.vanhoutryve@amd.com> |
[LowerMemIntrinsics] Avoid udiv/urem when type size is a power of 2 (#81238)
See #64620 - does not fix the issue but improves the generated code a
bit.
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6 |
|
#
f432a004 |
| 15-Nov-2023 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[llvm] Remove no-op ptr-to-ptr bitcasts (NFC)
Opaque ptr cleanup effort (NFC).
|
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4 |
|
#
4c60c0cb |
| 25-Oct-2023 |
Youngsuk Kim <youngsuk.kim@hpe.com> |
[LowerMemIntrinsics] Remove no-op ptr-to-ptr bitcasts (NFC)
Remove ptr-to-ptr bitcasts, which are unnecessary with opaque pointers enabled.
Opaque pointer clean-up effort. NFC.
|
Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6 |
|
#
06962403 |
| 10-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
LowerMemIntrinsics: Check address space aliasing for memmove expansion
For cases where we cannot insert an addrspacecast, we can still expand like a memcpy if we know the address spaces cannot alias
LowerMemIntrinsics: Check address space aliasing for memmove expansion
For cases where we cannot insert an addrspacecast, we can still expand like a memcpy if we know the address spaces cannot alias. Normally non-aliasing memmoves are optimized to memcpy, but we cannot rely on that for lowering. If a target has aliasing address spaces that cannot be casted between, we still have to give up lowering this.
show more ...
|
#
ee19fabc |
| 10-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
LowerMemIntrinsics: Handle inserting addrspacecast for memmove lowering
We're missing a trivial non-AA way to check for non-aliasing address spaces.
|
#
6d2e5c34 |
| 10-Jun-2023 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
LowerMemIntrinsics: Skip memmove with different address spaces
This is a quick fix for an assert when the source and dest have different address spaces. The pointer compare needs to have matching ty
LowerMemIntrinsics: Skip memmove with different address spaces
This is a quick fix for an assert when the source and dest have different address spaces. The pointer compare needs to have matching types, but we can't generically introduce addrspacecast and we don't know if the address spaces alias.
show more ...
|
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
86fe4dfd |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
Recommit: added missing "#include <cstdint>".
|
#
4e12d183 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
Revert "TargetTransformInfo: convert Optional to std::optional"
This reverts commit b83711248cb12639e7ef7303cfbb4452b4067e85.
Some buildbots are failing.
|
#
b8371124 |
| 02-Dec-2022 |
Krzysztof Parzyszek <kparzysz@quicinc.com> |
TargetTransformInfo: convert Optional to std::optional
|