Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4 |
|
#
2e6deb1d |
| 14-Nov-2024 |
Sjoerd Meijer <smeijer@nvidia.com> |
[LoopInterchange] Fix overflow in cost calculation (#111807)
If the iteration count is really large, e.g. UINT_MAX, then the cost
calculation can overflows and trigger an assert. So saturate the co
[LoopInterchange] Fix overflow in cost calculation (#111807)
If the iteration count is really large, e.g. UINT_MAX, then the cost
calculation can overflows and trigger an assert. So saturate the cost to
INT_MAX if this is the case by using InstructionCost as a type which
already supports this kind of overflow handling.
This fixes #104761
show more ...
|
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7 |
|
#
67025946 |
| 27-May-2024 |
Rouzbeh <rouzbeh.paktinat1@huawei.com> |
[LoopCacheAnalysis] Fix loop cache cost to always round the cost up to the nearest integer number (#88915)
Currently loop cache analysis uses following formula to evaluate cost of
an RefGroup for a
[LoopCacheAnalysis] Fix loop cache cost to always round the cost up to the nearest integer number (#88915)
Currently loop cache analysis uses following formula to evaluate cost of
an RefGroup for a consecutive memory access:
`RefCost=(TripCount*Stride)/CLS`
This cost evaluates to zero when `TripCount*Stride` is smaller than
cache-line-size. This results in wrong cost value for a loop and
misleads loopInterchange decisions as shown in [this
case](https://llvm.godbolt.org/z/jTz1vn4hn).
This patch fixes the problem by rounding the cost to 1 once this problem
happens.
show more ...
|
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4 |
|
#
900be901 |
| 12-Apr-2024 |
Victor Toni <ViToni@users.noreply.github.com> |
Fix typos (#88565)
|
Revision tags: llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1 |
|
#
585742cb |
| 29-Mar-2023 |
Joshua Cao <cao.joshua@yahoo.com> |
[SCEV] When computing trip count, only zext if necessary
This patch improves on https://reviews.llvm.org/D110587. To summarize the patch, given backedge-taken count BC, trip count TC is `BC + 1`. Ho
[SCEV] When computing trip count, only zext if necessary
This patch improves on https://reviews.llvm.org/D110587. To summarize the patch, given backedge-taken count BC, trip count TC is `BC + 1`. However, we don't know if BC we might overflow. So the patch modifies TC computation to `1 + zext(BC)`.
This patch only adds the zext if necessary by looking at the constant range. If we can determine that BC cannot be the max value for its bitwidth, then we know adding 1 will not overflow, and the zext is not needed. We apply loop guards before computing TC to get more data.
The primary motivation is to support my work on more precise trip multiples in https://reviews.llvm.org/D141823. For example:
``` void test(unsigned n) __builtin_assume(n % 6 == 0); for (unsigned i = 0; i < n; ++i) foo(); ```
Prior to this patch, we had `TC = 1 + zext(-1 + 6 * ((6 umax %n) /u 6))<nuw>`. SCEV range computation is able to determine that the BC cannot be the max value, so the zext is not needed. The result is `TC -> (6 * ((6 umax %n) /u 6))<nuw>`. From here, we would be able to determine that %n is a multiple of 6.
There was one change in LoopCacheAnalysis/LoopInterchange required. Before this patch, if a loop has BC = false, it would compute `TC -> 1 + zext(false) -> 1`, which was fine. After this patch, it computes `TC -> 1 + false = true`. CacheAnalysis would then sign extend the `true`, which was not the intended the behavior. I modified CacheAnalysis such that it would only zero extend trip counts.
This patch is not NFC, but also does not change any SCEV outputs. I would like to get this patch out first to make work with trip multiples easier.
Differential Revision: https://reviews.llvm.org/D147117
show more ...
|
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
d4b6fcb3 |
| 14-Dec-2022 |
Fangrui Song <i@maskray.me> |
[Analysis] llvm::Optional => std::optional
|
#
19aff0f3 |
| 03-Dec-2022 |
Kazu Hirata <kazu@google.com> |
[Analysis] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount o
[Analysis] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional.
This is part of an effort to migrate from llvm::Optional to std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
show more ...
|
Revision tags: llvmorg-15.0.6 |
|
#
7b91798a |
| 20-Nov-2022 |
Kazu Hirata <kazu@google.com> |
[Analysis] Use llvm::Optional::value_or (NFC)
|
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0 |
|
#
6ed2cb4a |
| 29-Aug-2022 |
Kazu Hirata <kazu@google.com> |
Revert "[llvm] Use llvm::is_contained (NFC)"
This reverts commit ebf574f59a80ca00e234eee0b047e5f0df99587d.
This patch seems to cause build failures on Windows.
|
#
ebf574f5 |
| 29-Aug-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Use llvm::is_contained (NFC)
|
Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
#
05ccde80 |
| 21-Jul-2022 |
Congzhe Cao <congzhe.cao@huawei.com> |
[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation
There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` ma
[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation
There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` may not match. The mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`.
The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to whichever is wider in those two variables. As a clean-up, duplicate calculations of `Stride` in `computeRefCost()` is then removed.
Reviewed By: Meinersbur, #loopoptwg
Differential Revision: https://reviews.llvm.org/D128877
show more ...
|
#
a7938c74 |
| 26-Jun-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
|
#
3b7c3a65 |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
|
#
aa8feeef |
| 25-Jun-2022 |
Kazu Hirata <kazu@google.com> |
Don't use Optional::hasValue (NFC)
|
Revision tags: llvmorg-14.0.6 |
|
#
4c77d027 |
| 16-Jun-2022 |
Congzhe Cao <congzhe.cao@huawei.com> |
[Delinearization] Refactoring of fixed-size array delinearization
This is a follow-up patch to D122857 where we added delinearization of fixed-size arrays to loop cache analysis, which resulted in s
[Delinearization] Refactoring of fixed-size array delinearization
This is a follow-up patch to D122857 where we added delinearization of fixed-size arrays to loop cache analysis, which resulted in some duplicate code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and DependenceAnalysis.cpp. Refactoring is done in this patch.
This patch refactors out the main logic of "tryDelinearizeFixedSize()" as "tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would like to delinearize fixed-size arrays. Currently it has two users, i.e., DependenceAnalysis.cpp and LoopCacheCost.cpp.
Reviewed By: Meinersbur, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124745
show more ...
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
#
363b3a64 |
| 02-May-2022 |
Bardia Mahjour <bmahjour@ca.ibm.com> |
fix warning caused by ef4ecc3ceffcf3ef129640c813f823c974f9ba22
|
#
ef4ecc3c |
| 02-May-2022 |
Bardia Mahjour <bmahjour@ca.ibm.com> |
[LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost
Reviewed By: congzhe, etiotto
Differential Revision: https://reviews.llvm.org/D123400
|
#
c428a3d2 |
| 29-Apr-2022 |
Congzhe Cao <congzhe.cao@huawei.com> |
[LoopCacheAnalysis] Enable delinearization of fixed sized arrays
Currently loop cache cost (LCC) cannot analyze fix-sized arrays since it cannot delinearize them. This patch adds the capability to d
[LoopCacheAnalysis] Enable delinearization of fixed sized arrays
Currently loop cache cost (LCC) cannot analyze fix-sized arrays since it cannot delinearize them. This patch adds the capability to delinearize fix-sized arrays to LCC. Most of the code is ported from DependenceAnalysis.cpp and some refactoring will be done in a next patch.
Reviewed By: #loopoptwg, Meinersbur
Differential Revision: https://reviews.llvm.org/D122857
show more ...
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
#
9aa52ba5 |
| 21-Mar-2022 |
Kazu Hirata <kazu@google.com> |
[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
#
b932bdf5 |
| 08-Jan-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
|
#
e5947760 |
| 03-Jan-2022 |
Kazu Hirata <kazu@google.com> |
Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887ee47f3ec8a030e9211169ef4fb094c3.
This patch causes gcc to issue a lot of warnings like:
warning: base cl
Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887ee47f3ec8a030e9211169ef4fb094c3.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]
show more ...
|
#
fd480888 |
| 02-Jan-2022 |
Kazu Hirata <kazu@google.com> |
[llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
|
Revision tags: llvmorg-13.0.1-rc1 |
|
#
4bd46501 |
| 25-Oct-2021 |
Kazu Hirata <kazu@google.com> |
Use llvm::any_of and llvm::none_of (NFC)
|
#
d464a9d4 |
| 16-Oct-2021 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[Analysis] Replace assert(isa)/dyn_cast with cast. NFC.
cast<> will perform the assertion for us.
Removes a static analysis null dereference warning.
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
#
585c594d |
| 08-Sep-2021 |
Philip Reames <listmail@philipreames.com> |
Move delinearization logic out of SCEV [NFC]
None of this logic has anything to do with SCEV's internals, it just uses the existing public APIs. As a result, we can move the code from ScalarEvoluti
Move delinearization logic out of SCEV [NFC]
None of this logic has anything to do with SCEV's internals, it just uses the existing public APIs. As a result, we can move the code from ScalarEvolution.cpp/hpp to Delinearization.cpp/hpp with only minor changes.
This was discussed in advance on today's loop opt call. It turned out to be easy as hoped.
show more ...
|
Revision tags: llvmorg-13.0.0-rc2 |
|
#
30b0c455 |
| 06-Aug-2021 |
Zheng Chen <czhengsz@cn.ibm.com> |
[LoopCacheAnalysis]: handle mismatch type for Numerator and CacheLineSize
fix an assertion due to mismatch type for Numerator and CacheLineSize in loop cache analysis pass.
Reviewed By: bmahjour
D
[LoopCacheAnalysis]: handle mismatch type for Numerator and CacheLineSize
fix an assertion due to mismatch type for Numerator and CacheLineSize in loop cache analysis pass.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D107618
show more ...
|