#
10f2a0d6 |
| 30-Sep-2020 |
Arthur Eubanks <aeubanks@google.com> |
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
show more ...
|
#
2a4e704c |
| 27-Oct-2020 |
Nico Weber <thakis@chromium.org> |
Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b. Makes clang assert when building Chromium, see https://crbug.com/1142813 fo
Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.
show more ...
|
#
e5766f25 |
| 30-Sep-2020 |
Arthur Eubanks <aeubanks@google.com> |
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
show more ...
|
#
b7926ce6 |
| 23-Oct-2020 |
Nick Desaulniers <ndesaulniers@google.com> |
[IR] add fn attr for no_stack_protector; prevent inlining on mismatch
It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function
[IR] add fn attr for no_stack_protector; prevent inlining on mismatch
It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function.
It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue.
While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u
Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.
Fixes pr/47479.
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D87956
show more ...
|
#
099bffe7 |
| 22-Oct-2020 |
Vedant Kumar <vsk@apple.com> |
Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"
This reverts commit 26ee8aff2b85ee28a2b2d0b1860d878b512fbdef.
It's necessary to insert bitcast the pointer oper
Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)"
This reverts commit 26ee8aff2b85ee28a2b2d0b1860d878b512fbdef.
It's necessary to insert bitcast the pointer operand of a lifetime marker if it has an opaque pointer type.
rdar://70560161
show more ...
|
#
595c6156 |
| 20-Oct-2020 |
Atmn Patel <atmndp@gmail.com> |
[IR] Adds mustprogress as a LLVM IR attribute
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like
[IR] Adds mustprogress as a LLVM IR attribute
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D85393
show more ...
|
#
8f0658ae |
| 08-Oct-2020 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[Transforms] CodeExtractor::verifyAssumptionCache - don't dereference a dyn_cast<>. NFCI.
Use cast<> as we immediately dereference the pointer afterwards - cast<> will assert if we fail.
Prevents c
[Transforms] CodeExtractor::verifyAssumptionCache - don't dereference a dyn_cast<>. NFCI.
Use cast<> as we immediately dereference the pointer afterwards - cast<> will assert if we fail.
Prevents clang static analyzer warning that we could deference a null pointer.
show more ...
|
Revision tags: llvmorg-11.0.0-rc5 |
|
#
26ee8aff |
| 29-Sep-2020 |
Vedant Kumar <vsk@apple.com> |
[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)
Lifetime marker intrinsics support any pointer type, so CodeExtractor does not need to bitcast to `i8*` in order to use t
[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)
Lifetime marker intrinsics support any pointer type, so CodeExtractor does not need to bitcast to `i8*` in order to use these markers.
show more ...
|
Revision tags: llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
6913812a |
| 20-Sep-2020 |
Fangrui Song <i@maskray.me> |
Fix some clang-tidy bugprone-argument-comment issues
|
Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
5e999cbe |
| 05-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition.
This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch.
My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway.
This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind.
It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable.
The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner.
Additional context is in D79744.
show more ...
|
#
ff7900d5 |
| 08-Jul-2020 |
Gui Andrade <guiand@google.com> |
[LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which may never have an undef value representation.
This patch allows L
[LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412
show more ...
|
#
81384874 |
| 21-May-2020 |
Yevgeny Rouban <yrouban@azul.com> |
[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge and introduce a public method that sets probabi
[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights()
Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities.
Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights(). Changing unreachable branch probabilities to raw(1) and distributing the rest (oldProbability - raw(1)) over the reachable branches could introduce total probability inaccuracy bigger than 1/numOfBranches.
Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
11aa3707 |
| 14-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
StoreInst should store Align, not MaybeAlign
This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small.
Differentia
StoreInst should store Align, not MaybeAlign
This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
show more ...
|
#
f89f7da9 |
| 25-Apr-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it fro
[IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout properties.
Differential Revision: https://reviews.llvm.org/D78862
show more ...
|
#
1370757d |
| 13-May-2020 |
Reid Kleckner <rnk@google.com> |
Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC."
This reverts commit eef95f2746c3347b8dad19091ffb82a88d73acd3.
The new assertion about branch propability sums does not hold.
|
#
eef95f27 |
| 13-May-2020 |
Yevgeny Rouban <yrouban@azul.com> |
[BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing
[BrachProbablityInfo] Set edge probabilities at once. NFC.
Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities.
Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396
show more ...
|
#
cb22ab74 |
| 12-May-2020 |
Zequan Wu <zequanwu@google.com> |
Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. T
Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. There is also an asan usecase where having this attribute would be beneficial to avoid alternative work-arounds.
Here is the link to the feature request: https://bugs.llvm.org/show_bug.cgi?id=42783.
`nomerge` is different from `noline`. `noinline` prevents function from inlining at callsites, but `nomerge` prevents multiple identical calls from being merged into one.
This patch adds `nomerge` to disable the optimization in IR level. A followup patch will be needed to let backend understands `nomerge` and avoid tail merge at backend.
Reviewed By: asbirlea, rnk
Differential Revision: https://reviews.llvm.org/D78659
show more ...
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
#
3b0450ac |
| 14-Feb-2020 |
Arthur Eubanks <aeubanks@google.com> |
Add IR constructs for preallocated (inalloca replacement)
Add llvm.call.preallocated.{setup,arg} instrinsics. Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated
Add IR constructs for preallocated (inalloca replacement)
Add llvm.call.preallocated.{setup,arg} instrinsics. Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup. Add "preallocated" parameter attribute, which is like byval but without the copy.
Verifier changes for these IR constructs.
See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md
Subscribers: hiraditya, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74651
show more ...
|
#
64249f17 |
| 25-Apr-2020 |
Ehud Katz <ehudkatz@gmail.com> |
[CodeExtractor] Fix extraction of a value used only by intrinsics outside of region
We should only skip `lifetime` and `dbg` intrinsics when searching for users. Other intrinsics are legit users tha
[CodeExtractor] Fix extraction of a value used only by intrinsics outside of region
We should only skip `lifetime` and `dbg` intrinsics when searching for users. Other intrinsics are legit users that can't be ignored.
Without this fix, the testcase would result in an invalid IR. `memcpy` will have a reference to the, now, external value (local to the extracted loop function).
Fix PR42194
Differential Revision: https://reviews.llvm.org/D78749
show more ...
|
#
565b56a7 |
| 07-Apr-2020 |
Eli Friedman <efriedma@quicinc.com> |
[NFC] Clean up uses of LoadInst constructor.
|
#
c5ec8890 |
| 03-Mar-2020 |
Tyker <tyker1@outlook.com> |
[NFC] Try fix ubsan buildbot after 876d13378931bee3dcefafff8729c40d5457ff31
|
Revision tags: llvmorg-10.0.0-rc2 |
|
#
70cac41a |
| 13-Feb-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d7620b3764f10f03f3a1e307fcdd72c2f with minor fixes.
The problem was that cancellation can cause new edges to
Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
Reapply 8a56d64d7620b3764f10f03f3a1e307fcdd72c2f with minor fixes.
The problem was that cancellation can cause new edges to the parallel region exit block which is not outlined. The CodeExtractor will encode the information which "exit" was taken as a return value. The fix is to ensure we do not return any value from the outlined function, to prevent control to value conversion we ensure a single exit block for the outlined region.
This reverts commit 3aac953afa34885a72df96f2b703b65f85cbb149.
show more ...
|
#
3aac953a |
| 13-Feb-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late"
This reverts commit 8a56d64d7620b3764f10f03f3a1e307fcdd72c2f.
Will be recommitted once the clang test problem is addressed.
|
#
8a56d64d |
| 10-Feb-2020 |
Johannes Doerfert <johannes@jdoerfert.de> |
[OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now finalize a function late, which will also do the outlining late
[OpenMP][IRBuilder] Perform finalization (incl. outlining) late
In order to fix PR44560 and to prepare for loop transformations we now finalize a function late, which will also do the outlining late. The logic is as before but the actual outlining step happens now after the function was fully constructed. Once we have loop transformations we can apply them in the finalize step before the outlining.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74372
show more ...
|
Revision tags: llvmorg-10.0.0-rc1 |
|
#
8359511c |
| 29-Jan-2020 |
Vedant Kumar <vsk@apple.com> |
[CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the original function. The setup is:
1) CodeExtractor unregister
[CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the original function. The setup is:
1) CodeExtractor unregisters assumptions in the blocks that are to be extracted.
2) Extraction happens. There are now two functions: f1 and f1.extracted.
3) Leftover assumptions in f1 (/not/ removed as they were not in the set of blocks to be extracted) now have affected-value llvm.assume handles in f1.extracted.
When assumptions for a value used in f1 are looked up, ValueTracking can assert as some of the handles are in the wrong function. To fix this, simply erase the llvm.assume calls in the extracted function.
Alternatives include flushing the assumption cache in the original function, or walking all values used in the original function to prune stale affected-value handles. Both seem more expensive.
Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled
rdar://58460728
show more ...
|