#
1b4d2491 |
| 27-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Replace ".size() < 1" with ".empty()"
|
#
d9bbe859 |
| 27-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[Alignment][NFC] Update Bitcodewriter to use Align
Differential Revision: https://reviews.llvm.org/D83533
|
#
ac375c2f |
| 23-Jul-2020 |
Steven Wu <stevenwu@apple.com> |
[Bitcode] Avoid duplicating linker option when upgrading
Summary: The upgrading path from old ModuleFlag based linker options to the new NamedMetadata based linker option in in materializeMetadata()
[Bitcode] Avoid duplicating linker option when upgrading
Summary: The upgrading path from old ModuleFlag based linker options to the new NamedMetadata based linker option in in materializeMetadata() which gets called once for the module and once for every GV. The linker options are getting dup'ed every time and it can create massive amount of the linker options in the object file that gets created from old bitcode. Fix the problem by checking if the new option exists or not before upgrade again.
rdar://64543389
Reviewers: pcc, t.p.northover, dexonsmith, arphaman
Reviewed By: arphaman
Subscribers: hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83688
show more ...
|
#
78709345 |
| 23-Jul-2020 |
Steven Wu <stevenwu@apple.com> |
[Bitcode] Drop invalid branch_weight in BitcodeReader
Summary: If bitcode reader gets an invalid branch weight, drop that from the inputs. This allows us to read the broken modules we generated befo
[Bitcode] Drop invalid branch_weight in BitcodeReader
Summary: If bitcode reader gets an invalid branch weight, drop that from the inputs. This allows us to read the broken modules we generated before the verifier was able to catch this.
rdar://64870641
Reviewers: yrouban, t.p.northover, dexonsmith, arphaman, aprantl
Reviewed By: aprantl
Subscribers: aprantl, hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83699
show more ...
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
#
5e999cbe |
| 05-Jun-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some
IR: Define byref parameter attribute
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition.
This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch.
My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway.
This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind.
It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable.
The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner.
Additional context is in D79744.
show more ...
|
#
cc28058c |
| 10-Jul-2020 |
Eric Christopher <echristo@gmail.com> |
Temporarily revert "[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)" as it wasn't NFC and is causing issues with thinlto bitcode reading.
I've followed up offline with reproduction i
Temporarily revert "[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)" as it wasn't NFC and is causing issues with thinlto bitcode reading.
I've followed up offline with reproduction instructions and testcases.
This reverts commit 30582457b47004dec8a78144abc919a13ccbd08c.
show more ...
|
#
30582457 |
| 10-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst. See D83136 for context and bug: https://bugs.llvm.org/show
[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst. See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168
Differential Revision: https://reviews.llvm.org/D83375
show more ...
|
#
ff7900d5 |
| 08-Jul-2020 |
Gui Andrade <guiand@google.com> |
[LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which may never have an undef value representation.
This patch allows L
[LLVM] Accept `noundef` attribute in function definitions/calls
The `noundef` attribute indicates an argument or return value which may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412
show more ...
|
#
74c72375 |
| 07-Jul-2020 |
Guillaume Chatelet <gchatelet@google.com> |
[NFC] Adding the align attribute on Atomic{CmpXchg|RMW}Inst
This is the first step to add support for the align attribute to AtomicRMWInst and AtomicCmpXchgInst. Next step is to add support in IRBui
[NFC] Adding the align attribute on Atomic{CmpXchg|RMW}Inst
This is the first step to add support for the align attribute to AtomicRMWInst and AtomicCmpXchgInst. Next step is to add support in IRBuilder and BitcodeReader. Bug: https://bugs.llvm.org/show_bug.cgi?id=27168
Differential Revision: https://reviews.llvm.org/D83136
show more ...
|
#
0ec712af |
| 20-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles.
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale t
[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles.
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale types, constant folding generally elminates them. So sort of hard to trip over.
Fixes regression from D72467.
(Recommitting after fix for memory leak.)
Differential Revision: https://reviews.llvm.org/D80330
show more ...
|
#
4abf0243 |
| 25-Jun-2020 |
Mehdi Amini <joker.eph@gmail.com> |
Remove references to the 4.0 release as a major breaking (NFC)
This is cleaning up comments (mostly in the bitcode handling) about removing some backward compatibility aspect in the 4.0 release. His
Remove references to the 4.0 release as a major breaking (NFC)
This is cleaning up comments (mostly in the bitcode handling) about removing some backward compatibility aspect in the 4.0 release. Historically, "4.0" was used during the development of the 3.x versions as "this future major breaking change version". At the time the major number was used to indicate the compatibility. When we reached 3.9 we decided to change the numbering, instead of going to 3.10 we went to 4.0 but after changing the meaning of the major number to not mean anything anymore with respect to bitcode backward compatibility.
The current policy (https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility) indicates only now:
The current LLVM version supports loading any bitcode since version 3.0.
Differential Revision: https://reviews.llvm.org/D82514
show more ...
|
#
10045cbe |
| 24-Jun-2020 |
Mitch Phillips <31459023+hctim@users.noreply.github.com> |
Revert "[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles."
Patch has a memory leak bug that broke the ASan buildbots. More info available at: https://reviews.llvm.org/D80330
Th
Revert "[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles."
Patch has a memory leak bug that broke the ASan buildbots. More info available at: https://reviews.llvm.org/D80330
This reverts commit b5740105d270a2d76da8812cafb63e4b799ada73.
show more ...
|
#
b5740105 |
| 20-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles.
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale t
[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles.
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale types, constant folding generally elminates them. So sort of hard to trip over.
Fixes regression from D72467.
Differential Revision: https://reviews.llvm.org/D80330
show more ...
|
Revision tags: llvmorg-10.0.1-rc1 |
|
#
07f33512 |
| 14-May-2020 |
Kevin P. Neal <kevin.neal@sas.com> |
[strictfp] Replace dangling strictfp attrs with nobuiltin
In preparation for a patch that will enforce new rules for the usage of the strictfp attribute, this patch introduces auto-upgrade behavior
[strictfp] Replace dangling strictfp attrs with nobuiltin
In preparation for a patch that will enforce new rules for the usage of the strictfp attribute, this patch introduces auto-upgrade behavior that will replace the strictfp attribute on callsites with nobuiltin if the enclosing function declaration doesn't also have the strictfp attribute.
This auto-upgrade isn't being performed on .ll files because that would prevent us from writing a test for the forthcoming verifier behavior.
Differential Revision: https://reviews.llvm.org/D70096
show more ...
|
#
4666953c |
| 01-Jun-2020 |
Vitaly Buka <vitalybuka@google.com> |
[StackSafety] Add info into function summary
Summary: This patch adds optional field into function summary, implements asm and bitcode serialization. YAML serialization is omitted and can be added l
[StackSafety] Add info into function summary
Summary: This patch adds optional field into function summary, implements asm and bitcode serialization. YAML serialization is omitted and can be added later if needed.
This patch includes this information into summary only if module contains at least one sanitize_memtag function. In a near future MTE is the user of the analysis. Later if needed we can provede more direct control on when information is included into summary.
Reviewers: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80908
show more ...
|
#
a7fa35a6 |
| 21-May-2020 |
Hiroshi Yamauchi <yamauchi@google.com> |
[ThinLTO] Compute the basic block count across modules.
Summary: Count the per-module number of basic blocks when the module summary is computed and sum them up during Thin LTO indexing.
This is us
[ThinLTO] Compute the basic block count across modules.
Summary: Count the per-module number of basic blocks when the module summary is computed and sum them up during Thin LTO indexing.
This is used to estimate the working set size under the partial sample PGO.
This is split off of D79831.
Reviewers: davidxl, espindola
Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80403
show more ...
|
#
c476abfd |
| 21-May-2020 |
Benjamin Kramer <benny.kra@googlemail.com> |
[BitcodeReader] Simplify code. NFCI.
|
#
4f04db4b |
| 15-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in vari
AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
show more ...
|
#
0ec5f501 |
| 16-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
Harden IR and bitcode parsers against infinite size types.
If isSized is passed a SmallPtrSet, it uses that set to catch infinitely recursive types (for example, a struct that has itself as a member
Harden IR and bitcode parsers against infinite size types.
If isSized is passed a SmallPtrSet, it uses that set to catch infinitely recursive types (for example, a struct that has itself as a member). Otherwise, it just crashes on such types.
show more ...
|
#
11aa3707 |
| 14-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
StoreInst should store Align, not MaybeAlign
This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small.
Differentia
StoreInst should store Align, not MaybeAlign
This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
show more ...
|
#
f89f7da9 |
| 25-Apr-2020 |
Nikita Popov <nikita.ppv@gmail.com> |
[IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it fro
[IR] Convert null-pointer-is-valid into an enum attribute
The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout properties.
Differential Revision: https://reviews.llvm.org/D78862
show more ...
|
#
8c24f331 |
| 31-Mar-2020 |
Ties Stuij <ties.stuij@arm.com> |
[IR][BFloat] Add BFloat IR type
Summary: The BFloat IR type is introduced to provide support for, initially, the BFloat16 datatype introduced with the Armv8.6 architecture (optional from Armv8.2 onw
[IR][BFloat] Add BFloat IR type
Summary: The BFloat IR type is introduced to provide support for, initially, the BFloat16 datatype introduced with the Armv8.6 architecture (optional from Armv8.2 onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE 754 floating point IR type.
This is part of a patch series upstreaming Armv8.6 features. Subsequent patches will upstream intrinsics support and C-lang support for BFloat.
Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau
Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78190
show more ...
|
#
4532a508 |
| 14-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the I
Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout.
The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time.
Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate.
This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test.
Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places.
Differential Revision: https://reviews.llvm.org/D78403
show more ...
|
#
44ecaabc |
| 13-May-2020 |
Eli Friedman <efriedma@quicinc.com> |
[BitcodeReader] datalayout must be specified before it is queried.
This isn't really a new invariant; it effectively already existed due to existing DataLayout queries. But this makes it explicit.
[BitcodeReader] datalayout must be specified before it is queried.
This isn't really a new invariant; it effectively already existed due to existing DataLayout queries. But this makes it explicit.
This is technically not backward-compatible with the existing bitcode reader, but it's backward-compatible with the output of the bitcode writer, which is what matters in practice.
No testcase because I don't know a good way to write one: there are no existing tools that can generate a bitcode file that would trigger the error.
Split off from D78403.
Differential Revision: https://reviews.llvm.org/D79900
show more ...
|
#
cb22ab74 |
| 12-May-2020 |
Zequan Wu <zequanwu@google.com> |
Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. T
Add nomerge function attribute to supress tail merge optimization in simplifyCFG
We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. There is also an asan usecase where having this attribute would be beneficial to avoid alternative work-arounds.
Here is the link to the feature request: https://bugs.llvm.org/show_bug.cgi?id=42783.
`nomerge` is different from `noline`. `noinline` prevents function from inlining at callsites, but `nomerge` prevents multiple identical calls from being merged into one.
This patch adds `nomerge` to disable the optimization in IR level. A followup patch will be needed to let backend understands `nomerge` and avoid tail merge at backend.
Reviewed By: asbirlea, rnk
Differential Revision: https://reviews.llvm.org/D78659
show more ...
|