Revision tags: llvmorg-21-init, llvmorg-19.1.7 |
|
#
a487b792 |
| 17-Dec-2024 |
Florian Hahn <flo@fhahn.com> |
[TySan] Add initial Type Sanitizer (LLVM) (#76259)
This patch introduces the LLVM components of a type sanitizer: a
sanitizer for type-based aliasing violations.
It is based on Hal Finkel's http
[TySan] Add initial Type Sanitizer (LLVM) (#76259)
This patch introduces the LLVM components of a type sanitizer: a
sanitizer for type-based aliasing violations.
It is based on Hal Finkel's https://reviews.llvm.org/D32198.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.
Clang's TBAA representation currently has a problem representing unions,
as demonstrated by the one XFAIL'd test in the runtime patch. We'll
update the TBAA representation to fix this, and at the same time, update
the sanitizer.
When the sanitizer is active, we disable actually using the TBAA
metadata for AA. This way we're less likely to use TBAA to remove memory
accesses that we'd like to verify.
As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.
It goes together with the corresponding clang changes
(https://github.com/llvm/llvm-project/pull/76260) and compiler-rt
changes (https://github.com/llvm/llvm-project/pull/76261)
PR: https://github.com/llvm/llvm-project/pull/76259
show more ...
|
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0 |
|
#
222d3b03 |
| 06-Sep-2024 |
Ivan Kosarev <ivan.kosarev@amd.com> |
[TBAA] Fix the case where a subobject gets accessed at a non-zero offset. (#101485)
|
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2 |
|
#
7df9da7d |
| 04-Aug-2024 |
Kazu Hirata <kazu@google.com> |
[llvm] Construct SmallVector with ArrayRef (NFC) (#101872)
|
Revision tags: llvmorg-19.1.0-rc1 |
|
#
6ce7b1f8 |
| 25-Jul-2024 |
Antonio Frighetto <me@antoniofrighetto.com> |
[TBAA] Do not rewrite TBAA if exists, always null out `!tbaa.struct`
Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess` unless it already exists, as struct-path aware `MDNodes` emitte
[TBAA] Do not rewrite TBAA if exists, always null out `!tbaa.struct`
Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess` unless it already exists, as struct-path aware `MDNodes` emitted via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct` carries memcpy padding semantics among struct fields and `!tbaa` is already meant to aid to alias semantics, it should be possible to zero out `!tbaa.struct` once the memcpy has been simplified. `SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has already replaced `!tbaa.struct` in SROA.
Fixes: https://github.com/llvm/llvm-project/issues/95661.
show more ...
|
Revision tags: llvmorg-20-init |
|
#
4169338e |
| 28-Jun-2024 |
Nikita Popov <npopov@redhat.com> |
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3 |
|
#
dc85719d |
| 16-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (#81313)
Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits
[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (#81313)
Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector<std::complex<float>>:
https://godbolt.org/z/f3vqYos3c
Depends on https://github.com/llvm/llvm-project/pull/81289.
PR: https://github.com/llvm/llvm-project/pull/81313
show more ...
|
#
53c0e809 |
| 16-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289)
If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use th
[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289)
If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use the !tbaa tag for
the accessed field directly instead of the full !tbaa.struct.
InstCombine already had a similar logic.
Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector<std::complex<float>>:
https://godbolt.org/z/f3vqYos3c
Depends on https://github.com/llvm/llvm-project/pull/81285.
show more ...
|
#
3b6e2504 |
| 16-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[TBAA] Only clear TBAAStruct if field can be extracted. (#81285)
Retain TBAAStruct if we fail to match the access to a single field. All
users at the moment use this when using the full size of the
[TBAA] Only clear TBAAStruct if field can be extracted. (#81285)
Retain TBAAStruct if we fail to match the access to a single field. All
users at the moment use this when using the full size of the original
access. SROA also retains the original TBAAStruct when accessing parts
at offset 0.
Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector<std::complex<float>>:
https://godbolt.org/z/f3vqYos3c
Depends on https://github.com/llvm/llvm-project/pull/81284
show more ...
|
#
c6098461 |
| 12-Feb-2024 |
Florian Hahn <flo@fhahn.com> |
[TBAA] Extract logic to use TBAA tag for field of !tbaa.struct (NFC). (#81284)
|
Revision tags: llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3 |
|
#
d020fa2b |
| 24-Apr-2023 |
David Goldblatt <davidgoldblatt@fb.com> |
[AA] Skip the layer of indirection in returning conservative results.
Historically, AA implementations chained to a following implementation to answer recursive queries. This is no longer the case,
[AA] Skip the layer of indirection in returning conservative results.
Historically, AA implementations chained to a following implementation to answer recursive queries. This is no longer the case, but the legacy lives on in a confusing phrasing of the return-a-conservative-value paths. Let's just return "don't know" directly, where appropriate; the current two-step way is confusing.
Differential Revision: https://reviews.llvm.org/D149100
show more ...
|
Revision tags: llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
02988fce |
| 16-Dec-2022 |
David Goldblatt <davidgoldblatt@fb.com> |
[AA] Allow for flow-sensitive analyses.
All current analyses ignore the context. We make the argument mandatory for analyses, but optional for the query interface.
Reviewed By: nikic
Differential
[AA] Allow for flow-sensitive analyses.
All current analyses ignore the context. We make the argument mandatory for analyses, but optional for the query interface.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D136512
show more ...
|
#
11138e5c |
| 09-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[TBAA] Avoid duplicate set lookup (NFC)
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4 |
|
#
01859da8 |
| 24-Oct-2022 |
Patrick Walton <pcwalton@fb.com> |
[AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory().
The pointsToConstantMemory() method returns true only if the memory pointed to by the memory location i
[AliasAnalysis] Introduce getModRefInfoMask() as a generalization of pointsToConstantMemory().
The pointsToConstantMemory() method returns true only if the memory pointed to by the memory location is globally invariant. However, the LLVM memory model also has the semantic notion of *locally-invariant*: memory that is known to be invariant for the life of the SSA value representing that pointer. The most common example of this is a pointer argument that is marked readonly noalias, which the Rust compiler frequently emits.
It'd be desirable for LLVM to treat locally-invariant memory the same way as globally-invariant memory when it's safe to do so. This patch implements that, by introducing the concept of a *ModRefInfo mask*. A ModRefInfo mask is a bound on the Mod/Ref behavior of an instruction that writes to a memory location, based on the knowledge that the memory is globally-constant memory (in which case the mask is NoModRef) or locally-constant memory (in which case the mask is Ref). ModRefInfo values for an instruction can be combined with the ModRefInfo mask by simply using the & operator. Where appropriate, this patch has modified uses of pointsToConstantMemory() to instead examine the mask.
The most notable optimization change I noticed with this patch is that now redundant loads from readonly noalias pointers can be eliminated across calls, even when the pointer is captured. Internally, before this patch, AliasAnalysis was assigning Ref to reads from constant memory; now AA can assign NoModRef, which is a tighter bound.
Differential Revision: https://reviews.llvm.org/D136659
show more ...
|
#
747f27d9 |
| 19-Oct-2022 |
Nikita Popov <npopov@redhat.com> |
[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)
Follow up on D135962, renaming the method name to match the new type name.
|
#
1a9d9823 |
| 19-Oct-2022 |
Nikita Popov <npopov@redhat.com> |
[AA] Rename uses of FunctionModRefBehavior (NFC)
Followup to D135962 to rename remaining uses of FunctionModRefBehavior to MemoryEffects. Does not touch API names yet, but also updates variables nam
[AA] Rename uses of FunctionModRefBehavior (NFC)
Followup to D135962 to rename remaining uses of FunctionModRefBehavior to MemoryEffects. Does not touch API names yet, but also updates variables names FMRB/MRB to ME, to match the new type name.
show more ...
|
Revision tags: llvmorg-15.0.3 |
|
#
03f9d0ff |
| 13-Oct-2022 |
Nikita Popov <npopov@redhat.com> |
[TBAA] Model call accessing immutable type as readnone
Accesses to constant memory are not observable and should be reported as readnone, not readonly. This is consistent with what we do for normal
[TBAA] Model call accessing immutable type as readnone
Accesses to constant memory are not observable and should be reported as readnone, not readonly. This is consistent with what we do for normal (non-call) instructions: For those, the TBAA metadata will result in pointsToConstantMemory() returning true, which will then result in a NoModRef result, not a Ref result.
Differential Revision: https://reviews.llvm.org/D135864
show more ...
|
Revision tags: working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
#
6053b37e |
| 15-Jan-2021 |
Nikita Popov <nikita.ppv@gmail.com> |
[AA] Thread AAQI through getModRefBehavior() (NFC)
This is in preparation for D94363, as we will need AAQI to perform the recursive call to the function variant.
|
#
b1cd393f |
| 28-Jul-2022 |
Nikita Popov <npopov@redhat.com> |
[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI)
Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can a
[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI)
Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead.
To give two examples of why this is useful:
* D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely.
The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring.
Differential Revision: https://reviews.llvm.org/D130896
show more ...
|
#
1cfbbba1 |
| 14-Sep-2022 |
Nikita Popov <npopov@redhat.com> |
[AA] Remove unnecessary intersections from getModRefBehavior() (NFC)
Intersection with other providers is performed by AAResults. Doing this here is both pointless and confusing.
|
#
d71128d9 |
| 14-Jul-2022 |
Dawid Jurczak <dawid_jurek@vp.pl> |
[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>
This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that revi
[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand>
This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that review: https://reviews.llvm.org/D129468#3643295
Differential Revision: https://reviews.llvm.org/D129565
show more ...
|
#
165240fe |
| 12-Jul-2022 |
Dawid Jurczak <dawid_jurek@vp.pl> |
[NFC] Fix compile time regression seen on some benchmarks after a630ea3003 commit
The goal of this change is fixing most of compile time slowdown seen after a630ea3003 commit on lencod and sqlite3 b
[NFC] Fix compile time regression seen on some benchmarks after a630ea3003 commit
The goal of this change is fixing most of compile time slowdown seen after a630ea3003 commit on lencod and sqlite3 benchmarks. There are 3 improvements included in this patch:
1. In getNumOperands when possible get value directly from SmallNumOps. 2. Inline getLargePtr by moving its definition to header. 3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one.
Differential Revision: https://reviews.llvm.org/D129468
show more ...
|
#
71c3a551 |
| 28-Feb-2022 |
serge-sans-paille <sguelton@redhat.com> |
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Diff
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659
show more ...
|
#
8cb9c736 |
| 17-Oct-2021 |
William S. Moses <gh@wsmoses.com> |
[LoopIdiom] Keep TBAA when creating memcpy/memmove
When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing
[LoopIdiom] Keep TBAA when creating memcpy/memmove
When upgrading a loop of load/store to a memcpy, the existing pass does not keep existing aliasing information. This patch allows existing aliasing information to be kept.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D108221
show more ...
|
#
2d1ffad0 |
| 21-Sep-2021 |
Michael Liao <michael.hliao@gmail.com> |
[IR] Re-group AAMDNodes relevant interfaces. NFC.
|
#
5fb3ae52 |
| 13-May-2021 |
Michael Liao <michael.hliao@gmail.com> |
[SelectionDAG] Re-calculate scoped AA metadata when merging stores.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D102821
|