#
e71bae94 |
| 24-Jul-2023 |
Jie Fu <jiefu@tencent.com> |
[clang][dataflow] Fix build failure due to -Wunused-variable in DataflowEnvironment.cpp (NFC)
/data/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:125:11: error: unused variab
[clang][dataflow] Fix build failure due to -Wunused-variable in DataflowEnvironment.cpp (NFC)
/data/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:125:11: error: unused variable 'StructVal2' [-Werror,-Wunused-variable] auto *StructVal2 = cast<StructValue>(&Val2); ^ 1 error generated.
show more ...
|
#
44f98d01 |
| 20-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Eliminate duplication between `AggregateStorageLocation` and `StructValue`.
After this change, `StructValue` is just a wrapper for an `AggregateStorageLocation`. For the wider cont
[clang][dataflow] Eliminate duplication between `AggregateStorageLocation` and `StructValue`.
After this change, `StructValue` is just a wrapper for an `AggregateStorageLocation`. For the wider context, see https://discourse.llvm.org/t/70086.
## How to review
- Start by looking at the comments added / changed in Value.h, StorageLocation.h, and DataflowEnvironment.h. This will give you a good overview of the semantic changes.
- Look at the corresponding .cpp files that implement the semantic changes.
- Transfer.cpp, TypeErasedDataflowAnalysis.cpp, and RecordOps.cpp show how the core of the framework is affected by the semantic changes.
- UncheckedOptionalAccessModel.cpp shows how this complex model is affected by the changes.
- Many of the changes in the rest of the patch are mechanical in nature.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155446
show more ...
|
#
0f6cf555 |
| 17-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Bugfix for `refreshStructValue()`.
In the case where the expression was not yet associated with a storage location, we created a new storage location but failed to associate it wit
[clang][dataflow] Bugfix for `refreshStructValue()`.
In the case where the expression was not yet associated with a storage location, we created a new storage location but failed to associate it with the expression.
The newly added test fails without the fix.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D155465
show more ...
|
#
d17f455a |
| 17-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Add `refreshStructValue()`.
Besides being a useful abstraction, this function will help insulate existing clients of the framework from upcoming changes to the API of `StructValue`
[clang][dataflow] Add `refreshStructValue()`.
Besides being a useful abstraction, this function will help insulate existing clients of the framework from upcoming changes to the API of `StructValue` and `AggregateStorageLocation`.
Depends On D155202
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155204
show more ...
|
#
6d768548 |
| 17-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Add `DataflowEnvironment::createObject()`.
This consolidates the code used in various places to initialize objects (usually for variables) into one central location.
It will also
[clang][dataflow] Add `DataflowEnvironment::createObject()`.
This consolidates the code used in various places to initialize objects (usually for variables) into one central location.
It will also help reduce the number of changes needed when we make the upcoming API changes to `AggregateStorageLocation` and `StructValue`.
Depends On D155074
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155075
show more ...
|
#
6272226b |
| 13-Jul-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] Remove deprecated BoolValue flow condition accessors
Use the Formula versions instead, now.
Differential Revision: https://reviews.llvm.org/D155152
|
#
7d935d08 |
| 11-Jul-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] improve determinism of generated SAT system
Fixes two places where we relied on map iteration order when processing values, which leaked nondeterminism into the generated SAT formulas. Ad
[dataflow] improve determinism of generated SAT system
Fixes two places where we relied on map iteration order when processing values, which leaked nondeterminism into the generated SAT formulas. Adds a couple of tests that directly assert that the SAT system is equivalent on each run.
It's desirable that the formulas are deterministic based on the input:
- our SAT solver is naive and perfermance is sensitive to even simple semantics-preserving transformations like A|B to B|A. (e.g. it's likely to choose a different variable to split on). Timeout failures are bad, but *flaky* ones are terrible to debug. - similarly when debugging, it's important to have a consistent understanding of what e.g. "V23" means across runs.
---
Both changes in this patch were isolated from a nullability analysis of real-world code which was extremely slow, spending ages in the SAT solver at "random" points that varied on each run. I've included a reduced version of the code as a regression test.
One of the changes shows up directly as flow-condition nondeterminism with a no-op analysis, the other relied on bits of the nullability analysis but I found a synthetic example to show the problem.
Differential Revision: https://reviews.llvm.org/D154948
show more ...
|
#
b47bdcbc |
| 12-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Include fields initialized in an `InitListExpr` in `getModeledFields()`.
Previously, we were including these fields only in the specific instance that was initialized by the `InitL
[clang][dataflow] Include fields initialized in an `InitListExpr` in `getModeledFields()`.
Previously, we were including these fields only in the specific instance that was initialized by the `InitListExpr`, but not in other instances of the same type. This is inconsistent and error-prone.
Depends On D154952
Reviewed By: xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D154961
show more ...
|
#
e53da3ea |
| 10-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow][NFC] Expand a comment.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D154834
|
#
e8a1560d |
| 06-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Various changes to handling of modeled fields.
- Rename `getReferencedFields()` to `getModeledFields()`. Move the logic that returns all object fields when doing a context-sensit
[clang][dataflow] Various changes to handling of modeled fields.
- Rename `getReferencedFields()` to `getModeledFields()`. Move the logic that returns all object fields when doing a context-sensitive analysis to here from `DataflowAnalysisContext::createStorageLocation()`. I think all callers of the previous `getReferencedFields()` should use this logic; the fact that they were not doing so looks like a bug.
- Make `getModeledFields()` public. I have an upcoming patch that will need to use this function from Transfer.cpp, and there doesn't seem to be any reason why this function should not be public.
- Use a `SmallSetVector` to get deterministic iteration order. I have a pending patch where I'm getting flaky tests because `Environment::createValueUnlessSelfReferential()` is non-deterministically populating different fields depending on the iteration order. This change fixes those flaky tests.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D154586
show more ...
|
#
fc9821a8 |
| 05-Jul-2023 |
Sam McCall <sam.mccall@gmail.com> |
Reland "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.
Test problems were due to unspecified order of function arg evalua
Reland "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.
Test problems were due to unspecified order of function arg evaluation.
Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"
This properly frees the Value hierarchy from managing boolean formulas.
We still distinguish AtomicBoolValue; this type is used in client code. However we expect to convert such uses to BoolValue (where the distinction is not needed) or Atom (where atomic identity is intended), and then fold AtomicBoolValue into FormulaBoolValue.
We also distinguish TopBoolValue; this has distinct rules for widen/join/equivalence, and top-ness is not represented in Formula. It'd be nice to find a cleaner representation (e.g. the absence of a formula), but no immediate plans.
For now, BoolValues with the same Formula are deduplicated. This doesn't seem desirable, as Values are mutable by their creators (properties). We can probably drop this for FormulaBoolValue immediately (not in this patch, to isolate changes). For AtomicBoolValue we first need to update clients to stop using value pointers for atom identity.
The data structures around flow conditions are updated: - flow condition tokens are Atom, rather than AtomicBoolValue* - conditions are Formula, rather than BoolValue Most APIs were changed directly, some with many clients had a new version added and the existing one deprecated.
The factories for BoolValues in Environment keep their existing signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue) and are not deprecated. These have very many clients and finding the most ergonomic API & migration path still needs some thought.
Differential Revision: https://reviews.llvm.org/D153469
show more ...
|
#
2d8cd195 |
| 05-Jul-2023 |
Sam McCall <sam.mccall@gmail.com> |
Revert "Reland "[dataflow] Add dedicated representation of boolean formulas" and followups
These changes are OK, but they break downstream stuff that needs more time to adapt :-(
This reverts commi
Revert "Reland "[dataflow] Add dedicated representation of boolean formulas" and followups
These changes are OK, but they break downstream stuff that needs more time to adapt :-(
This reverts commit 71579569f4399d3cf6bc618dcd449b5388d749cc. This reverts commit 5e4ad816bf100b0325ed45ab1cfea18deb3ab3d1. This reverts commit 1c3ac8dfa16c42a631968aadd0396cfe7f7778e0.
show more ...
|
#
5e4ad816 |
| 21-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)
This properly frees the Value hierarchy from managing
[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)
This properly frees the Value hierarchy from managing boolean formulas.
We still distinguish AtomicBoolValue; this type is used in client code. However we expect to convert such uses to BoolValue (where the distinction is not needed) or Atom (where atomic identity is intended), and then fold AtomicBoolValue into FormulaBoolValue.
We also distinguish TopBoolValue; this has distinct rules for widen/join/equivalence, and top-ness is not represented in Formula. It'd be nice to find a cleaner representation (e.g. the absence of a formula), but no immediate plans.
For now, BoolValues with the same Formula are deduplicated. This doesn't seem desirable, as Values are mutable by their creators (properties). We can probably drop this for FormulaBoolValue immediately (not in this patch, to isolate changes). For AtomicBoolValue we first need to update clients to stop using value pointers for atom identity.
The data structures around flow conditions are updated: - flow condition tokens are Atom, rather than AtomicBoolValue* - conditions are Formula, rather than BoolValue Most APIs were changed directly, some with many clients had a new version added and the existing one deprecated.
The factories for BoolValues in Environment keep their existing signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue) and are not deprecated. These have very many clients and finding the most ergonomic API & migration path still needs some thought.
Differential Revision: https://reviews.llvm.org/D153469
show more ...
|
#
1e7329cd |
| 04-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Model variables / fields / funcs used in default initializers.
The newly added test fails without the other changes in this patch.
Reviewed By: sammccall, gribozavr2
Differential
[clang][dataflow] Model variables / fields / funcs used in default initializers.
The newly added test fails without the other changes in this patch.
Reviewed By: sammccall, gribozavr2
Differential Revision: https://reviews.llvm.org/D154420
show more ...
|
#
74d8455b |
| 28-Jun-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Make `getThisPointeeStorageLocation()` return an `AggregateStorageLocation`.
This avoids the need for casts at callsites.
Depends On D153852
Reviewed By: sammccall, xazax.hun, gr
[clang][dataflow] Make `getThisPointeeStorageLocation()` return an `AggregateStorageLocation`.
This avoids the need for casts at callsites.
Depends On D153852
Reviewed By: sammccall, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D153854
show more ...
|
#
d3632486 |
| 28-Jun-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Initialize fields of anonymous records correctly.
Previously, the newly added test would crash.
Depends On D153851
Reviewed By: gribozavr2
Differential Revision: https://reviews
[clang][dataflow] Initialize fields of anonymous records correctly.
Previously, the newly added test would crash.
Depends On D153851
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D153852
show more ...
|
#
6f22de67 |
| 27-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] Use consistent, symmetrical, non-mutating erased signature for join()
Mutating join() isn't used and so appears to be an anti-optimization. Having Lattice vs Environment inconsistent is a
[dataflow] Use consistent, symmetrical, non-mutating erased signature for join()
Mutating join() isn't used and so appears to be an anti-optimization. Having Lattice vs Environment inconsistent is awkward, particularly when trying to minimize copies while joining.
This patch eliminates the difference, but doesn't actually change the signature of join on concrete lattice types (as that's a breaking change).
Differential Revision: https://reviews.llvm.org/D153908
show more ...
|
#
1e010c5c |
| 26-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
Reland "[dataflow] avoid more accidental copies of Environment"
This reverts commit fb13d027eae405a7be96fd7da0d72422e48a0719.
|
#
fb13d027 |
| 26-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
Revert "[dataflow] avoid more accidental copies of Environment"
This reverts commit ae54f01dd8c53d18c276420b23f0d0ab7afefff1.
Accidentally committed without review :-(
|
#
ae54f01d |
| 22-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] avoid more accidental copies of Environment
This is clunky but greatly improves debugging of flow conditions - each copy adds more indirections in the form of flow condition tokens.
(Lat
[dataflow] avoid more accidental copies of Environment
This is clunky but greatly improves debugging of flow conditions - each copy adds more indirections in the form of flow condition tokens.
(LatticeEffect presumably once did something here, but it's now both unused and untested.)
For the exit flow condition of: ``` void target(base::Optional<int*> opt) { if (opt.value_or(nullptr) != nullptr) { opt.value(); } else { opt.value(); // unsafe } } ```
Before: ``` (B0:1 = V15) (B1:1 = V8) (B2:1 = V10) (B3:1 = (V4 & (!V7 => V6))) (V10 = (B3:1 & !V7)) (V12 = B1:1) (V13 = B2:1) (V15 = (V12 | V13)) (V3 = V2) (V4 = V3) (V8 = (B3:1 & !!V7)) B0:1 V2 ```
After D153491: ``` (B0:1 = (V9 | V10)) (B1:1 = (B3:1 & !!V6)) (B2:1 = (B3:1 & !V6)) (B3:1 = (V3 & (!V6 => V5))) (V10 = B2:1) (V3 = V2) (V9 = B1:1) B0:1 V2 ```
After this patch, we can finally see the relations between the flow conditions directly:
``` (B0:1 = (B2:1 | B1:1)) (B1:1 = (B3:1 & !!V6)) (B2:1 = (B3:1 & !V6)) (B3:1 = (V3 & (!V6 => V5))) (V3 = V2) B0:1 V2 ```
(I believe V2 is the FC for the InitEnv, and V3 is introduced when computing the input state for B3 - not sure how to eliminate it)
Differential Revision: https://reviews.llvm.org/D153493
show more ...
|
#
f2123af1 |
| 20-Jun-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Perform deep copies in copy and move operations.
This serves two purposes:
- Because, today, we only copy the `StructValue`, modifying the destination of the copy also modifies
[clang][dataflow] Perform deep copies in copy and move operations.
This serves two purposes:
- Because, today, we only copy the `StructValue`, modifying the destination of the copy also modifies the source. This is demonstrated by the new checks added to `CopyConstructor` and `MoveConstructor`, which fail without the deep copy.
- It lays the groundwork for eliminating the redundancy between `AggregateStorageLocation` and `StructValue`, which will happen as part of the ongoing migration to strict handling of value categories (seeo https://discourse.llvm.org/t/70086 for details). This will involve turning `StructValue` into essentially just a wrapper for `AggregateStorageLocation`; under this scheme, the current "shallow" copy (copying a `StructValue` from one `AggregateStorageLocation` to another) will no longer be possible.
Because we now perform deep copies, tests need to perform a deep equality comparison instead of just comparing for equal identity of the `StructValue`s. The new function `recordsEqual()` provides such a deep equality comparison.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D153006
show more ...
|
#
c2bb6807 |
| 24-Jun-2023 |
Sam McCall <sam.mccall@gmail.com> |
[dataflow] Disallow implicit copy of Environment, use fork() instead
Environments are heavyweight, and copies are observably different from the original: they introduce new SAT variables, which degr
[dataflow] Disallow implicit copy of Environment, use fork() instead
Environments are heavyweight, and copies are observably different from the original: they introduce new SAT variables, which degrade performance & debugging. Copies should only be done deliberately, where justified.
Empirically there are several places in the framework where we perform dubious copies, sometimes entirely accidentally. (see e.g. D153491). Making these explicit makes this mistake harder.
This patch forces copies to go through fork(), the copy-constructor is private. This requires changes to existing callsites: some are correct and call fork(), some are incorrect and are fixed, others are difficult and I left a FIXME.
Differential Revision: https://reviews.llvm.org/D153674
show more ...
|
#
583371be |
| 12-Jun-2023 |
Kazu Hirata <kazu@google.com> |
[FlowSensitive] Use {DenseMapBase,StringMap}::lookup (NFC)
|
#
64413584 |
| 23-May-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Add support for return values of reference type.
This patch changes the way `Environment::ReturnLoc` is set: Whereas previously it was set by the caller, it is now set by the calle
[clang][dataflow] Add support for return values of reference type.
This patch changes the way `Environment::ReturnLoc` is set: Whereas previously it was set by the caller, it is now set by the callee (obviously, as we otherwise would not be able to return references).
The patch also introduces `Environment::ReturnVal`, which is used for non-reference-type return values. This allows these to be handled with the correct value category semantics; see also https://discourse.llvm.org/t/70086, which describes the ongoing migration to strict value category semantics.
Depends On D150776
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D151194
show more ...
|
#
080ee850 |
| 17-May-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Add `Strict` versions of `Value` and `StorageLocation` accessors.
This is part of the gradual migration to strict handling of value categories, as described in the RFC at https://d
[clang][dataflow] Add `Strict` versions of `Value` and `StorageLocation` accessors.
This is part of the gradual migration to strict handling of value categories, as described in the RFC at https://discourse.llvm.org/t/70086.
This patch migrates some representative calls of the newly deprecated accessors to the new `Strict` functions. Followup patches will migrate the remaining callers. (There are a large number of callers, with some subtlety involved in some of them, so it makes sense to split this up into multiple patches rather than migrating all callers in one go.)
The `Strict` accessors as implemented here have some differences in semantics compared to the semantics originally proposed in the RFC; specifically:
* `setStorageLocationStrict()`: The RFC proposes to create an intermediate `ReferenceValue` that then refers to the `StorageLocation` for the glvalue. It turns out though that, even today, most places in the code are not doing this but are instead associating glvalues directly with their `StorageLocation`. It therefore didn't seem to make sense to introduce new `ReferenceValue`s where there were none previously, so I have chosen to instead make `setStorageLocationStrict()` simply call through to `setStorageLocation(const Expr &, StorageLocation &)` and merely add the assertion that the expression must be a glvalue.
* `getStorageLocationStrict()`: The RFC proposes that this should assert that the storage location for the glvalue expression is associated with an intermediate `ReferenceValue`, but, as explained, this is often not true. The current state is inconsistent: Sometimes the intermediate `ReferenceValue` is there, sometimes it isn't. For this reason, `getStorageLocationStrict()` skips past a `ReferenceValue` if it is there but otherwise directly returns the storage location associated with the expression. This behavior is equivalent to the existing behavior of `SkipPast::Reference`.
* `setValueStrict()`: The RFC proposes that this should always create the same `StorageLocation` for a given `Value`, but, in fact, the transfer functions that exist today don't guarantee this; almost all transfer functions unconditionally create a new `StorageLocation` when associating an expression with a `Value`.
There appears to be one special case: `TerminatorVisitor::extendFlowCondition()` checks whether the expression is already associated with a `StorageLocation` and, if so, reuses the existing `StorageLocation` instead of creating a new one.
For this reason, `setValueStrict()` implements this logic (preserve an existing `StorageLocation`) but makes no attempt to always associate the same `StorageLocation` with a given `Value`, as nothing in the framework appers to require this.
As `TerminatorVisitor::extendFlowCondition()` is an interesting special case, the `setValue()` call there is among the ones that this patch migrates to `setValueStrict()`.
Reviewed By: sammccall, ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D150653
show more ...
|