#
3b6d63c5 |
| 04-Dec-2023 |
martinboehme <mboehme@google.com> |
Revert "[clang][dataflow] Retrieve members from accessors called using member…" (#74299)
Reverts llvm/llvm-project#73978
|
#
a3fe9cb2 |
| 04-Dec-2023 |
Samira Bazuzi <bazuzi@users.noreply.github.com> |
[clang][dataflow] Retrieve members from accessors called using member… (#73978)
… pointers.
getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains
[clang][dataflow] Retrieve members from accessors called using member… (#73978)
… pointers.
getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains a decade-plus-old FIXME to handle
pointers to members, but two approaches I looked at for fixing it are
more invasive or complex than simply swapping to getCalleeDecl.
The first, have getMethodDecl call getCalleeDecl, creates a large tree
of const-ness mismatches due to getMethodDecl returning a non-const
value while being a const member function and getCalleeDecl only being a
const member function when it returns a const value.
The second, implementing an AST walk to match how
CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary
operator, is basically reimplementing Expr::getReferencedDeclOfCallee,
which is used by Expr::getCalleeDecl. We don't need another copy of that
code.
show more ...
|
#
71f2ec2d |
| 04-Dec-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Add synthetic fields to `RecordStorageLocation` (#73860)
Synthetic fields are intended to model the internal state of a class
(e.g. the value stored in a `std::optional`) without
[clang][dataflow] Add synthetic fields to `RecordStorageLocation` (#73860)
Synthetic fields are intended to model the internal state of a class
(e.g. the value stored in a `std::optional`) without having to depend on
that class's implementation details.
Today, this is typically done with properties on `RecordValue`s, but
these have several drawbacks:
* Care must be taken to call `refreshRecordValue()` before modifying a
property so that the modified property values aren’t seen by other
environments that may have access to the same `RecordValue`.
* Properties aren’t associated with a storage location. If an analysis
needs to associate a location with the value stored in a property (e.g.
to model the reference returned by `std::optional::value()`), it needs
to manually add an indirection using a `PointerValue`. (See for example
the way this is done in UncheckedOptionalAccessModel.cpp, specifically
in `maybeInitializeOptionalValueMember()`.)
* Properties don’t participate in the builtin compare, join, and widen
operations. If an analysis needs to apply these operations to
properties, it needs to override the corresponding methods of
`ValueModel`.
* Longer-term, we plan to eliminate `RecordValue`, as by-value
operations on records aren’t really “a thing” in C++ (see
https://discourse.llvm.org/t/70086#changed-structvalue-api-14). This
would obviously eliminate the ability to set properties on
`RecordValue`s.
To demonstrate the advantages of synthetic fields, this patch converts
UncheckedOptionalAccessModel.cpp to synthetic fields. This greatly
simplifies the implementation of the check.
This PR is pretty big; to make it easier to review, I have broken it
down into a stack of three commits, each of which contains a set of
logically related changes. I considered submitting each of these as a
separate PR, but the commits only really make sense when taken together.
To review, I suggest first looking at the changes in
UncheckedOptionalAccessModel.cpp. This gives a flavor for how the
various API changes work together in the context of an analysis. Then,
review the rest of the changes.
show more ...
|
Revision tags: llvmorg-17.0.6 |
|
#
5bd643e1 |
| 27-Nov-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Strengthen widening of boolean values. (#73484)
Before we widen to top, we now check if both values can be proved either true or false in their respective environments; if so, wide
[clang][dataflow] Strengthen widening of boolean values. (#73484)
Before we widen to top, we now check if both values can be proved either true or false in their respective environments; if so, widening returns a true or false literal. The idea is that we avoid losing information if posssible.
This patch includes a test that fails without this change to widening.
This change does mean that we call the SAT solver in more places, but this seems acceptable given the additional precision we gain.
In tests on an internal codebase, the number of SAT solver timeouts we observe with Crubit's nullability checker does increase by about 25%. They can be brought back to the previous level by doubling the SAT solver work limit.
show more ...
|
#
c4c59192 |
| 22-Nov-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Clear `ExprToLoc` and `ExprToVal` at the start of a block. (#72985)
We never need to access entries from these maps outside of the current
basic
block. This could only ever becom
[clang][dataflow] Clear `ExprToLoc` and `ExprToVal` at the start of a block. (#72985)
We never need to access entries from these maps outside of the current
basic
block. This could only ever become a consideration when flow control
happens
inside a full-expression (i.e. we have multiple basic blocks for a full
expression); there are two kinds of expression where this can happen,
but we
already deal with these in other ways:
* Short-circuiting logical operators (`&&` and `||`) have operands that
live in
different basic blocks than the operator itself, but we already have
code in
the framework to retrieve the value of these operands from the
environment
for the block they are computed in, rather than in the environment of
the
block containing the operator.
* The conditional operator similarly has operands that live in different
basic
blocks. However, we currently don't implement a transfer function for
the
conditional operator. When we do this, we need to retrieve the values of
the
operands from the environments of the basic blocks they live in, as we
already do for logical operators. This patch adds a comment to this
effect
to the code.
Clearing out `ExprToLoc` and `ExprToVal` has two benefits:
* We avoid performing joins on boolean expressions contained in
`ExprToVal` and
hence extending the flow condition in cases where this is not needed.
Simpler
flow conditions should reduce the amount of work we do in the SAT
solver.
* Debugging becomes easier when flow conditions are simpler and
`ExprToLoc` /
`ExprToVal` don’t contain any extraneous entries.
Benchmark results on Crubit's `pointer_nullability_analysis_benchmark
show a
slight runtime increase for simple benchmarks, offset by substantial
runtime
reductions for more complex benchmarks:
```
name old cpu/op new cpu/op delta
BM_PointerAnalysisCopyPointer 29.8µs ± 1% 29.9µs ± 4% ~ (p=0.879 n=46+49)
BM_PointerAnalysisIntLoop 101µs ± 3% 104µs ± 4% +2.96% (p=0.000 n=55+57)
BM_PointerAnalysisPointerLoop 378µs ± 3% 245µs ± 3% -35.09% (p=0.000 n=47+55)
BM_PointerAnalysisBranch 118µs ± 2% 122µs ± 3% +3.37% (p=0.000 n=59+59)
BM_PointerAnalysisLoopAndBranch 779µs ± 3% 413µs ± 5% -47.01% (p=0.000 n=56+45)
BM_PointerAnalysisTwoLoops 187µs ± 3% 192µs ± 5% +2.80% (p=0.000 n=57+58)
BM_PointerAnalysisJoinFilePath 17.4ms ± 3% 7.2ms ± 3% -58.75% (p=0.000 n=58+57)
BM_PointerAnalysisCallInLoop 14.7ms ± 4% 10.3ms ± 2% -29.87% (p=0.000 n=56+58)
```
show more ...
|
Revision tags: llvmorg-17.0.5 |
|
#
a0700532 |
| 08-Nov-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Replace one remaining call to deprecated `addToFlowCondition()`. (#71547)
|
#
7c636728 |
| 07-Nov-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. (#70848)
This can make the flow condition significantly easier to interpret; see below for an example.
I had hoped that adding th
[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. (#70848)
This can make the flow condition significantly easier to interpret; see below for an example.
I had hoped that adding the simplification as a preprocessing step before the SAT solver (in `DataflowAnalysisContext::querySolver()`) would also speed up SAT solving and maybe even eliminate SAT solver timeouts, but in my testing, this actually turns out to be a pessimization. It appears that these simplifications are easy enough for the SAT solver to perform itself.
Nevertheless, the improvement in debugging alone makes this a worthwhile change.
Example of flow condition output with these changes:
``` Flow condition token: V37 Constraints: (V16 = (((V15 & (V19 = V12)) & V22) & V25)) (V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14))) True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37) False atoms: (V3, V8, V17) Equivalent atoms: (V11, V15)
Flow condition constraints before simplification: V37 ((!V3 & !V8) & !V17) (V37 = V34) (V34 = (V29 & (V35 = V30))) (V29 = (((V16 | V2) & V32) & (V30 = V32))) (V16 = (((V15 & (V19 = V12)) & V22) & V25)) (V15 = V11) (V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14))) (V2 = V1) (V1 = V0) V0 (V7 = V6) (V6 = V5) (V5 = V2) ```
show more ...
|
Revision tags: llvmorg-17.0.4 |
|
#
d1f59544 |
| 25-Oct-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Add `Environment::allows()`. (#70046)
This allows querying whether, given the flow condition, a certain formula still has a solution (though it is not necessarily implied by the fl
[clang][dataflow] Add `Environment::allows()`. (#70046)
This allows querying whether, given the flow condition, a certain formula still has a solution (though it is not necessarily implied by the flow condition, as `flowConditionImplies()` would check).
This can be checked today, but only with a double negation, i.e. to check whether, given the flow condition, a formula F has a solution, you can check `!Env.flowConditionImplies(Arena.makeNot(F))`. The double negation makes this hard to reason about, and it would be nicer to have a way of directly checking this.
For consistency, this patch also renames `flowConditionImplies()` to `proves()`; the old name is kept around for compatibility but deprecated.
show more ...
|
#
14b039c1 |
| 24-Oct-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Remove `declToLocConsistent()` assertion. (#69819)
As described [here](https://discourse.llvm.org/t/70086/6), there are legitimate non-bug scenarios where two `DeclToLoc` maps to b
[clang][dataflow] Remove `declToLocConsistent()` assertion. (#69819)
As described [here](https://discourse.llvm.org/t/70086/6), there are legitimate non-bug scenarios where two `DeclToLoc` maps to be joined contain different storage locations for the same declaration. This patch also adds a test containing an example of such a situation. (The test fails without the other changes in this patch.)
With the assertion removed, the existing logic in `intersectDenseMaps()` will remove the corresponding declaration from the joined DeclToLoc map.
We also remove `removeDecl()`'s precondition (that the declaration must be associated with a storage location) because this may no longer hold if the declaration was previously removed during a join, as described above.
show more ...
|
Revision tags: llvmorg-17.0.3 |
|
#
52d06963 |
| 11-Oct-2023 |
Stanislav Gatev <sgatev@google.com> |
[clang][dataflow] Add support for lambda captures (#68558)
This adds support for copy, ref, and this lambda captures to the core
framework and also adds relevant tests in UncheckedOptionalAccessTes
[clang][dataflow] Add support for lambda captures (#68558)
This adds support for copy, ref, and this lambda captures to the core
framework and also adds relevant tests in UncheckedOptionalAccessTest.
show more ...
|
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
#
af475173 |
| 27-Jan-2022 |
Corentin Jabot <corentinjabot@gmail.com> |
[C++] Implement "Deducing this" (P0847R7)
This patch implements P0847R7 (partially), CWG2561 and CWG2653.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm
[C++] Implement "Deducing this" (P0847R7)
This patch implements P0847R7 (partially), CWG2561 and CWG2653.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D140828
show more ...
|
#
834cb919 |
| 26-Sep-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Remove declarations from `DeclToLoc` when their lifetime ends. (#67300)
After https://reviews.llvm.org/D153273, we're now able to use `CFGLifetimeEnds` together with the other CFG
[clang][dataflow] Remove declarations from `DeclToLoc` when their lifetime ends. (#67300)
After https://reviews.llvm.org/D153273, we're now able to use `CFGLifetimeEnds` together with the other CFG options we use.
show more ...
|
#
1d7b59ca |
| 19-Sep-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Fix two null pointer dereferences in `getMemberForAccessor()`. (#66742)
The additions to the test trigger crashes without the fixes.
|
#
03be486e |
| 18-Sep-2023 |
Kinuko Yasuda <kinuko@chromium.org> |
[clang][dataflow] Model the fields that are accessed via inline accessors (#66368)
So that the values that are accessed via such accessors can be analyzed
as a limited version of context-sensitive
[clang][dataflow] Model the fields that are accessed via inline accessors (#66368)
So that the values that are accessed via such accessors can be analyzed
as a limited version of context-sensitive analysis. We can potentially
do this only when some option is set, but doing additional modeling like
this won't be expensive and intrusive, so we do it by default for now.
show more ...
|
#
7f66cc7d |
| 12-Sep-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Merge `RecordValue`s with different locations correctly. (#65319)
Now that prvalue expressions map directly to values (see https://reviews.llvm.org/D158977), it's no longer guarant
[clang][dataflow] Merge `RecordValue`s with different locations correctly. (#65319)
Now that prvalue expressions map directly to values (see https://reviews.llvm.org/D158977), it's no longer guaranteed that `RecordValue`s associated with the same expression will always have the same storage location.
In other words, D158977 invalidated the assertion in `mergeDistinctValues()`. The newly added test causes this assertion to fail without the other changes in the patch.
This patch fixes the issue. However, the real fix will be to eliminate the `StorageLocation` from `RecordValue` entirely.
show more ...
|
#
80f0dc3a |
| 01-Sep-2023 |
Yitzhak Mandelbaum <yitzhakm@google.com> |
[clang][dataflow] Unsoundly treat "Unknown" as "Equivalent" in widening.
This change makes widening act the same as equivalence checking. When the analysis does not provide an answer regarding the e
[clang][dataflow] Unsoundly treat "Unknown" as "Equivalent" in widening.
This change makes widening act the same as equivalence checking. When the analysis does not provide an answer regarding the equivalence of two distinct values, the framework treats them as equivalent. This is an unsound choice that enables convergence.
Differential Revision: https://reviews.llvm.org/D159355
show more ...
|
#
266c12a1 |
| 31-Aug-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] When dumping `ExprToVal`, dump the `Value`, not just its location.
This makes `ExprToVal` dumping consistent with `LocToVal` dumping.
Reviewed By: ymandel, xazax.hun
Differential
[clang][dataflow] When dumping `ExprToVal`, dump the `Value`, not just its location.
This makes `ExprToVal` dumping consistent with `LocToVal` dumping.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D159274
show more ...
|
#
330d5bcb |
| 28-Aug-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Don't associate prvalue expressions with storage locations.
Instead, map prvalue expressions directly to values in a newly introduced map `Environment::ExprToVal`.
This change int
[clang][dataflow] Don't associate prvalue expressions with storage locations.
Instead, map prvalue expressions directly to values in a newly introduced map `Environment::ExprToVal`.
This change introduces an additional member variable in `Environment` but is an overall win:
- It is more conceptually correctly, since prvalues don't have storage locations.
- It eliminates complexity from `Environment::setValue(const Expr &E, Value &Val)`.
- It reduces the amount of data stored in `Environment`: A prvalue now has a single entry in `ExprToVal` instead of one in `ExprToLoc` and one in `LocToVal`.
- Not allocating `StorageLocation`s for prvalues additionally reduces memory usage.
This patch is the last step in the migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details). The changes here are almost entirely internal to `Environment`.
The only externally observable change is that when associating a `RecordValue` with the location returned by `Environment::getResultObjectLocation()` for a given expression, callers additionally need to associate the `RecordValue` with the expression themselves.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D158977
show more ...
|
#
9ecdbe38 |
| 01-Aug-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Rename `AggregateStorageLocation` to `RecordStorageLocation` and `StructValue` to `RecordValue`.
- Both of these constructs are used to represent structs, classes, and unions; Cl
[clang][dataflow] Rename `AggregateStorageLocation` to `RecordStorageLocation` and `StructValue` to `RecordValue`.
- Both of these constructs are used to represent structs, classes, and unions; Clang uses the collective term "record" for these.
- The term "aggregate" in `AggregateStorageLocation` implies that, at some point, the intention may have been to use it also for arrays, but it don't think it's possible to use it for arrays. Records and arrays are very different and therefore need to be modeled differently. Records have a fixed set of named fields, which can have different type; arrays have a variable number of elements, but they all have the same type.
- Futhermore, "aggregate" has a very specific meaning in C++ (https://en.cppreference.com/w/cpp/language/aggregate_initialization). Aggregates of class type may not have any user-declared or inherited constructors, no private or protected non-static data members, no virtual member functions, and so on, but we use `AggregateStorageLocations` to model all objects of class type.
In addition, for consistency, we also rename the following:
- `getAggregateLoc()` (in `RecordValue`, formerly known as `StructValue`) to simply `getLoc()`.
- `refreshStructValue()` to `refreshRecordValue()`
We keep the old names around as deprecated synonyms to enable clients to be migrated to the new names.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D156788
show more ...
|
#
b244b6ae |
| 31-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Remove `Strict` suffix from accessors.
For the time being, we're keeping the `Strict` versions around as deprecated synonyms so that clients can be migrated, but these synonyms wil
[clang][dataflow] Remove `Strict` suffix from accessors.
For the time being, we're keeping the `Strict` versions around as deprecated synonyms so that clients can be migrated, but these synonyms will be removed soon.
Depends On D156673
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D156674
show more ...
|
#
17ba278f |
| 31-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Remove deprecated accessors as well as `SkipPast`.
Depends On D156672
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D156673
|
#
f76f6674 |
| 31-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Use `Strict` accessors where we weren't using them yet.
This eliminates all uses of the deprecated accessors.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://revie
[clang][dataflow] Use `Strict` accessors where we weren't using them yet.
This eliminates all uses of the deprecated accessors.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D156672
show more ...
|
#
8c0acbf8 |
| 28-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Avoid -Wunused-variable.
|
#
e95134b9 |
| 26-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Reverse course on `getValue()` deprecation.
In the [value categories RFC](https://discourse.llvm.org/t/70086), I proposed that the end state of the migration should be that `getVal
[clang][dataflow] Reverse course on `getValue()` deprecation.
In the [value categories RFC](https://discourse.llvm.org/t/70086), I proposed that the end state of the migration should be that `getValue()` should only be legal to call on prvalues.
As a stepping stone, to allow migrating off existing calls to `getValue()`, I proposed introducing `getValueStrict()`, which would already have the new semantics.
However, I've now reconsidered this. Any expression, whether prvalue or glvalue, has a value, so really there isn't any reason to forbid calling `getValue()` on glvalues. I'm therefore removing the deprecation from `getValue()` and transitioning existing `getValueStrict()` calls back to `getValue()`.
The other "strict" accessors are a different case. `setValueStrict()` should only be called on prvalues because glvalues need to have a storage location associated with them; it doesn't make sense to only set a value for them. And, of course, `getStorageLocationStrict()` and `setStorageLocationStrict()` should obviously only be called on glvalues because prvalues don't have storage locations.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155921
show more ...
|
#
1b334a2a |
| 26-Jul-2023 |
Martin Braenne <mboehme@google.com> |
[clang][dataflow] Eliminate `ReferenceValue`.
There are no remaining uses of this class in the framework.
This patch is part of the ongoing migration to strict handling of value categories (see htt
[clang][dataflow] Eliminate `ReferenceValue`.
There are no remaining uses of this class in the framework.
This patch is part of the ongoing migration to strict handling of value categories (see https://discourse.llvm.org/t/70086 for details).
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D155922
show more ...
|