#
71f1932b |
| 11-Apr-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Reland #87320: Propagate locations from result objects to initializers. (#88316)
This relands #87320 and additionally removes the now-unused function `isOriginalRecordConstructor()
[clang][dataflow] Reland #87320: Propagate locations from result objects to initializers. (#88316)
This relands #87320 and additionally removes the now-unused function `isOriginalRecordConstructor()`, which was causing buildbots to fail.
show more ...
|
#
7549b458 |
| 10-Apr-2024 |
martinboehme <mboehme@google.com> |
Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315)
Reverts llvm/llvm-project#87320
This is causing buildbots to fail because
`isOriginalRecordConstructo
Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315)
Reverts llvm/llvm-project#87320
This is causing buildbots to fail because
`isOriginalRecordConstructor()` is now unused.
show more ...
|
#
21009f46 |
| 10-Apr-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Propagate locations from result objects to initializers. (#87320)
Previously, we were propagating storage locations the other way around, i.e. from initializers to result objects,
[clang][dataflow] Propagate locations from result objects to initializers. (#87320)
Previously, we were propagating storage locations the other way around, i.e. from initializers to result objects, using `RecordValue::getLoc()`. This gave the wrong behavior in some cases -- see the newly added or fixed tests in this patch.
In addition, this patch now unblocks removing the `RecordValue` class entirely, as we no longer need `RecordValue::getLoc()`.
With this patch, the test `TransferTest.DifferentReferenceLocInJoin` started to fail because the framework now always uses the same storge location for a `MaterializeTemporaryExpr`, meaning that the code under test no longer set up the desired state where a variable of reference type is mapped to two different storage locations in environments being joined. Rather than trying to modify this test to set up the test condition again, I have chosen to replace the test with an equivalent test in DataflowEnvironmentTest.cpp that sets up the test condition directly; because this test is more direct, it will also be less brittle in the face of future changes.
show more ...
|
#
d08a76d1 |
| 07-Apr-2024 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Fix warnings discovered by #87348 [-Wunused-but-set-variable]
|
#
bbd259af |
| 04-Apr-2024 |
Yitzhak Mandelbaum <ymand@users.noreply.github.com> |
[clang][dataflow] Refactor `widen` API to be explicit about change effect. (#87233)
The previous API relied on pointer equality of inputs and outputs to
signal whether a change occured. This was to
[clang][dataflow] Refactor `widen` API to be explicit about change effect. (#87233)
The previous API relied on pointer equality of inputs and outputs to
signal whether a change occured. This was too subtle and led to bugs in
practice. It was also very limiting: the override could not return an equivalent (but
not identical) value.
show more ...
|
Revision tags: llvmorg-18.1.3 |
|
#
8d77d362 |
| 28-Mar-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Introduce a helper class for handling record initializer lists. (#86675)
This is currently only used in one place, but I'm working on a patch that will use this from a second place
[clang][dataflow] Introduce a helper class for handling record initializer lists. (#86675)
This is currently only used in one place, but I'm working on a patch that will use this from a second place. And I think this already improves the readability of the one place this is used so far.
show more ...
|
#
4c4ea249 |
| 26-Mar-2024 |
smanna12 <soumi.manna@intel.com> |
[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with auto keyword (#85962)
Reported by Static Analyzer Tool:
In clang::dataflow::Environment::initialize(): Using the auto keyw
[NFC][CLANG] Fix static analyzer bugs about unnecessary object copies with auto keyword (#85962)
Reported by Static Analyzer Tool:
In clang::dataflow::Environment::initialize(): Using the auto keyword
without an & causes the copy of an object of type LambdaCapture
show more ...
|
Revision tags: llvmorg-18.1.2 |
|
#
27d50499 |
| 18-Mar-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Fix `getResultObjectLocation()` on `CXXDefaultArgExpr`. (#85072)
This patch includes a test that causes an assertion failure without the
other
changes in this patch.
|
#
2d539db2 |
| 08-Mar-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] When analyzing ctors, don't initialize fields of `*this` with values. (#84164)
This is the constructor's job, and we want to be able to test that it does this.
|
Revision tags: llvmorg-18.1.1 |
|
#
d5aecf0c |
| 07-Mar-2024 |
martinboehme <mboehme@google.com> |
[clang][nullability] Don't discard expression state before end of full-expression. (#82611)
In https://github.com/llvm/llvm-project/pull/72985, I made a change to
discard
expression state (`ExprTo
[clang][nullability] Don't discard expression state before end of full-expression. (#82611)
In https://github.com/llvm/llvm-project/pull/72985, I made a change to
discard
expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each
basic
block. I did so with the claim that "we never need to access entries
from these
maps outside of the current basic block", noting that there are
exceptions to
this claim when control flow happens inside a full-expression (the
operands of
`&&`, `||`, and the conditional operator live in different basic blocks
than the
operator itself) but that we already have a mechanism for retrieving the
values
of these operands from the environment for the block they are computed
in.
It turns out, however, that the operands of these operators aren't the
only
expressions whose values can be accessed from a different basic block;
when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:
```cxx
void f(int*, int);
bool cond();
void target() {
int i = 0;
f(&i, cond() ? 1 : 0);
}
```
([godbolt](https://godbolt.org/z/hrbj1Mj3o))
In the CFG[^1] , note how the expression for `&i` is computed in block
B4,
but the parent of this expression (the `CallExpr`) is located in block
B1.
The the argument expression `&i` and the `CallExpr` are essentially
"torn apart"
into different basic blocks by the conditional operator in the second
argument.
In other words, the edge between the `CallExpr` and its argument `&i`
straddles
the boundary between two blocks.
I used to think that this scenario -- where an edge between an
expression and
one of its children straddles a block boundary -- could only happen
between the
expression that triggers the control flow (`&&`, `||`, or the
conditional
operator) and its children, but the example above shows that other
expressions
can be affected as well; the control flow is still triggered by `&&`,
`||` or
the conditional operator, but the expressions affected lie outside these
operators.
Discarding expression state too soon is harmful. For example, an
analysis that
checks the arguments of the `CallExpr` above would not be able to
retrieve a
value for the `&i` argument.
This patch therefore ensures that we don't discard expression state
before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state
for the
reasons explained in https://github.com/llvm/llvm-project/pull/72985
(avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by
removing
clutter from the expression state).
The impact on performance from this change is about a 1% slowdown in the
Crubit nullability check benchmarks:
```
name old cpu/op new cpu/op delta
BM_PointerAnalysisCopyPointer 71.9µs ± 1% 71.9µs ± 2% ~ (p=0.987 n=15+20)
BM_PointerAnalysisIntLoop 190µs ± 1% 192µs ± 2% +1.06% (p=0.000 n=14+16)
BM_PointerAnalysisPointerLoop 325µs ± 5% 324µs ± 4% ~ (p=0.496 n=18+20)
BM_PointerAnalysisBranch 193µs ± 0% 192µs ± 4% ~ (p=0.488 n=14+18)
BM_PointerAnalysisLoopAndBranch 521µs ± 1% 525µs ± 3% +0.94% (p=0.017 n=18+19)
BM_PointerAnalysisTwoLoops 337µs ± 1% 341µs ± 3% +1.19% (p=0.004 n=17+19)
BM_PointerAnalysisJoinFilePath 1.62ms ± 2% 1.64ms ± 3% +0.92% (p=0.021 n=20+20)
BM_PointerAnalysisCallInLoop 1.14ms ± 1% 1.15ms ± 4% ~ (p=0.135 n=16+18)
```
[^1]:
```
[B5 (ENTRY)]
Succs (1): B4
[B1]
1: [B4.9] ? [B2.1] : [B3.1]
2: [B4.4]([B4.6], [B1.1])
Preds (2): B2 B3
Succs (1): B0
[B2]
1: 1
Preds (1): B4
Succs (1): B1
[B3]
1: 0
Preds (1): B4
Succs (1): B1
[B4]
1: 0
2: int i = 0;
3: f
4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
5: i
6: &[B4.5]
7: cond
8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
9: [B4.8]()
T: [B4.9] ? ... : ...
Preds (1): B5
Succs (2): B2 B3
[B0 (EXIT)]
Preds (1): B1
```
show more ...
|
#
128780b0 |
| 01-Mar-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986)
This fixes a crash introduced by https://github.com/llvm/llvm-project/pull/82348 but also adds additional handling to m
[clang][dataflow] Correctly treat empty initializer lists for unions. (#82986)
This fixes a crash introduced by https://github.com/llvm/llvm-project/pull/82348 but also adds additional handling to make sure that we treat empty initializer lists for both unions and structs/classes correctly (see tests added in this patch).
show more ...
|
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4 |
|
#
c4e94633 |
| 26-Feb-2024 |
Samira Bazuzi <bazuzi@google.com> |
Revert "[clang][dataflow] Correctly handle `InitListExpr` of union type." (#82856)
Reverts llvm/llvm-project#82348, which caused crashes when analyzing
empty InitListExprs for unions, e.g.
```cc
Revert "[clang][dataflow] Correctly handle `InitListExpr` of union type." (#82856)
Reverts llvm/llvm-project#82348, which caused crashes when analyzing
empty InitListExprs for unions, e.g.
```cc
union U {
double double_value;
int int_value;
};
void target() {
U value;
value = {};
}
```
Co-authored-by: Samira Bazuzi <bazuzi@users.noreply.github.com>
show more ...
|
#
4725993f |
| 21-Feb-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Correctly handle `InitListExpr` of union type. (#82348)
|
Revision tags: llvmorg-18.1.0-rc3 |
|
#
8c6e96d9 |
| 13-Feb-2024 |
Antonio Frighetto <me@antoniofrighetto.com> |
[clang][Dataflow] Fix unnecessary copy in `initializeFieldsWithValues` (NFC)
|
#
270f2c55 |
| 13-Feb-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Add `Environment::initializeFieldsWithValues()`. (#81239)
This function will be useful when we change the behavior of record-type
prvalues
so that they directly initialize the as
[clang][dataflow] Add `Environment::initializeFieldsWithValues()`. (#81239)
This function will be useful when we change the behavior of record-type
prvalues
so that they directly initialize the associated result object. See also
the
comment here for more details:
https://github.com/llvm/llvm-project/blob/9e73656af524a2c592978aec91de67316c5ce69f/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L354
As part of this patch, we document and assert that synthetic fields may
not have
reference type.
There is no practical use case for this: A `StorageLocation` may not
have
reference type, and a synthetic field of the corresponding non-reference
type
can serve the same purpose.
show more ...
|
Revision tags: llvmorg-18.1.0-rc2 |
|
#
672fb27b |
| 06-Feb-2024 |
Yitzhak Mandelbaum <ymand@users.noreply.github.com> |
[clang][dataflow] Add new `join` API and replace existing `merge` implementations. (#80361)
This patch adds a new interface for the join operation, now properly called `join`. Originally, the framew
[clang][dataflow] Add new `join` API and replace existing `merge` implementations. (#80361)
This patch adds a new interface for the join operation, now properly called `join`. Originally, the framework offered a single `merge` operation, which could serve either as a join or a widening. In practice, though we found this conflation didn't work for non-trivial anlyses, and split of the widening operation (`widen`). This change completes the transition by introducing a proper `join` with strict join semantics.
In the process, it drops an odd (and often misused) aspect of `merge` wherein callees could implictly instruct the framework to drop the current entry by returning `false`. This features was never used correctly in analyses and doesn't belong in a join operation, so it is omitted.
---------
Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com> Co-authored-by: martinboehme <mboehme@google.com>
show more ...
|
#
c83ec847 |
| 31-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Extend debug output for `Environment`. (#79982)
* Print `ReturnLoc`, `ReturnVal`, and `ThisPointeeLoc` if applicable.
* For entries in `LocToVal` that correspond to declarations,
[clang][dataflow] Extend debug output for `Environment`. (#79982)
* Print `ReturnLoc`, `ReturnVal`, and `ThisPointeeLoc` if applicable.
* For entries in `LocToVal` that correspond to declarations, print the names of the declarations next to them.
I've removed the FIXME because all relevant fields are now being dumped. I'm not sure we actually need the capability for the caller to specify which fields to dump, so I've simply deleted this part of the comment.
Some examples of the output:


show more ...
|
Revision tags: llvmorg-18.1.0-rc1 |
|
#
7a6c2628 |
| 24-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Eliminate two uses of `RecordValue::getLoc()`. (#79163)
This is a small step towards eventually eliminating `RecordValue` entirely.
|
Revision tags: llvmorg-19-init |
|
#
a2caa492 |
| 22-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Treat comma operator correctly in `getResultObjectLocation()`. (#78427)
|
#
f1226eea |
| 18-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Consider `CXXDefaultInitExpr` to be an "original record ctor". (#78423)
The CFG doesn't contain a CFGElement for the `CXXDefaultInitExpr::getInit()`, so it makes sense to consider
[clang][dataflow] Consider `CXXDefaultInitExpr` to be an "original record ctor". (#78423)
The CFG doesn't contain a CFGElement for the `CXXDefaultInitExpr::getInit()`, so it makes sense to consider the `CXXDefaultInitExpr` to be the expression that originally constructs the object.
show more ...
|
#
23bfc271 |
| 16-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Use `ignoreCFGOmittedNodes()` in `setValue()`. (#78245)
This is to be consistent with `getValue()`, which also uses `ignoreCFGOmittedNodes()`.
Before this fix, it was not possible
[clang][dataflow] Use `ignoreCFGOmittedNodes()` in `setValue()`. (#78245)
This is to be consistent with `getValue()`, which also uses `ignoreCFGOmittedNodes()`.
Before this fix, it was not possible to retrieve a `Value` from a "CFG omitted" node that had previously been set using `setValue()`; see the accompanying test, which fails without the fix.
I discovered this issue while running internal integration tests on https://github.com/llvm/llvm-project/pull/78127.
show more ...
|
#
c19cacfa |
| 16-Jan-2024 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Tighten checking for existence of a function body. (#78163)
In various places, we would previously call `FunctionDecl::hasBody()` (which checks whether any redeclaration of the fun
[clang][dataflow] Tighten checking for existence of a function body. (#78163)
In various places, we would previously call `FunctionDecl::hasBody()` (which checks whether any redeclaration of the function has a body, not necessarily the one on which `hasBody()` is being called).
This is bug-prone, as a recent bug in Crubit's nullability checker has shown
([fix](https://github.com/google/crubit/commit/4b01ed0f14d953cda20f92d62256e7365d206b2e), [fix for the fix](https://github.com/google/crubit/commit/e0c5d8ddd7d647da483c2ae198ff91d131c12055)).
Instead, we now use `FunctionDecl::doesThisDeclarationHaveABody()` which, as the name implies, checks whether the specific redeclaration it is being called on has a body.
Alternatively, I considered being more lenient and "canonicalizing" to the `FunctionDecl` that has the body if the `FunctionDecl` being passed is a different redeclaration. However, this also risks hiding bugs: A caller might inadverently perform the analysis for all redeclarations of a function and end up duplicating work without realizing it. By accepting only the redeclaration that contains the body, we prevent this.
I've checked, and all clients that I'm aware of do currently pass in the redeclaration that contains the function body. Typically this is because they use the `ast_matchers::hasBody()` matcher which, unlike `FunctionDecl::hasBody()`, only matches for the redeclaration containing the body.
show more ...
|
#
2ee396b0 |
| 21-Dec-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Add `Environment::get<>()`. (#76027)
This template function casts the result of `getValue()` or
`getStorageLocation()` to a given subclass of `Value` or
`StorageLocation` (using
[clang][dataflow] Add `Environment::get<>()`. (#76027)
This template function casts the result of `getValue()` or
`getStorageLocation()` to a given subclass of `Value` or
`StorageLocation` (using `cast_or_null`).
It's a common pattern to do something like this:
```cxx
auto *Val = cast_or_null<PointerValue>(Env.getValue(E));
```
This can now be expressed more concisely like this:
```cxx
auto *Val = Env.get<PointerValue>(E);
```
Instead of adding a new method `get()`, I had originally considered
simply adding a template parameter to `getValue()` and
`getStorageLocation()` (with a default argument of `Value` or
`StorageLocation`), but this results in an undesirable repetition at the
callsite, e.g. `getStorageLocation<RecordStorageLocation>(...)`. The
`Value` and `StorageLocation` in the method name adds nothing of value
when the template argument already contains this information, so it
seemed best to shorten the method name to simply `get()`.
show more ...
|
#
ca103434 |
| 18-Dec-2023 |
martinboehme <mboehme@google.com> |
[clang][dataflow] Fix an issue with `Environment::getResultObjectLocation()`. (#75483)
So far, if there was a chain of record type prvalues, `getResultObjectLocation()` would assign a different resu
[clang][dataflow] Fix an issue with `Environment::getResultObjectLocation()`. (#75483)
So far, if there was a chain of record type prvalues, `getResultObjectLocation()` would assign a different result object location to each one. This makes no sense, of course, as all of these prvalues end up initializing the same result object.
This patch fixes this by propagating storage locations up through the entire chain of prvalues.
The new implementation also has the desirable effect of making it possible to make `getResultObjectLocation()` const, which seems appropriate given that, logically, it is just an accessor.
show more ...
|
#
40381d12 |
| 05-Dec-2023 |
Samira Bazuzi <bazuzi@users.noreply.github.com> |
[clang][dataflow] Re-land: Retrieve members from accessors called usi… (#74336)
…ng member pointers.
This initially landed with a broken test due to a mid-air collision with
a new requirement fo
[clang][dataflow] Re-land: Retrieve members from accessors called usi… (#74336)
…ng member pointers.
This initially landed with a broken test due to a mid-air collision with
a new requirement for Environment initialization before field modeling.
Have added that initialization in the test.
From first landing:
getMethodDecl does not handle pointers to members and returns nullptr
for them. getMethodDecl contains a decade-plus-old FIXME to handle
pointers to members, but two approaches I looked at for fixing it are
more invasive or complex than simply swapping to getCalleeDecl.
The first, have getMethodDecl call getCalleeDecl, creates a large tree
of const-ness mismatches due to getMethodDecl returning a non-const
value while being a const member function and getCalleeDecl only being a
const member function when it returns a const value.
The second, implementing an AST walk to match how
CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary
operator, is basically reimplementing Expr::getReferencedDeclOfCallee,
which is used by Expr::getCalleeDecl. We don't need another copy of that
code.
show more ...
|