xref: /minix3/external/bsd/llvm/dist/clang/www/analyzer/open_projects.html (revision 0a6a1f1d05b60e214de2f05a7310ddd1f0e590e7)
1f4a2713aSLionel Sambuc<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2f4a2713aSLionel Sambuc          "http://www.w3.org/TR/html4/strict.dtd">
3f4a2713aSLionel Sambuc<html>
4f4a2713aSLionel Sambuc<head>
5f4a2713aSLionel Sambuc  <title>Open Projects</title>
6f4a2713aSLionel Sambuc  <link type="text/css" rel="stylesheet" href="menu.css">
7f4a2713aSLionel Sambuc  <link type="text/css" rel="stylesheet" href="content.css">
8f4a2713aSLionel Sambuc  <script type="text/javascript" src="scripts/menu.js"></script>
9f4a2713aSLionel Sambuc</head>
10f4a2713aSLionel Sambuc<body>
11f4a2713aSLionel Sambuc
12f4a2713aSLionel Sambuc<div id="page">
13f4a2713aSLionel Sambuc<!--#include virtual="menu.html.incl"-->
14f4a2713aSLionel Sambuc<div id="content">
15f4a2713aSLionel Sambuc
16f4a2713aSLionel Sambuc<h1>Open Projects</h1>
17f4a2713aSLionel Sambuc
18f4a2713aSLionel Sambuc<p>This page lists several projects that would boost analyzer's usability and
19f4a2713aSLionel Sambucpower. Most of the projects listed here are infrastructure-related so this list
20f4a2713aSLionel Sambucis an addition to the <a href="potential_checkers.html">potential checkers
21f4a2713aSLionel Sambuclist</a>. If you are interested in tackling one of these, please send an email
22f4a2713aSLionel Sambucto the <a href=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>cfe-dev
23f4a2713aSLionel Sambucmailing list</a> to notify other members of the community.</p>
24f4a2713aSLionel Sambuc
25f4a2713aSLionel Sambuc<ul>
26f4a2713aSLionel Sambuc  <li>Core Analyzer Infrastructure
27f4a2713aSLionel Sambuc  <ul>
28f4a2713aSLionel Sambuc    <li>Explicitly model standard library functions with <tt>BodyFarm</tt>.
29f4a2713aSLionel Sambuc    <p><tt><a href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt>
30f4a2713aSLionel Sambuc    allows the analyzer to explicitly model functions whose definitions are
31f4a2713aSLionel Sambuc    not available during analysis. Modeling more of the widely used functions
32f4a2713aSLionel Sambuc    (such as the members of <tt>std::string</tt>) will improve precision of the
33f4a2713aSLionel Sambuc    analysis.
34f4a2713aSLionel Sambuc    <i>(Difficulty: Easy, ongoing)</i><p>
35f4a2713aSLionel Sambuc    </li>
36f4a2713aSLionel Sambuc
37f4a2713aSLionel Sambuc    <li>Handle floating-point values.
38f4a2713aSLionel Sambuc    <p>Currently, the analyzer treats all floating-point values as unknown.
39f4a2713aSLionel Sambuc    However, we already have most of the infrastructure we need to handle
40f4a2713aSLionel Sambuc    floats: RangeConstraintManager. This would involve adding a new SVal kind
41f4a2713aSLionel Sambuc    for constant floats, generalizing the constraint manager to handle floats
42f4a2713aSLionel Sambuc    and integers equally, and auditing existing code to make sure it doesn't
43f4a2713aSLionel Sambuc    make untoward assumptions.
44f4a2713aSLionel Sambuc    <i> (Difficulty: Medium)</i></p>
45f4a2713aSLionel Sambuc    </li>
46f4a2713aSLionel Sambuc
47f4a2713aSLionel Sambuc    <li>Implement generalized loop execution modeling.
48f4a2713aSLionel Sambuc    <p>Currently, the analyzer simply unrolls each loop <tt>N</tt> times. This
49f4a2713aSLionel Sambuc    means that it will not execute any code after the loop if the loop is
50f4a2713aSLionel Sambuc    guaranteed to execute more than <tt>N</tt> times. This results in lost
51f4a2713aSLionel Sambuc    basic block coverage. We could continue exploring the path if we could
52f4a2713aSLionel Sambuc    model a generic <tt>i</tt>-th iteration of a loop.
53f4a2713aSLionel Sambuc    <i> (Difficulty: Hard)</i></p>
54f4a2713aSLionel Sambuc    </li>
55f4a2713aSLionel Sambuc
56f4a2713aSLionel Sambuc    <li>Enhance CFG to model C++ temporaries properly.
57f4a2713aSLionel Sambuc    <p>There is an existing implementation of this, but it's not complete and
58f4a2713aSLionel Sambuc    is disabled in the analyzer.
59*0a6a1f1dSLionel Sambuc    <i>(Difficulty: Medium; current contact: Alex McCarthy)</i></p>
60f4a2713aSLionel Sambuc
61f4a2713aSLionel Sambuc    <li>Enhance CFG to model exception-handling properly.
62f4a2713aSLionel Sambuc    <p>Currently exceptions are treated as "black holes", and exception-handling
63f4a2713aSLionel Sambuc    control structures are poorly modeled (to be conservative). This could be
64f4a2713aSLionel Sambuc    much improved for both C++ and Objective-C exceptions.
65f4a2713aSLionel Sambuc    <i>(Difficulty: Medium)</i></p>
66f4a2713aSLionel Sambuc
67f4a2713aSLionel Sambuc    <li>Enhance CFG to model C++ <code>new</code> more precisely.
68f4a2713aSLionel Sambuc    <p>The current representation of <code>new</code> does not provide an easy
69f4a2713aSLionel Sambuc    way for the analyzer to model the call to a memory allocation function
70f4a2713aSLionel Sambuc    (<code>operator new</code>), then initialize the result with a constructor
71f4a2713aSLionel Sambuc    call. The problem is discussed at length in
72f4a2713aSLionel Sambuc    <a href="http://llvm.org/bugs/show_bug.cgi?id=12014">PR12014</a>.
73*0a6a1f1dSLionel Sambuc    <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p>
74f4a2713aSLionel Sambuc
75f4a2713aSLionel Sambuc    <li>Enhance CFG to model C++ <code>delete</code> more precisely.
76f4a2713aSLionel Sambuc    <p>Similarly, the representation of <code>delete</code> does not include
77f4a2713aSLionel Sambuc    the call to the destructor, followed by the call to the deallocation
78f4a2713aSLionel Sambuc    function (<code>operator delete</code>). One particular issue
79f4a2713aSLionel Sambuc    (<tt>noreturn</tt> destructors) is discussed in
80f4a2713aSLionel Sambuc    <a href="http://llvm.org/bugs/show_bug.cgi?id=15599">PR15599</a>
81*0a6a1f1dSLionel Sambuc    <i>(Difficulty: Easy; current contact: Karthik Bhat)</i></p>
82*0a6a1f1dSLionel Sambuc
83*0a6a1f1dSLionel Sambuc    <li>Implement a BitwiseConstraintManager to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=3098">PR3098</a>.
84*0a6a1f1dSLionel Sambuc    <p>Constraints on the bits of an integer are not easily representable as
85*0a6a1f1dSLionel Sambuc    ranges. A bitwise constraint manager would model constraints such as "bit 32
86*0a6a1f1dSLionel Sambuc    is known to be 1". This would help code that made use of bitmasks</code>.
87*0a6a1f1dSLionel Sambuc    <i>(Difficulty: Medium)</i></p>
88*0a6a1f1dSLionel Sambuc    </li>
89f4a2713aSLionel Sambuc
90f4a2713aSLionel Sambuc    <li>Track type info through casts more precisely.
91f4a2713aSLionel Sambuc    <p>The DynamicTypePropagation checker is in charge of inferring a region's
92f4a2713aSLionel Sambuc    dynamic type based on what operations the code is performing. Casts are a
93f4a2713aSLionel Sambuc    rich source of type information that the analyzer currently ignores. They
94f4a2713aSLionel Sambuc    are tricky to get right, but might have very useful consequences.
95f4a2713aSLionel Sambuc    <i>(Difficulty: Medium)</i></p>
96f4a2713aSLionel Sambuc
97f4a2713aSLionel Sambuc    <li>Design and implement alpha-renaming.
98f4a2713aSLionel Sambuc    <p>Implement unifying two symbolic values along a path after they are
99f4a2713aSLionel Sambuc    determined to be equal via comparison. This would allow us to reduce the
100f4a2713aSLionel Sambuc    number of false positives and would be a building step to more advanced
101f4a2713aSLionel Sambuc    analyses, such as summary-based interprocedural and cross-translation-unit
102f4a2713aSLionel Sambuc    analysis.
103f4a2713aSLionel Sambuc    <i>(Difficulty: Hard)</i></p>
104f4a2713aSLionel Sambuc    </li>
105f4a2713aSLionel Sambuc  </ul>
106f4a2713aSLionel Sambuc  </li>
107f4a2713aSLionel Sambuc
108f4a2713aSLionel Sambuc  <li>Bug Reporting
109f4a2713aSLionel Sambuc  <ul>
110f4a2713aSLionel Sambuc    <li>Add support for displaying cross-file diagnostic paths in HTML output
111f4a2713aSLionel Sambuc    (used by <tt>scan-build</tt>).
112f4a2713aSLionel Sambuc    <p>Currently <tt>scan-build</tt> output does not display reports that span
113f4a2713aSLionel Sambuc    multiple files. The main problem is that we do not have a good format to
114f4a2713aSLionel Sambuc    display such paths in HTML output. <i>(Difficulty: Medium)</i> </p>
115f4a2713aSLionel Sambuc    </li>
116f4a2713aSLionel Sambuc
117f4a2713aSLionel Sambuc    <li>Refactor path diagnostic generation in <a href="http://clang.llvm.org/doxygen/BugReporter_8cpp_source.html">BugReporter.cpp</a>.
118f4a2713aSLionel Sambuc    <p>It would be great to have more code reuse between "Minimal" and
119f4a2713aSLionel Sambuc    "Extensive" PathDiagnostic generation algorithms. One idea is to create an
120f4a2713aSLionel Sambuc    IR for representing path diagnostics, which would be later be used to
121f4a2713aSLionel Sambuc    generate minimal or extensive report output. <i>(Difficulty: Medium)</i></p>
122f4a2713aSLionel Sambuc    </li>
123f4a2713aSLionel Sambuc  </ul>
124f4a2713aSLionel Sambuc  </li>
125f4a2713aSLionel Sambuc
126f4a2713aSLionel Sambuc  <li>Other Infrastructure
127f4a2713aSLionel Sambuc  <ul>
128f4a2713aSLionel Sambuc    <li>Rewrite <tt>scan-build</tt> (in Python).
129f4a2713aSLionel Sambuc    <p><i>(Difficulty: Easy)</i></p>
130f4a2713aSLionel Sambuc    </li>
131f4a2713aSLionel Sambuc
132f4a2713aSLionel Sambuc    <li>Do a better job interposing on a compilation.
133f4a2713aSLionel Sambuc    <p>Currently, <tt>scan-build</tt> just sets the <tt>CC</tt> and <tt>CXX</tt>
134f4a2713aSLionel Sambuc    environment variables to its wrapper scripts, which then call into an
135f4a2713aSLionel Sambuc    underlying platform compiler. This is problematic for any project that
136f4a2713aSLionel Sambuc    doesn't exclusively use <tt>CC</tt> and <tt>CXX</tt> to control its
137f4a2713aSLionel Sambuc    compilers.
138f4a2713aSLionel Sambuc    <p><i>(Difficulty: Medium-Hard)</i></p>
139f4a2713aSLionel Sambuc    </li>
140f4a2713aSLionel Sambuc
141f4a2713aSLionel Sambuc    <li>Create an <tt>analyzer_annotate</tt> attribute for the analyzer
142f4a2713aSLionel Sambuc    annotations.
143f4a2713aSLionel Sambuc    <p>We would like to put all analyzer attributes behind a fence so that we
144f4a2713aSLionel Sambuc    could add/remove them without worrying that compiler (not analyzer) users
145f4a2713aSLionel Sambuc    depend on them. Design and implement such a generic analyzer attribute in
146f4a2713aSLionel Sambuc    the compiler. <i>(Difficulty: Medium)</i></p>
147f4a2713aSLionel Sambuc    </li>
148f4a2713aSLionel Sambuc  </ul>
149f4a2713aSLionel Sambuc  </li>
150f4a2713aSLionel Sambuc
151f4a2713aSLionel Sambuc  <li>Enhanced Checks
152f4a2713aSLionel Sambuc  <ul>
153f4a2713aSLionel Sambuc    <li>Implement a production-ready StreamChecker.
154f4a2713aSLionel Sambuc    <p>A SimpleStreamChecker has been presented in the Building a Checker in 24
155f4a2713aSLionel Sambuc    Hours talk
156f4a2713aSLionel Sambuc    (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a>
157f4a2713aSLionel Sambuc    <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>).
158f4a2713aSLionel Sambuc    We need to implement a production version of the checker with richer set of
159f4a2713aSLionel Sambuc    APIs and evaluate it by running on real codebases.
160f4a2713aSLionel Sambuc    <i>(Difficulty: Easy)</i></p>
161f4a2713aSLionel Sambuc    </li>
162f4a2713aSLionel Sambuc
163f4a2713aSLionel Sambuc    <li>Extend Malloc checker with reasoning about custom allocator,
164f4a2713aSLionel Sambuc    deallocator, and ownership-transfer functions.
165f4a2713aSLionel Sambuc    <p>This would require extending the MallocPessimistic checker to reason
166f4a2713aSLionel Sambuc    about annotated functions. It is strongly desired that one would rely on
167f4a2713aSLionel Sambuc    the <tt>analyzer_annotate</tt> attribute, as described above.
168f4a2713aSLionel Sambuc    <i>(Difficulty: Easy)</i></p>
169f4a2713aSLionel Sambuc    </li>
170f4a2713aSLionel Sambuc
171f4a2713aSLionel Sambuc    <li>Implement a BitwiseMaskingChecker to handle <a href="http://llvm.org/bugs/show_bug.cgi?id=16615">PR16615</a>.
172f4a2713aSLionel Sambuc    <p>Symbolic expressions of the form <code>$sym &amp; CONSTANT</code> can range from 0 to <code>CONSTANT-</code>1 if CONSTANT is <code>2^n-1</code>, e.g. 0xFF (0b11111111), 0x7F (0b01111111), 0x3 (0b0011), 0xFFFF, etc. Even without handling general bitwise operations on symbols, we can at least bound the value of the resulting expression. Bonus points for handling masks followed by shifts, e.g. <code>($sym &amp; 0b1100) >> 2</code>.
173f4a2713aSLionel Sambuc    <i>(Difficulty: Easy)</i></p>
174f4a2713aSLionel Sambuc    </li>
175f4a2713aSLionel Sambuc
176f4a2713aSLionel Sambuc    <li>Implement iterators invalidation checker.
177f4a2713aSLionel Sambuc    <p><i>(Difficulty: Easy)</i></p>
178f4a2713aSLionel Sambuc    </li>
179f4a2713aSLionel Sambuc
180f4a2713aSLionel Sambuc    <li>Write checkers which catch Copy and Paste errors.
181f4a2713aSLionel Sambuc    <p>Take a look at the
182f4a2713aSLionel Sambuc    <a href="http://pages.cs.wisc.edu/~shanlu/paper/TSE-CPMiner.pdf">CP-Miner</a>
183f4a2713aSLionel Sambuc    paper for inspiration.
184*0a6a1f1dSLionel Sambuc    <i>(Difficulty: Medium-Hard; current contacts: Daniel Marjam&auml;ki and Daniel Fahlgren)</i></p>
185f4a2713aSLionel Sambuc    </li>
186f4a2713aSLionel Sambuc  </ul>
187f4a2713aSLionel Sambuc  </li>
188f4a2713aSLionel Sambuc</ul>
189f4a2713aSLionel Sambuc
190f4a2713aSLionel Sambuc</div>
191f4a2713aSLionel Sambuc</div>
192f4a2713aSLionel Sambuc</body>
193f4a2713aSLionel Sambuc</html>
194f4a2713aSLionel Sambuc
195