| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| "http://www.w3.org/TR/html4/strict.dtd"> |
| <html> |
| <head> |
| <title>Open Projects</title> |
| <link type="text/css" rel="stylesheet" href="menu.css"> |
| <link type="text/css" rel="stylesheet" href="content.css"> |
| <script type="text/javascript" src="scripts/menu.js"></script> |
| </head> |
| <body> |
| |
| <div id="page"> |
| <!--#include virtual="menu.html.incl"--> |
| <div id="content"> |
| |
| <h1>Open Projects</h1> |
| |
| <p>This page lists several projects that would boost analyzer's usability and |
| power. Most of the projects listed here are infrastructure-related so this list |
| is an addition to the <a href="potential_checkers.html">potential checkers |
| list</a>. If you are interested in tackling one of these, please send an email |
| to the <a href=https://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev |
| mailing list</a> to notify other members of the community.</p> |
| |
| <ul> |
| <li>Release checkers from "alpha" |
| <p>New checkers which were contributed to the analyzer, |
| but have not passed a rigorous evaluation process, |
| are committed as "alpha checkers" (from "alpha version"), |
| and are not enabled by default.</p> |
| |
| <p>Ideally, only the checkers which are actively being worked on should be in |
| "alpha", |
| but over the years the development of many of those has stalled. |
| Such checkers should either be improved |
| up to a point where they can be enabled by default, |
| or removed from the analyzer entirely. |
| |
| <ul> |
| <li><code>alpha.security.ArrayBound</code> and |
| <code>alpha.security.ArrayBoundV2</code> |
| <p>Array bounds checking is a desired feature, |
| but having an acceptable rate of false positives might not be possible |
| without a proper |
| <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening</a> support. |
| Additionally, it might be more promising to perform index checking based on |
| <a href="https://en.wikipedia.org/wiki/Taint_checking">tainted</a> index values. |
| <p><i>(Difficulty: Medium)</i></p></p> |
| </li> |
| |
| <li><code>alpha.unix.StreamChecker</code> |
| <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 |
| Hours talk |
| (<a href="https://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> |
| <a href="https://youtu.be/kdxlsP5QVPw">video</a>).</p> |
| |
| <p>This alpha checker is an attempt to write a production grade stream checker. |
| However, it was found to have an unacceptably high false positive rate. |
| One of the found problems was that eagerly splitting the state |
| based on whether the system call may fail leads to too many reports. |
| A <em>delayed</em> split where the implication is stored in the state |
| (similarly to nullability implications in <code>TrustNonnullChecker</code>) |
| may produce much better results.</p> |
| <p><i>(Difficulty: Medium)</i></p> |
| </li> |
| </ul> |
| </li> |
| |
| <li>Improve C++ support |
| <ul> |
| <li>Handle construction as part of aggregate initialization. |
| <p><a href="https://en.cppreference.com/w/cpp/language/aggregate_initialization">Aggregates</a> |
| are objects that can be brace-initialized without calling a |
| constructor (that is, <code><a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html"> |
| CXXConstructExpr</a></code> does not occur in the AST), |
| but potentially calling |
| constructors for their fields and base classes |
| These |
| constructors of sub-objects need to know what object they are constructing. |
| Moreover, if the aggregate contains |
| references, lifetime extension needs to be properly modeled. |
| |
| One can start untangling this problem by trying to replace the |
| current ad-hoc <code><a href="https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html"> |
| ParentMap</a></code> lookup in <a href="https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430"> |
| <code>CXXConstructExpr::CK_NonVirtualBase</code></a> branch of |
| <code>ExprEngine::VisitCXXConstructExpr()</code> |
| with proper support for the feature. |
| <p><i>(Difficulty: Medium) </i></p></p> |
| </li> |
| |
| <li>Handle array constructors. |
| <p>When an array of objects is allocated (say, using the |
| <code>operator new[]</code> or defining a stack array), |
| constructors for all elements of the array are called. |
| We should model (potentially some of) such evaluations, |
| and the same applies for destructors called from |
| <code>operator delete[]</code>. |
| See tests cases in <a href="https://github.com/llvm/llvm-project/tree/main/clang/test/Analysis/handle_constructors_with_new_array.cpp">handle_constructors_with_new_array.cpp</a>. |
| </p> |
| <p> |
| Constructing an array requires invoking multiple (potentially unknown) |
| amount of constructors with the same construct-expression. |
| Apart from the technical difficulties of juggling program points around |
| correctly to avoid accidentally merging paths together, we'll have to |
| be a judge on when to exit the loop and how to widen it. |
| Given that the constructor is going to be a default constructor, |
| a nice 95% solution might be to execute exactly one constructor and |
| then default-bind the resulting LazyCompoundVal to the whole array; |
| it'll work whenever the default constructor doesn't touch global state |
| but only initializes the object to various default values. |
| But if, say, we're making an array of strings, |
| depending on the implementation you might have to allocate a new buffer |
| for each string, and in this case default-binding won't cut it. |
| We might want to come up with an auxiliary analysis in order to perform |
| widening of these simple loops more precisely. |
| </p> |
| </li> |
| |
| <li>Handle constructors that can be elided due to Named Return Value Optimization (NRVO) |
| <p>Local variables which are returned by values on all return statements |
| may be stored directly at the address for the return value, |
| eliding the copy or move constructor call. |
| Such variables can be identified using the AST call <code>VarDecl::isNRVOVariable</code>. |
| </p> |
| </li> |
| |
| <li>Handle constructors of lambda captures |
| <p>Variables which are captured by value into a lambda require a call to |
| a copy constructor. |
| This call is not currently modeled. |
| </p> |
| </li> |
| |
| <li>Handle constructors for default arguments |
| <p>Default arguments in C++ are recomputed at every call, |
| and are therefore local, and not static, variables. |
| See tests cases in <a href="https://github.com/llvm/llvm-project/tree/main/clang/test/Analysis/handle_constructors_for_default_arguments.cpp">handle_constructors_for_default_arguments.cpp</a>. |
| </p> |
| <p> |
| Default arguments are annoying because the initializer expression is |
| evaluated at the call site but doesn't syntactically belong to the |
| caller's AST; instead it belongs to the ParmVarDecl for the default |
| parameter. This can lead to situations when the same expression has to |
| carry different values simultaneously - |
| when multiple instances of the same function are evaluated as part of the |
| same full-expression without specifying the default arguments. |
| Even simply calling the function twice (not necessarily within the |
| same full-expression) may lead to program points agglutinating because |
| it's the same expression. There are some nasty test cases already |
| in temporaries.cpp (struct DefaultParam and so on). I recommend adding a |
| new LocationContext kind specifically to deal with this problem. It'll |
| also help you figure out the construction context when you evaluate the |
| construct-expression (though you might still need to do some additional |
| CFG work to get construction contexts right). |
| </p> |
| </li> |
| |
| <li>Enhance the modeling of the standard library. |
| <p>The analyzer needs a better understanding of STL in order to be more |
| useful on C++ codebases. |
| While full library modeling is not an easy task, |
| large gains can be achieved by supporting only a few cases: |
| e.g. calling <code>.length()</code> on an empty |
| <code>std::string</code> always yields zero. |
| <p><i>(Difficulty: Medium)</i></p><p> |
| </li> |
| |
| <li>Enhance CFG to model exception-handling. |
| <p>Currently exceptions are treated as "black holes", and exception-handling |
| control structures are poorly modeled in order to be conservative. |
| This could be improved for both C++ and Objective-C exceptions. |
| <p><i>(Difficulty: Hard)</i></p></p> |
| </li> |
| </ul> |
| </li> |
| |
| <li>Core Analyzer Infrastructure |
| <ul> |
| <li>Handle unions. |
| <p>Currently in the analyzer the value of a union is always regarded as |
| an unknown. |
| This problem was |
| previously <a href="https://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html">discussed</a> |
| on the mailing list, but no solution was implemented. |
| <p><i> (Difficulty: Medium) </i></p></p> |
| </li> |
| |
| <li>Floating-point support. |
| <p>Currently, the analyzer treats all floating-point values as unknown. |
| This project would involve adding a new <code>SVal</code> kind |
| for constant floats, generalizing the constraint manager to handle floats, |
| and auditing existing code to make sure it doesn't |
| make incorrect assumptions (most notably, that <code>X == X</code> |
| is always true, since it does not hold for <code>NaN</code>). |
| <p><i> (Difficulty: Medium)</i></p></p> |
| </li> |
| |
| <li>Improved loop execution modeling. |
| <p>The analyzer simply unrolls each loop <tt>N</tt> times before |
| dropping the path, for a fixed constant <tt>N</tt>. |
| However, that results in lost coverage in cases where the loop always |
| executes more than <tt>N</tt> times. |
| A Google Summer Of Code |
| <a href="https://summerofcode.withgoogle.com/archive/2017/projects/6071606019358720/">project</a> |
| was completed to make the loop bound parameterizable, |
| but the <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">widening</a> |
| problem still remains open. |
| |
| <p><i> (Difficulty: Hard)</i></p></p> |
| </li> |
| |
| <li>Basic function summarization support |
| <p>The analyzer performs inter-procedural analysis using |
| either inlining or "conservative evaluation" (invalidating all data |
| passed to the function). |
| Often, a very simple summary |
| (e.g. "this function is <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>") would be |
| enough to be a large improvement over conservative evaluation. |
| Such summaries could be obtained either syntactically, |
| or using a dataflow framework. |
| <p><i>(Difficulty: Hard)</i></p><p> |
| </li> |
| |
| <li>Implement a dataflow flamework. |
| <p>The analyzer core |
| implements a <a href="https://en.wikipedia.org/wiki/Symbolic_execution">symbolic execution</a> |
| engine, which performs checks |
| (use-after-free, uninitialized value read, etc.) |
| over a <em>single</em> program path. |
| However, many useful properties |
| (dead code, check-after-use, etc.) require |
| reasoning over <em>all</em> possible in a program. |
| Such reasoning requires a |
| <a href="https://en.wikipedia.org/wiki/Data-flow_analysis">dataflow analysis</a> framework. |
| Clang already implements |
| a few dataflow analyses (most notably, liveness), |
| but they implemented in an ad-hoc fashion. |
| A proper framework would enable us writing many more useful checkers. |
| <p><i> (Difficulty: Hard) </i></p></p> |
| </li> |
| |
| <li>Track type information through casts more precisely. |
| <p>The <code>DynamicTypePropagation</code> |
| checker is in charge of inferring a region's |
| dynamic type based on what operations the code is performing. |
| Casts are a rich source of type information that the analyzer currently ignores. |
| <p><i>(Difficulty: Medium)</i></p></p> |
| </li> |
| |
| </ul> |
| </li> |
| |
| <li>Fixing miscellaneous bugs |
| <p>Apart from the open projects listed above, |
| contributors are welcome to fix any of the outstanding |
| <a href="https://bugs.llvm.org/buglist.cgi?component=Static%20Analyzer&list_id=147756&product=clang&resolution=---">bugs</a> |
| in the Bugzilla. |
| <p><i>(Difficulty: Anything)</i></p></p> |
| </li> |
| |
| </ul> |
| |
| </div> |
| </div> |
| </body> |
| </html> |