1*12c85518Srobert============================ 2*12c85518SrobertTaint Analysis Configuration 3*12c85518Srobert============================ 4*12c85518Srobert 5*12c85518SrobertThe Clang Static Analyzer uses taint analysis to detect security-related issues in code. 6*12c85518SrobertThe backbone of taint analysis in the Clang SA is the `GenericTaintChecker`, which the user can access via the :ref:`alpha-security-taint-TaintPropagation` checker alias and this checker has a default taint-related configuration. 7*12c85518SrobertThe built-in default settings are defined in code, and they are always in effect once the checker is enabled, either directly or via the alias. 8*12c85518SrobertThe checker also provides a configuration interface for extending the default settings by providing a configuration file in `YAML <http://llvm.org/docs/YamlIO.html#introduction-to-yaml>`_ format. 9*12c85518SrobertThis documentation describes the syntax of the configuration file and gives the informal semantics of the configuration options. 10*12c85518Srobert 11*12c85518Srobert.. contents:: 12*12c85518Srobert :local: 13*12c85518Srobert 14*12c85518Srobert.. _clangsa-taint-configuration-overview: 15*12c85518Srobert 16*12c85518SrobertOverview 17*12c85518Srobert________ 18*12c85518Srobert 19*12c85518SrobertTaint analysis works by checking for the occurrence of special operations during the symbolic execution of the program. 20*12c85518SrobertTaint analysis defines sources, sinks, and propagation rules. It identifies errors by detecting a flow of information that originates from a taint source, reaches a taint sink, and propagates through the program paths via propagation rules. 21*12c85518SrobertA source, sink, or an operation that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by :ref:`alpha-security-taint-TaintPropagation`. 22*12c85518SrobertIt is possible to express that a statement sanitizes tainted values by providing a ``Filters`` section in the external configuration (see :ref:`clangsa-taint-configuration-example` and :ref:`clangsa-taint-filter-details`). 23*12c85518SrobertThere are no default filters defined in the built-in settings. 24*12c85518SrobertThe checker's documentation also specifies how to provide a custom taint configuration with command-line options. 25*12c85518Srobert 26*12c85518Srobert.. _clangsa-taint-configuration-example: 27*12c85518Srobert 28*12c85518SrobertExample configuration file 29*12c85518Srobert__________________________ 30*12c85518Srobert 31*12c85518Srobert.. code-block:: yaml 32*12c85518Srobert 33*12c85518Srobert # The entries that specify arguments use 0-based indexing when specifying 34*12c85518Srobert # input arguments, and -1 is used to denote the return value. 35*12c85518Srobert 36*12c85518Srobert Filters: 37*12c85518Srobert # Filter functions 38*12c85518Srobert # Taint is sanitized when tainted variables are pass arguments to filters. 39*12c85518Srobert 40*12c85518Srobert # Filter function 41*12c85518Srobert # void cleanse_first_arg(int* arg) 42*12c85518Srobert # 43*12c85518Srobert # Result example: 44*12c85518Srobert # int x; // x is tainted 45*12c85518Srobert # cleanse_first_arg(&x); // x is not tainted after the call 46*12c85518Srobert - Name: cleanse_first_arg 47*12c85518Srobert Args: [0] 48*12c85518Srobert 49*12c85518Srobert Propagations: 50*12c85518Srobert # Source functions 51*12c85518Srobert # The omission of SrcArgs key indicates unconditional taint propagation, 52*12c85518Srobert # which is conceptually what a source does. 53*12c85518Srobert 54*12c85518Srobert # Source function 55*12c85518Srobert # size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream) 56*12c85518Srobert # 57*12c85518Srobert # Result example: 58*12c85518Srobert # FILE* f = fopen("file.txt"); 59*12c85518Srobert # char buf[1024]; 60*12c85518Srobert # size_t read = fread(buf, sizeof(buf[0]), sizeof(buf)/sizeof(buf[0]), f); 61*12c85518Srobert # // both read and buf are tainted 62*12c85518Srobert - Name: fread 63*12c85518Srobert DstArgs: [0, -1] 64*12c85518Srobert 65*12c85518Srobert # Propagation functions 66*12c85518Srobert # The presence of SrcArgs key indicates conditional taint propagation, 67*12c85518Srobert # which is conceptually what a propagator does. 68*12c85518Srobert 69*12c85518Srobert # Propagation function 70*12c85518Srobert # char *dirname(char *path) 71*12c85518Srobert # 72*12c85518Srobert # Result example: 73*12c85518Srobert # char* path = read_path(); 74*12c85518Srobert # char* dir = dirname(path); 75*12c85518Srobert # // dir is tainted if path was tainted 76*12c85518Srobert - Name: dirname 77*12c85518Srobert SrcArgs: [0] 78*12c85518Srobert DstArgs: [-1] 79*12c85518Srobert 80*12c85518Srobert Sinks: 81*12c85518Srobert # Sink functions 82*12c85518Srobert # If taint reaches any of the arguments specified, a warning is emitted. 83*12c85518Srobert 84*12c85518Srobert # Sink function 85*12c85518Srobert # int system(const char* command) 86*12c85518Srobert # 87*12c85518Srobert # Result example: 88*12c85518Srobert # const char* command = read_command(); 89*12c85518Srobert # system(command); // emit diagnostic if command is tainted 90*12c85518Srobert - Name: system 91*12c85518Srobert Args: [0] 92*12c85518Srobert 93*12c85518SrobertIn the example file above, the entries under the `Propagation` key implement the conceptual sources and propagations, and sinks have their dedicated `Sinks` key. 94*12c85518SrobertThe user can define operations (function calls) where the tainted values should be cleansed by listing entries under the `Filters` key. 95*12c85518SrobertFilters model the sanitization of values done by the programmer, and providing these is key to avoiding false-positive findings. 96*12c85518Srobert 97*12c85518SrobertConfiguration file syntax and semantics 98*12c85518Srobert_______________________________________ 99*12c85518Srobert 100*12c85518SrobertThe configuration file should have valid `YAML <http://llvm.org/docs/YamlIO.html#introduction-to-yaml>`_ syntax. 101*12c85518Srobert 102*12c85518SrobertThe configuration file can have the following top-level keys: 103*12c85518Srobert - Filters 104*12c85518Srobert - Propagations 105*12c85518Srobert - Sinks 106*12c85518Srobert 107*12c85518SrobertUnder the `Filters` key, the user can specify a list of operations that remove taint (see :ref:`clangsa-taint-filter-details` for details). 108*12c85518Srobert 109*12c85518SrobertUnder the `Propagations` key, the user can specify a list of operations that introduce and propagate taint (see :ref:`clangsa-taint-propagation-details` for details). 110*12c85518SrobertThe user can mark taint sources with a `SrcArgs` key in the `Propagation` key, while propagations have none. 111*12c85518SrobertThe lack of the `SrcArgs` key means unconditional propagation, which is how sources are modeled. 112*12c85518SrobertThe semantics of propagations are such, that if any of the source arguments are tainted (specified by indexes in `SrcArgs`) then all of the destination arguments (specified by indexes in `DstArgs`) also become tainted. 113*12c85518Srobert 114*12c85518SrobertUnder the `Sinks` key, the user can specify a list of operations where the checker should emit a bug report if tainted data reaches it (see :ref:`clangsa-taint-sink-details` for details). 115*12c85518Srobert 116*12c85518Srobert.. _clangsa-taint-filter-details: 117*12c85518Srobert 118*12c85518SrobertFilter syntax and semantics 119*12c85518Srobert########################### 120*12c85518Srobert 121*12c85518SrobertAn entry under `Filters` is a `YAML <http://llvm.org/docs/YamlIO.html#introduction-to-yaml>`_ object with the following mandatory keys: 122*12c85518Srobert - `Name` is a string that specifies the name of a function. 123*12c85518Srobert Encountering this function during symbolic execution the checker will sanitize taint from the memory region referred to by the given arguments or return a sanitized value. 124*12c85518Srobert - `Args` is a list of numbers in the range of ``[-1..int_max]``. 125*12c85518Srobert It indicates the indexes of arguments in the function call. 126*12c85518Srobert The number ``-1`` signifies the return value; other numbers identify call arguments. 127*12c85518Srobert The values of these arguments are considered clean after the function call. 128*12c85518Srobert 129*12c85518SrobertThe following keys are optional: 130*12c85518Srobert - `Scope` is a string that specifies the prefix of the function's name in its fully qualified name. This option restricts the set of matching function calls. It can encode not only namespaces but struct/class names as well to match member functions. 131*12c85518Srobert 132*12c85518Srobert .. _clangsa-taint-propagation-details: 133*12c85518Srobert 134*12c85518SrobertPropagation syntax and semantics 135*12c85518Srobert################################ 136*12c85518Srobert 137*12c85518SrobertAn entry under `Propagation` is a `YAML <http://llvm.org/docs/YamlIO.html#introduction-to-yaml>`_ object with the following mandatory keys: 138*12c85518Srobert - `Name` is a string that specifies the name of a function. 139*12c85518Srobert Encountering this function during symbolic execution propagate taint from one or more arguments to other arguments and possibly the return value. 140*12c85518Srobert It helps model the taint-related behavior of functions that are not analyzable otherwise. 141*12c85518Srobert 142*12c85518SrobertThe following keys are optional: 143*12c85518Srobert - `Scope` is a string that specifies the prefix of the function's name in its fully qualified name. This option restricts the set of matching function calls. 144*12c85518Srobert - `SrcArgs` is a list of numbers in the range of ``[0..int_max]`` that indicates the indexes of arguments in the function call. 145*12c85518Srobert Taint-propagation considers the values of these arguments during the evaluation of the function call. 146*12c85518Srobert If any `SrcArgs` arguments are tainted, the checker will consider all `DstArgs` arguments tainted after the call. 147*12c85518Srobert - `DstArgs` is a list of numbers in the range of ``[-1..int_max]`` that indicates the indexes of arguments in the function call. 148*12c85518Srobert The number ``-1`` specifies the return value of the function. 149*12c85518Srobert If any `SrcArgs` arguments are tainted, the checker will consider all `DstArgs` arguments tainted after the call. 150*12c85518Srobert - `VariadicType` is a string that can be one of ``None``, ``Dst``, ``Src``. 151*12c85518Srobert It is used in conjunction with `VariadicIndex` to specify arguments inside a variadic argument. 152*12c85518Srobert The value of ``Src`` will treat every call site argument that is part of a variadic argument list as a source concerning propagation rules (as if specified by `SrcArg`). 153*12c85518Srobert The value of ``Dst`` will treat every call site argument that is part of a variadic argument list a destination concerning propagation rules. 154*12c85518Srobert The value of ``None`` will not consider the arguments that are part of a variadic argument list (this option is redundant but can be used to temporarily switch off handling of a particular variadic argument option without removing the VariadicIndex key). 155*12c85518Srobert - `VariadicIndex` is a number in the range of ``[0..int_max]``. It indicates the starting index of the variadic argument in the signature of the function. 156*12c85518Srobert 157*12c85518Srobert 158*12c85518Srobert.. _clangsa-taint-sink-details: 159*12c85518Srobert 160*12c85518SrobertSink syntax and semantics 161*12c85518Srobert######################### 162*12c85518Srobert 163*12c85518SrobertAn entry under `Sinks` is a `YAML <http://llvm.org/docs/YamlIO.html#introduction-to-yaml>`_ object with the following mandatory keys: 164*12c85518Srobert - `Name` is a string that specifies the name of a function. 165*12c85518Srobert Encountering this function during symbolic execution will emit a taint-related diagnostic if any of the arguments specified with `Args` are tainted at the call site. 166*12c85518Srobert - `Args` is a list of numbers in the range of ``[0..int_max]`` that indicates the indexes of arguments in the function call. 167*12c85518Srobert The checker reports an error if any of the specified arguments are tainted. 168*12c85518Srobert 169*12c85518SrobertThe following keys are optional: 170*12c85518Srobert - `Scope` is a string that specifies the prefix of the function's name in its fully qualified name. This option restricts the set of matching function calls. 171