xref: /llvm-project/clang/docs/ClangTransformerTutorial.rst (revision 293dbea8b0169525d93a4ee4b7d6c53aa9d4bee0)
12b494844SYitzhak Mandelbaum==========================
22b494844SYitzhak MandelbaumClang Transformer Tutorial
32b494844SYitzhak Mandelbaum==========================
42b494844SYitzhak Mandelbaum
52b494844SYitzhak MandelbaumA tutorial on how to write a source-to-source translation tool using Clang Transformer.
62b494844SYitzhak Mandelbaum
72b494844SYitzhak Mandelbaum.. contents::
82b494844SYitzhak Mandelbaum   :local:
92b494844SYitzhak Mandelbaum
102b494844SYitzhak MandelbaumWhat is Clang Transformer?
112b494844SYitzhak Mandelbaum--------------------------
122b494844SYitzhak Mandelbaum
132b494844SYitzhak MandelbaumClang Transformer is a framework for writing C++ diagnostics and program
142b494844SYitzhak Mandelbaumtransformations. It is built on the clang toolchain and the LibTooling library,
152b494844SYitzhak Mandelbaumbut aims to hide much of the complexity of clang's native, low-level libraries.
162b494844SYitzhak Mandelbaum
172b494844SYitzhak MandelbaumThe core abstraction of Transformer is the *rewrite rule*, which specifies how
182b494844SYitzhak Mandelbaumto change a given program pattern into a new form. Here are some examples of
192b494844SYitzhak Mandelbaumtasks you can achieve with Transformer:
202b494844SYitzhak Mandelbaum
212b494844SYitzhak Mandelbaum*   warn against using the name ``MkX`` for a declared function,
222b494844SYitzhak Mandelbaum*   change ``MkX`` to ``MakeX``, where ``MkX`` is the name of a declared function,
232b494844SYitzhak Mandelbaum*   change ``s.size()`` to ``Size(s)``, where ``s`` is a ``string``,
242b494844SYitzhak Mandelbaum*   collapse ``e.child().m()`` to ``e.m()``, for any expression ``e`` and method named
252b494844SYitzhak Mandelbaum    ``m``.
262b494844SYitzhak Mandelbaum
272b494844SYitzhak MandelbaumAll of the examples have a common form: they identify a pattern that is the
282b494844SYitzhak Mandelbaumtarget of the transformation, they specify an *edit* to the code identified by
292b494844SYitzhak Mandelbaumthe pattern, and their pattern and edit refer to common variables, like ``s``,
302b494844SYitzhak Mandelbaum``e``, and ``m``, that range over code fragments. Our first and second examples also
312b494844SYitzhak Mandelbaumspecify constraints on the pattern that aren't apparent from the syntax alone,
322b494844SYitzhak Mandelbaumlike "``s`` is a ``string``." Even the first example ("warn ...") shares this form,
332b494844SYitzhak Mandelbaumeven though it doesn't change any of the code -- it's "edit" is simply a no-op.
342b494844SYitzhak Mandelbaum
352b494844SYitzhak MandelbaumTransformer helps users succinctly specify rules of this sort and easily execute
362b494844SYitzhak Mandelbaumthem locally over a collection of files, apply them to selected portions of
372b494844SYitzhak Mandelbauma codebase, or even bundle them as a clang-tidy check for ongoing application.
382b494844SYitzhak Mandelbaum
392b494844SYitzhak MandelbaumWho is Clang Transformer for?
402b494844SYitzhak Mandelbaum-----------------------------
412b494844SYitzhak Mandelbaum
422b494844SYitzhak MandelbaumClang Transformer is for developers who want to write clang-tidy checks or write
432b494844SYitzhak Mandelbaumtools to modify a large number of C++ files in (roughly) the same way. What
442b494844SYitzhak Mandelbaumqualifies as "large" really depends on the nature of the change and your
452b494844SYitzhak Mandelbaumpatience for repetitive editing. In our experience, automated solutions become
462b494844SYitzhak Mandelbaumworthwhile somewhere between 100 and 500 files.
472b494844SYitzhak Mandelbaum
482b494844SYitzhak MandelbaumGetting Started
492b494844SYitzhak Mandelbaum---------------
502b494844SYitzhak Mandelbaum
512b494844SYitzhak MandelbaumPatterns in Transformer are expressed with :doc:`clang's AST matchers <LibASTMatchers>`.
522b494844SYitzhak MandelbaumMatchers are a language of combinators for describing portions of a clang
532b494844SYitzhak MandelbaumAbstract Syntax Tree (AST). Since clang's AST includes complete type information
542b494844SYitzhak Mandelbaum(within the limits of single `Translation Unit (TU)`_,
552b494844SYitzhak Mandelbaumthese patterns can even encode rich constraints on the type properties of AST
562b494844SYitzhak Mandelbaumnodes.
572b494844SYitzhak Mandelbaum
582b494844SYitzhak Mandelbaum.. _`Translation Unit (TU)`: https://en.wikipedia.org/wiki/Translation_unit_\(programming\)
592b494844SYitzhak Mandelbaum
602b494844SYitzhak MandelbaumWe assume a familiarity with the clang AST and the corresponding AST matchers
612b494844SYitzhak Mandelbaumfor the purpose of this tutorial. Users who are unfamiliar with either are
622b494844SYitzhak Mandelbaumencouraged to start with the recommended references in `Related Reading`_.
632b494844SYitzhak Mandelbaum
642b494844SYitzhak MandelbaumExample: style-checking names
652b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
662b494844SYitzhak Mandelbaum
672b494844SYitzhak MandelbaumAssume you have a style-guide rule which forbids functions from being named
682b494844SYitzhak Mandelbaum"MkX" and you want to write a check that catches any violations of this rule. We
692b494844SYitzhak Mandelbaumcan express this a Transformer rewrite rule:
702b494844SYitzhak Mandelbaum
712b494844SYitzhak Mandelbaum.. code-block:: c++
722b494844SYitzhak Mandelbaum
73*293dbea8SDeNiCoN   makeRule(functionDecl(hasName("MkX")).bind("fun"),
742b494844SYitzhak Mandelbaum	    noopEdit(node("fun")),
752b494844SYitzhak Mandelbaum	    cat("The name ``MkX`` is not allowed for functions; please rename"));
762b494844SYitzhak Mandelbaum
772b494844SYitzhak Mandelbaum``makeRule`` is our go-to function for generating rewrite rules. It takes three
782b494844SYitzhak Mandelbaumarguments: the pattern, the edit, and (optionally) an explanatory note. In our
792b494844SYitzhak Mandelbaumexample, the pattern (``functionDecl(...)``) identifies the declaration of the
802b494844SYitzhak Mandelbaumfunction ``MkX``. Since we're just diagnosing the problem, but not suggesting a
812b494844SYitzhak Mandelbaumfix, our edit is an no-op. But, it contains an *anchor* for the diagnostic
822b494844SYitzhak Mandelbaummessage: ``node("fun")`` says to associate the message with the source range of
832b494844SYitzhak Mandelbaumthe AST node bound to "fun"; in this case, the ill-named function declaration.
842b494844SYitzhak MandelbaumFinally, we use ``cat`` to build a message that explains the change. Regarding the
852b494844SYitzhak Mandelbaumname ``cat`` -- we'll discuss it in more detail below, but suffice it to say that
862b494844SYitzhak Mandelbaumit can also take multiple arguments and concatenate their results.
872b494844SYitzhak Mandelbaum
882b494844SYitzhak MandelbaumNote that the result of ``makeRule`` is a value of type
892b494844SYitzhak Mandelbaum``clang::transformer::RewriteRule``, but most users don't need to care about the
902b494844SYitzhak Mandelbaumdetails of this type.
912b494844SYitzhak Mandelbaum
922b494844SYitzhak MandelbaumExample: renaming a function
932b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^
942b494844SYitzhak Mandelbaum
952b494844SYitzhak MandelbaumNow, let's extend this example to a *transformation*; specifically, the second
962b494844SYitzhak Mandelbaumexample above:
972b494844SYitzhak Mandelbaum
982b494844SYitzhak Mandelbaum.. code-block:: c++
992b494844SYitzhak Mandelbaum
1002b494844SYitzhak Mandelbaum   makeRule(declRefExpr(to(functionDecl(hasName("MkX")))),
1012b494844SYitzhak Mandelbaum	    changeTo(cat("MakeX")),
1022b494844SYitzhak Mandelbaum	    cat("MkX has been renamed MakeX"));
1032b494844SYitzhak Mandelbaum
1042b494844SYitzhak MandelbaumIn this example, the pattern (``declRefExpr(...)``) identifies any *reference* to
1052b494844SYitzhak Mandelbaumthe function ``MkX``, rather than the declaration itself, as in our previous
1062b494844SYitzhak Mandelbaumexample. Our edit (``changeTo(...)``) says to *change* the code matched by the
1072b494844SYitzhak Mandelbaumpattern *to* the text "MakeX". Finally, we use ``cat`` again to build a message
1082b494844SYitzhak Mandelbaumthat explains the change.
1092b494844SYitzhak Mandelbaum
1102b494844SYitzhak MandelbaumHere are some example changes that this rule would make:
1112b494844SYitzhak Mandelbaum
1122b494844SYitzhak Mandelbaum+--------------------------+----------------------------+
1132b494844SYitzhak Mandelbaum| Original                 | Result                     |
1142b494844SYitzhak Mandelbaum+==========================+============================+
1152b494844SYitzhak Mandelbaum| ``X x = MkX(3);``        | ``X x = MakeX(3);``        |
1162b494844SYitzhak Mandelbaum+--------------------------+----------------------------+
1172b494844SYitzhak Mandelbaum| ``CallFactory(MkX, 3);`` | ``CallFactory(MakeX, 3);`` |
1182b494844SYitzhak Mandelbaum+--------------------------+----------------------------+
1192b494844SYitzhak Mandelbaum| ``auto f = MkX;``        | ``auto f = MakeX;``        |
1202b494844SYitzhak Mandelbaum+--------------------------+----------------------------+
1212b494844SYitzhak Mandelbaum
1222b494844SYitzhak MandelbaumExample: method to function
1232b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^
1242b494844SYitzhak Mandelbaum
1252b494844SYitzhak MandelbaumNext, let's write a rule to replace a method call with a (free) function call,
1262b494844SYitzhak Mandelbaumapplied to the original method call's target object. Specifically, "change
1272b494844SYitzhak Mandelbaum``s.size()`` to ``Size(s)``, where ``s`` is a ``string``." We start with a simpler
1282b494844SYitzhak Mandelbaumchange that ignores the type of ``s``. That is, it will modify *any* method call
1292b494844SYitzhak Mandelbaumwhere the method is named "size":
1302b494844SYitzhak Mandelbaum
1312b494844SYitzhak Mandelbaum.. code-block:: c++
1322b494844SYitzhak Mandelbaum
1332b494844SYitzhak Mandelbaum   llvm::StringRef s = "str";
1342b494844SYitzhak Mandelbaum   makeRule(
1352b494844SYitzhak Mandelbaum     cxxMemberCallExpr(
1362b494844SYitzhak Mandelbaum       on(expr().bind(s)),
1372b494844SYitzhak Mandelbaum       callee(cxxMethodDecl(hasName("size")))),
1382b494844SYitzhak Mandelbaum     changeTo(cat("Size(", node(s), ")")),
1392b494844SYitzhak Mandelbaum     cat("Method ``size`` is deprecated in favor of free function ``Size``"));
1402b494844SYitzhak Mandelbaum
1412b494844SYitzhak MandelbaumWe express the pattern with the given AST matcher, which binds the method call's
1422b494844SYitzhak Mandelbaumtarget to ``s`` [#f1]_. For the edit, we again use ``changeTo``, but this
1432b494844SYitzhak Mandelbaumtime we construct the term from multiple parts, which we compose with ``cat``. The
1442b494844SYitzhak Mandelbaumsecond part of our term is ``node(s)``, which selects the source code
1452b494844SYitzhak Mandelbaumcorresponding to the AST node ``s`` that was bound when a match was found in the
1462b494844SYitzhak MandelbaumAST for our rule's pattern. ``node(s)`` constructs a ``RangeSelector``, which, when
1472b494844SYitzhak Mandelbaumused in ``cat``, indicates that the selected source should be inserted in the
1482b494844SYitzhak Mandelbaumoutput at that point.
1492b494844SYitzhak Mandelbaum
1502b494844SYitzhak MandelbaumNow, we probably don't want to rewrite *all* invocations of "size" methods, just
1512b494844SYitzhak Mandelbaumthose on ``std::string``\ s. We can achieve this change simply by refining our
1522b494844SYitzhak Mandelbaummatcher. The rest of the rule remains unchanged:
1532b494844SYitzhak Mandelbaum
1542b494844SYitzhak Mandelbaum.. code-block:: c++
1552b494844SYitzhak Mandelbaum
1562b494844SYitzhak Mandelbaum   llvm::StringRef s = "str";
1572b494844SYitzhak Mandelbaum   makeRule(
1582b494844SYitzhak Mandelbaum     cxxMemberCallExpr(
1592b494844SYitzhak Mandelbaum       on(expr(hasType(namedDecl(hasName("std::string"))))
1602b494844SYitzhak Mandelbaum	 .bind(s)),
1612b494844SYitzhak Mandelbaum       callee(cxxMethodDecl(hasName("size")))),
1622b494844SYitzhak Mandelbaum     changeTo(cat("Size(", node(s), ")")),
1632b494844SYitzhak Mandelbaum     cat("Method ``size`` is deprecated in favor of free function ``Size``"));
1642b494844SYitzhak Mandelbaum
1652b494844SYitzhak MandelbaumExample: rewriting method calls
1662b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1672b494844SYitzhak Mandelbaum
1682b494844SYitzhak MandelbaumIn this example, we delete an "intermediary" method call in a string of
1692b494844SYitzhak Mandelbauminvocations. This scenario can arise, for example, if you want to collapse a
1702b494844SYitzhak Mandelbaumsubstructure into its parent.
1712b494844SYitzhak Mandelbaum
1722b494844SYitzhak Mandelbaum.. code-block:: c++
1732b494844SYitzhak Mandelbaum
1742b494844SYitzhak Mandelbaum   llvm::StringRef e = "expr", m = "member";
1752b494844SYitzhak Mandelbaum   auto child_call = cxxMemberCallExpr(on(expr().bind(e)),
1762b494844SYitzhak Mandelbaum				       callee(cxxMethodDecl(hasName("child"))));
1772b494844SYitzhak Mandelbaum   makeRule(cxxMemberCallExpr(on(child_call), callee(memberExpr().bind(m)),
1782b494844SYitzhak Mandelbaum	    changeTo(cat(e, ".", member(m), "()"))),
1792b494844SYitzhak Mandelbaum	    cat("``child`` accessor is being removed; call ",
1802b494844SYitzhak Mandelbaum		member(m), " directly on parent"));
1812b494844SYitzhak Mandelbaum
1822b494844SYitzhak MandelbaumThis rule isn't quite what we want: it will rewrite ``my_object.child().foo()`` to
1832b494844SYitzhak Mandelbaum``my_object.foo()``, but it will also rewrite ``my_ptr->child().foo()`` to
1842b494844SYitzhak Mandelbaum``my_ptr.foo()``, which is not what we intend. We could fix this by restricting
1852b494844SYitzhak Mandelbaumthe pattern with ``not(isArrow())`` in the definition of ``child_call``. Yet, we
1862b494844SYitzhak Mandelbaum*want* to rewrite calls through pointers.
1872b494844SYitzhak Mandelbaum
1882b494844SYitzhak MandelbaumTo capture this idiom, we provide the ``access`` combinator to intelligently
1892b494844SYitzhak Mandelbaumconstruct a field/method access. In our example, the member access is expressed
1902b494844SYitzhak Mandelbaumas:
1912b494844SYitzhak Mandelbaum
1922b494844SYitzhak Mandelbaum.. code-block:: c++
1932b494844SYitzhak Mandelbaum
1942b494844SYitzhak Mandelbaum   access(e, cat(member(m)))
1952b494844SYitzhak Mandelbaum
1962b494844SYitzhak MandelbaumThe first argument specifies the object being accessed and the second, a
1972b494844SYitzhak Mandelbaumdescription of the field/method name. In this case, we specify that the method
1982b494844SYitzhak Mandelbaumname should be copied from the source -- specifically, the source range of ``m``'s
1992b494844SYitzhak Mandelbaummember. To construct the method call, we would use this expression in ``cat``:
2002b494844SYitzhak Mandelbaum
2012b494844SYitzhak Mandelbaum.. code-block:: c++
2022b494844SYitzhak Mandelbaum
2032b494844SYitzhak Mandelbaum   cat(access(e, cat(member(m))), "()")
2042b494844SYitzhak Mandelbaum
2052b494844SYitzhak MandelbaumReference: ranges, stencils, edits, rules
2062b494844SYitzhak Mandelbaum-----------------------------------------
2072b494844SYitzhak Mandelbaum
2082b494844SYitzhak MandelbaumThe above examples demonstrate just the basics of rewrite rules. Every element
2092b494844SYitzhak Mandelbaumwe touched on has more available constructors: range selectors, stencils, edits
2102b494844SYitzhak Mandelbaumand rules. In this section, we'll briefly review each in turn, with references
2112b494844SYitzhak Mandelbaumto the source headers for up-to-date information. First, though, we clarify what
2122b494844SYitzhak Mandelbaumrewrite rules are actually rewriting.
2132b494844SYitzhak Mandelbaum
2142b494844SYitzhak MandelbaumRewriting ASTs to... Text?
2152b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^
2162b494844SYitzhak Mandelbaum
2172b494844SYitzhak MandelbaumThe astute reader may have noticed that we've been somewhat vague in our
2182b494844SYitzhak Mandelbaumexplanation of what the rewrite rules are actually rewriting. We've referred to
2192b494844SYitzhak Mandelbaum"code", but code can be represented both as raw source text and as an abstract
2202b494844SYitzhak Mandelbaumsyntax tree. So, which one is it?
2212b494844SYitzhak Mandelbaum
2222b494844SYitzhak MandelbaumIdeally, we'd be rewriting the input AST to a new AST, but clang's AST is not
2232b494844SYitzhak Mandelbaumterribly amenable to this kind of transformation. So, we compromise: we express
2242b494844SYitzhak Mandelbaumour patterns and the names that they bind in terms of the AST, but our changes
2252b494844SYitzhak Mandelbaumin terms of source code text. We've designed Transformer's language to bridge
2262b494844SYitzhak Mandelbaumthe gap between the two representations, in an attempt to minimize the user's
2272b494844SYitzhak Mandelbaumneed to reason about source code locations and other, low-level syntactic
2282b494844SYitzhak Mandelbaumdetails.
2292b494844SYitzhak Mandelbaum
2302b494844SYitzhak MandelbaumRange Selectors
2312b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^
2322b494844SYitzhak Mandelbaum
2332b494844SYitzhak MandelbaumTransformer provides a small API for describing source ranges: the
2342b494844SYitzhak Mandelbaum``RangeSelector`` combinators. These ranges are most commonly used to specify the
2352b494844SYitzhak Mandelbaumsource code affected by an edit and to extract source code in constructing new
2362b494844SYitzhak Mandelbaumtext.
2372b494844SYitzhak Mandelbaum
2382b494844SYitzhak MandelbaumRoughly, there are two kinds of range combinators: ones that select a source
2392b494844SYitzhak Mandelbaumrange based on the AST, and others that combine existing ranges into new ranges.
2402b494844SYitzhak MandelbaumFor example, ``node`` selects the range of source spanned by a particular AST
2412b494844SYitzhak Mandelbaumnode, as we've seen, while ``after`` selects the (empty) range located immediately
2422b494844SYitzhak Mandelbaumafter its argument range. So, ``after(node("id"))`` is the empty range immediately
2432b494844SYitzhak Mandelbaumfollowing the AST node bound to ``id``.
2442b494844SYitzhak Mandelbaum
2452b494844SYitzhak MandelbaumFor the full collection of ``RangeSelector``\ s, see the header,
2462b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RangeSelector.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RangeSelector.h>`_
2472b494844SYitzhak Mandelbaum
2482b494844SYitzhak MandelbaumStencils
2492b494844SYitzhak Mandelbaum^^^^^^^^
2502b494844SYitzhak Mandelbaum
2512b494844SYitzhak MandelbaumTransformer offers a large and growing collection of combinators for
2522b494844SYitzhak Mandelbaumconstructing output. Above, we demonstrated ``cat``, the core function for
2532b494844SYitzhak Mandelbaumconstructing stencils. It takes a series of arguments, of three possible kinds:
2542b494844SYitzhak Mandelbaum
2552b494844SYitzhak Mandelbaum#.  Raw text, to be copied directly to the output.
2562b494844SYitzhak Mandelbaum#.  Selector: specified with a ``RangeSelector``, indicates a range of source text
2572b494844SYitzhak Mandelbaum    to copy to the output.
2582b494844SYitzhak Mandelbaum#.  Builder: an operation that constructs a code snippet from its arguments. For
2592b494844SYitzhak Mandelbaum    example, the ``access`` function we saw above.
2602b494844SYitzhak Mandelbaum
2612b494844SYitzhak MandelbaumData of these different types are all represented (generically) by a ``Stencil``.
2622b494844SYitzhak Mandelbaum``cat`` takes text and ``RangeSelector``\ s directly as arguments, rather than
2632b494844SYitzhak Mandelbaumrequiring that they be constructed with a builder; other builders are
2642b494844SYitzhak Mandelbaumconstructed explicitly.
2652b494844SYitzhak Mandelbaum
2662b494844SYitzhak MandelbaumIn general, ``Stencil``\ s produce text from a match result. So, they are not
2672b494844SYitzhak Mandelbaumlimited to generating source code, but can also be used to generate diagnostic
2682b494844SYitzhak Mandelbaummessages that reference (named) elements of the matched code, like we saw in the
2692b494844SYitzhak Mandelbaumexample of rewriting method calls.
2702b494844SYitzhak Mandelbaum
2712b494844SYitzhak MandelbaumFurther details of the ``Stencil`` type are documented in the header file
2722b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/Stencil.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/Stencil.h>`_.
2732b494844SYitzhak Mandelbaum
2742b494844SYitzhak MandelbaumEdits
2752b494844SYitzhak Mandelbaum^^^^^
2762b494844SYitzhak Mandelbaum
2772b494844SYitzhak MandelbaumTransformer supports additional forms of edits. First, in a ``changeTo``, we can
2782b494844SYitzhak Mandelbaumspecify the particular portion of code to be replaced, using the same
2792b494844SYitzhak Mandelbaum``RangeSelector`` we saw earlier. For example, we could change the function name
2802b494844SYitzhak Mandelbaumin a function declaration with:
2812b494844SYitzhak Mandelbaum
2822b494844SYitzhak Mandelbaum.. code-block:: c++
2832b494844SYitzhak Mandelbaum
2842b494844SYitzhak Mandelbaum   makeRule(functionDecl(hasName("bad")).bind(f),
2852b494844SYitzhak Mandelbaum	    changeTo(name(f), cat("good")),
2862b494844SYitzhak Mandelbaum	    cat("bad is now good"));
2872b494844SYitzhak Mandelbaum
2882b494844SYitzhak MandelbaumWe also provide simpler editing primitives for insertion and deletion:
2892b494844SYitzhak Mandelbaum``insertBefore``, ``insertAfter`` and ``remove``. These can all be found in the header
2902b494844SYitzhak Mandelbaumfile
2912b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_.
2922b494844SYitzhak Mandelbaum
2932b494844SYitzhak MandelbaumWe are not limited one edit per match found. Some situations require making
2942b494844SYitzhak Mandelbaummultiple edits for each match. For example, suppose we wanted to swap two
2952b494844SYitzhak Mandelbaumarguments of a function call.
2962b494844SYitzhak Mandelbaum
2972b494844SYitzhak MandelbaumFor this, we provide an overload of ``makeRule`` that takes a list of edits,
2982b494844SYitzhak Mandelbaumrather than just a single one. Our example might look like:
2992b494844SYitzhak Mandelbaum
3002b494844SYitzhak Mandelbaum.. code-block:: c++
3012b494844SYitzhak Mandelbaum
3022b494844SYitzhak Mandelbaum   makeRule(callExpr(...),
3032b494844SYitzhak Mandelbaum	   {changeTo(node(arg0), cat(node(arg2))),
3042b494844SYitzhak Mandelbaum	    changeTo(node(arg2), cat(node(arg0)))},
3052b494844SYitzhak Mandelbaum	   cat("swap the first and third arguments of the call"));
3062b494844SYitzhak Mandelbaum
3072b494844SYitzhak Mandelbaum``EditGenerator``\ s (Advanced)
3082b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3092b494844SYitzhak Mandelbaum
3102b494844SYitzhak MandelbaumThe particular edits we've seen so far are all instances of the ``ASTEdit`` class,
3112b494844SYitzhak Mandelbaumor a list of such. But, not all edits can be expressed as ``ASTEdit``\ s. So, we
3122b494844SYitzhak Mandelbaumalso support a very general signature for edit generators:
3132b494844SYitzhak Mandelbaum
3142b494844SYitzhak Mandelbaum.. code-block:: c++
3152b494844SYitzhak Mandelbaum
3162b494844SYitzhak Mandelbaum   using EditGenerator = MatchConsumer<llvm::SmallVector<Edit, 1>>;
3172b494844SYitzhak Mandelbaum
3182b494844SYitzhak MandelbaumThat is, an ``EditGenerator`` is function that maps a ``MatchResult`` to a set
3192b494844SYitzhak Mandelbaumof edits, or fails. This signature supports a very general form of computation
3202b494844SYitzhak Mandelbaumover match results. Transformer provides a number of functions for working with
3212b494844SYitzhak Mandelbaum``EditGenerator``\ s, most notably
3222b494844SYitzhak Mandelbaum`flatten <https://github.com/llvm/llvm-project/blob/1fabe6e51917bcd7a1242294069c682fe6dffa45/clang/include/clang/Tooling/Transformer/RewriteRule.h#L165-L167>`_
3232b494844SYitzhak Mandelbaum``EditGenerator``\ s, like list flattening. For the full list, see the header file
3242b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_.
3252b494844SYitzhak Mandelbaum
3262b494844SYitzhak MandelbaumRules
3272b494844SYitzhak Mandelbaum^^^^^
3282b494844SYitzhak Mandelbaum
3292b494844SYitzhak MandelbaumWe can also compose multiple *rules*, rather than just edits within a rule,
3302b494844SYitzhak Mandelbaumusing ``applyFirst``: it composes a list of rules as an ordered choice, where
3312b494844SYitzhak MandelbaumTransformer applies the first rule whose pattern matches, ignoring others in the
3322b494844SYitzhak Mandelbaumlist that follow. If the matchers are independent then order doesn't matter. In
3332b494844SYitzhak Mandelbaumthat case, ``applyFirst`` is simply joining the set of rules into one.
3342b494844SYitzhak Mandelbaum
3352b494844SYitzhak MandelbaumThe benefit of ``applyFirst`` is that, for some problems, it allows the user to
3362b494844SYitzhak Mandelbaummore concisely formulate later rules in the list, since their patterns need not
3372b494844SYitzhak Mandelbaumexplicitly exclude the earlier patterns of the list. For example, consider a set
3382b494844SYitzhak Mandelbaumof rules that rewrite compound statements, where one rule handles the case of an
3392b494844SYitzhak Mandelbaumempty compound statement and the other handles non-empty compound statements.
3402b494844SYitzhak MandelbaumWith ``applyFirst``, these rules can be expressed compactly as:
3412b494844SYitzhak Mandelbaum
3422b494844SYitzhak Mandelbaum.. code-block:: c++
3432b494844SYitzhak Mandelbaum
3442b494844SYitzhak Mandelbaum   applyFirst({
3452b494844SYitzhak Mandelbaum     makeRule(compoundStmt(statementCountIs(0)).bind("empty"), ...),
3462b494844SYitzhak Mandelbaum     makeRule(compoundStmt().bind("non-empty"),...)
3472b494844SYitzhak Mandelbaum   })
3482b494844SYitzhak Mandelbaum
3492b494844SYitzhak MandelbaumThe second rule does not need to explicitly specify that the compound statement
3502b494844SYitzhak Mandelbaumis non-empty -- it follows from the rules position in ``applyFirst``. For more
3512b494844SYitzhak Mandelbaumcomplicated examples, this can lead to substantially more readable code.
3522b494844SYitzhak Mandelbaum
3532b494844SYitzhak MandelbaumSometimes, a modification to the code might require the inclusion of a
3542b494844SYitzhak Mandelbaumparticular header file. To this end, users can modify rules to specify include
3552b494844SYitzhak Mandelbaumdirectives with ``addInclude``.
3562b494844SYitzhak Mandelbaum
3572b494844SYitzhak MandelbaumFor additional documentation on these functions, see the header file
3582b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_.
3592b494844SYitzhak Mandelbaum
3602b494844SYitzhak MandelbaumUsing a RewriteRule as a clang-tidy check
3612b494844SYitzhak Mandelbaum-----------------------------------------
3622b494844SYitzhak Mandelbaum
3632b494844SYitzhak MandelbaumTransformer supports executing a rewrite rule as a
3642b494844SYitzhak Mandelbaum`clang-tidy <https://clang.llvm.org/extra/clang-tidy/>`_ check, with the class
3652b494844SYitzhak Mandelbaum``clang::tidy::utils::TransformerClangTidyCheck``. It is designed to require
3662b494844SYitzhak Mandelbaumminimal code in the definition. For example, given a rule
3672b494844SYitzhak Mandelbaum``MyCheckAsRewriteRule``, one can define a tidy check as follows:
3682b494844SYitzhak Mandelbaum
3692b494844SYitzhak Mandelbaum.. code-block:: c++
3702b494844SYitzhak Mandelbaum
3712b494844SYitzhak Mandelbaum   class MyCheck : public TransformerClangTidyCheck {
3722b494844SYitzhak Mandelbaum    public:
3732b494844SYitzhak Mandelbaum     MyCheck(StringRef Name, ClangTidyContext *Context)
3742b494844SYitzhak Mandelbaum	 : TransformerClangTidyCheck(MyCheckAsRewriteRule, Name, Context) {}
3752b494844SYitzhak Mandelbaum   };
3762b494844SYitzhak Mandelbaum
3772b494844SYitzhak Mandelbaum``TransformerClangTidyCheck`` implements the virtual ``registerMatchers`` and
3782b494844SYitzhak Mandelbaum``check`` methods based on your rule specification, so you don't need to implement
3792b494844SYitzhak Mandelbaumthem yourself. If the rule needs to be configured based on the language options
3802b494844SYitzhak Mandelbaumand/or the clang-tidy configuration, it can be expressed as a function taking
3812b494844SYitzhak Mandelbaumthese as parameters and (optionally) returning a ``RewriteRule``. This would be
3822b494844SYitzhak Mandelbaumuseful, for example, for our method-renaming rule, which is parameterized by the
3832b494844SYitzhak Mandelbaumoriginal name and the target. For details, see
3842b494844SYitzhak Mandelbaum`clang-tools-extra/clang-tidy/utils/TransformerClangTidyCheck.h <https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/utils/TransformerClangTidyCheck.h>`_
3852b494844SYitzhak Mandelbaum
3862b494844SYitzhak MandelbaumRelated Reading
3872b494844SYitzhak Mandelbaum---------------
3882b494844SYitzhak Mandelbaum
3892b494844SYitzhak MandelbaumA good place to start understanding the clang AST and its matchers is with the
3902b494844SYitzhak Mandelbaumintroductions on clang's site:
3912b494844SYitzhak Mandelbaum
3922b494844SYitzhak Mandelbaum*   :doc:`Introduction to the Clang AST <IntroductionToTheClangAST>`
3932b494844SYitzhak Mandelbaum*   :doc:`Matching the Clang AST <LibASTMatchers>`
3942b494844SYitzhak Mandelbaum*   `AST Matcher Reference <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_
3952b494844SYitzhak Mandelbaum
3962b494844SYitzhak Mandelbaum.. rubric:: Footnotes
3972b494844SYitzhak Mandelbaum
3982b494844SYitzhak Mandelbaum.. [#f1] Technically, it binds it to the string "str", to which our
3992b494844SYitzhak Mandelbaum    variable ``s`` is bound. But, the choice of that id string is
4002b494844SYitzhak Mandelbaum    irrelevant, so elide the difference.
401