12b494844SYitzhak Mandelbaum========================== 22b494844SYitzhak MandelbaumClang Transformer Tutorial 32b494844SYitzhak Mandelbaum========================== 42b494844SYitzhak Mandelbaum 52b494844SYitzhak MandelbaumA tutorial on how to write a source-to-source translation tool using Clang Transformer. 62b494844SYitzhak Mandelbaum 72b494844SYitzhak Mandelbaum.. contents:: 82b494844SYitzhak Mandelbaum :local: 92b494844SYitzhak Mandelbaum 102b494844SYitzhak MandelbaumWhat is Clang Transformer? 112b494844SYitzhak Mandelbaum-------------------------- 122b494844SYitzhak Mandelbaum 132b494844SYitzhak MandelbaumClang Transformer is a framework for writing C++ diagnostics and program 142b494844SYitzhak Mandelbaumtransformations. It is built on the clang toolchain and the LibTooling library, 152b494844SYitzhak Mandelbaumbut aims to hide much of the complexity of clang's native, low-level libraries. 162b494844SYitzhak Mandelbaum 172b494844SYitzhak MandelbaumThe core abstraction of Transformer is the *rewrite rule*, which specifies how 182b494844SYitzhak Mandelbaumto change a given program pattern into a new form. Here are some examples of 192b494844SYitzhak Mandelbaumtasks you can achieve with Transformer: 202b494844SYitzhak Mandelbaum 212b494844SYitzhak Mandelbaum* warn against using the name ``MkX`` for a declared function, 222b494844SYitzhak Mandelbaum* change ``MkX`` to ``MakeX``, where ``MkX`` is the name of a declared function, 232b494844SYitzhak Mandelbaum* change ``s.size()`` to ``Size(s)``, where ``s`` is a ``string``, 242b494844SYitzhak Mandelbaum* collapse ``e.child().m()`` to ``e.m()``, for any expression ``e`` and method named 252b494844SYitzhak Mandelbaum ``m``. 262b494844SYitzhak Mandelbaum 272b494844SYitzhak MandelbaumAll of the examples have a common form: they identify a pattern that is the 282b494844SYitzhak Mandelbaumtarget of the transformation, they specify an *edit* to the code identified by 292b494844SYitzhak Mandelbaumthe pattern, and their pattern and edit refer to common variables, like ``s``, 302b494844SYitzhak Mandelbaum``e``, and ``m``, that range over code fragments. Our first and second examples also 312b494844SYitzhak Mandelbaumspecify constraints on the pattern that aren't apparent from the syntax alone, 322b494844SYitzhak Mandelbaumlike "``s`` is a ``string``." Even the first example ("warn ...") shares this form, 332b494844SYitzhak Mandelbaumeven though it doesn't change any of the code -- it's "edit" is simply a no-op. 342b494844SYitzhak Mandelbaum 352b494844SYitzhak MandelbaumTransformer helps users succinctly specify rules of this sort and easily execute 362b494844SYitzhak Mandelbaumthem locally over a collection of files, apply them to selected portions of 372b494844SYitzhak Mandelbauma codebase, or even bundle them as a clang-tidy check for ongoing application. 382b494844SYitzhak Mandelbaum 392b494844SYitzhak MandelbaumWho is Clang Transformer for? 402b494844SYitzhak Mandelbaum----------------------------- 412b494844SYitzhak Mandelbaum 422b494844SYitzhak MandelbaumClang Transformer is for developers who want to write clang-tidy checks or write 432b494844SYitzhak Mandelbaumtools to modify a large number of C++ files in (roughly) the same way. What 442b494844SYitzhak Mandelbaumqualifies as "large" really depends on the nature of the change and your 452b494844SYitzhak Mandelbaumpatience for repetitive editing. In our experience, automated solutions become 462b494844SYitzhak Mandelbaumworthwhile somewhere between 100 and 500 files. 472b494844SYitzhak Mandelbaum 482b494844SYitzhak MandelbaumGetting Started 492b494844SYitzhak Mandelbaum--------------- 502b494844SYitzhak Mandelbaum 512b494844SYitzhak MandelbaumPatterns in Transformer are expressed with :doc:`clang's AST matchers <LibASTMatchers>`. 522b494844SYitzhak MandelbaumMatchers are a language of combinators for describing portions of a clang 532b494844SYitzhak MandelbaumAbstract Syntax Tree (AST). Since clang's AST includes complete type information 542b494844SYitzhak Mandelbaum(within the limits of single `Translation Unit (TU)`_, 552b494844SYitzhak Mandelbaumthese patterns can even encode rich constraints on the type properties of AST 562b494844SYitzhak Mandelbaumnodes. 572b494844SYitzhak Mandelbaum 582b494844SYitzhak Mandelbaum.. _`Translation Unit (TU)`: https://en.wikipedia.org/wiki/Translation_unit_\(programming\) 592b494844SYitzhak Mandelbaum 602b494844SYitzhak MandelbaumWe assume a familiarity with the clang AST and the corresponding AST matchers 612b494844SYitzhak Mandelbaumfor the purpose of this tutorial. Users who are unfamiliar with either are 622b494844SYitzhak Mandelbaumencouraged to start with the recommended references in `Related Reading`_. 632b494844SYitzhak Mandelbaum 642b494844SYitzhak MandelbaumExample: style-checking names 652b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 662b494844SYitzhak Mandelbaum 672b494844SYitzhak MandelbaumAssume you have a style-guide rule which forbids functions from being named 682b494844SYitzhak Mandelbaum"MkX" and you want to write a check that catches any violations of this rule. We 692b494844SYitzhak Mandelbaumcan express this a Transformer rewrite rule: 702b494844SYitzhak Mandelbaum 712b494844SYitzhak Mandelbaum.. code-block:: c++ 722b494844SYitzhak Mandelbaum 73*293dbea8SDeNiCoN makeRule(functionDecl(hasName("MkX")).bind("fun"), 742b494844SYitzhak Mandelbaum noopEdit(node("fun")), 752b494844SYitzhak Mandelbaum cat("The name ``MkX`` is not allowed for functions; please rename")); 762b494844SYitzhak Mandelbaum 772b494844SYitzhak Mandelbaum``makeRule`` is our go-to function for generating rewrite rules. It takes three 782b494844SYitzhak Mandelbaumarguments: the pattern, the edit, and (optionally) an explanatory note. In our 792b494844SYitzhak Mandelbaumexample, the pattern (``functionDecl(...)``) identifies the declaration of the 802b494844SYitzhak Mandelbaumfunction ``MkX``. Since we're just diagnosing the problem, but not suggesting a 812b494844SYitzhak Mandelbaumfix, our edit is an no-op. But, it contains an *anchor* for the diagnostic 822b494844SYitzhak Mandelbaummessage: ``node("fun")`` says to associate the message with the source range of 832b494844SYitzhak Mandelbaumthe AST node bound to "fun"; in this case, the ill-named function declaration. 842b494844SYitzhak MandelbaumFinally, we use ``cat`` to build a message that explains the change. Regarding the 852b494844SYitzhak Mandelbaumname ``cat`` -- we'll discuss it in more detail below, but suffice it to say that 862b494844SYitzhak Mandelbaumit can also take multiple arguments and concatenate their results. 872b494844SYitzhak Mandelbaum 882b494844SYitzhak MandelbaumNote that the result of ``makeRule`` is a value of type 892b494844SYitzhak Mandelbaum``clang::transformer::RewriteRule``, but most users don't need to care about the 902b494844SYitzhak Mandelbaumdetails of this type. 912b494844SYitzhak Mandelbaum 922b494844SYitzhak MandelbaumExample: renaming a function 932b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 942b494844SYitzhak Mandelbaum 952b494844SYitzhak MandelbaumNow, let's extend this example to a *transformation*; specifically, the second 962b494844SYitzhak Mandelbaumexample above: 972b494844SYitzhak Mandelbaum 982b494844SYitzhak Mandelbaum.. code-block:: c++ 992b494844SYitzhak Mandelbaum 1002b494844SYitzhak Mandelbaum makeRule(declRefExpr(to(functionDecl(hasName("MkX")))), 1012b494844SYitzhak Mandelbaum changeTo(cat("MakeX")), 1022b494844SYitzhak Mandelbaum cat("MkX has been renamed MakeX")); 1032b494844SYitzhak Mandelbaum 1042b494844SYitzhak MandelbaumIn this example, the pattern (``declRefExpr(...)``) identifies any *reference* to 1052b494844SYitzhak Mandelbaumthe function ``MkX``, rather than the declaration itself, as in our previous 1062b494844SYitzhak Mandelbaumexample. Our edit (``changeTo(...)``) says to *change* the code matched by the 1072b494844SYitzhak Mandelbaumpattern *to* the text "MakeX". Finally, we use ``cat`` again to build a message 1082b494844SYitzhak Mandelbaumthat explains the change. 1092b494844SYitzhak Mandelbaum 1102b494844SYitzhak MandelbaumHere are some example changes that this rule would make: 1112b494844SYitzhak Mandelbaum 1122b494844SYitzhak Mandelbaum+--------------------------+----------------------------+ 1132b494844SYitzhak Mandelbaum| Original | Result | 1142b494844SYitzhak Mandelbaum+==========================+============================+ 1152b494844SYitzhak Mandelbaum| ``X x = MkX(3);`` | ``X x = MakeX(3);`` | 1162b494844SYitzhak Mandelbaum+--------------------------+----------------------------+ 1172b494844SYitzhak Mandelbaum| ``CallFactory(MkX, 3);`` | ``CallFactory(MakeX, 3);`` | 1182b494844SYitzhak Mandelbaum+--------------------------+----------------------------+ 1192b494844SYitzhak Mandelbaum| ``auto f = MkX;`` | ``auto f = MakeX;`` | 1202b494844SYitzhak Mandelbaum+--------------------------+----------------------------+ 1212b494844SYitzhak Mandelbaum 1222b494844SYitzhak MandelbaumExample: method to function 1232b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1242b494844SYitzhak Mandelbaum 1252b494844SYitzhak MandelbaumNext, let's write a rule to replace a method call with a (free) function call, 1262b494844SYitzhak Mandelbaumapplied to the original method call's target object. Specifically, "change 1272b494844SYitzhak Mandelbaum``s.size()`` to ``Size(s)``, where ``s`` is a ``string``." We start with a simpler 1282b494844SYitzhak Mandelbaumchange that ignores the type of ``s``. That is, it will modify *any* method call 1292b494844SYitzhak Mandelbaumwhere the method is named "size": 1302b494844SYitzhak Mandelbaum 1312b494844SYitzhak Mandelbaum.. code-block:: c++ 1322b494844SYitzhak Mandelbaum 1332b494844SYitzhak Mandelbaum llvm::StringRef s = "str"; 1342b494844SYitzhak Mandelbaum makeRule( 1352b494844SYitzhak Mandelbaum cxxMemberCallExpr( 1362b494844SYitzhak Mandelbaum on(expr().bind(s)), 1372b494844SYitzhak Mandelbaum callee(cxxMethodDecl(hasName("size")))), 1382b494844SYitzhak Mandelbaum changeTo(cat("Size(", node(s), ")")), 1392b494844SYitzhak Mandelbaum cat("Method ``size`` is deprecated in favor of free function ``Size``")); 1402b494844SYitzhak Mandelbaum 1412b494844SYitzhak MandelbaumWe express the pattern with the given AST matcher, which binds the method call's 1422b494844SYitzhak Mandelbaumtarget to ``s`` [#f1]_. For the edit, we again use ``changeTo``, but this 1432b494844SYitzhak Mandelbaumtime we construct the term from multiple parts, which we compose with ``cat``. The 1442b494844SYitzhak Mandelbaumsecond part of our term is ``node(s)``, which selects the source code 1452b494844SYitzhak Mandelbaumcorresponding to the AST node ``s`` that was bound when a match was found in the 1462b494844SYitzhak MandelbaumAST for our rule's pattern. ``node(s)`` constructs a ``RangeSelector``, which, when 1472b494844SYitzhak Mandelbaumused in ``cat``, indicates that the selected source should be inserted in the 1482b494844SYitzhak Mandelbaumoutput at that point. 1492b494844SYitzhak Mandelbaum 1502b494844SYitzhak MandelbaumNow, we probably don't want to rewrite *all* invocations of "size" methods, just 1512b494844SYitzhak Mandelbaumthose on ``std::string``\ s. We can achieve this change simply by refining our 1522b494844SYitzhak Mandelbaummatcher. The rest of the rule remains unchanged: 1532b494844SYitzhak Mandelbaum 1542b494844SYitzhak Mandelbaum.. code-block:: c++ 1552b494844SYitzhak Mandelbaum 1562b494844SYitzhak Mandelbaum llvm::StringRef s = "str"; 1572b494844SYitzhak Mandelbaum makeRule( 1582b494844SYitzhak Mandelbaum cxxMemberCallExpr( 1592b494844SYitzhak Mandelbaum on(expr(hasType(namedDecl(hasName("std::string")))) 1602b494844SYitzhak Mandelbaum .bind(s)), 1612b494844SYitzhak Mandelbaum callee(cxxMethodDecl(hasName("size")))), 1622b494844SYitzhak Mandelbaum changeTo(cat("Size(", node(s), ")")), 1632b494844SYitzhak Mandelbaum cat("Method ``size`` is deprecated in favor of free function ``Size``")); 1642b494844SYitzhak Mandelbaum 1652b494844SYitzhak MandelbaumExample: rewriting method calls 1662b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1672b494844SYitzhak Mandelbaum 1682b494844SYitzhak MandelbaumIn this example, we delete an "intermediary" method call in a string of 1692b494844SYitzhak Mandelbauminvocations. This scenario can arise, for example, if you want to collapse a 1702b494844SYitzhak Mandelbaumsubstructure into its parent. 1712b494844SYitzhak Mandelbaum 1722b494844SYitzhak Mandelbaum.. code-block:: c++ 1732b494844SYitzhak Mandelbaum 1742b494844SYitzhak Mandelbaum llvm::StringRef e = "expr", m = "member"; 1752b494844SYitzhak Mandelbaum auto child_call = cxxMemberCallExpr(on(expr().bind(e)), 1762b494844SYitzhak Mandelbaum callee(cxxMethodDecl(hasName("child")))); 1772b494844SYitzhak Mandelbaum makeRule(cxxMemberCallExpr(on(child_call), callee(memberExpr().bind(m)), 1782b494844SYitzhak Mandelbaum changeTo(cat(e, ".", member(m), "()"))), 1792b494844SYitzhak Mandelbaum cat("``child`` accessor is being removed; call ", 1802b494844SYitzhak Mandelbaum member(m), " directly on parent")); 1812b494844SYitzhak Mandelbaum 1822b494844SYitzhak MandelbaumThis rule isn't quite what we want: it will rewrite ``my_object.child().foo()`` to 1832b494844SYitzhak Mandelbaum``my_object.foo()``, but it will also rewrite ``my_ptr->child().foo()`` to 1842b494844SYitzhak Mandelbaum``my_ptr.foo()``, which is not what we intend. We could fix this by restricting 1852b494844SYitzhak Mandelbaumthe pattern with ``not(isArrow())`` in the definition of ``child_call``. Yet, we 1862b494844SYitzhak Mandelbaum*want* to rewrite calls through pointers. 1872b494844SYitzhak Mandelbaum 1882b494844SYitzhak MandelbaumTo capture this idiom, we provide the ``access`` combinator to intelligently 1892b494844SYitzhak Mandelbaumconstruct a field/method access. In our example, the member access is expressed 1902b494844SYitzhak Mandelbaumas: 1912b494844SYitzhak Mandelbaum 1922b494844SYitzhak Mandelbaum.. code-block:: c++ 1932b494844SYitzhak Mandelbaum 1942b494844SYitzhak Mandelbaum access(e, cat(member(m))) 1952b494844SYitzhak Mandelbaum 1962b494844SYitzhak MandelbaumThe first argument specifies the object being accessed and the second, a 1972b494844SYitzhak Mandelbaumdescription of the field/method name. In this case, we specify that the method 1982b494844SYitzhak Mandelbaumname should be copied from the source -- specifically, the source range of ``m``'s 1992b494844SYitzhak Mandelbaummember. To construct the method call, we would use this expression in ``cat``: 2002b494844SYitzhak Mandelbaum 2012b494844SYitzhak Mandelbaum.. code-block:: c++ 2022b494844SYitzhak Mandelbaum 2032b494844SYitzhak Mandelbaum cat(access(e, cat(member(m))), "()") 2042b494844SYitzhak Mandelbaum 2052b494844SYitzhak MandelbaumReference: ranges, stencils, edits, rules 2062b494844SYitzhak Mandelbaum----------------------------------------- 2072b494844SYitzhak Mandelbaum 2082b494844SYitzhak MandelbaumThe above examples demonstrate just the basics of rewrite rules. Every element 2092b494844SYitzhak Mandelbaumwe touched on has more available constructors: range selectors, stencils, edits 2102b494844SYitzhak Mandelbaumand rules. In this section, we'll briefly review each in turn, with references 2112b494844SYitzhak Mandelbaumto the source headers for up-to-date information. First, though, we clarify what 2122b494844SYitzhak Mandelbaumrewrite rules are actually rewriting. 2132b494844SYitzhak Mandelbaum 2142b494844SYitzhak MandelbaumRewriting ASTs to... Text? 2152b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^ 2162b494844SYitzhak Mandelbaum 2172b494844SYitzhak MandelbaumThe astute reader may have noticed that we've been somewhat vague in our 2182b494844SYitzhak Mandelbaumexplanation of what the rewrite rules are actually rewriting. We've referred to 2192b494844SYitzhak Mandelbaum"code", but code can be represented both as raw source text and as an abstract 2202b494844SYitzhak Mandelbaumsyntax tree. So, which one is it? 2212b494844SYitzhak Mandelbaum 2222b494844SYitzhak MandelbaumIdeally, we'd be rewriting the input AST to a new AST, but clang's AST is not 2232b494844SYitzhak Mandelbaumterribly amenable to this kind of transformation. So, we compromise: we express 2242b494844SYitzhak Mandelbaumour patterns and the names that they bind in terms of the AST, but our changes 2252b494844SYitzhak Mandelbaumin terms of source code text. We've designed Transformer's language to bridge 2262b494844SYitzhak Mandelbaumthe gap between the two representations, in an attempt to minimize the user's 2272b494844SYitzhak Mandelbaumneed to reason about source code locations and other, low-level syntactic 2282b494844SYitzhak Mandelbaumdetails. 2292b494844SYitzhak Mandelbaum 2302b494844SYitzhak MandelbaumRange Selectors 2312b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^ 2322b494844SYitzhak Mandelbaum 2332b494844SYitzhak MandelbaumTransformer provides a small API for describing source ranges: the 2342b494844SYitzhak Mandelbaum``RangeSelector`` combinators. These ranges are most commonly used to specify the 2352b494844SYitzhak Mandelbaumsource code affected by an edit and to extract source code in constructing new 2362b494844SYitzhak Mandelbaumtext. 2372b494844SYitzhak Mandelbaum 2382b494844SYitzhak MandelbaumRoughly, there are two kinds of range combinators: ones that select a source 2392b494844SYitzhak Mandelbaumrange based on the AST, and others that combine existing ranges into new ranges. 2402b494844SYitzhak MandelbaumFor example, ``node`` selects the range of source spanned by a particular AST 2412b494844SYitzhak Mandelbaumnode, as we've seen, while ``after`` selects the (empty) range located immediately 2422b494844SYitzhak Mandelbaumafter its argument range. So, ``after(node("id"))`` is the empty range immediately 2432b494844SYitzhak Mandelbaumfollowing the AST node bound to ``id``. 2442b494844SYitzhak Mandelbaum 2452b494844SYitzhak MandelbaumFor the full collection of ``RangeSelector``\ s, see the header, 2462b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RangeSelector.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RangeSelector.h>`_ 2472b494844SYitzhak Mandelbaum 2482b494844SYitzhak MandelbaumStencils 2492b494844SYitzhak Mandelbaum^^^^^^^^ 2502b494844SYitzhak Mandelbaum 2512b494844SYitzhak MandelbaumTransformer offers a large and growing collection of combinators for 2522b494844SYitzhak Mandelbaumconstructing output. Above, we demonstrated ``cat``, the core function for 2532b494844SYitzhak Mandelbaumconstructing stencils. It takes a series of arguments, of three possible kinds: 2542b494844SYitzhak Mandelbaum 2552b494844SYitzhak Mandelbaum#. Raw text, to be copied directly to the output. 2562b494844SYitzhak Mandelbaum#. Selector: specified with a ``RangeSelector``, indicates a range of source text 2572b494844SYitzhak Mandelbaum to copy to the output. 2582b494844SYitzhak Mandelbaum#. Builder: an operation that constructs a code snippet from its arguments. For 2592b494844SYitzhak Mandelbaum example, the ``access`` function we saw above. 2602b494844SYitzhak Mandelbaum 2612b494844SYitzhak MandelbaumData of these different types are all represented (generically) by a ``Stencil``. 2622b494844SYitzhak Mandelbaum``cat`` takes text and ``RangeSelector``\ s directly as arguments, rather than 2632b494844SYitzhak Mandelbaumrequiring that they be constructed with a builder; other builders are 2642b494844SYitzhak Mandelbaumconstructed explicitly. 2652b494844SYitzhak Mandelbaum 2662b494844SYitzhak MandelbaumIn general, ``Stencil``\ s produce text from a match result. So, they are not 2672b494844SYitzhak Mandelbaumlimited to generating source code, but can also be used to generate diagnostic 2682b494844SYitzhak Mandelbaummessages that reference (named) elements of the matched code, like we saw in the 2692b494844SYitzhak Mandelbaumexample of rewriting method calls. 2702b494844SYitzhak Mandelbaum 2712b494844SYitzhak MandelbaumFurther details of the ``Stencil`` type are documented in the header file 2722b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/Stencil.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/Stencil.h>`_. 2732b494844SYitzhak Mandelbaum 2742b494844SYitzhak MandelbaumEdits 2752b494844SYitzhak Mandelbaum^^^^^ 2762b494844SYitzhak Mandelbaum 2772b494844SYitzhak MandelbaumTransformer supports additional forms of edits. First, in a ``changeTo``, we can 2782b494844SYitzhak Mandelbaumspecify the particular portion of code to be replaced, using the same 2792b494844SYitzhak Mandelbaum``RangeSelector`` we saw earlier. For example, we could change the function name 2802b494844SYitzhak Mandelbaumin a function declaration with: 2812b494844SYitzhak Mandelbaum 2822b494844SYitzhak Mandelbaum.. code-block:: c++ 2832b494844SYitzhak Mandelbaum 2842b494844SYitzhak Mandelbaum makeRule(functionDecl(hasName("bad")).bind(f), 2852b494844SYitzhak Mandelbaum changeTo(name(f), cat("good")), 2862b494844SYitzhak Mandelbaum cat("bad is now good")); 2872b494844SYitzhak Mandelbaum 2882b494844SYitzhak MandelbaumWe also provide simpler editing primitives for insertion and deletion: 2892b494844SYitzhak Mandelbaum``insertBefore``, ``insertAfter`` and ``remove``. These can all be found in the header 2902b494844SYitzhak Mandelbaumfile 2912b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_. 2922b494844SYitzhak Mandelbaum 2932b494844SYitzhak MandelbaumWe are not limited one edit per match found. Some situations require making 2942b494844SYitzhak Mandelbaummultiple edits for each match. For example, suppose we wanted to swap two 2952b494844SYitzhak Mandelbaumarguments of a function call. 2962b494844SYitzhak Mandelbaum 2972b494844SYitzhak MandelbaumFor this, we provide an overload of ``makeRule`` that takes a list of edits, 2982b494844SYitzhak Mandelbaumrather than just a single one. Our example might look like: 2992b494844SYitzhak Mandelbaum 3002b494844SYitzhak Mandelbaum.. code-block:: c++ 3012b494844SYitzhak Mandelbaum 3022b494844SYitzhak Mandelbaum makeRule(callExpr(...), 3032b494844SYitzhak Mandelbaum {changeTo(node(arg0), cat(node(arg2))), 3042b494844SYitzhak Mandelbaum changeTo(node(arg2), cat(node(arg0)))}, 3052b494844SYitzhak Mandelbaum cat("swap the first and third arguments of the call")); 3062b494844SYitzhak Mandelbaum 3072b494844SYitzhak Mandelbaum``EditGenerator``\ s (Advanced) 3082b494844SYitzhak Mandelbaum^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3092b494844SYitzhak Mandelbaum 3102b494844SYitzhak MandelbaumThe particular edits we've seen so far are all instances of the ``ASTEdit`` class, 3112b494844SYitzhak Mandelbaumor a list of such. But, not all edits can be expressed as ``ASTEdit``\ s. So, we 3122b494844SYitzhak Mandelbaumalso support a very general signature for edit generators: 3132b494844SYitzhak Mandelbaum 3142b494844SYitzhak Mandelbaum.. code-block:: c++ 3152b494844SYitzhak Mandelbaum 3162b494844SYitzhak Mandelbaum using EditGenerator = MatchConsumer<llvm::SmallVector<Edit, 1>>; 3172b494844SYitzhak Mandelbaum 3182b494844SYitzhak MandelbaumThat is, an ``EditGenerator`` is function that maps a ``MatchResult`` to a set 3192b494844SYitzhak Mandelbaumof edits, or fails. This signature supports a very general form of computation 3202b494844SYitzhak Mandelbaumover match results. Transformer provides a number of functions for working with 3212b494844SYitzhak Mandelbaum``EditGenerator``\ s, most notably 3222b494844SYitzhak Mandelbaum`flatten <https://github.com/llvm/llvm-project/blob/1fabe6e51917bcd7a1242294069c682fe6dffa45/clang/include/clang/Tooling/Transformer/RewriteRule.h#L165-L167>`_ 3232b494844SYitzhak Mandelbaum``EditGenerator``\ s, like list flattening. For the full list, see the header file 3242b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_. 3252b494844SYitzhak Mandelbaum 3262b494844SYitzhak MandelbaumRules 3272b494844SYitzhak Mandelbaum^^^^^ 3282b494844SYitzhak Mandelbaum 3292b494844SYitzhak MandelbaumWe can also compose multiple *rules*, rather than just edits within a rule, 3302b494844SYitzhak Mandelbaumusing ``applyFirst``: it composes a list of rules as an ordered choice, where 3312b494844SYitzhak MandelbaumTransformer applies the first rule whose pattern matches, ignoring others in the 3322b494844SYitzhak Mandelbaumlist that follow. If the matchers are independent then order doesn't matter. In 3332b494844SYitzhak Mandelbaumthat case, ``applyFirst`` is simply joining the set of rules into one. 3342b494844SYitzhak Mandelbaum 3352b494844SYitzhak MandelbaumThe benefit of ``applyFirst`` is that, for some problems, it allows the user to 3362b494844SYitzhak Mandelbaummore concisely formulate later rules in the list, since their patterns need not 3372b494844SYitzhak Mandelbaumexplicitly exclude the earlier patterns of the list. For example, consider a set 3382b494844SYitzhak Mandelbaumof rules that rewrite compound statements, where one rule handles the case of an 3392b494844SYitzhak Mandelbaumempty compound statement and the other handles non-empty compound statements. 3402b494844SYitzhak MandelbaumWith ``applyFirst``, these rules can be expressed compactly as: 3412b494844SYitzhak Mandelbaum 3422b494844SYitzhak Mandelbaum.. code-block:: c++ 3432b494844SYitzhak Mandelbaum 3442b494844SYitzhak Mandelbaum applyFirst({ 3452b494844SYitzhak Mandelbaum makeRule(compoundStmt(statementCountIs(0)).bind("empty"), ...), 3462b494844SYitzhak Mandelbaum makeRule(compoundStmt().bind("non-empty"),...) 3472b494844SYitzhak Mandelbaum }) 3482b494844SYitzhak Mandelbaum 3492b494844SYitzhak MandelbaumThe second rule does not need to explicitly specify that the compound statement 3502b494844SYitzhak Mandelbaumis non-empty -- it follows from the rules position in ``applyFirst``. For more 3512b494844SYitzhak Mandelbaumcomplicated examples, this can lead to substantially more readable code. 3522b494844SYitzhak Mandelbaum 3532b494844SYitzhak MandelbaumSometimes, a modification to the code might require the inclusion of a 3542b494844SYitzhak Mandelbaumparticular header file. To this end, users can modify rules to specify include 3552b494844SYitzhak Mandelbaumdirectives with ``addInclude``. 3562b494844SYitzhak Mandelbaum 3572b494844SYitzhak MandelbaumFor additional documentation on these functions, see the header file 3582b494844SYitzhak Mandelbaum`clang/Tooling/Transformer/RewriteRule.h <https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Tooling/Transformer/RewriteRule.h>`_. 3592b494844SYitzhak Mandelbaum 3602b494844SYitzhak MandelbaumUsing a RewriteRule as a clang-tidy check 3612b494844SYitzhak Mandelbaum----------------------------------------- 3622b494844SYitzhak Mandelbaum 3632b494844SYitzhak MandelbaumTransformer supports executing a rewrite rule as a 3642b494844SYitzhak Mandelbaum`clang-tidy <https://clang.llvm.org/extra/clang-tidy/>`_ check, with the class 3652b494844SYitzhak Mandelbaum``clang::tidy::utils::TransformerClangTidyCheck``. It is designed to require 3662b494844SYitzhak Mandelbaumminimal code in the definition. For example, given a rule 3672b494844SYitzhak Mandelbaum``MyCheckAsRewriteRule``, one can define a tidy check as follows: 3682b494844SYitzhak Mandelbaum 3692b494844SYitzhak Mandelbaum.. code-block:: c++ 3702b494844SYitzhak Mandelbaum 3712b494844SYitzhak Mandelbaum class MyCheck : public TransformerClangTidyCheck { 3722b494844SYitzhak Mandelbaum public: 3732b494844SYitzhak Mandelbaum MyCheck(StringRef Name, ClangTidyContext *Context) 3742b494844SYitzhak Mandelbaum : TransformerClangTidyCheck(MyCheckAsRewriteRule, Name, Context) {} 3752b494844SYitzhak Mandelbaum }; 3762b494844SYitzhak Mandelbaum 3772b494844SYitzhak Mandelbaum``TransformerClangTidyCheck`` implements the virtual ``registerMatchers`` and 3782b494844SYitzhak Mandelbaum``check`` methods based on your rule specification, so you don't need to implement 3792b494844SYitzhak Mandelbaumthem yourself. If the rule needs to be configured based on the language options 3802b494844SYitzhak Mandelbaumand/or the clang-tidy configuration, it can be expressed as a function taking 3812b494844SYitzhak Mandelbaumthese as parameters and (optionally) returning a ``RewriteRule``. This would be 3822b494844SYitzhak Mandelbaumuseful, for example, for our method-renaming rule, which is parameterized by the 3832b494844SYitzhak Mandelbaumoriginal name and the target. For details, see 3842b494844SYitzhak Mandelbaum`clang-tools-extra/clang-tidy/utils/TransformerClangTidyCheck.h <https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/utils/TransformerClangTidyCheck.h>`_ 3852b494844SYitzhak Mandelbaum 3862b494844SYitzhak MandelbaumRelated Reading 3872b494844SYitzhak Mandelbaum--------------- 3882b494844SYitzhak Mandelbaum 3892b494844SYitzhak MandelbaumA good place to start understanding the clang AST and its matchers is with the 3902b494844SYitzhak Mandelbaumintroductions on clang's site: 3912b494844SYitzhak Mandelbaum 3922b494844SYitzhak Mandelbaum* :doc:`Introduction to the Clang AST <IntroductionToTheClangAST>` 3932b494844SYitzhak Mandelbaum* :doc:`Matching the Clang AST <LibASTMatchers>` 3942b494844SYitzhak Mandelbaum* `AST Matcher Reference <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_ 3952b494844SYitzhak Mandelbaum 3962b494844SYitzhak Mandelbaum.. rubric:: Footnotes 3972b494844SYitzhak Mandelbaum 3982b494844SYitzhak Mandelbaum.. [#f1] Technically, it binds it to the string "str", to which our 3992b494844SYitzhak Mandelbaum variable ``s`` is bound. But, the choice of that id string is 4002b494844SYitzhak Mandelbaum irrelevant, so elide the difference. 401