xref: /llvm-project/llvm/docs/tutorial/MyFirstLanguageFrontend/LangImpl07.rst (revision 6f8a363a483489687597e29b8bda0975e821f188)
1=======================================================
2Kaleidoscope: Extending the Language: Mutable Variables
3=======================================================
4
5.. contents::
6   :local:
7
8Chapter 7 Introduction
9======================
10
11Welcome to Chapter 7 of the "`Implementing a language with
12LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
13very respectable, albeit simple, `functional programming
14language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
15journey, we learned some parsing techniques, how to build and represent
16an AST, how to build LLVM IR, and how to optimize the resultant code as
17well as JIT compile it.
18
19While Kaleidoscope is interesting as a functional language, the fact
20that it is functional makes it "too easy" to generate LLVM IR for it. In
21particular, a functional language makes it very easy to build LLVM IR
22directly in `SSA
23form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
24Since LLVM requires that the input code be in SSA form, this is a very
25nice property and it is often unclear to newcomers how to generate code
26for an imperative language with mutable variables.
27
28The short (and happy) summary of this chapter is that there is no need
29for your front-end to build SSA form: LLVM provides highly tuned and
30well tested support for this, though the way it works is a bit
31unexpected for some.
32
33Why is this a hard problem?
34===========================
35
36To understand why mutable variables cause complexities in SSA
37construction, consider this extremely simple C example:
38
39.. code-block:: c
40
41    int G, H;
42    int test(_Bool Condition) {
43      int X;
44      if (Condition)
45        X = G;
46      else
47        X = H;
48      return X;
49    }
50
51In this case, we have the variable "X", whose value depends on the path
52executed in the program. Because there are two different possible values
53for X before the return instruction, a PHI node is inserted to merge the
54two values. The LLVM IR that we want for this example looks like this:
55
56.. code-block:: llvm
57
58    @G = weak global i32 0   ; type of @G is i32*
59    @H = weak global i32 0   ; type of @H is i32*
60
61    define i32 @test(i1 %Condition) {
62    entry:
63      br i1 %Condition, label %cond_true, label %cond_false
64
65    cond_true:
66      %X.0 = load i32, i32* @G
67      br label %cond_next
68
69    cond_false:
70      %X.1 = load i32, i32* @H
71      br label %cond_next
72
73    cond_next:
74      %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
75      ret i32 %X.2
76    }
77
78In this example, the loads from the G and H global variables are
79explicit in the LLVM IR, and they live in the then/else branches of the
80if statement (cond\_true/cond\_false). In order to merge the incoming
81values, the X.2 phi node in the cond\_next block selects the right value
82to use based on where control flow is coming from: if control flow comes
83from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
84control flow comes from cond\_true, it gets the value of X.0. The intent
85of this chapter is not to explain the details of SSA form. For more
86information, see one of the many `online
87references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
88
89The question for this article is "who places the phi nodes when lowering
90assignments to mutable variables?". The issue here is that LLVM
91*requires* that its IR be in SSA form: there is no "non-ssa" mode for
92it. However, SSA construction requires non-trivial algorithms and data
93structures, so it is inconvenient and wasteful for every front-end to
94have to reproduce this logic.
95
96Memory in LLVM
97==============
98
99The 'trick' here is that while LLVM does require all register values to
100be in SSA form, it does not require (or permit) memory objects to be in
101SSA form. In the example above, note that the loads from G and H are
102direct accesses to G and H: they are not renamed or versioned. This
103differs from some other compiler systems, which do try to version memory
104objects. In LLVM, instead of encoding dataflow analysis of memory into
105the LLVM IR, it is handled with `Analysis
106Passes <../../WritingAnLLVMPass.html>`_ which are computed on demand.
107
108With this in mind, the high-level idea is that we want to make a stack
109variable (which lives in memory, because it is on the stack) for each
110mutable object in a function. To take advantage of this trick, we need
111to talk about how LLVM represents stack variables.
112
113In LLVM, all memory accesses are explicit with load/store instructions,
114and it is carefully designed not to have (or need) an "address-of"
115operator. Notice how the type of the @G/@H global variables is actually
116"i32\*" even though the variable is defined as "i32". What this means is
117that @G defines *space* for an i32 in the global data area, but its
118*name* actually refers to the address for that space. Stack variables
119work the same way, except that instead of being declared with global
120variable definitions, they are declared with the `LLVM alloca
121instruction <../../LangRef.html#alloca-instruction>`_:
122
123.. code-block:: llvm
124
125    define i32 @example() {
126    entry:
127      %X = alloca i32           ; type of %X is i32*.
128      ...
129      %tmp = load i32, i32* %X  ; load the stack value %X from the stack.
130      %tmp2 = add i32 %tmp, 1   ; increment it
131      store i32 %tmp2, i32* %X  ; store it back
132      ...
133
134This code shows an example of how you can declare and manipulate a stack
135variable in the LLVM IR. Stack memory allocated with the alloca
136instruction is fully general: you can pass the address of the stack slot
137to functions, you can store it in other variables, etc. In our example
138above, we could rewrite the example to use the alloca technique to avoid
139using a PHI node:
140
141.. code-block:: llvm
142
143    @G = weak global i32 0   ; type of @G is i32*
144    @H = weak global i32 0   ; type of @H is i32*
145
146    define i32 @test(i1 %Condition) {
147    entry:
148      %X = alloca i32           ; type of %X is i32*.
149      br i1 %Condition, label %cond_true, label %cond_false
150
151    cond_true:
152      %X.0 = load i32, i32* @G
153      store i32 %X.0, i32* %X   ; Update X
154      br label %cond_next
155
156    cond_false:
157      %X.1 = load i32, i32* @H
158      store i32 %X.1, i32* %X   ; Update X
159      br label %cond_next
160
161    cond_next:
162      %X.2 = load i32, i32* %X  ; Read X
163      ret i32 %X.2
164    }
165
166With this, we have discovered a way to handle arbitrary mutable
167variables without the need to create Phi nodes at all:
168
169#. Each mutable variable becomes a stack allocation.
170#. Each read of the variable becomes a load from the stack.
171#. Each update of the variable becomes a store to the stack.
172#. Taking the address of a variable just uses the stack address
173   directly.
174
175While this solution has solved our immediate problem, it introduced
176another one: we have now apparently introduced a lot of stack traffic
177for very simple and common operations, a major performance problem.
178Fortunately for us, the LLVM optimizer has a highly-tuned optimization
179pass named "mem2reg" that handles this case, promoting allocas like this
180into SSA registers, inserting Phi nodes as appropriate. If you run this
181example through the pass, for example, you'll get:
182
183.. code-block:: bash
184
185    $ llvm-as < example.ll | opt -passes=mem2reg | llvm-dis
186    @G = weak global i32 0
187    @H = weak global i32 0
188
189    define i32 @test(i1 %Condition) {
190    entry:
191      br i1 %Condition, label %cond_true, label %cond_false
192
193    cond_true:
194      %X.0 = load i32, i32* @G
195      br label %cond_next
196
197    cond_false:
198      %X.1 = load i32, i32* @H
199      br label %cond_next
200
201    cond_next:
202      %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
203      ret i32 %X.01
204    }
205
206The mem2reg pass implements the standard "iterated dominance frontier"
207algorithm for constructing SSA form and has a number of optimizations
208that speed up (very common) degenerate cases. The mem2reg optimization
209pass is the answer to dealing with mutable variables, and we highly
210recommend that you depend on it. Note that mem2reg only works on
211variables in certain circumstances:
212
213#. mem2reg is alloca-driven: it looks for allocas and if it can handle
214   them, it promotes them. It does not apply to global variables or heap
215   allocations.
216#. mem2reg only looks for alloca instructions in the entry block of the
217   function. Being in the entry block guarantees that the alloca is only
218   executed once, which makes analysis simpler.
219#. mem2reg only promotes allocas whose uses are direct loads and stores.
220   If the address of the stack object is passed to a function, or if any
221   funny pointer arithmetic is involved, the alloca will not be
222   promoted.
223#. mem2reg only works on allocas of `first
224   class <../../LangRef.html#first-class-types>`_ values (such as pointers,
225   scalars and vectors), and only if the array size of the allocation is
226   1 (or missing in the .ll file). mem2reg is not capable of promoting
227   structs or arrays to registers. Note that the "sroa" pass is
228   more powerful and can promote structs, "unions", and arrays in many
229   cases.
230
231All of these properties are easy to satisfy for most imperative
232languages, and we'll illustrate it below with Kaleidoscope. The final
233question you may be asking is: should I bother with this nonsense for my
234front-end? Wouldn't it be better if I just did SSA construction
235directly, avoiding use of the mem2reg optimization pass? In short, we
236strongly recommend that you use this technique for building SSA form,
237unless there is an extremely good reason not to. Using this technique
238is:
239
240-  Proven and well tested: clang uses this technique
241   for local mutable variables. As such, the most common clients of LLVM
242   are using this to handle a bulk of their variables. You can be sure
243   that bugs are found fast and fixed early.
244-  Extremely Fast: mem2reg has a number of special cases that make it
245   fast in common cases as well as fully general. For example, it has
246   fast-paths for variables that are only used in a single block,
247   variables that only have one assignment point, good heuristics to
248   avoid insertion of unneeded phi nodes, etc.
249-  Needed for debug info generation: `Debug information in
250   LLVM <../../SourceLevelDebugging.html>`_ relies on having the address of
251   the variable exposed so that debug info can be attached to it. This
252   technique dovetails very naturally with this style of debug info.
253
254If nothing else, this makes it much easier to get your front-end up and
255running, and is very simple to implement. Let's extend Kaleidoscope with
256mutable variables now!
257
258Mutable Variables in Kaleidoscope
259=================================
260
261Now that we know the sort of problem we want to tackle, let's see what
262this looks like in the context of our little Kaleidoscope language.
263We're going to add two features:
264
265#. The ability to mutate variables with the '=' operator.
266#. The ability to define new variables.
267
268While the first item is really what this is about, we only have
269variables for incoming arguments as well as for induction variables, and
270redefining those only goes so far :). Also, the ability to define new
271variables is a useful thing regardless of whether you will be mutating
272them. Here's a motivating example that shows how we could use these:
273
274::
275
276    # Define ':' for sequencing: as a low-precedence operator that ignores operands
277    # and just returns the RHS.
278    def binary : 1 (x y) y;
279
280    # Recursive fib, we could do this before.
281    def fib(x)
282      if (x < 3) then
283        1
284      else
285        fib(x-1)+fib(x-2);
286
287    # Iterative fib.
288    def fibi(x)
289      var a = 1, b = 1, c in
290      (for i = 3, i < x in
291         c = a + b :
292         a = b :
293         b = c) :
294      b;
295
296    # Call it.
297    fibi(10);
298
299In order to mutate variables, we have to change our existing variables
300to use the "alloca trick". Once we have that, we'll add our new
301operator, then extend Kaleidoscope to support new variable definitions.
302
303Adjusting Existing Variables for Mutation
304=========================================
305
306The symbol table in Kaleidoscope is managed at code generation time by
307the '``NamedValues``' map. This map currently keeps track of the LLVM
308"Value\*" that holds the double value for the named variable. In order
309to support mutation, we need to change this slightly, so that
310``NamedValues`` holds the *memory location* of the variable in question.
311Note that this change is a refactoring: it changes the structure of the
312code, but does not (by itself) change the behavior of the compiler. All
313of these changes are isolated in the Kaleidoscope code generator.
314
315At this point in Kaleidoscope's development, it only supports variables
316for two things: incoming arguments to functions and the induction
317variable of 'for' loops. For consistency, we'll allow mutation of these
318variables in addition to other user-defined variables. This means that
319these will both need memory locations.
320
321To start our transformation of Kaleidoscope, we'll change the
322``NamedValues`` map so that it maps to AllocaInst\* instead of Value\*. Once
323we do this, the C++ compiler will tell us what parts of the code we need
324to update:
325
326.. code-block:: c++
327
328    static std::map<std::string, AllocaInst*> NamedValues;
329
330Also, since we will need to create these allocas, we'll use a helper
331function that ensures that the allocas are created in the entry block of
332the function:
333
334.. code-block:: c++
335
336    /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
337    /// the function.  This is used for mutable variables etc.
338    static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
339                                              StringRef VarName) {
340      IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
341                     TheFunction->getEntryBlock().begin());
342      return TmpB.CreateAlloca(Type::getDoubleTy(*TheContext), nullptr,
343                               VarName);
344    }
345
346This funny looking code creates an IRBuilder object that is pointing at
347the first instruction (.begin()) of the entry block. It then creates an
348alloca with the expected name and returns it. Because all values in
349Kaleidoscope are doubles, there is no need to pass in a type to use.
350
351With this in place, the first functionality change we want to make belongs to
352variable references. In our new scheme, variables live on the stack, so
353code generating a reference to them actually needs to produce a load
354from the stack slot:
355
356.. code-block:: c++
357
358    Value *VariableExprAST::codegen() {
359      // Look this variable up in the function.
360      AllocaInst *A = NamedValues[Name];
361      if (!A)
362        return LogErrorV("Unknown variable name");
363
364      // Load the value.
365      return Builder->CreateLoad(A->getAllocatedType(), A, Name.c_str());
366    }
367
368As you can see, this is pretty straightforward. Now we need to update
369the things that define the variables to set up the alloca. We'll start
370with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
371the unabridged code):
372
373.. code-block:: c++
374
375      Function *TheFunction = Builder->GetInsertBlock()->getParent();
376
377      // Create an alloca for the variable in the entry block.
378      AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
379
380      // Emit the start code first, without 'variable' in scope.
381      Value *StartVal = Start->codegen();
382      if (!StartVal)
383        return nullptr;
384
385      // Store the value into the alloca.
386      Builder->CreateStore(StartVal, Alloca);
387      ...
388
389      // Compute the end condition.
390      Value *EndCond = End->codegen();
391      if (!EndCond)
392        return nullptr;
393
394      // Reload, increment, and restore the alloca.  This handles the case where
395      // the body of the loop mutates the variable.
396      Value *CurVar = Builder->CreateLoad(Alloca->getAllocatedType(), Alloca,
397                                          VarName.c_str());
398      Value *NextVar = Builder->CreateFAdd(CurVar, StepVal, "nextvar");
399      Builder->CreateStore(NextVar, Alloca);
400      ...
401
402This code is virtually identical to the code `before we allowed mutable
403variables <LangImpl05.html#code-generation-for-the-for-loop>`_. The big difference is that we
404no longer have to construct a PHI node, and we use load/store to access
405the variable as needed.
406
407To support mutable argument variables, we need to also make allocas for
408them. The code for this is also pretty simple:
409
410.. code-block:: c++
411
412    Function *FunctionAST::codegen() {
413      ...
414      Builder->SetInsertPoint(BB);
415
416      // Record the function arguments in the NamedValues map.
417      NamedValues.clear();
418      for (auto &Arg : TheFunction->args()) {
419        // Create an alloca for this variable.
420        AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
421
422        // Store the initial value into the alloca.
423        Builder->CreateStore(&Arg, Alloca);
424
425        // Add arguments to variable symbol table.
426        NamedValues[std::string(Arg.getName())] = Alloca;
427      }
428
429      if (Value *RetVal = Body->codegen()) {
430        ...
431
432For each argument, we make an alloca, store the input value to the
433function into the alloca, and register the alloca as the memory location
434for the argument. This method gets invoked by ``FunctionAST::codegen()``
435right after it sets up the entry block for the function.
436
437The final missing piece is adding the mem2reg pass, which allows us to
438get good codegen once again:
439
440.. code-block:: c++
441
442        // Promote allocas to registers.
443        TheFPM->addPass(PromotePass());
444        // Do simple "peephole" optimizations and bit-twiddling optzns.
445        TheFPM->addPass(InstCombinePass());
446        // Reassociate expressions.
447        TheFPM->addPass(ReassociatePass());
448        ...
449
450It is interesting to see what the code looks like before and after the
451mem2reg optimization runs. For example, this is the before/after code
452for our recursive fib function. Before the optimization:
453
454.. code-block:: llvm
455
456    define double @fib(double %x) {
457    entry:
458      %x1 = alloca double
459      store double %x, double* %x1
460      %x2 = load double, double* %x1
461      %cmptmp = fcmp ult double %x2, 3.000000e+00
462      %booltmp = uitofp i1 %cmptmp to double
463      %ifcond = fcmp one double %booltmp, 0.000000e+00
464      br i1 %ifcond, label %then, label %else
465
466    then:       ; preds = %entry
467      br label %ifcont
468
469    else:       ; preds = %entry
470      %x3 = load double, double* %x1
471      %subtmp = fsub double %x3, 1.000000e+00
472      %calltmp = call double @fib(double %subtmp)
473      %x4 = load double, double* %x1
474      %subtmp5 = fsub double %x4, 2.000000e+00
475      %calltmp6 = call double @fib(double %subtmp5)
476      %addtmp = fadd double %calltmp, %calltmp6
477      br label %ifcont
478
479    ifcont:     ; preds = %else, %then
480      %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
481      ret double %iftmp
482    }
483
484Here there is only one variable (x, the input argument) but you can
485still see the extremely simple-minded code generation strategy we are
486using. In the entry block, an alloca is created, and the initial input
487value is stored into it. Each reference to the variable does a reload
488from the stack. Also, note that we didn't modify the if/then/else
489expression, so it still inserts a PHI node. While we could make an
490alloca for it, it is actually easier to create a PHI node for it, so we
491still just make the PHI.
492
493Here is the code after the mem2reg pass runs:
494
495.. code-block:: llvm
496
497    define double @fib(double %x) {
498    entry:
499      %cmptmp = fcmp ult double %x, 3.000000e+00
500      %booltmp = uitofp i1 %cmptmp to double
501      %ifcond = fcmp one double %booltmp, 0.000000e+00
502      br i1 %ifcond, label %then, label %else
503
504    then:
505      br label %ifcont
506
507    else:
508      %subtmp = fsub double %x, 1.000000e+00
509      %calltmp = call double @fib(double %subtmp)
510      %subtmp5 = fsub double %x, 2.000000e+00
511      %calltmp6 = call double @fib(double %subtmp5)
512      %addtmp = fadd double %calltmp, %calltmp6
513      br label %ifcont
514
515    ifcont:     ; preds = %else, %then
516      %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
517      ret double %iftmp
518    }
519
520This is a trivial case for mem2reg, since there are no redefinitions of
521the variable. The point of showing this is to calm your tension about
522inserting such blatant inefficiencies :).
523
524After the rest of the optimizers run, we get:
525
526.. code-block:: llvm
527
528    define double @fib(double %x) {
529    entry:
530      %cmptmp = fcmp ult double %x, 3.000000e+00
531      %booltmp = uitofp i1 %cmptmp to double
532      %ifcond = fcmp ueq double %booltmp, 0.000000e+00
533      br i1 %ifcond, label %else, label %ifcont
534
535    else:
536      %subtmp = fsub double %x, 1.000000e+00
537      %calltmp = call double @fib(double %subtmp)
538      %subtmp5 = fsub double %x, 2.000000e+00
539      %calltmp6 = call double @fib(double %subtmp5)
540      %addtmp = fadd double %calltmp, %calltmp6
541      ret double %addtmp
542
543    ifcont:
544      ret double 1.000000e+00
545    }
546
547Here we see that the simplifycfg pass decided to clone the return
548instruction into the end of the 'else' block. This allowed it to
549eliminate some branches and the PHI node.
550
551Now that all symbol table references are updated to use stack variables,
552we'll add the assignment operator.
553
554New Assignment Operator
555=======================
556
557With our current framework, adding a new assignment operator is really
558simple. We will parse it just like any other binary operator, but handle
559it internally (instead of allowing the user to define it). The first
560step is to set a precedence:
561
562.. code-block:: c++
563
564     int main() {
565       // Install standard binary operators.
566       // 1 is lowest precedence.
567       BinopPrecedence['='] = 2;
568       BinopPrecedence['<'] = 10;
569       BinopPrecedence['+'] = 20;
570       BinopPrecedence['-'] = 20;
571
572Now that the parser knows the precedence of the binary operator, it
573takes care of all the parsing and AST generation. We just need to
574implement codegen for the assignment operator. This looks like:
575
576.. code-block:: c++
577
578    Value *BinaryExprAST::codegen() {
579      // Special case '=' because we don't want to emit the LHS as an expression.
580      if (Op == '=') {
581        // This assume we're building without RTTI because LLVM builds that way by
582        // default. If you build LLVM with RTTI this can be changed to a
583        // dynamic_cast for automatic error checking.
584        VariableExprAST *LHSE = static_cast<VariableExprAST*>(LHS.get());
585        if (!LHSE)
586          return LogErrorV("destination of '=' must be a variable");
587
588Unlike the rest of the binary operators, our assignment operator doesn't
589follow the "emit LHS, emit RHS, do computation" model. As such, it is
590handled as a special case before the other binary operators are handled.
591The other strange thing is that it requires the LHS to be a variable. It
592is invalid to have "(x+1) = expr" - only things like "x = expr" are
593allowed.
594
595.. code-block:: c++
596
597        // Codegen the RHS.
598        Value *Val = RHS->codegen();
599        if (!Val)
600          return nullptr;
601
602        // Look up the name.
603        Value *Variable = NamedValues[LHSE->getName()];
604        if (!Variable)
605          return LogErrorV("Unknown variable name");
606
607        Builder->CreateStore(Val, Variable);
608        return Val;
609      }
610      ...
611
612Once we have the variable, codegen'ing the assignment is
613straightforward: we emit the RHS of the assignment, create a store, and
614return the computed value. Returning a value allows for chained
615assignments like "X = (Y = Z)".
616
617Now that we have an assignment operator, we can mutate loop variables
618and arguments. For example, we can now run code like this:
619
620::
621
622    # Function to print a double.
623    extern printd(x);
624
625    # Define ':' for sequencing: as a low-precedence operator that ignores operands
626    # and just returns the RHS.
627    def binary : 1 (x y) y;
628
629    def test(x)
630      printd(x) :
631      x = 4 :
632      printd(x);
633
634    test(123);
635
636When run, this example prints "123" and then "4", showing that we did
637actually mutate the value! Okay, we have now officially implemented our
638goal: getting this to work requires SSA construction in the general
639case. However, to be really useful, we want the ability to define our
640own local variables, let's add this next!
641
642User-defined Local Variables
643============================
644
645Adding var/in is just like any other extension we made to
646Kaleidoscope: we extend the lexer, the parser, the AST and the code
647generator. The first step for adding our new 'var/in' construct is to
648extend the lexer. As before, this is pretty trivial, the code looks like
649this:
650
651.. code-block:: c++
652
653    enum Token {
654      ...
655      // var definition
656      tok_var = -13
657    ...
658    }
659    ...
660    static int gettok() {
661    ...
662        if (IdentifierStr == "in")
663          return tok_in;
664        if (IdentifierStr == "binary")
665          return tok_binary;
666        if (IdentifierStr == "unary")
667          return tok_unary;
668        if (IdentifierStr == "var")
669          return tok_var;
670        return tok_identifier;
671    ...
672
673The next step is to define the AST node that we will construct. For
674var/in, it looks like this:
675
676.. code-block:: c++
677
678    /// VarExprAST - Expression class for var/in
679    class VarExprAST : public ExprAST {
680      std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
681      std::unique_ptr<ExprAST> Body;
682
683    public:
684      VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
685                 std::unique_ptr<ExprAST> Body)
686        : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
687
688      Value *codegen() override;
689    };
690
691var/in allows a list of names to be defined all at once, and each name
692can optionally have an initializer value. As such, we capture this
693information in the VarNames vector. Also, var/in has a body, this body
694is allowed to access the variables defined by the var/in.
695
696With this in place, we can define the parser pieces. The first thing we
697do is add it as a primary expression:
698
699.. code-block:: c++
700
701    /// primary
702    ///   ::= identifierexpr
703    ///   ::= numberexpr
704    ///   ::= parenexpr
705    ///   ::= ifexpr
706    ///   ::= forexpr
707    ///   ::= varexpr
708    static std::unique_ptr<ExprAST> ParsePrimary() {
709      switch (CurTok) {
710      default:
711        return LogError("unknown token when expecting an expression");
712      case tok_identifier:
713        return ParseIdentifierExpr();
714      case tok_number:
715        return ParseNumberExpr();
716      case '(':
717        return ParseParenExpr();
718      case tok_if:
719        return ParseIfExpr();
720      case tok_for:
721        return ParseForExpr();
722      case tok_var:
723        return ParseVarExpr();
724      }
725    }
726
727Next we define ParseVarExpr:
728
729.. code-block:: c++
730
731    /// varexpr ::= 'var' identifier ('=' expression)?
732    //                    (',' identifier ('=' expression)?)* 'in' expression
733    static std::unique_ptr<ExprAST> ParseVarExpr() {
734      getNextToken();  // eat the var.
735
736      std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
737
738      // At least one variable name is required.
739      if (CurTok != tok_identifier)
740        return LogError("expected identifier after var");
741
742The first part of this code parses the list of identifier/expr pairs
743into the local ``VarNames`` vector.
744
745.. code-block:: c++
746
747      while (true) {
748        std::string Name = IdentifierStr;
749        getNextToken();  // eat identifier.
750
751        // Read the optional initializer.
752        std::unique_ptr<ExprAST> Init;
753        if (CurTok == '=') {
754          getNextToken(); // eat the '='.
755
756          Init = ParseExpression();
757          if (!Init) return nullptr;
758        }
759
760        VarNames.push_back(std::make_pair(Name, std::move(Init)));
761
762        // End of var list, exit loop.
763        if (CurTok != ',') break;
764        getNextToken(); // eat the ','.
765
766        if (CurTok != tok_identifier)
767          return LogError("expected identifier list after var");
768      }
769
770Once all the variables are parsed, we then parse the body and create the
771AST node:
772
773.. code-block:: c++
774
775      // At this point, we have to have 'in'.
776      if (CurTok != tok_in)
777        return LogError("expected 'in' keyword after 'var'");
778      getNextToken();  // eat 'in'.
779
780      auto Body = ParseExpression();
781      if (!Body)
782        return nullptr;
783
784      return std::make_unique<VarExprAST>(std::move(VarNames),
785                                           std::move(Body));
786    }
787
788Now that we can parse and represent the code, we need to support
789emission of LLVM IR for it. This code starts out with:
790
791.. code-block:: c++
792
793    Value *VarExprAST::codegen() {
794      std::vector<AllocaInst *> OldBindings;
795
796      Function *TheFunction = Builder->GetInsertBlock()->getParent();
797
798      // Register all variables and emit their initializer.
799      for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
800        const std::string &VarName = VarNames[i].first;
801        ExprAST *Init = VarNames[i].second.get();
802
803Basically it loops over all the variables, installing them one at a
804time. For each variable we put into the symbol table, we remember the
805previous value that we replace in OldBindings.
806
807.. code-block:: c++
808
809        // Emit the initializer before adding the variable to scope, this prevents
810        // the initializer from referencing the variable itself, and permits stuff
811        // like this:
812        //  var a = 1 in
813        //    var a = a in ...   # refers to outer 'a'.
814        Value *InitVal;
815        if (Init) {
816          InitVal = Init->codegen();
817          if (!InitVal)
818            return nullptr;
819        } else { // If not specified, use 0.0.
820          InitVal = ConstantFP::get(*TheContext, APFloat(0.0));
821        }
822
823        AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
824        Builder->CreateStore(InitVal, Alloca);
825
826        // Remember the old variable binding so that we can restore the binding when
827        // we unrecurse.
828        OldBindings.push_back(NamedValues[VarName]);
829
830        // Remember this binding.
831        NamedValues[VarName] = Alloca;
832      }
833
834There are more comments here than code. The basic idea is that we emit
835the initializer, create the alloca, then update the symbol table to
836point to it. Once all the variables are installed in the symbol table,
837we evaluate the body of the var/in expression:
838
839.. code-block:: c++
840
841      // Codegen the body, now that all vars are in scope.
842      Value *BodyVal = Body->codegen();
843      if (!BodyVal)
844        return nullptr;
845
846Finally, before returning, we restore the previous variable bindings:
847
848.. code-block:: c++
849
850      // Pop all our variables from scope.
851      for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
852        NamedValues[VarNames[i].first] = OldBindings[i];
853
854      // Return the body computation.
855      return BodyVal;
856    }
857
858The end result of all of this is that we get properly scoped variable
859definitions, and we even (trivially) allow mutation of them :).
860
861With this, we completed what we set out to do. Our nice iterative fib
862example from the intro compiles and runs just fine. The mem2reg pass
863optimizes all of our stack variables into SSA registers, inserting PHI
864nodes where needed, and our front-end remains simple: no "iterated
865dominance frontier" computation anywhere in sight.
866
867Full Code Listing
868=================
869
870Here is the complete code listing for our running example, enhanced with
871mutable variables and var/in support. To build this example, use:
872
873.. code-block:: bash
874
875    # Compile
876    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
877    # Run
878    ./toy
879
880Here is the code:
881
882.. literalinclude:: ../../../examples/Kaleidoscope/Chapter7/toy.cpp
883   :language: c++
884
885`Next: Compiling to Object Code <LangImpl08.html>`_
886
887