xref: /minix3/external/bsd/llvm/dist/llvm/docs/tutorial/LangImpl8.rst (revision 0a6a1f1d05b60e214de2f05a7310ddd1f0e590e7)
1*0a6a1f1dSLionel Sambuc=======================================================
2*0a6a1f1dSLionel SambucKaleidoscope: Extending the Language: Debug Information
3*0a6a1f1dSLionel Sambuc=======================================================
4f4a2713aSLionel Sambuc
5f4a2713aSLionel Sambuc.. contents::
6f4a2713aSLionel Sambuc   :local:
7f4a2713aSLionel Sambuc
8*0a6a1f1dSLionel SambucChapter 8 Introduction
9*0a6a1f1dSLionel Sambuc======================
10f4a2713aSLionel Sambuc
11*0a6a1f1dSLionel SambucWelcome to Chapter 8 of the "`Implementing a language with
12*0a6a1f1dSLionel SambucLLVM <index.html>`_" tutorial. In chapters 1 through 7, we've built a
13*0a6a1f1dSLionel Sambucdecent little programming language with functions and variables.
14*0a6a1f1dSLionel SambucWhat happens if something goes wrong though, how do you debug your
15*0a6a1f1dSLionel Sambucprogram?
16f4a2713aSLionel Sambuc
17*0a6a1f1dSLionel SambucSource level debugging uses formatted data that helps a debugger
18*0a6a1f1dSLionel Sambuctranslate from binary and the state of the machine back to the
19*0a6a1f1dSLionel Sambucsource that the programmer wrote. In LLVM we generally use a format
20*0a6a1f1dSLionel Sambuccalled `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding
21*0a6a1f1dSLionel Sambucthat represents types, source locations, and variable locations.
22f4a2713aSLionel Sambuc
23*0a6a1f1dSLionel SambucThe short summary of this chapter is that we'll go through the
24*0a6a1f1dSLionel Sambucvarious things you have to add to a programming language to
25*0a6a1f1dSLionel Sambucsupport debug info, and how you translate that into DWARF.
26f4a2713aSLionel Sambuc
27*0a6a1f1dSLionel SambucCaveat: For now we can't debug via the JIT, so we'll need to compile
28*0a6a1f1dSLionel Sambucour program down to something small and standalone. As part of this
29*0a6a1f1dSLionel Sambucwe'll make a few modifications to the running of the language and
30*0a6a1f1dSLionel Sambuchow programs are compiled. This means that we'll have a source file
31*0a6a1f1dSLionel Sambucwith a simple program written in Kaleidoscope rather than the
32*0a6a1f1dSLionel Sambucinteractive JIT. It does involve a limitation that we can only
33*0a6a1f1dSLionel Sambuchave one "top level" command at a time to reduce the number of
34*0a6a1f1dSLionel Sambucchanges necessary.
35f4a2713aSLionel Sambuc
36*0a6a1f1dSLionel SambucHere's the sample program we'll be compiling:
37f4a2713aSLionel Sambuc
38*0a6a1f1dSLionel Sambuc.. code-block:: python
39f4a2713aSLionel Sambuc
40*0a6a1f1dSLionel Sambuc   def fib(x)
41*0a6a1f1dSLionel Sambuc     if x < 3 then
42*0a6a1f1dSLionel Sambuc       1
43*0a6a1f1dSLionel Sambuc     else
44*0a6a1f1dSLionel Sambuc       fib(x-1)+fib(x-2);
45f4a2713aSLionel Sambuc
46*0a6a1f1dSLionel Sambuc   fib(10)
47f4a2713aSLionel Sambuc
48f4a2713aSLionel Sambuc
49*0a6a1f1dSLionel SambucWhy is this a hard problem?
50*0a6a1f1dSLionel Sambuc===========================
51f4a2713aSLionel Sambuc
52*0a6a1f1dSLionel SambucDebug information is a hard problem for a few different reasons - mostly
53*0a6a1f1dSLionel Sambuccentered around optimized code. First, optimization makes keeping source
54*0a6a1f1dSLionel Sambuclocations more difficult. In LLVM IR we keep the original source location
55*0a6a1f1dSLionel Sambucfor each IR level instruction on the instruction. Optimization passes
56*0a6a1f1dSLionel Sambucshould keep the source locations for newly created instructions, but merged
57*0a6a1f1dSLionel Sambucinstructions only get to keep a single location - this can cause jumping
58*0a6a1f1dSLionel Sambucaround when stepping through optimized programs. Secondly, optimization
59*0a6a1f1dSLionel Sambuccan move variables in ways that are either optimized out, shared in memory
60*0a6a1f1dSLionel Sambucwith other variables, or difficult to track. For the purposes of this
61*0a6a1f1dSLionel Sambuctutorial we're going to avoid optimization (as you'll see with one of the
62*0a6a1f1dSLionel Sambucnext sets of patches).
63f4a2713aSLionel Sambuc
64*0a6a1f1dSLionel SambucAhead-of-Time Compilation Mode
65*0a6a1f1dSLionel Sambuc==============================
66f4a2713aSLionel Sambuc
67*0a6a1f1dSLionel SambucTo highlight only the aspects of adding debug information to a source
68*0a6a1f1dSLionel Sambuclanguage without needing to worry about the complexities of JIT debugging
69*0a6a1f1dSLionel Sambucwe're going to make a few changes to Kaleidoscope to support compiling
70*0a6a1f1dSLionel Sambucthe IR emitted by the front end into a simple standalone program that
71*0a6a1f1dSLionel Sambucyou can execute, debug, and see results.
72f4a2713aSLionel Sambuc
73*0a6a1f1dSLionel SambucFirst we make our anonymous function that contains our top level
74*0a6a1f1dSLionel Sambucstatement be our "main":
75f4a2713aSLionel Sambuc
76*0a6a1f1dSLionel Sambuc.. code-block:: udiff
77f4a2713aSLionel Sambuc
78*0a6a1f1dSLionel Sambuc  -    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
79*0a6a1f1dSLionel Sambuc  +    PrototypeAST *Proto = new PrototypeAST("main", std::vector<std::string>());
80f4a2713aSLionel Sambuc
81*0a6a1f1dSLionel Sambucjust with the simple change of giving it a name.
82f4a2713aSLionel Sambuc
83*0a6a1f1dSLionel SambucThen we're going to remove the command line code wherever it exists:
84f4a2713aSLionel Sambuc
85*0a6a1f1dSLionel Sambuc.. code-block:: udiff
86f4a2713aSLionel Sambuc
87*0a6a1f1dSLionel Sambuc  @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
88*0a6a1f1dSLionel Sambuc   /// top ::= definition | external | expression | ';'
89*0a6a1f1dSLionel Sambuc   static void MainLoop() {
90*0a6a1f1dSLionel Sambuc     while (1) {
91*0a6a1f1dSLionel Sambuc  -    fprintf(stderr, "ready> ");
92*0a6a1f1dSLionel Sambuc       switch (CurTok) {
93*0a6a1f1dSLionel Sambuc       case tok_eof:
94*0a6a1f1dSLionel Sambuc         return;
95*0a6a1f1dSLionel Sambuc  @@ -1184,7 +1183,6 @@ int main() {
96*0a6a1f1dSLionel Sambuc     BinopPrecedence['*'] = 40; // highest.
97f4a2713aSLionel Sambuc
98*0a6a1f1dSLionel Sambuc     // Prime the first token.
99*0a6a1f1dSLionel Sambuc  -  fprintf(stderr, "ready> ");
100*0a6a1f1dSLionel Sambuc     getNextToken();
101f4a2713aSLionel Sambuc
102*0a6a1f1dSLionel SambucLastly we're going to disable all of the optimization passes and the JIT so
103*0a6a1f1dSLionel Sambucthat the only thing that happens after we're done parsing and generating
104*0a6a1f1dSLionel Sambuccode is that the llvm IR goes to standard error:
105f4a2713aSLionel Sambuc
106*0a6a1f1dSLionel Sambuc.. code-block:: udiff
107f4a2713aSLionel Sambuc
108*0a6a1f1dSLionel Sambuc  @@ -1108,17 +1108,8 @@ static void HandleExtern() {
109*0a6a1f1dSLionel Sambuc   static void HandleTopLevelExpression() {
110*0a6a1f1dSLionel Sambuc     // Evaluate a top-level expression into an anonymous function.
111*0a6a1f1dSLionel Sambuc     if (FunctionAST *F = ParseTopLevelExpr()) {
112*0a6a1f1dSLionel Sambuc  -    if (Function *LF = F->Codegen()) {
113*0a6a1f1dSLionel Sambuc  -      // We're just doing this to make sure it executes.
114*0a6a1f1dSLionel Sambuc  -      TheExecutionEngine->finalizeObject();
115*0a6a1f1dSLionel Sambuc  -      // JIT the function, returning a function pointer.
116*0a6a1f1dSLionel Sambuc  -      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
117*0a6a1f1dSLionel Sambuc  -
118*0a6a1f1dSLionel Sambuc  -      // Cast it to the right type (takes no arguments, returns a double) so we
119*0a6a1f1dSLionel Sambuc  -      // can call it as a native function.
120*0a6a1f1dSLionel Sambuc  -      double (*FP)() = (double (*)())(intptr_t)FPtr;
121*0a6a1f1dSLionel Sambuc  -      // Ignore the return value for this.
122*0a6a1f1dSLionel Sambuc  -      (void)FP;
123*0a6a1f1dSLionel Sambuc  +    if (!F->Codegen()) {
124*0a6a1f1dSLionel Sambuc  +      fprintf(stderr, "Error generating code for top level expr");
125*0a6a1f1dSLionel Sambuc       }
126*0a6a1f1dSLionel Sambuc     } else {
127*0a6a1f1dSLionel Sambuc       // Skip token for error recovery.
128*0a6a1f1dSLionel Sambuc  @@ -1439,11 +1459,11 @@ int main() {
129*0a6a1f1dSLionel Sambuc     // target lays out data structures.
130*0a6a1f1dSLionel Sambuc     TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
131*0a6a1f1dSLionel Sambuc     OurFPM.add(new DataLayoutPass());
132*0a6a1f1dSLionel Sambuc  +#if 0
133*0a6a1f1dSLionel Sambuc     OurFPM.add(createBasicAliasAnalysisPass());
134*0a6a1f1dSLionel Sambuc     // Promote allocas to registers.
135*0a6a1f1dSLionel Sambuc     OurFPM.add(createPromoteMemoryToRegisterPass());
136*0a6a1f1dSLionel Sambuc  @@ -1218,7 +1210,7 @@ int main() {
137*0a6a1f1dSLionel Sambuc     OurFPM.add(createGVNPass());
138*0a6a1f1dSLionel Sambuc     // Simplify the control flow graph (deleting unreachable blocks, etc).
139*0a6a1f1dSLionel Sambuc     OurFPM.add(createCFGSimplificationPass());
140*0a6a1f1dSLionel Sambuc  -
141*0a6a1f1dSLionel Sambuc  +  #endif
142*0a6a1f1dSLionel Sambuc     OurFPM.doInitialization();
143f4a2713aSLionel Sambuc
144*0a6a1f1dSLionel Sambuc     // Set the global so the code gen can use this.
145f4a2713aSLionel Sambuc
146*0a6a1f1dSLionel SambucThis relatively small set of changes get us to the point that we can compile
147*0a6a1f1dSLionel Sambucour piece of Kaleidoscope language down to an executable program via this
148*0a6a1f1dSLionel Sambuccommand line:
149f4a2713aSLionel Sambuc
150*0a6a1f1dSLionel Sambuc.. code-block:: bash
151f4a2713aSLionel Sambuc
152*0a6a1f1dSLionel Sambuc  Kaleidoscope-Ch8 < fib.ks | & clang -x ir -
153f4a2713aSLionel Sambuc
154*0a6a1f1dSLionel Sambucwhich gives an a.out/a.exe in the current working directory.
155f4a2713aSLionel Sambuc
156*0a6a1f1dSLionel SambucCompile Unit
157*0a6a1f1dSLionel Sambuc============
158f4a2713aSLionel Sambuc
159*0a6a1f1dSLionel SambucThe top level container for a section of code in DWARF is a compile unit.
160*0a6a1f1dSLionel SambucThis contains the type and function data for an individual translation unit
161*0a6a1f1dSLionel Sambuc(read: one file of source code). So the first thing we need to do is
162*0a6a1f1dSLionel Sambucconstruct one for our fib.ks file.
163f4a2713aSLionel Sambuc
164*0a6a1f1dSLionel SambucDWARF Emission Setup
165*0a6a1f1dSLionel Sambuc====================
166f4a2713aSLionel Sambuc
167*0a6a1f1dSLionel SambucSimilar to the ``IRBuilder`` class we have a
168*0a6a1f1dSLionel Sambuc```DIBuilder`` <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
169*0a6a1f1dSLionel Sambucthat helps in constructing debug metadata for an llvm IR file. It
170*0a6a1f1dSLionel Sambuccorresponds 1:1 similarly to ``IRBuilder`` and llvm IR, but with nicer names.
171*0a6a1f1dSLionel SambucUsing it does require that you be more familiar with DWARF terminology than
172*0a6a1f1dSLionel Sambucyou needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
173*0a6a1f1dSLionel Sambucread through the general documentation on the
174*0a6a1f1dSLionel Sambuc```Metadata Format`` <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
175*0a6a1f1dSLionel Sambucshould be a little more clear. We'll be using this class to construct all
176*0a6a1f1dSLionel Sambucof our IR level descriptions. Construction for it takes a module so we
177*0a6a1f1dSLionel Sambucneed to construct it shortly after we construct our module. We've left it
178*0a6a1f1dSLionel Sambucas a global static variable to make it a bit easier to use.
179f4a2713aSLionel Sambuc
180*0a6a1f1dSLionel SambucNext we're going to create a small container to cache some of our frequent
181*0a6a1f1dSLionel Sambucdata. The first will be our compile unit, but we'll also write a bit of
182*0a6a1f1dSLionel Sambuccode for our one type since we won't have to worry about multiple typed
183*0a6a1f1dSLionel Sambucexpressions:
184f4a2713aSLionel Sambuc
185*0a6a1f1dSLionel Sambuc.. code-block:: c++
186*0a6a1f1dSLionel Sambuc
187*0a6a1f1dSLionel Sambuc  static DIBuilder *DBuilder;
188*0a6a1f1dSLionel Sambuc
189*0a6a1f1dSLionel Sambuc  struct DebugInfo {
190*0a6a1f1dSLionel Sambuc    DICompileUnit TheCU;
191*0a6a1f1dSLionel Sambuc    DIType DblTy;
192*0a6a1f1dSLionel Sambuc
193*0a6a1f1dSLionel Sambuc    DIType getDoubleTy();
194*0a6a1f1dSLionel Sambuc  } KSDbgInfo;
195*0a6a1f1dSLionel Sambuc
196*0a6a1f1dSLionel Sambuc  DIType DebugInfo::getDoubleTy() {
197*0a6a1f1dSLionel Sambuc    if (DblTy.isValid())
198*0a6a1f1dSLionel Sambuc      return DblTy;
199*0a6a1f1dSLionel Sambuc
200*0a6a1f1dSLionel Sambuc    DblTy = DBuilder->createBasicType("double", 64, 64, dwarf::DW_ATE_float);
201*0a6a1f1dSLionel Sambuc    return DblTy;
202*0a6a1f1dSLionel Sambuc  }
203*0a6a1f1dSLionel Sambuc
204*0a6a1f1dSLionel SambucAnd then later on in ``main`` when we're constructing our module:
205*0a6a1f1dSLionel Sambuc
206*0a6a1f1dSLionel Sambuc.. code-block:: c++
207*0a6a1f1dSLionel Sambuc
208*0a6a1f1dSLionel Sambuc  DBuilder = new DIBuilder(*TheModule);
209*0a6a1f1dSLionel Sambuc
210*0a6a1f1dSLionel Sambuc  KSDbgInfo.TheCU = DBuilder->createCompileUnit(
211*0a6a1f1dSLionel Sambuc      dwarf::DW_LANG_C, "fib.ks", ".", "Kaleidoscope Compiler", 0, "", 0);
212*0a6a1f1dSLionel Sambuc
213*0a6a1f1dSLionel SambucThere are a couple of things to note here. First, while we're producing a
214*0a6a1f1dSLionel Sambuccompile unit for a language called Kaleidoscope we used the language
215*0a6a1f1dSLionel Sambucconstant for C. This is because a debugger wouldn't necessarily understand
216*0a6a1f1dSLionel Sambucthe calling conventions or default ABI for a language it doesn't recognize
217*0a6a1f1dSLionel Sambucand we follow the C ABI in our llvm code generation so it's the closest
218*0a6a1f1dSLionel Sambucthing to accurate. This ensures we can actually call functions from the
219*0a6a1f1dSLionel Sambucdebugger and have them execute. Secondly, you'll see the "fib.ks" in the
220*0a6a1f1dSLionel Sambuccall to ``createCompileUnit``. This is a default hard coded value since
221*0a6a1f1dSLionel Sambucwe're using shell redirection to put our source into the Kaleidoscope
222*0a6a1f1dSLionel Sambuccompiler. In a usual front end you'd have an input file name and it would
223*0a6a1f1dSLionel Sambucgo there.
224*0a6a1f1dSLionel Sambuc
225*0a6a1f1dSLionel SambucOne last thing as part of emitting debug information via DIBuilder is that
226*0a6a1f1dSLionel Sambucwe need to "finalize" the debug information. The reasons are part of the
227*0a6a1f1dSLionel Sambucunderlying API for DIBuilder, but make sure you do this near the end of
228*0a6a1f1dSLionel Sambucmain:
229*0a6a1f1dSLionel Sambuc
230*0a6a1f1dSLionel Sambuc.. code-block:: c++
231*0a6a1f1dSLionel Sambuc
232*0a6a1f1dSLionel Sambuc  DBuilder->finalize();
233*0a6a1f1dSLionel Sambuc
234*0a6a1f1dSLionel Sambucbefore you dump out the module.
235*0a6a1f1dSLionel Sambuc
236*0a6a1f1dSLionel SambucFunctions
237*0a6a1f1dSLionel Sambuc=========
238*0a6a1f1dSLionel Sambuc
239*0a6a1f1dSLionel SambucNow that we have our ``Compile Unit`` and our source locations, we can add
240*0a6a1f1dSLionel Sambucfunction definitions to the debug info. So in ``PrototypeAST::Codegen`` we
241*0a6a1f1dSLionel Sambucadd a few lines of code to describe a context for our subprogram, in this
242*0a6a1f1dSLionel Sambuccase the "File", and the actual definition of the function itself.
243*0a6a1f1dSLionel Sambuc
244*0a6a1f1dSLionel SambucSo the context:
245*0a6a1f1dSLionel Sambuc
246*0a6a1f1dSLionel Sambuc.. code-block:: c++
247*0a6a1f1dSLionel Sambuc
248*0a6a1f1dSLionel Sambuc  DIFile Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
249*0a6a1f1dSLionel Sambuc                                     KSDbgInfo.TheCU.getDirectory());
250*0a6a1f1dSLionel Sambuc
251*0a6a1f1dSLionel Sambucgiving us a DIFile and asking the ``Compile Unit`` we created above for the
252*0a6a1f1dSLionel Sambucdirectory and filename where we are currently. Then, for now, we use some
253*0a6a1f1dSLionel Sambucsource locations of 0 (since our AST doesn't currently have source location
254*0a6a1f1dSLionel Sambucinformation) and construct our function definition:
255*0a6a1f1dSLionel Sambuc
256*0a6a1f1dSLionel Sambuc.. code-block:: c++
257*0a6a1f1dSLionel Sambuc
258*0a6a1f1dSLionel Sambuc  DIDescriptor FContext(Unit);
259*0a6a1f1dSLionel Sambuc  unsigned LineNo = 0;
260*0a6a1f1dSLionel Sambuc  unsigned ScopeLine = 0;
261*0a6a1f1dSLionel Sambuc  DISubprogram SP = DBuilder->createFunction(
262*0a6a1f1dSLionel Sambuc      FContext, Name, StringRef(), Unit, LineNo,
263*0a6a1f1dSLionel Sambuc      CreateFunctionType(Args.size(), Unit), false /* internal linkage */,
264*0a6a1f1dSLionel Sambuc      true /* definition */, ScopeLine, DIDescriptor::FlagPrototyped, false, F);
265*0a6a1f1dSLionel Sambuc
266*0a6a1f1dSLionel Sambucand we now have a DISubprogram that contains a reference to all of our metadata
267*0a6a1f1dSLionel Sambucfor the function.
268*0a6a1f1dSLionel Sambuc
269*0a6a1f1dSLionel SambucSource Locations
270*0a6a1f1dSLionel Sambuc================
271*0a6a1f1dSLionel Sambuc
272*0a6a1f1dSLionel SambucThe most important thing for debug information is accurate source location -
273*0a6a1f1dSLionel Sambucthis makes it possible to map your source code back. We have a problem though,
274*0a6a1f1dSLionel SambucKaleidoscope really doesn't have any source location information in the lexer
275*0a6a1f1dSLionel Sambucor parser so we'll need to add it.
276*0a6a1f1dSLionel Sambuc
277*0a6a1f1dSLionel Sambuc.. code-block:: c++
278*0a6a1f1dSLionel Sambuc
279*0a6a1f1dSLionel Sambuc   struct SourceLocation {
280*0a6a1f1dSLionel Sambuc     int Line;
281*0a6a1f1dSLionel Sambuc     int Col;
282*0a6a1f1dSLionel Sambuc   };
283*0a6a1f1dSLionel Sambuc   static SourceLocation CurLoc;
284*0a6a1f1dSLionel Sambuc   static SourceLocation LexLoc = {1, 0};
285*0a6a1f1dSLionel Sambuc
286*0a6a1f1dSLionel Sambuc   static int advance() {
287*0a6a1f1dSLionel Sambuc     int LastChar = getchar();
288*0a6a1f1dSLionel Sambuc
289*0a6a1f1dSLionel Sambuc     if (LastChar == '\n' || LastChar == '\r') {
290*0a6a1f1dSLionel Sambuc       LexLoc.Line++;
291*0a6a1f1dSLionel Sambuc       LexLoc.Col = 0;
292*0a6a1f1dSLionel Sambuc     } else
293*0a6a1f1dSLionel Sambuc       LexLoc.Col++;
294*0a6a1f1dSLionel Sambuc     return LastChar;
295*0a6a1f1dSLionel Sambuc   }
296*0a6a1f1dSLionel Sambuc
297*0a6a1f1dSLionel SambucIn this set of code we've added some functionality on how to keep track of the
298*0a6a1f1dSLionel Sambucline and column of the "source file". As we lex every token we set our current
299*0a6a1f1dSLionel Sambuccurrent "lexical location" to the assorted line and column for the beginning
300*0a6a1f1dSLionel Sambucof the token. We do this by overriding all of the previous calls to
301*0a6a1f1dSLionel Sambuc``getchar()`` with our new ``advance()`` that keeps track of the information
302*0a6a1f1dSLionel Sambucand then we have added to all of our AST classes a source location:
303*0a6a1f1dSLionel Sambuc
304*0a6a1f1dSLionel Sambuc.. code-block:: c++
305*0a6a1f1dSLionel Sambuc
306*0a6a1f1dSLionel Sambuc   class ExprAST {
307*0a6a1f1dSLionel Sambuc     SourceLocation Loc;
308*0a6a1f1dSLionel Sambuc
309*0a6a1f1dSLionel Sambuc     public:
310*0a6a1f1dSLionel Sambuc       int getLine() const { return Loc.Line; }
311*0a6a1f1dSLionel Sambuc       int getCol() const { return Loc.Col; }
312*0a6a1f1dSLionel Sambuc       ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
313*0a6a1f1dSLionel Sambuc       virtual std::ostream &dump(std::ostream &out, int ind) {
314*0a6a1f1dSLionel Sambuc         return out << ':' << getLine() << ':' << getCol() << '\n';
315*0a6a1f1dSLionel Sambuc       }
316*0a6a1f1dSLionel Sambuc
317*0a6a1f1dSLionel Sambucthat we pass down through when we create a new expression:
318*0a6a1f1dSLionel Sambuc
319*0a6a1f1dSLionel Sambuc.. code-block:: c++
320*0a6a1f1dSLionel Sambuc
321*0a6a1f1dSLionel Sambuc   LHS = new BinaryExprAST(BinLoc, BinOp, LHS, RHS);
322*0a6a1f1dSLionel Sambuc
323*0a6a1f1dSLionel Sambucgiving us locations for each of our expressions and variables.
324*0a6a1f1dSLionel Sambuc
325*0a6a1f1dSLionel SambucFrom this we can make sure to tell ``DIBuilder`` when we're at a new source
326*0a6a1f1dSLionel Sambuclocation so it can use that when we generate the rest of our code and make
327*0a6a1f1dSLionel Sambucsure that each instruction has source location information. We do this
328*0a6a1f1dSLionel Sambucby constructing another small function:
329*0a6a1f1dSLionel Sambuc
330*0a6a1f1dSLionel Sambuc.. code-block:: c++
331*0a6a1f1dSLionel Sambuc
332*0a6a1f1dSLionel Sambuc  void DebugInfo::emitLocation(ExprAST *AST) {
333*0a6a1f1dSLionel Sambuc    DIScope *Scope;
334*0a6a1f1dSLionel Sambuc    if (LexicalBlocks.empty())
335*0a6a1f1dSLionel Sambuc      Scope = &TheCU;
336*0a6a1f1dSLionel Sambuc    else
337*0a6a1f1dSLionel Sambuc      Scope = LexicalBlocks.back();
338*0a6a1f1dSLionel Sambuc    Builder.SetCurrentDebugLocation(
339*0a6a1f1dSLionel Sambuc        DebugLoc::get(AST->getLine(), AST->getCol(), DIScope(*Scope)));
340*0a6a1f1dSLionel Sambuc  }
341*0a6a1f1dSLionel Sambuc
342*0a6a1f1dSLionel Sambucthat both tells the main ``IRBuilder`` where we are, but also what scope
343*0a6a1f1dSLionel Sambucwe're in. Since we've just created a function above we can either be in
344*0a6a1f1dSLionel Sambucthe main file scope (like when we created our function), or now we can be
345*0a6a1f1dSLionel Sambucin the function scope we just created. To represent this we create a stack
346*0a6a1f1dSLionel Sambucof scopes:
347*0a6a1f1dSLionel Sambuc
348*0a6a1f1dSLionel Sambuc.. code-block:: c++
349*0a6a1f1dSLionel Sambuc
350*0a6a1f1dSLionel Sambuc   std::vector<DIScope *> LexicalBlocks;
351*0a6a1f1dSLionel Sambuc   std::map<const PrototypeAST *, DIScope> FnScopeMap;
352*0a6a1f1dSLionel Sambuc
353*0a6a1f1dSLionel Sambucand keep a map of each function to the scope that it represents (a DISubprogram
354*0a6a1f1dSLionel Sambucis also a DIScope).
355*0a6a1f1dSLionel Sambuc
356*0a6a1f1dSLionel SambucThen we make sure to:
357*0a6a1f1dSLionel Sambuc
358*0a6a1f1dSLionel Sambuc.. code-block:: c++
359*0a6a1f1dSLionel Sambuc
360*0a6a1f1dSLionel Sambuc   KSDbgInfo.emitLocation(this);
361*0a6a1f1dSLionel Sambuc
362*0a6a1f1dSLionel Sambucemit the location every time we start to generate code for a new AST, and
363*0a6a1f1dSLionel Sambucalso:
364*0a6a1f1dSLionel Sambuc
365*0a6a1f1dSLionel Sambuc.. code-block:: c++
366*0a6a1f1dSLionel Sambuc
367*0a6a1f1dSLionel Sambuc  KSDbgInfo.FnScopeMap[this] = SP;
368*0a6a1f1dSLionel Sambuc
369*0a6a1f1dSLionel Sambucstore the scope (function) when we create it and use it:
370*0a6a1f1dSLionel Sambuc
371*0a6a1f1dSLionel Sambuc  KSDbgInfo.LexicalBlocks.push_back(&KSDbgInfo.FnScopeMap[Proto]);
372*0a6a1f1dSLionel Sambuc
373*0a6a1f1dSLionel Sambucwhen we start generating the code for each function.
374*0a6a1f1dSLionel Sambuc
375*0a6a1f1dSLionel Sambucalso, don't forget to pop the scope back off of your scope stack at the
376*0a6a1f1dSLionel Sambucend of the code generation for the function:
377*0a6a1f1dSLionel Sambuc
378*0a6a1f1dSLionel Sambuc.. code-block:: c++
379*0a6a1f1dSLionel Sambuc
380*0a6a1f1dSLionel Sambuc  // Pop off the lexical block for the function since we added it
381*0a6a1f1dSLionel Sambuc  // unconditionally.
382*0a6a1f1dSLionel Sambuc  KSDbgInfo.LexicalBlocks.pop_back();
383*0a6a1f1dSLionel Sambuc
384*0a6a1f1dSLionel SambucVariables
385*0a6a1f1dSLionel Sambuc=========
386*0a6a1f1dSLionel Sambuc
387*0a6a1f1dSLionel SambucNow that we have functions, we need to be able to print out the variables
388*0a6a1f1dSLionel Sambucwe have in scope. Let's get our function arguments set up so we can get
389*0a6a1f1dSLionel Sambucdecent backtraces and see how our functions are being called. It isn't
390*0a6a1f1dSLionel Sambuca lot of code, and we generally handle it when we're creating the
391*0a6a1f1dSLionel Sambucargument allocas in ``PrototypeAST::CreateArgumentAllocas``.
392*0a6a1f1dSLionel Sambuc
393*0a6a1f1dSLionel Sambuc.. code-block:: c++
394*0a6a1f1dSLionel Sambuc
395*0a6a1f1dSLionel Sambuc  DIScope *Scope = KSDbgInfo.LexicalBlocks.back();
396*0a6a1f1dSLionel Sambuc  DIFile Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
397*0a6a1f1dSLionel Sambuc                                     KSDbgInfo.TheCU.getDirectory());
398*0a6a1f1dSLionel Sambuc  DIVariable D = DBuilder->createLocalVariable(dwarf::DW_TAG_arg_variable,
399*0a6a1f1dSLionel Sambuc                                               *Scope, Args[Idx], Unit, Line,
400*0a6a1f1dSLionel Sambuc                                               KSDbgInfo.getDoubleTy(), Idx);
401*0a6a1f1dSLionel Sambuc
402*0a6a1f1dSLionel Sambuc  Instruction *Call = DBuilder->insertDeclare(
403*0a6a1f1dSLionel Sambuc      Alloca, D, DBuilder->createExpression(), Builder.GetInsertBlock());
404*0a6a1f1dSLionel Sambuc  Call->setDebugLoc(DebugLoc::get(Line, 0, *Scope));
405*0a6a1f1dSLionel Sambuc
406*0a6a1f1dSLionel SambucHere we're doing a few things. First, we're grabbing our current scope
407*0a6a1f1dSLionel Sambucfor the variable so we can say what range of code our variable is valid
408*0a6a1f1dSLionel Sambucthrough. Second, we're creating the variable, giving it the scope,
409*0a6a1f1dSLionel Sambucthe name, source location, type, and since it's an argument, the argument
410*0a6a1f1dSLionel Sambucindex. Third, we create an ``lvm.dbg.declare`` call to indicate at the IR
411*0a6a1f1dSLionel Sambuclevel that we've got a variable in an alloca (and it gives a starting
412*0a6a1f1dSLionel Sambuclocation for the variable). Lastly, we set a source location for the
413*0a6a1f1dSLionel Sambucbeginning of the scope on the declare.
414*0a6a1f1dSLionel Sambuc
415*0a6a1f1dSLionel SambucOne interesting thing to note at this point is that various debuggers have
416*0a6a1f1dSLionel Sambucassumptions based on how code and debug information was generated for them
417*0a6a1f1dSLionel Sambucin the past. In this case we need to do a little bit of a hack to avoid
418*0a6a1f1dSLionel Sambucgenerating line information for the function prologue so that the debugger
419*0a6a1f1dSLionel Sambucknows to skip over those instructions when setting a breakpoint. So in
420*0a6a1f1dSLionel Sambuc``FunctionAST::CodeGen`` we add a couple of lines:
421*0a6a1f1dSLionel Sambuc
422*0a6a1f1dSLionel Sambuc.. code-block:: c++
423*0a6a1f1dSLionel Sambuc
424*0a6a1f1dSLionel Sambuc  // Unset the location for the prologue emission (leading instructions with no
425*0a6a1f1dSLionel Sambuc  // location in a function are considered part of the prologue and the debugger
426*0a6a1f1dSLionel Sambuc  // will run past them when breaking on a function)
427*0a6a1f1dSLionel Sambuc  KSDbgInfo.emitLocation(nullptr);
428*0a6a1f1dSLionel Sambuc
429*0a6a1f1dSLionel Sambucand then emit a new location when we actually start generating code for the
430*0a6a1f1dSLionel Sambucbody of the function:
431*0a6a1f1dSLionel Sambuc
432*0a6a1f1dSLionel Sambuc.. code-block:: c++
433*0a6a1f1dSLionel Sambuc
434*0a6a1f1dSLionel Sambuc  KSDbgInfo.emitLocation(Body);
435*0a6a1f1dSLionel Sambuc
436*0a6a1f1dSLionel SambucWith this we have enough debug information to set breakpoints in functions,
437*0a6a1f1dSLionel Sambucprint out argument variables, and call functions. Not too bad for just a
438*0a6a1f1dSLionel Sambucfew simple lines of code!
439*0a6a1f1dSLionel Sambuc
440*0a6a1f1dSLionel SambucFull Code Listing
441*0a6a1f1dSLionel Sambuc=================
442*0a6a1f1dSLionel Sambuc
443*0a6a1f1dSLionel SambucHere is the complete code listing for our running example, enhanced with
444*0a6a1f1dSLionel Sambucdebug information. To build this example, use:
445*0a6a1f1dSLionel Sambuc
446*0a6a1f1dSLionel Sambuc.. code-block:: bash
447*0a6a1f1dSLionel Sambuc
448*0a6a1f1dSLionel Sambuc    # Compile
449*0a6a1f1dSLionel Sambuc    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
450*0a6a1f1dSLionel Sambuc    # Run
451*0a6a1f1dSLionel Sambuc    ./toy
452*0a6a1f1dSLionel Sambuc
453*0a6a1f1dSLionel SambucHere is the code:
454*0a6a1f1dSLionel Sambuc
455*0a6a1f1dSLionel Sambuc.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
456*0a6a1f1dSLionel Sambuc   :language: c++
457*0a6a1f1dSLionel Sambuc
458*0a6a1f1dSLionel Sambuc`Next: Conclusion and other useful LLVM tidbits <LangImpl9.html>`_
459f4a2713aSLionel Sambuc
460