1*0a6a1f1dSLionel Sambuc======================================================= 2*0a6a1f1dSLionel SambucKaleidoscope: Extending the Language: Debug Information 3*0a6a1f1dSLionel Sambuc======================================================= 4f4a2713aSLionel Sambuc 5f4a2713aSLionel Sambuc.. contents:: 6f4a2713aSLionel Sambuc :local: 7f4a2713aSLionel Sambuc 8*0a6a1f1dSLionel SambucChapter 8 Introduction 9*0a6a1f1dSLionel Sambuc====================== 10f4a2713aSLionel Sambuc 11*0a6a1f1dSLionel SambucWelcome to Chapter 8 of the "`Implementing a language with 12*0a6a1f1dSLionel SambucLLVM <index.html>`_" tutorial. In chapters 1 through 7, we've built a 13*0a6a1f1dSLionel Sambucdecent little programming language with functions and variables. 14*0a6a1f1dSLionel SambucWhat happens if something goes wrong though, how do you debug your 15*0a6a1f1dSLionel Sambucprogram? 16f4a2713aSLionel Sambuc 17*0a6a1f1dSLionel SambucSource level debugging uses formatted data that helps a debugger 18*0a6a1f1dSLionel Sambuctranslate from binary and the state of the machine back to the 19*0a6a1f1dSLionel Sambucsource that the programmer wrote. In LLVM we generally use a format 20*0a6a1f1dSLionel Sambuccalled `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding 21*0a6a1f1dSLionel Sambucthat represents types, source locations, and variable locations. 22f4a2713aSLionel Sambuc 23*0a6a1f1dSLionel SambucThe short summary of this chapter is that we'll go through the 24*0a6a1f1dSLionel Sambucvarious things you have to add to a programming language to 25*0a6a1f1dSLionel Sambucsupport debug info, and how you translate that into DWARF. 26f4a2713aSLionel Sambuc 27*0a6a1f1dSLionel SambucCaveat: For now we can't debug via the JIT, so we'll need to compile 28*0a6a1f1dSLionel Sambucour program down to something small and standalone. As part of this 29*0a6a1f1dSLionel Sambucwe'll make a few modifications to the running of the language and 30*0a6a1f1dSLionel Sambuchow programs are compiled. This means that we'll have a source file 31*0a6a1f1dSLionel Sambucwith a simple program written in Kaleidoscope rather than the 32*0a6a1f1dSLionel Sambucinteractive JIT. It does involve a limitation that we can only 33*0a6a1f1dSLionel Sambuchave one "top level" command at a time to reduce the number of 34*0a6a1f1dSLionel Sambucchanges necessary. 35f4a2713aSLionel Sambuc 36*0a6a1f1dSLionel SambucHere's the sample program we'll be compiling: 37f4a2713aSLionel Sambuc 38*0a6a1f1dSLionel Sambuc.. code-block:: python 39f4a2713aSLionel Sambuc 40*0a6a1f1dSLionel Sambuc def fib(x) 41*0a6a1f1dSLionel Sambuc if x < 3 then 42*0a6a1f1dSLionel Sambuc 1 43*0a6a1f1dSLionel Sambuc else 44*0a6a1f1dSLionel Sambuc fib(x-1)+fib(x-2); 45f4a2713aSLionel Sambuc 46*0a6a1f1dSLionel Sambuc fib(10) 47f4a2713aSLionel Sambuc 48f4a2713aSLionel Sambuc 49*0a6a1f1dSLionel SambucWhy is this a hard problem? 50*0a6a1f1dSLionel Sambuc=========================== 51f4a2713aSLionel Sambuc 52*0a6a1f1dSLionel SambucDebug information is a hard problem for a few different reasons - mostly 53*0a6a1f1dSLionel Sambuccentered around optimized code. First, optimization makes keeping source 54*0a6a1f1dSLionel Sambuclocations more difficult. In LLVM IR we keep the original source location 55*0a6a1f1dSLionel Sambucfor each IR level instruction on the instruction. Optimization passes 56*0a6a1f1dSLionel Sambucshould keep the source locations for newly created instructions, but merged 57*0a6a1f1dSLionel Sambucinstructions only get to keep a single location - this can cause jumping 58*0a6a1f1dSLionel Sambucaround when stepping through optimized programs. Secondly, optimization 59*0a6a1f1dSLionel Sambuccan move variables in ways that are either optimized out, shared in memory 60*0a6a1f1dSLionel Sambucwith other variables, or difficult to track. For the purposes of this 61*0a6a1f1dSLionel Sambuctutorial we're going to avoid optimization (as you'll see with one of the 62*0a6a1f1dSLionel Sambucnext sets of patches). 63f4a2713aSLionel Sambuc 64*0a6a1f1dSLionel SambucAhead-of-Time Compilation Mode 65*0a6a1f1dSLionel Sambuc============================== 66f4a2713aSLionel Sambuc 67*0a6a1f1dSLionel SambucTo highlight only the aspects of adding debug information to a source 68*0a6a1f1dSLionel Sambuclanguage without needing to worry about the complexities of JIT debugging 69*0a6a1f1dSLionel Sambucwe're going to make a few changes to Kaleidoscope to support compiling 70*0a6a1f1dSLionel Sambucthe IR emitted by the front end into a simple standalone program that 71*0a6a1f1dSLionel Sambucyou can execute, debug, and see results. 72f4a2713aSLionel Sambuc 73*0a6a1f1dSLionel SambucFirst we make our anonymous function that contains our top level 74*0a6a1f1dSLionel Sambucstatement be our "main": 75f4a2713aSLionel Sambuc 76*0a6a1f1dSLionel Sambuc.. code-block:: udiff 77f4a2713aSLionel Sambuc 78*0a6a1f1dSLionel Sambuc - PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); 79*0a6a1f1dSLionel Sambuc + PrototypeAST *Proto = new PrototypeAST("main", std::vector<std::string>()); 80f4a2713aSLionel Sambuc 81*0a6a1f1dSLionel Sambucjust with the simple change of giving it a name. 82f4a2713aSLionel Sambuc 83*0a6a1f1dSLionel SambucThen we're going to remove the command line code wherever it exists: 84f4a2713aSLionel Sambuc 85*0a6a1f1dSLionel Sambuc.. code-block:: udiff 86f4a2713aSLionel Sambuc 87*0a6a1f1dSLionel Sambuc @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() { 88*0a6a1f1dSLionel Sambuc /// top ::= definition | external | expression | ';' 89*0a6a1f1dSLionel Sambuc static void MainLoop() { 90*0a6a1f1dSLionel Sambuc while (1) { 91*0a6a1f1dSLionel Sambuc - fprintf(stderr, "ready> "); 92*0a6a1f1dSLionel Sambuc switch (CurTok) { 93*0a6a1f1dSLionel Sambuc case tok_eof: 94*0a6a1f1dSLionel Sambuc return; 95*0a6a1f1dSLionel Sambuc @@ -1184,7 +1183,6 @@ int main() { 96*0a6a1f1dSLionel Sambuc BinopPrecedence['*'] = 40; // highest. 97f4a2713aSLionel Sambuc 98*0a6a1f1dSLionel Sambuc // Prime the first token. 99*0a6a1f1dSLionel Sambuc - fprintf(stderr, "ready> "); 100*0a6a1f1dSLionel Sambuc getNextToken(); 101f4a2713aSLionel Sambuc 102*0a6a1f1dSLionel SambucLastly we're going to disable all of the optimization passes and the JIT so 103*0a6a1f1dSLionel Sambucthat the only thing that happens after we're done parsing and generating 104*0a6a1f1dSLionel Sambuccode is that the llvm IR goes to standard error: 105f4a2713aSLionel Sambuc 106*0a6a1f1dSLionel Sambuc.. code-block:: udiff 107f4a2713aSLionel Sambuc 108*0a6a1f1dSLionel Sambuc @@ -1108,17 +1108,8 @@ static void HandleExtern() { 109*0a6a1f1dSLionel Sambuc static void HandleTopLevelExpression() { 110*0a6a1f1dSLionel Sambuc // Evaluate a top-level expression into an anonymous function. 111*0a6a1f1dSLionel Sambuc if (FunctionAST *F = ParseTopLevelExpr()) { 112*0a6a1f1dSLionel Sambuc - if (Function *LF = F->Codegen()) { 113*0a6a1f1dSLionel Sambuc - // We're just doing this to make sure it executes. 114*0a6a1f1dSLionel Sambuc - TheExecutionEngine->finalizeObject(); 115*0a6a1f1dSLionel Sambuc - // JIT the function, returning a function pointer. 116*0a6a1f1dSLionel Sambuc - void *FPtr = TheExecutionEngine->getPointerToFunction(LF); 117*0a6a1f1dSLionel Sambuc - 118*0a6a1f1dSLionel Sambuc - // Cast it to the right type (takes no arguments, returns a double) so we 119*0a6a1f1dSLionel Sambuc - // can call it as a native function. 120*0a6a1f1dSLionel Sambuc - double (*FP)() = (double (*)())(intptr_t)FPtr; 121*0a6a1f1dSLionel Sambuc - // Ignore the return value for this. 122*0a6a1f1dSLionel Sambuc - (void)FP; 123*0a6a1f1dSLionel Sambuc + if (!F->Codegen()) { 124*0a6a1f1dSLionel Sambuc + fprintf(stderr, "Error generating code for top level expr"); 125*0a6a1f1dSLionel Sambuc } 126*0a6a1f1dSLionel Sambuc } else { 127*0a6a1f1dSLionel Sambuc // Skip token for error recovery. 128*0a6a1f1dSLionel Sambuc @@ -1439,11 +1459,11 @@ int main() { 129*0a6a1f1dSLionel Sambuc // target lays out data structures. 130*0a6a1f1dSLionel Sambuc TheModule->setDataLayout(TheExecutionEngine->getDataLayout()); 131*0a6a1f1dSLionel Sambuc OurFPM.add(new DataLayoutPass()); 132*0a6a1f1dSLionel Sambuc +#if 0 133*0a6a1f1dSLionel Sambuc OurFPM.add(createBasicAliasAnalysisPass()); 134*0a6a1f1dSLionel Sambuc // Promote allocas to registers. 135*0a6a1f1dSLionel Sambuc OurFPM.add(createPromoteMemoryToRegisterPass()); 136*0a6a1f1dSLionel Sambuc @@ -1218,7 +1210,7 @@ int main() { 137*0a6a1f1dSLionel Sambuc OurFPM.add(createGVNPass()); 138*0a6a1f1dSLionel Sambuc // Simplify the control flow graph (deleting unreachable blocks, etc). 139*0a6a1f1dSLionel Sambuc OurFPM.add(createCFGSimplificationPass()); 140*0a6a1f1dSLionel Sambuc - 141*0a6a1f1dSLionel Sambuc + #endif 142*0a6a1f1dSLionel Sambuc OurFPM.doInitialization(); 143f4a2713aSLionel Sambuc 144*0a6a1f1dSLionel Sambuc // Set the global so the code gen can use this. 145f4a2713aSLionel Sambuc 146*0a6a1f1dSLionel SambucThis relatively small set of changes get us to the point that we can compile 147*0a6a1f1dSLionel Sambucour piece of Kaleidoscope language down to an executable program via this 148*0a6a1f1dSLionel Sambuccommand line: 149f4a2713aSLionel Sambuc 150*0a6a1f1dSLionel Sambuc.. code-block:: bash 151f4a2713aSLionel Sambuc 152*0a6a1f1dSLionel Sambuc Kaleidoscope-Ch8 < fib.ks | & clang -x ir - 153f4a2713aSLionel Sambuc 154*0a6a1f1dSLionel Sambucwhich gives an a.out/a.exe in the current working directory. 155f4a2713aSLionel Sambuc 156*0a6a1f1dSLionel SambucCompile Unit 157*0a6a1f1dSLionel Sambuc============ 158f4a2713aSLionel Sambuc 159*0a6a1f1dSLionel SambucThe top level container for a section of code in DWARF is a compile unit. 160*0a6a1f1dSLionel SambucThis contains the type and function data for an individual translation unit 161*0a6a1f1dSLionel Sambuc(read: one file of source code). So the first thing we need to do is 162*0a6a1f1dSLionel Sambucconstruct one for our fib.ks file. 163f4a2713aSLionel Sambuc 164*0a6a1f1dSLionel SambucDWARF Emission Setup 165*0a6a1f1dSLionel Sambuc==================== 166f4a2713aSLionel Sambuc 167*0a6a1f1dSLionel SambucSimilar to the ``IRBuilder`` class we have a 168*0a6a1f1dSLionel Sambuc```DIBuilder`` <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class 169*0a6a1f1dSLionel Sambucthat helps in constructing debug metadata for an llvm IR file. It 170*0a6a1f1dSLionel Sambuccorresponds 1:1 similarly to ``IRBuilder`` and llvm IR, but with nicer names. 171*0a6a1f1dSLionel SambucUsing it does require that you be more familiar with DWARF terminology than 172*0a6a1f1dSLionel Sambucyou needed to be with ``IRBuilder`` and ``Instruction`` names, but if you 173*0a6a1f1dSLionel Sambucread through the general documentation on the 174*0a6a1f1dSLionel Sambuc```Metadata Format`` <http://llvm.org/docs/SourceLevelDebugging.html>`_ it 175*0a6a1f1dSLionel Sambucshould be a little more clear. We'll be using this class to construct all 176*0a6a1f1dSLionel Sambucof our IR level descriptions. Construction for it takes a module so we 177*0a6a1f1dSLionel Sambucneed to construct it shortly after we construct our module. We've left it 178*0a6a1f1dSLionel Sambucas a global static variable to make it a bit easier to use. 179f4a2713aSLionel Sambuc 180*0a6a1f1dSLionel SambucNext we're going to create a small container to cache some of our frequent 181*0a6a1f1dSLionel Sambucdata. The first will be our compile unit, but we'll also write a bit of 182*0a6a1f1dSLionel Sambuccode for our one type since we won't have to worry about multiple typed 183*0a6a1f1dSLionel Sambucexpressions: 184f4a2713aSLionel Sambuc 185*0a6a1f1dSLionel Sambuc.. code-block:: c++ 186*0a6a1f1dSLionel Sambuc 187*0a6a1f1dSLionel Sambuc static DIBuilder *DBuilder; 188*0a6a1f1dSLionel Sambuc 189*0a6a1f1dSLionel Sambuc struct DebugInfo { 190*0a6a1f1dSLionel Sambuc DICompileUnit TheCU; 191*0a6a1f1dSLionel Sambuc DIType DblTy; 192*0a6a1f1dSLionel Sambuc 193*0a6a1f1dSLionel Sambuc DIType getDoubleTy(); 194*0a6a1f1dSLionel Sambuc } KSDbgInfo; 195*0a6a1f1dSLionel Sambuc 196*0a6a1f1dSLionel Sambuc DIType DebugInfo::getDoubleTy() { 197*0a6a1f1dSLionel Sambuc if (DblTy.isValid()) 198*0a6a1f1dSLionel Sambuc return DblTy; 199*0a6a1f1dSLionel Sambuc 200*0a6a1f1dSLionel Sambuc DblTy = DBuilder->createBasicType("double", 64, 64, dwarf::DW_ATE_float); 201*0a6a1f1dSLionel Sambuc return DblTy; 202*0a6a1f1dSLionel Sambuc } 203*0a6a1f1dSLionel Sambuc 204*0a6a1f1dSLionel SambucAnd then later on in ``main`` when we're constructing our module: 205*0a6a1f1dSLionel Sambuc 206*0a6a1f1dSLionel Sambuc.. code-block:: c++ 207*0a6a1f1dSLionel Sambuc 208*0a6a1f1dSLionel Sambuc DBuilder = new DIBuilder(*TheModule); 209*0a6a1f1dSLionel Sambuc 210*0a6a1f1dSLionel Sambuc KSDbgInfo.TheCU = DBuilder->createCompileUnit( 211*0a6a1f1dSLionel Sambuc dwarf::DW_LANG_C, "fib.ks", ".", "Kaleidoscope Compiler", 0, "", 0); 212*0a6a1f1dSLionel Sambuc 213*0a6a1f1dSLionel SambucThere are a couple of things to note here. First, while we're producing a 214*0a6a1f1dSLionel Sambuccompile unit for a language called Kaleidoscope we used the language 215*0a6a1f1dSLionel Sambucconstant for C. This is because a debugger wouldn't necessarily understand 216*0a6a1f1dSLionel Sambucthe calling conventions or default ABI for a language it doesn't recognize 217*0a6a1f1dSLionel Sambucand we follow the C ABI in our llvm code generation so it's the closest 218*0a6a1f1dSLionel Sambucthing to accurate. This ensures we can actually call functions from the 219*0a6a1f1dSLionel Sambucdebugger and have them execute. Secondly, you'll see the "fib.ks" in the 220*0a6a1f1dSLionel Sambuccall to ``createCompileUnit``. This is a default hard coded value since 221*0a6a1f1dSLionel Sambucwe're using shell redirection to put our source into the Kaleidoscope 222*0a6a1f1dSLionel Sambuccompiler. In a usual front end you'd have an input file name and it would 223*0a6a1f1dSLionel Sambucgo there. 224*0a6a1f1dSLionel Sambuc 225*0a6a1f1dSLionel SambucOne last thing as part of emitting debug information via DIBuilder is that 226*0a6a1f1dSLionel Sambucwe need to "finalize" the debug information. The reasons are part of the 227*0a6a1f1dSLionel Sambucunderlying API for DIBuilder, but make sure you do this near the end of 228*0a6a1f1dSLionel Sambucmain: 229*0a6a1f1dSLionel Sambuc 230*0a6a1f1dSLionel Sambuc.. code-block:: c++ 231*0a6a1f1dSLionel Sambuc 232*0a6a1f1dSLionel Sambuc DBuilder->finalize(); 233*0a6a1f1dSLionel Sambuc 234*0a6a1f1dSLionel Sambucbefore you dump out the module. 235*0a6a1f1dSLionel Sambuc 236*0a6a1f1dSLionel SambucFunctions 237*0a6a1f1dSLionel Sambuc========= 238*0a6a1f1dSLionel Sambuc 239*0a6a1f1dSLionel SambucNow that we have our ``Compile Unit`` and our source locations, we can add 240*0a6a1f1dSLionel Sambucfunction definitions to the debug info. So in ``PrototypeAST::Codegen`` we 241*0a6a1f1dSLionel Sambucadd a few lines of code to describe a context for our subprogram, in this 242*0a6a1f1dSLionel Sambuccase the "File", and the actual definition of the function itself. 243*0a6a1f1dSLionel Sambuc 244*0a6a1f1dSLionel SambucSo the context: 245*0a6a1f1dSLionel Sambuc 246*0a6a1f1dSLionel Sambuc.. code-block:: c++ 247*0a6a1f1dSLionel Sambuc 248*0a6a1f1dSLionel Sambuc DIFile Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(), 249*0a6a1f1dSLionel Sambuc KSDbgInfo.TheCU.getDirectory()); 250*0a6a1f1dSLionel Sambuc 251*0a6a1f1dSLionel Sambucgiving us a DIFile and asking the ``Compile Unit`` we created above for the 252*0a6a1f1dSLionel Sambucdirectory and filename where we are currently. Then, for now, we use some 253*0a6a1f1dSLionel Sambucsource locations of 0 (since our AST doesn't currently have source location 254*0a6a1f1dSLionel Sambucinformation) and construct our function definition: 255*0a6a1f1dSLionel Sambuc 256*0a6a1f1dSLionel Sambuc.. code-block:: c++ 257*0a6a1f1dSLionel Sambuc 258*0a6a1f1dSLionel Sambuc DIDescriptor FContext(Unit); 259*0a6a1f1dSLionel Sambuc unsigned LineNo = 0; 260*0a6a1f1dSLionel Sambuc unsigned ScopeLine = 0; 261*0a6a1f1dSLionel Sambuc DISubprogram SP = DBuilder->createFunction( 262*0a6a1f1dSLionel Sambuc FContext, Name, StringRef(), Unit, LineNo, 263*0a6a1f1dSLionel Sambuc CreateFunctionType(Args.size(), Unit), false /* internal linkage */, 264*0a6a1f1dSLionel Sambuc true /* definition */, ScopeLine, DIDescriptor::FlagPrototyped, false, F); 265*0a6a1f1dSLionel Sambuc 266*0a6a1f1dSLionel Sambucand we now have a DISubprogram that contains a reference to all of our metadata 267*0a6a1f1dSLionel Sambucfor the function. 268*0a6a1f1dSLionel Sambuc 269*0a6a1f1dSLionel SambucSource Locations 270*0a6a1f1dSLionel Sambuc================ 271*0a6a1f1dSLionel Sambuc 272*0a6a1f1dSLionel SambucThe most important thing for debug information is accurate source location - 273*0a6a1f1dSLionel Sambucthis makes it possible to map your source code back. We have a problem though, 274*0a6a1f1dSLionel SambucKaleidoscope really doesn't have any source location information in the lexer 275*0a6a1f1dSLionel Sambucor parser so we'll need to add it. 276*0a6a1f1dSLionel Sambuc 277*0a6a1f1dSLionel Sambuc.. code-block:: c++ 278*0a6a1f1dSLionel Sambuc 279*0a6a1f1dSLionel Sambuc struct SourceLocation { 280*0a6a1f1dSLionel Sambuc int Line; 281*0a6a1f1dSLionel Sambuc int Col; 282*0a6a1f1dSLionel Sambuc }; 283*0a6a1f1dSLionel Sambuc static SourceLocation CurLoc; 284*0a6a1f1dSLionel Sambuc static SourceLocation LexLoc = {1, 0}; 285*0a6a1f1dSLionel Sambuc 286*0a6a1f1dSLionel Sambuc static int advance() { 287*0a6a1f1dSLionel Sambuc int LastChar = getchar(); 288*0a6a1f1dSLionel Sambuc 289*0a6a1f1dSLionel Sambuc if (LastChar == '\n' || LastChar == '\r') { 290*0a6a1f1dSLionel Sambuc LexLoc.Line++; 291*0a6a1f1dSLionel Sambuc LexLoc.Col = 0; 292*0a6a1f1dSLionel Sambuc } else 293*0a6a1f1dSLionel Sambuc LexLoc.Col++; 294*0a6a1f1dSLionel Sambuc return LastChar; 295*0a6a1f1dSLionel Sambuc } 296*0a6a1f1dSLionel Sambuc 297*0a6a1f1dSLionel SambucIn this set of code we've added some functionality on how to keep track of the 298*0a6a1f1dSLionel Sambucline and column of the "source file". As we lex every token we set our current 299*0a6a1f1dSLionel Sambuccurrent "lexical location" to the assorted line and column for the beginning 300*0a6a1f1dSLionel Sambucof the token. We do this by overriding all of the previous calls to 301*0a6a1f1dSLionel Sambuc``getchar()`` with our new ``advance()`` that keeps track of the information 302*0a6a1f1dSLionel Sambucand then we have added to all of our AST classes a source location: 303*0a6a1f1dSLionel Sambuc 304*0a6a1f1dSLionel Sambuc.. code-block:: c++ 305*0a6a1f1dSLionel Sambuc 306*0a6a1f1dSLionel Sambuc class ExprAST { 307*0a6a1f1dSLionel Sambuc SourceLocation Loc; 308*0a6a1f1dSLionel Sambuc 309*0a6a1f1dSLionel Sambuc public: 310*0a6a1f1dSLionel Sambuc int getLine() const { return Loc.Line; } 311*0a6a1f1dSLionel Sambuc int getCol() const { return Loc.Col; } 312*0a6a1f1dSLionel Sambuc ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {} 313*0a6a1f1dSLionel Sambuc virtual std::ostream &dump(std::ostream &out, int ind) { 314*0a6a1f1dSLionel Sambuc return out << ':' << getLine() << ':' << getCol() << '\n'; 315*0a6a1f1dSLionel Sambuc } 316*0a6a1f1dSLionel Sambuc 317*0a6a1f1dSLionel Sambucthat we pass down through when we create a new expression: 318*0a6a1f1dSLionel Sambuc 319*0a6a1f1dSLionel Sambuc.. code-block:: c++ 320*0a6a1f1dSLionel Sambuc 321*0a6a1f1dSLionel Sambuc LHS = new BinaryExprAST(BinLoc, BinOp, LHS, RHS); 322*0a6a1f1dSLionel Sambuc 323*0a6a1f1dSLionel Sambucgiving us locations for each of our expressions and variables. 324*0a6a1f1dSLionel Sambuc 325*0a6a1f1dSLionel SambucFrom this we can make sure to tell ``DIBuilder`` when we're at a new source 326*0a6a1f1dSLionel Sambuclocation so it can use that when we generate the rest of our code and make 327*0a6a1f1dSLionel Sambucsure that each instruction has source location information. We do this 328*0a6a1f1dSLionel Sambucby constructing another small function: 329*0a6a1f1dSLionel Sambuc 330*0a6a1f1dSLionel Sambuc.. code-block:: c++ 331*0a6a1f1dSLionel Sambuc 332*0a6a1f1dSLionel Sambuc void DebugInfo::emitLocation(ExprAST *AST) { 333*0a6a1f1dSLionel Sambuc DIScope *Scope; 334*0a6a1f1dSLionel Sambuc if (LexicalBlocks.empty()) 335*0a6a1f1dSLionel Sambuc Scope = &TheCU; 336*0a6a1f1dSLionel Sambuc else 337*0a6a1f1dSLionel Sambuc Scope = LexicalBlocks.back(); 338*0a6a1f1dSLionel Sambuc Builder.SetCurrentDebugLocation( 339*0a6a1f1dSLionel Sambuc DebugLoc::get(AST->getLine(), AST->getCol(), DIScope(*Scope))); 340*0a6a1f1dSLionel Sambuc } 341*0a6a1f1dSLionel Sambuc 342*0a6a1f1dSLionel Sambucthat both tells the main ``IRBuilder`` where we are, but also what scope 343*0a6a1f1dSLionel Sambucwe're in. Since we've just created a function above we can either be in 344*0a6a1f1dSLionel Sambucthe main file scope (like when we created our function), or now we can be 345*0a6a1f1dSLionel Sambucin the function scope we just created. To represent this we create a stack 346*0a6a1f1dSLionel Sambucof scopes: 347*0a6a1f1dSLionel Sambuc 348*0a6a1f1dSLionel Sambuc.. code-block:: c++ 349*0a6a1f1dSLionel Sambuc 350*0a6a1f1dSLionel Sambuc std::vector<DIScope *> LexicalBlocks; 351*0a6a1f1dSLionel Sambuc std::map<const PrototypeAST *, DIScope> FnScopeMap; 352*0a6a1f1dSLionel Sambuc 353*0a6a1f1dSLionel Sambucand keep a map of each function to the scope that it represents (a DISubprogram 354*0a6a1f1dSLionel Sambucis also a DIScope). 355*0a6a1f1dSLionel Sambuc 356*0a6a1f1dSLionel SambucThen we make sure to: 357*0a6a1f1dSLionel Sambuc 358*0a6a1f1dSLionel Sambuc.. code-block:: c++ 359*0a6a1f1dSLionel Sambuc 360*0a6a1f1dSLionel Sambuc KSDbgInfo.emitLocation(this); 361*0a6a1f1dSLionel Sambuc 362*0a6a1f1dSLionel Sambucemit the location every time we start to generate code for a new AST, and 363*0a6a1f1dSLionel Sambucalso: 364*0a6a1f1dSLionel Sambuc 365*0a6a1f1dSLionel Sambuc.. code-block:: c++ 366*0a6a1f1dSLionel Sambuc 367*0a6a1f1dSLionel Sambuc KSDbgInfo.FnScopeMap[this] = SP; 368*0a6a1f1dSLionel Sambuc 369*0a6a1f1dSLionel Sambucstore the scope (function) when we create it and use it: 370*0a6a1f1dSLionel Sambuc 371*0a6a1f1dSLionel Sambuc KSDbgInfo.LexicalBlocks.push_back(&KSDbgInfo.FnScopeMap[Proto]); 372*0a6a1f1dSLionel Sambuc 373*0a6a1f1dSLionel Sambucwhen we start generating the code for each function. 374*0a6a1f1dSLionel Sambuc 375*0a6a1f1dSLionel Sambucalso, don't forget to pop the scope back off of your scope stack at the 376*0a6a1f1dSLionel Sambucend of the code generation for the function: 377*0a6a1f1dSLionel Sambuc 378*0a6a1f1dSLionel Sambuc.. code-block:: c++ 379*0a6a1f1dSLionel Sambuc 380*0a6a1f1dSLionel Sambuc // Pop off the lexical block for the function since we added it 381*0a6a1f1dSLionel Sambuc // unconditionally. 382*0a6a1f1dSLionel Sambuc KSDbgInfo.LexicalBlocks.pop_back(); 383*0a6a1f1dSLionel Sambuc 384*0a6a1f1dSLionel SambucVariables 385*0a6a1f1dSLionel Sambuc========= 386*0a6a1f1dSLionel Sambuc 387*0a6a1f1dSLionel SambucNow that we have functions, we need to be able to print out the variables 388*0a6a1f1dSLionel Sambucwe have in scope. Let's get our function arguments set up so we can get 389*0a6a1f1dSLionel Sambucdecent backtraces and see how our functions are being called. It isn't 390*0a6a1f1dSLionel Sambuca lot of code, and we generally handle it when we're creating the 391*0a6a1f1dSLionel Sambucargument allocas in ``PrototypeAST::CreateArgumentAllocas``. 392*0a6a1f1dSLionel Sambuc 393*0a6a1f1dSLionel Sambuc.. code-block:: c++ 394*0a6a1f1dSLionel Sambuc 395*0a6a1f1dSLionel Sambuc DIScope *Scope = KSDbgInfo.LexicalBlocks.back(); 396*0a6a1f1dSLionel Sambuc DIFile Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(), 397*0a6a1f1dSLionel Sambuc KSDbgInfo.TheCU.getDirectory()); 398*0a6a1f1dSLionel Sambuc DIVariable D = DBuilder->createLocalVariable(dwarf::DW_TAG_arg_variable, 399*0a6a1f1dSLionel Sambuc *Scope, Args[Idx], Unit, Line, 400*0a6a1f1dSLionel Sambuc KSDbgInfo.getDoubleTy(), Idx); 401*0a6a1f1dSLionel Sambuc 402*0a6a1f1dSLionel Sambuc Instruction *Call = DBuilder->insertDeclare( 403*0a6a1f1dSLionel Sambuc Alloca, D, DBuilder->createExpression(), Builder.GetInsertBlock()); 404*0a6a1f1dSLionel Sambuc Call->setDebugLoc(DebugLoc::get(Line, 0, *Scope)); 405*0a6a1f1dSLionel Sambuc 406*0a6a1f1dSLionel SambucHere we're doing a few things. First, we're grabbing our current scope 407*0a6a1f1dSLionel Sambucfor the variable so we can say what range of code our variable is valid 408*0a6a1f1dSLionel Sambucthrough. Second, we're creating the variable, giving it the scope, 409*0a6a1f1dSLionel Sambucthe name, source location, type, and since it's an argument, the argument 410*0a6a1f1dSLionel Sambucindex. Third, we create an ``lvm.dbg.declare`` call to indicate at the IR 411*0a6a1f1dSLionel Sambuclevel that we've got a variable in an alloca (and it gives a starting 412*0a6a1f1dSLionel Sambuclocation for the variable). Lastly, we set a source location for the 413*0a6a1f1dSLionel Sambucbeginning of the scope on the declare. 414*0a6a1f1dSLionel Sambuc 415*0a6a1f1dSLionel SambucOne interesting thing to note at this point is that various debuggers have 416*0a6a1f1dSLionel Sambucassumptions based on how code and debug information was generated for them 417*0a6a1f1dSLionel Sambucin the past. In this case we need to do a little bit of a hack to avoid 418*0a6a1f1dSLionel Sambucgenerating line information for the function prologue so that the debugger 419*0a6a1f1dSLionel Sambucknows to skip over those instructions when setting a breakpoint. So in 420*0a6a1f1dSLionel Sambuc``FunctionAST::CodeGen`` we add a couple of lines: 421*0a6a1f1dSLionel Sambuc 422*0a6a1f1dSLionel Sambuc.. code-block:: c++ 423*0a6a1f1dSLionel Sambuc 424*0a6a1f1dSLionel Sambuc // Unset the location for the prologue emission (leading instructions with no 425*0a6a1f1dSLionel Sambuc // location in a function are considered part of the prologue and the debugger 426*0a6a1f1dSLionel Sambuc // will run past them when breaking on a function) 427*0a6a1f1dSLionel Sambuc KSDbgInfo.emitLocation(nullptr); 428*0a6a1f1dSLionel Sambuc 429*0a6a1f1dSLionel Sambucand then emit a new location when we actually start generating code for the 430*0a6a1f1dSLionel Sambucbody of the function: 431*0a6a1f1dSLionel Sambuc 432*0a6a1f1dSLionel Sambuc.. code-block:: c++ 433*0a6a1f1dSLionel Sambuc 434*0a6a1f1dSLionel Sambuc KSDbgInfo.emitLocation(Body); 435*0a6a1f1dSLionel Sambuc 436*0a6a1f1dSLionel SambucWith this we have enough debug information to set breakpoints in functions, 437*0a6a1f1dSLionel Sambucprint out argument variables, and call functions. Not too bad for just a 438*0a6a1f1dSLionel Sambucfew simple lines of code! 439*0a6a1f1dSLionel Sambuc 440*0a6a1f1dSLionel SambucFull Code Listing 441*0a6a1f1dSLionel Sambuc================= 442*0a6a1f1dSLionel Sambuc 443*0a6a1f1dSLionel SambucHere is the complete code listing for our running example, enhanced with 444*0a6a1f1dSLionel Sambucdebug information. To build this example, use: 445*0a6a1f1dSLionel Sambuc 446*0a6a1f1dSLionel Sambuc.. code-block:: bash 447*0a6a1f1dSLionel Sambuc 448*0a6a1f1dSLionel Sambuc # Compile 449*0a6a1f1dSLionel Sambuc clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy 450*0a6a1f1dSLionel Sambuc # Run 451*0a6a1f1dSLionel Sambuc ./toy 452*0a6a1f1dSLionel Sambuc 453*0a6a1f1dSLionel SambucHere is the code: 454*0a6a1f1dSLionel Sambuc 455*0a6a1f1dSLionel Sambuc.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp 456*0a6a1f1dSLionel Sambuc :language: c++ 457*0a6a1f1dSLionel Sambuc 458*0a6a1f1dSLionel Sambuc`Next: Conclusion and other useful LLVM tidbits <LangImpl9.html>`_ 459f4a2713aSLionel Sambuc 460