13a799deeSJacques Pienaar# Chapter 1: Toy Language and AST 25b4a01d4SMehdi Amini 35b4a01d4SMehdi Amini[TOC] 45b4a01d4SMehdi Amini 55b4a01d4SMehdi Amini## The Language 65b4a01d4SMehdi Amini 75b4a01d4SMehdi AminiThis tutorial will be illustrated with a toy language that we’ll call “Toy” 85b4a01d4SMehdi Amini(naming is hard...). Toy is a tensor-based language that allows you to define 95b4a01d4SMehdi Aminifunctions, perform some math computation, and print results. 105b4a01d4SMehdi Amini 115b4a01d4SMehdi AminiGiven that we want to keep things simple, the codegen will be limited to tensors 125b4a01d4SMehdi Aminiof rank <= 2, and the only datatype in Toy is a 64-bit floating point type (aka 135b4a01d4SMehdi Amini‘double’ in C parlance). As such, all values are implicitly double precision, 145b4a01d4SMehdi Amini`Values` are immutable (i.e. every operation returns a newly allocated value), 155b4a01d4SMehdi Aminiand deallocation is automatically managed. But enough with the long description; 165b4a01d4SMehdi Amininothing is better than walking through an example to get a better understanding: 175b4a01d4SMehdi Amini 18430bba2aSJacques Pienaar```toy 195b4a01d4SMehdi Aminidef main() { 205b4a01d4SMehdi Amini # Define a variable `a` with shape <2, 3>, initialized with the literal value. 215b4a01d4SMehdi Amini # The shape is inferred from the supplied literal. 225b4a01d4SMehdi Amini var a = [[1, 2, 3], [4, 5, 6]]; 235b4a01d4SMehdi Amini 245b4a01d4SMehdi Amini # b is identical to a, the literal tensor is implicitly reshaped: defining new 255b4a01d4SMehdi Amini # variables is the way to reshape tensors (element count must match). 265b4a01d4SMehdi Amini var b<2, 3> = [1, 2, 3, 4, 5, 6]; 275b4a01d4SMehdi Amini 285b4a01d4SMehdi Amini # transpose() and print() are the only builtin, the following will transpose 295b4a01d4SMehdi Amini # a and b and perform an element-wise multiplication before printing the result. 305b4a01d4SMehdi Amini print(transpose(a) * transpose(b)); 315b4a01d4SMehdi Amini} 325b4a01d4SMehdi Amini``` 335b4a01d4SMehdi Amini 345b4a01d4SMehdi AminiType checking is statically performed through type inference; the language only 355b4a01d4SMehdi Aminirequires type declarations to specify tensor shapes when needed. Functions are 365b4a01d4SMehdi Aminigeneric: their parameters are unranked (in other words, we know these are 375b4a01d4SMehdi Aminitensors, but we don't know their dimensions). They are specialized for every 385b4a01d4SMehdi Amininewly discovered signature at call sites. Let's revisit the previous example by 395b4a01d4SMehdi Aminiadding a user-defined function: 405b4a01d4SMehdi Amini 41430bba2aSJacques Pienaar```toy 425b4a01d4SMehdi Amini# User defined generic function that operates on unknown shaped arguments. 435b4a01d4SMehdi Aminidef multiply_transpose(a, b) { 445b4a01d4SMehdi Amini return transpose(a) * transpose(b); 455b4a01d4SMehdi Amini} 465b4a01d4SMehdi Amini 475b4a01d4SMehdi Aminidef main() { 485b4a01d4SMehdi Amini # Define a variable `a` with shape <2, 3>, initialized with the literal value. 495b4a01d4SMehdi Amini var a = [[1, 2, 3], [4, 5, 6]]; 505b4a01d4SMehdi Amini var b<2, 3> = [1, 2, 3, 4, 5, 6]; 515b4a01d4SMehdi Amini 525b4a01d4SMehdi Amini # This call will specialize `multiply_transpose` with <2, 3> for both 535b4a01d4SMehdi Amini # arguments and deduce a return type of <3, 2> in initialization of `c`. 545b4a01d4SMehdi Amini var c = multiply_transpose(a, b); 555b4a01d4SMehdi Amini 565b4a01d4SMehdi Amini # A second call to `multiply_transpose` with <2, 3> for both arguments will 575b4a01d4SMehdi Amini # reuse the previously specialized and inferred version and return <3, 2>. 585b4a01d4SMehdi Amini var d = multiply_transpose(b, a); 595b4a01d4SMehdi Amini 605b4a01d4SMehdi Amini # A new call with <3, 2> (instead of <2, 3>) for both dimensions will 615b4a01d4SMehdi Amini # trigger another specialization of `multiply_transpose`. 62bfedf169SThomas Preud'homme var e = multiply_transpose(c, d); 635b4a01d4SMehdi Amini 64*444bb1f1SAndrey Portnoy # Finally, calling into `multiply_transpose` with incompatible shapes 65*444bb1f1SAndrey Portnoy # (<2, 3> and <3, 2>) will trigger a shape inference error. 66*444bb1f1SAndrey Portnoy var f = multiply_transpose(a, c); 675b4a01d4SMehdi Amini} 685b4a01d4SMehdi Amini``` 695b4a01d4SMehdi Amini 705b4a01d4SMehdi Amini## The AST 715b4a01d4SMehdi Amini 725b4a01d4SMehdi AminiThe AST from the above code is fairly straightforward; here is a dump of it: 735b4a01d4SMehdi Amini 745b4a01d4SMehdi Amini``` 755b4a01d4SMehdi AminiModule: 765b4a01d4SMehdi Amini Function 77*444bb1f1SAndrey Portnoy Proto 'multiply_transpose' @test/Examples/Toy/Ch1/ast.toy:4:1 78495cf272SLucy Fox Params: [a, b] 795b4a01d4SMehdi Amini Block { 805b4a01d4SMehdi Amini Return 81495cf272SLucy Fox BinOp: * @test/Examples/Toy/Ch1/ast.toy:5:25 82495cf272SLucy Fox Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:10 83495cf272SLucy Fox var: a @test/Examples/Toy/Ch1/ast.toy:5:20 845b4a01d4SMehdi Amini ] 85495cf272SLucy Fox Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:25 86495cf272SLucy Fox var: b @test/Examples/Toy/Ch1/ast.toy:5:35 875b4a01d4SMehdi Amini ] 885b4a01d4SMehdi Amini } // Block 895b4a01d4SMehdi Amini Function 90*444bb1f1SAndrey Portnoy Proto 'main' @test/Examples/Toy/Ch1/ast.toy:8:1 91495cf272SLucy Fox Params: [] 925b4a01d4SMehdi Amini Block { 93495cf272SLucy Fox VarDecl a<> @test/Examples/Toy/Ch1/ast.toy:11:3 94495cf272SLucy Fox Literal: <2, 3>[ <3>[ 1.000000e+00, 2.000000e+00, 3.000000e+00], <3>[ 4.000000e+00, 5.000000e+00, 6.000000e+00]] @test/Examples/Toy/Ch1/ast.toy:11:11 95495cf272SLucy Fox VarDecl b<2, 3> @test/Examples/Toy/Ch1/ast.toy:15:3 96495cf272SLucy Fox Literal: <6>[ 1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00] @test/Examples/Toy/Ch1/ast.toy:15:17 97495cf272SLucy Fox VarDecl c<> @test/Examples/Toy/Ch1/ast.toy:19:3 98495cf272SLucy Fox Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:19:11 99495cf272SLucy Fox var: a @test/Examples/Toy/Ch1/ast.toy:19:30 100495cf272SLucy Fox var: b @test/Examples/Toy/Ch1/ast.toy:19:33 1015b4a01d4SMehdi Amini ] 102495cf272SLucy Fox VarDecl d<> @test/Examples/Toy/Ch1/ast.toy:22:3 103495cf272SLucy Fox Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:22:11 104495cf272SLucy Fox var: b @test/Examples/Toy/Ch1/ast.toy:22:30 105495cf272SLucy Fox var: a @test/Examples/Toy/Ch1/ast.toy:22:33 1065b4a01d4SMehdi Amini ] 107495cf272SLucy Fox VarDecl e<> @test/Examples/Toy/Ch1/ast.toy:25:3 108495cf272SLucy Fox Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:25:11 109bfedf169SThomas Preud'homme var: c @test/Examples/Toy/Ch1/ast.toy:25:30 110bfedf169SThomas Preud'homme var: d @test/Examples/Toy/Ch1/ast.toy:25:33 1115b4a01d4SMehdi Amini ] 112495cf272SLucy Fox VarDecl f<> @test/Examples/Toy/Ch1/ast.toy:28:3 113495cf272SLucy Fox Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:28:11 114*444bb1f1SAndrey Portnoy var: a @test/Examples/Toy/Ch1/ast.toy:28:30 115*444bb1f1SAndrey Portnoy var: c @test/Examples/Toy/Ch1/ast.toy:28:33 1165b4a01d4SMehdi Amini ] 1175b4a01d4SMehdi Amini } // Block 1185b4a01d4SMehdi Amini``` 1195b4a01d4SMehdi Amini 1205b4a01d4SMehdi AminiYou can reproduce this result and play with the example in the 1215b4a01d4SMehdi Amini`examples/toy/Ch1/` directory; try running `path/to/BUILD/bin/toyc-ch1 1225b4a01d4SMehdi Aminitest/Examples/Toy/Ch1/ast.toy -emit=ast`. 1235b4a01d4SMehdi Amini 1245b4a01d4SMehdi AminiThe code for the lexer is fairly straightforward; it is all in a single header: 1255b4a01d4SMehdi Amini`examples/toy/Ch1/include/toy/Lexer.h`. The parser can be found in 1265b4a01d4SMehdi Amini`examples/toy/Ch1/include/toy/Parser.h`; it is a recursive descent parser. If 1275b4a01d4SMehdi Aminiyou are not familiar with such a Lexer/Parser, these are very similar to the 1285b4a01d4SMehdi AminiLLVM Kaleidoscope equivalent that are detailed in the first two chapters of the 1295b4a01d4SMehdi Amini[Kaleidoscope Tutorial](https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl02.html). 1305b4a01d4SMehdi Amini 1315b4a01d4SMehdi AminiThe [next chapter](Ch-2.md) will demonstrate how to convert this AST into MLIR. 132