xref: /llvm-project/mlir/docs/Tutorials/Toy/Ch-1.md (revision 444bb1f1bbbd7450fd30acc4ac1091fe916a740f)
13a799deeSJacques Pienaar# Chapter 1: Toy Language and AST
25b4a01d4SMehdi Amini
35b4a01d4SMehdi Amini[TOC]
45b4a01d4SMehdi Amini
55b4a01d4SMehdi Amini## The Language
65b4a01d4SMehdi Amini
75b4a01d4SMehdi AminiThis tutorial will be illustrated with a toy language that we’ll call “Toy”
85b4a01d4SMehdi Amini(naming is hard...). Toy is a tensor-based language that allows you to define
95b4a01d4SMehdi Aminifunctions, perform some math computation, and print results.
105b4a01d4SMehdi Amini
115b4a01d4SMehdi AminiGiven that we want to keep things simple, the codegen will be limited to tensors
125b4a01d4SMehdi Aminiof rank <= 2, and the only datatype in Toy is a 64-bit floating point type (aka
135b4a01d4SMehdi Amini‘double’ in C parlance). As such, all values are implicitly double precision,
145b4a01d4SMehdi Amini`Values` are immutable (i.e. every operation returns a newly allocated value),
155b4a01d4SMehdi Aminiand deallocation is automatically managed. But enough with the long description;
165b4a01d4SMehdi Amininothing is better than walking through an example to get a better understanding:
175b4a01d4SMehdi Amini
18430bba2aSJacques Pienaar```toy
195b4a01d4SMehdi Aminidef main() {
205b4a01d4SMehdi Amini  # Define a variable `a` with shape <2, 3>, initialized with the literal value.
215b4a01d4SMehdi Amini  # The shape is inferred from the supplied literal.
225b4a01d4SMehdi Amini  var a = [[1, 2, 3], [4, 5, 6]];
235b4a01d4SMehdi Amini
245b4a01d4SMehdi Amini  # b is identical to a, the literal tensor is implicitly reshaped: defining new
255b4a01d4SMehdi Amini  # variables is the way to reshape tensors (element count must match).
265b4a01d4SMehdi Amini  var b<2, 3> = [1, 2, 3, 4, 5, 6];
275b4a01d4SMehdi Amini
285b4a01d4SMehdi Amini  # transpose() and print() are the only builtin, the following will transpose
295b4a01d4SMehdi Amini  # a and b and perform an element-wise multiplication before printing the result.
305b4a01d4SMehdi Amini  print(transpose(a) * transpose(b));
315b4a01d4SMehdi Amini}
325b4a01d4SMehdi Amini```
335b4a01d4SMehdi Amini
345b4a01d4SMehdi AminiType checking is statically performed through type inference; the language only
355b4a01d4SMehdi Aminirequires type declarations to specify tensor shapes when needed. Functions are
365b4a01d4SMehdi Aminigeneric: their parameters are unranked (in other words, we know these are
375b4a01d4SMehdi Aminitensors, but we don't know their dimensions). They are specialized for every
385b4a01d4SMehdi Amininewly discovered signature at call sites. Let's revisit the previous example by
395b4a01d4SMehdi Aminiadding a user-defined function:
405b4a01d4SMehdi Amini
41430bba2aSJacques Pienaar```toy
425b4a01d4SMehdi Amini# User defined generic function that operates on unknown shaped arguments.
435b4a01d4SMehdi Aminidef multiply_transpose(a, b) {
445b4a01d4SMehdi Amini  return transpose(a) * transpose(b);
455b4a01d4SMehdi Amini}
465b4a01d4SMehdi Amini
475b4a01d4SMehdi Aminidef main() {
485b4a01d4SMehdi Amini  # Define a variable `a` with shape <2, 3>, initialized with the literal value.
495b4a01d4SMehdi Amini  var a = [[1, 2, 3], [4, 5, 6]];
505b4a01d4SMehdi Amini  var b<2, 3> = [1, 2, 3, 4, 5, 6];
515b4a01d4SMehdi Amini
525b4a01d4SMehdi Amini  # This call will specialize `multiply_transpose` with <2, 3> for both
535b4a01d4SMehdi Amini  # arguments and deduce a return type of <3, 2> in initialization of `c`.
545b4a01d4SMehdi Amini  var c = multiply_transpose(a, b);
555b4a01d4SMehdi Amini
565b4a01d4SMehdi Amini  # A second call to `multiply_transpose` with <2, 3> for both arguments will
575b4a01d4SMehdi Amini  # reuse the previously specialized and inferred version and return <3, 2>.
585b4a01d4SMehdi Amini  var d = multiply_transpose(b, a);
595b4a01d4SMehdi Amini
605b4a01d4SMehdi Amini  # A new call with <3, 2> (instead of <2, 3>) for both dimensions will
615b4a01d4SMehdi Amini  # trigger another specialization of `multiply_transpose`.
62bfedf169SThomas Preud'homme  var e = multiply_transpose(c, d);
635b4a01d4SMehdi Amini
64*444bb1f1SAndrey Portnoy  # Finally, calling into `multiply_transpose` with incompatible shapes
65*444bb1f1SAndrey Portnoy  # (<2, 3> and <3, 2>) will trigger a shape inference error.
66*444bb1f1SAndrey Portnoy  var f = multiply_transpose(a, c);
675b4a01d4SMehdi Amini}
685b4a01d4SMehdi Amini```
695b4a01d4SMehdi Amini
705b4a01d4SMehdi Amini## The AST
715b4a01d4SMehdi Amini
725b4a01d4SMehdi AminiThe AST from the above code is fairly straightforward; here is a dump of it:
735b4a01d4SMehdi Amini
745b4a01d4SMehdi Amini```
755b4a01d4SMehdi AminiModule:
765b4a01d4SMehdi Amini  Function
77*444bb1f1SAndrey Portnoy    Proto 'multiply_transpose' @test/Examples/Toy/Ch1/ast.toy:4:1
78495cf272SLucy Fox    Params: [a, b]
795b4a01d4SMehdi Amini    Block {
805b4a01d4SMehdi Amini      Return
81495cf272SLucy Fox        BinOp: * @test/Examples/Toy/Ch1/ast.toy:5:25
82495cf272SLucy Fox          Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:10
83495cf272SLucy Fox            var: a @test/Examples/Toy/Ch1/ast.toy:5:20
845b4a01d4SMehdi Amini          ]
85495cf272SLucy Fox          Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:25
86495cf272SLucy Fox            var: b @test/Examples/Toy/Ch1/ast.toy:5:35
875b4a01d4SMehdi Amini          ]
885b4a01d4SMehdi Amini    } // Block
895b4a01d4SMehdi Amini  Function
90*444bb1f1SAndrey Portnoy    Proto 'main' @test/Examples/Toy/Ch1/ast.toy:8:1
91495cf272SLucy Fox    Params: []
925b4a01d4SMehdi Amini    Block {
93495cf272SLucy Fox      VarDecl a<> @test/Examples/Toy/Ch1/ast.toy:11:3
94495cf272SLucy Fox        Literal: <2, 3>[ <3>[ 1.000000e+00, 2.000000e+00, 3.000000e+00], <3>[ 4.000000e+00, 5.000000e+00, 6.000000e+00]] @test/Examples/Toy/Ch1/ast.toy:11:11
95495cf272SLucy Fox      VarDecl b<2, 3> @test/Examples/Toy/Ch1/ast.toy:15:3
96495cf272SLucy Fox        Literal: <6>[ 1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00] @test/Examples/Toy/Ch1/ast.toy:15:17
97495cf272SLucy Fox      VarDecl c<> @test/Examples/Toy/Ch1/ast.toy:19:3
98495cf272SLucy Fox        Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:19:11
99495cf272SLucy Fox          var: a @test/Examples/Toy/Ch1/ast.toy:19:30
100495cf272SLucy Fox          var: b @test/Examples/Toy/Ch1/ast.toy:19:33
1015b4a01d4SMehdi Amini        ]
102495cf272SLucy Fox      VarDecl d<> @test/Examples/Toy/Ch1/ast.toy:22:3
103495cf272SLucy Fox        Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:22:11
104495cf272SLucy Fox          var: b @test/Examples/Toy/Ch1/ast.toy:22:30
105495cf272SLucy Fox          var: a @test/Examples/Toy/Ch1/ast.toy:22:33
1065b4a01d4SMehdi Amini        ]
107495cf272SLucy Fox      VarDecl e<> @test/Examples/Toy/Ch1/ast.toy:25:3
108495cf272SLucy Fox        Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:25:11
109bfedf169SThomas Preud'homme          var: c @test/Examples/Toy/Ch1/ast.toy:25:30
110bfedf169SThomas Preud'homme          var: d @test/Examples/Toy/Ch1/ast.toy:25:33
1115b4a01d4SMehdi Amini        ]
112495cf272SLucy Fox      VarDecl f<> @test/Examples/Toy/Ch1/ast.toy:28:3
113495cf272SLucy Fox        Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:28:11
114*444bb1f1SAndrey Portnoy          var: a @test/Examples/Toy/Ch1/ast.toy:28:30
115*444bb1f1SAndrey Portnoy          var: c @test/Examples/Toy/Ch1/ast.toy:28:33
1165b4a01d4SMehdi Amini        ]
1175b4a01d4SMehdi Amini    } // Block
1185b4a01d4SMehdi Amini```
1195b4a01d4SMehdi Amini
1205b4a01d4SMehdi AminiYou can reproduce this result and play with the example in the
1215b4a01d4SMehdi Amini`examples/toy/Ch1/` directory; try running `path/to/BUILD/bin/toyc-ch1
1225b4a01d4SMehdi Aminitest/Examples/Toy/Ch1/ast.toy -emit=ast`.
1235b4a01d4SMehdi Amini
1245b4a01d4SMehdi AminiThe code for the lexer is fairly straightforward; it is all in a single header:
1255b4a01d4SMehdi Amini`examples/toy/Ch1/include/toy/Lexer.h`. The parser can be found in
1265b4a01d4SMehdi Amini`examples/toy/Ch1/include/toy/Parser.h`; it is a recursive descent parser. If
1275b4a01d4SMehdi Aminiyou are not familiar with such a Lexer/Parser, these are very similar to the
1285b4a01d4SMehdi AminiLLVM Kaleidoscope equivalent that are detailed in the first two chapters of the
1295b4a01d4SMehdi Amini[Kaleidoscope Tutorial](https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl02.html).
1305b4a01d4SMehdi Amini
1315b4a01d4SMehdi AminiThe [next chapter](Ch-2.md) will demonstrate how to convert this AST into MLIR.
132