1*8feb0f0bSmrg.. Copyright (C) 2015-2020 Free Software Foundation, Inc. 236ac495dSmrg Originally contributed by David Malcolm <dmalcolm@redhat.com> 336ac495dSmrg 436ac495dSmrg This is free software: you can redistribute it and/or modify it 536ac495dSmrg under the terms of the GNU General Public License as published by 636ac495dSmrg the Free Software Foundation, either version 3 of the License, or 736ac495dSmrg (at your option) any later version. 836ac495dSmrg 936ac495dSmrg This program is distributed in the hope that it will be useful, but 1036ac495dSmrg WITHOUT ANY WARRANTY; without even the implied warranty of 1136ac495dSmrg MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 1236ac495dSmrg General Public License for more details. 1336ac495dSmrg 1436ac495dSmrg You should have received a copy of the GNU General Public License 1536ac495dSmrg along with this program. If not, see 1636ac495dSmrg <http://www.gnu.org/licenses/>. 1736ac495dSmrg 1836ac495dSmrgTutorial part 5: Implementing an Ahead-of-Time compiler 1936ac495dSmrg------------------------------------------------------- 2036ac495dSmrg 2136ac495dSmrgIf you have a pre-existing language frontend that's compatible with 2236ac495dSmrglibgccjit's license, it's possible to hook it up to libgccjit as a 2336ac495dSmrgbackend. In the previous example we showed 2436ac495dSmrghow to do that for in-memory JIT-compilation, but libgccjit can also 2536ac495dSmrgcompile code directly to a file, allowing you to implement a more 2636ac495dSmrgtraditional ahead-of-time compiler ("JIT" is something of a misnomer 2736ac495dSmrgfor this use-case). 2836ac495dSmrg 2936ac495dSmrgThe essential difference is to compile the context using 3036ac495dSmrg:c:func:`gcc_jit_context_compile_to_file` rather than 3136ac495dSmrg:c:func:`gcc_jit_context_compile`. 3236ac495dSmrg 3336ac495dSmrgThe "brainf" language 3436ac495dSmrg********************* 3536ac495dSmrg 3636ac495dSmrgIn this example we use libgccjit to construct an ahead-of-time compiler 3736ac495dSmrgfor an esoteric programming language that we shall refer to as "brainf". 3836ac495dSmrg 3936ac495dSmrgbrainf scripts operate on an array of bytes, with a notional data pointer 4036ac495dSmrgwithin the array. 4136ac495dSmrg 4236ac495dSmrgbrainf is hard for humans to read, but it's trivial to write a parser for 4336ac495dSmrgit, as there is no lexing; just a stream of bytes. The operations are: 4436ac495dSmrg 4536ac495dSmrg====================== ============================= 4636ac495dSmrgCharacter Meaning 4736ac495dSmrg====================== ============================= 4836ac495dSmrg``>`` ``idx += 1`` 4936ac495dSmrg``<`` ``idx -= 1`` 5036ac495dSmrg``+`` ``data[idx] += 1`` 5136ac495dSmrg``-`` ``data[idx] -= 1`` 5236ac495dSmrg``.`` ``output (data[idx])`` 5336ac495dSmrg``,`` ``data[idx] = input ()`` 5436ac495dSmrg``[`` loop until ``data[idx] == 0`` 5536ac495dSmrg``]`` end of loop 5636ac495dSmrgAnything else ignored 5736ac495dSmrg====================== ============================= 5836ac495dSmrg 5936ac495dSmrgUnlike the previous example, we'll implement an ahead-of-time compiler, 6036ac495dSmrgwhich reads ``.bf`` scripts and outputs executables (though it would 6136ac495dSmrgbe trivial to have it run them JIT-compiled in-process). 6236ac495dSmrg 6336ac495dSmrgHere's what a simple ``.bf`` script looks like: 6436ac495dSmrg 6536ac495dSmrg .. literalinclude:: ../examples/emit-alphabet.bf 6636ac495dSmrg :lines: 1- 6736ac495dSmrg 6836ac495dSmrg.. note:: 6936ac495dSmrg 7036ac495dSmrg This example makes use of whitespace and comments for legibility, but 7136ac495dSmrg could have been written as:: 7236ac495dSmrg 7336ac495dSmrg ++++++++++++++++++++++++++ 7436ac495dSmrg >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 7536ac495dSmrg [>.+<-] 7636ac495dSmrg 7736ac495dSmrg It's not a particularly useful language, except for providing 7836ac495dSmrg compiler-writers with a test case that's easy to parse. The point 7936ac495dSmrg is that you can use :c:func:`gcc_jit_context_compile_to_file` 8036ac495dSmrg to use libgccjit as a backend for a pre-existing language frontend 8136ac495dSmrg (provided that the pre-existing frontend is compatible with libgccjit's 8236ac495dSmrg license). 8336ac495dSmrg 8436ac495dSmrgConverting a brainf script to libgccjit IR 8536ac495dSmrg****************************************** 8636ac495dSmrg 8736ac495dSmrgAs before we write simple code to populate a :c:type:`gcc_jit_context *`. 8836ac495dSmrg 8936ac495dSmrg .. literalinclude:: ../examples/tut05-bf.c 9036ac495dSmrg :start-after: #define MAX_OPEN_PARENS 16 9136ac495dSmrg :end-before: /* Entrypoint to the compiler. */ 9236ac495dSmrg :language: c 9336ac495dSmrg 9436ac495dSmrgCompiling a context to a file 9536ac495dSmrg***************************** 9636ac495dSmrg 9736ac495dSmrgUnlike the previous tutorial, this time we'll compile the context 9836ac495dSmrgdirectly to an executable, using :c:func:`gcc_jit_context_compile_to_file`: 9936ac495dSmrg 10036ac495dSmrg.. code-block:: c 10136ac495dSmrg 10236ac495dSmrg gcc_jit_context_compile_to_file (ctxt, 10336ac495dSmrg GCC_JIT_OUTPUT_KIND_EXECUTABLE, 10436ac495dSmrg output_file); 10536ac495dSmrg 10636ac495dSmrgHere's the top-level of the compiler, which is what actually calls into 10736ac495dSmrg:c:func:`gcc_jit_context_compile_to_file`: 10836ac495dSmrg 10936ac495dSmrg .. literalinclude:: ../examples/tut05-bf.c 11036ac495dSmrg :start-after: /* Entrypoint to the compiler. */ 11136ac495dSmrg :end-before: /* Use the built compiler to compile the example to an executable: 11236ac495dSmrg :language: c 11336ac495dSmrg 11436ac495dSmrgNote how once the context is populated you could trivially instead compile 11536ac495dSmrgit to memory using :c:func:`gcc_jit_context_compile` and run it in-process 11636ac495dSmrgas in the previous tutorial. 11736ac495dSmrg 11836ac495dSmrgTo create an executable, we need to export a ``main`` function. Here's 11936ac495dSmrghow to create one from the JIT API: 12036ac495dSmrg 12136ac495dSmrg .. literalinclude:: ../examples/tut05-bf.c 12236ac495dSmrg :start-after: #include "libgccjit.h" 12336ac495dSmrg :end-before: #define MAX_OPEN_PARENS 16 12436ac495dSmrg :language: c 12536ac495dSmrg 12636ac495dSmrg.. note:: 12736ac495dSmrg 12836ac495dSmrg The above implementation ignores ``argc`` and ``argv``, but you could 12936ac495dSmrg make use of them by exposing ``param_argc`` and ``param_argv`` to the 13036ac495dSmrg caller. 13136ac495dSmrg 13236ac495dSmrgUpon compiling this C code, we obtain a bf-to-machine-code compiler; 13336ac495dSmrglet's call it ``bfc``: 13436ac495dSmrg 13536ac495dSmrg.. code-block:: console 13636ac495dSmrg 13736ac495dSmrg $ gcc \ 13836ac495dSmrg tut05-bf.c \ 13936ac495dSmrg -o bfc \ 14036ac495dSmrg -lgccjit 14136ac495dSmrg 14236ac495dSmrgWe can now use ``bfc`` to compile .bf files into machine code executables: 14336ac495dSmrg 14436ac495dSmrg.. code-block:: console 14536ac495dSmrg 14636ac495dSmrg $ ./bfc \ 14736ac495dSmrg emit-alphabet.bf \ 14836ac495dSmrg a.out 14936ac495dSmrg 15036ac495dSmrgwhich we can run directly: 15136ac495dSmrg 15236ac495dSmrg.. code-block:: console 15336ac495dSmrg 15436ac495dSmrg $ ./a.out 15536ac495dSmrg ABCDEFGHIJKLMNOPQRSTUVWXYZ 15636ac495dSmrg 15736ac495dSmrgSuccess! 15836ac495dSmrg 15936ac495dSmrgWe can also inspect the generated executable using standard tools: 16036ac495dSmrg 16136ac495dSmrg.. code-block:: console 16236ac495dSmrg 16336ac495dSmrg $ objdump -d a.out |less 16436ac495dSmrg 16536ac495dSmrgwhich shows that libgccjit has managed to optimize the function 16636ac495dSmrgsomewhat (for example, the runs of 26 and 65 increment operations 16736ac495dSmrghave become integer constants 0x1a and 0x41): 16836ac495dSmrg 16936ac495dSmrg.. code-block:: console 17036ac495dSmrg 17136ac495dSmrg 0000000000400620 <main>: 17236ac495dSmrg 400620: 80 3d 39 0a 20 00 00 cmpb $0x0,0x200a39(%rip) # 601060 <data 17336ac495dSmrg 400627: 74 07 je 400630 <main 17436ac495dSmrg 400629: eb fe jmp 400629 <main+0x9> 17536ac495dSmrg 40062b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 17636ac495dSmrg 400630: 48 83 ec 08 sub $0x8,%rsp 17736ac495dSmrg 400634: 0f b6 05 26 0a 20 00 movzbl 0x200a26(%rip),%eax # 601061 <data_cells+0x1> 17836ac495dSmrg 40063b: c6 05 1e 0a 20 00 1a movb $0x1a,0x200a1e(%rip) # 601060 <data_cells> 17936ac495dSmrg 400642: 8d 78 41 lea 0x41(%rax),%edi 18036ac495dSmrg 400645: 40 88 3d 15 0a 20 00 mov %dil,0x200a15(%rip) # 601061 <data_cells+0x1> 18136ac495dSmrg 40064c: 0f 1f 40 00 nopl 0x0(%rax) 18236ac495dSmrg 400650: 40 0f b6 ff movzbl %dil,%edi 18336ac495dSmrg 400654: e8 87 fe ff ff callq 4004e0 <putchar@plt> 18436ac495dSmrg 400659: 0f b6 05 01 0a 20 00 movzbl 0x200a01(%rip),%eax # 601061 <data_cells+0x1> 18536ac495dSmrg 400660: 80 2d f9 09 20 00 01 subb $0x1,0x2009f9(%rip) # 601060 <data_cells> 18636ac495dSmrg 400667: 8d 78 01 lea 0x1(%rax),%edi 18736ac495dSmrg 40066a: 40 88 3d f0 09 20 00 mov %dil,0x2009f0(%rip) # 601061 <data_cells+0x1> 18836ac495dSmrg 400671: 75 dd jne 400650 <main+0x30> 18936ac495dSmrg 400673: 31 c0 xor %eax,%eax 19036ac495dSmrg 400675: 48 83 c4 08 add $0x8,%rsp 19136ac495dSmrg 400679: c3 retq 19236ac495dSmrg 40067a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 19336ac495dSmrg 19436ac495dSmrgWe also set up debugging information (via 19536ac495dSmrg:c:func:`gcc_jit_context_new_location` and 19636ac495dSmrg:c:macro:`GCC_JIT_BOOL_OPTION_DEBUGINFO`), so it's possible to use ``gdb`` 19736ac495dSmrgto singlestep through the generated binary and inspect the internal 19836ac495dSmrgstate ``idx`` and ``data_cells``: 19936ac495dSmrg 20036ac495dSmrg.. code-block:: console 20136ac495dSmrg 20236ac495dSmrg (gdb) break main 20336ac495dSmrg Breakpoint 1 at 0x400790 20436ac495dSmrg (gdb) run 20536ac495dSmrg Starting program: a.out 20636ac495dSmrg 20736ac495dSmrg Breakpoint 1, 0x0000000000400790 in main (argc=1, argv=0x7fffffffe448) 20836ac495dSmrg (gdb) stepi 20936ac495dSmrg 0x0000000000400797 in main (argc=1, argv=0x7fffffffe448) 21036ac495dSmrg (gdb) stepi 21136ac495dSmrg 0x00000000004007a0 in main (argc=1, argv=0x7fffffffe448) 21236ac495dSmrg (gdb) stepi 21336ac495dSmrg 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 21436ac495dSmrg (gdb) list 21536ac495dSmrg 4 21636ac495dSmrg 5 cell 0 = 26 21736ac495dSmrg 6 ++++++++++++++++++++++++++ 21836ac495dSmrg 7 21936ac495dSmrg 8 cell 1 = 65 22036ac495dSmrg 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 22136ac495dSmrg 10 22236ac495dSmrg 11 while cell#0 != 0 22336ac495dSmrg 12 [ 22436ac495dSmrg 13 > 22536ac495dSmrg (gdb) n 22636ac495dSmrg 6 ++++++++++++++++++++++++++ 22736ac495dSmrg (gdb) n 22836ac495dSmrg 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 22936ac495dSmrg (gdb) p idx 23036ac495dSmrg $1 = 1 23136ac495dSmrg (gdb) p data_cells 23236ac495dSmrg $2 = "\032", '\000' <repeats 29998 times> 23336ac495dSmrg (gdb) p data_cells[0] 23436ac495dSmrg $3 = 26 '\032' 23536ac495dSmrg (gdb) p data_cells[1] 23636ac495dSmrg $4 = 0 '\000' 23736ac495dSmrg (gdb) list 23836ac495dSmrg 4 23936ac495dSmrg 5 cell 0 = 26 24036ac495dSmrg 6 ++++++++++++++++++++++++++ 24136ac495dSmrg 7 24236ac495dSmrg 8 cell 1 = 65 24336ac495dSmrg 9 >+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++< 24436ac495dSmrg 10 24536ac495dSmrg 11 while cell#0 != 0 24636ac495dSmrg 12 [ 24736ac495dSmrg 13 > 24836ac495dSmrg 24936ac495dSmrg 25036ac495dSmrgOther forms of ahead-of-time-compilation 25136ac495dSmrg**************************************** 25236ac495dSmrg 25336ac495dSmrgThe above demonstrates compiling a :c:type:`gcc_jit_context *` directly 25436ac495dSmrgto an executable. It's also possible to compile it to an object file, 25536ac495dSmrgand to a dynamic library. See the documentation of 25636ac495dSmrg:c:func:`gcc_jit_context_compile_to_file` for more information. 257