xref: /openbsd-src/gnu/usr.bin/gcc/gcc/treelang/treelang.texi (revision c87b03e512fc05ed6e0222f6fb0ae86264b1d05b)
1\input texinfo  @c -*-texinfo-*-
2
3@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
4@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
5@c NOTE THIS IS NOT A GOOD EXAMPLE OF HOW TO DO A MANUAL. FIXME!!!
6
7
8@c %**start of header
9@setfilename treelang.info
10
11@include gcc-common.texi
12
13@set version-treelang 1.0
14
15@set last-update 2001-07-30
16@set copyrights-treelang 1995,1996,1997,1998,1999,2000,2001,2002
17
18@set email-general gcc@@gcc.gnu.org
19@set email-bugs gcc-bugs@@gcc.gnu.org or bug-gcc@@gnu.org
20@set email-patches gcc-patches@@gcc.gnu.org
21@set path-treelang gcc/gcc/treelang
22
23@set which-treelang GCC-@value{version-GCC}
24@set which-GCC GCC
25
26@set email-josling tej@@melbpc.org.au
27@set www-josling http://www.geocities.com/timjosling
28
29@c This tells @include'd files that they're part of the overall TREELANG doc
30@c set.  (They might be part of a higher-level doc set too.)
31@set DOC-TREELANG
32
33@c @setfilename usetreelang.info
34@c @setfilename maintaintreelang.info
35@c To produce the full manual, use the "treelang.info" setfilename, and
36@c make sure the following do NOT begin with '@c' (and the @clear lines DO)
37@set INTERNALS
38@set USING
39@c To produce a user-only manual, use the "usetreelang.info" setfilename, and
40@c make sure the following does NOT begin with '@c':
41@c @clear INTERNALS
42@c To produce a maintainer-only manual, use the "maintaintreelang.info" setfilename,
43@c and make sure the following does NOT begin with '@c':
44@c @clear USING
45
46@ifset INTERNALS
47@ifset USING
48@settitle Using and Maintaining GNU Treelang
49@end ifset
50@end ifset
51@c seems reasonable to assume at least one of INTERNALS or USING is set...
52@ifclear INTERNALS
53@settitle Using GNU Treelang
54@end ifclear
55@ifclear USING
56@settitle Maintaining GNU Treelang
57@end ifclear
58@c then again, have some fun
59@ifclear INTERNALS
60@ifclear USING
61@settitle Doing Very Little at all with GNU Treelang
62@end ifclear
63@end ifclear
64
65@syncodeindex fn cp
66@syncodeindex vr cp
67@c %**end of header
68
69@c Cause even numbered pages to be printed on the left hand side of
70@c the page and odd numbered pages to be printed on the right hand
71@c side of the page.  Using this, you can print on both sides of a
72@c sheet of paper and have the text on the same part of the sheet.
73
74@c The text on right hand pages is pushed towards the right hand
75@c margin and the text on left hand pages is pushed toward the left
76@c hand margin.
77@c (To provide the reverse effect, set bindingoffset to -0.75in.)
78
79@c @tex
80@c \global\bindingoffset=0.75in
81@c \global\normaloffset =0.75in
82@c @end tex
83
84@copying
85Copyright @copyright{} @value{copyrights-treelang} Free Software Foundation, Inc.
86
87Permission is granted to copy, distribute and/or modify this document
88under the terms of the GNU Free Documentation License, Version 1.2 or
89any later version published by the Free Software Foundation; with the
90Invariant Sections being ``GNU General Public License'', the Front-Cover
91texts being (a) (see below), and with the Back-Cover Texts being (b)
92(see below).  A copy of the license is included in the section entitled
93``GNU Free Documentation License''.
94
95(a) The FSF's Front-Cover Text is:
96
97     A GNU Manual
98
99(b) The FSF's Back-Cover Text is:
100
101     You have freedom to copy and modify this GNU Manual, like GNU
102     software.  Copies published by the Free Software Foundation raise
103     funds for GNU development.
104@end copying
105
106@ifnottex
107@dircategory Programming
108@direntry
109* treelang: (treelang).                  The GNU Treelang compiler.
110@end direntry
111@ifset INTERNALS
112@ifset USING
113This file documents the use and the internals of the GNU Treelang
114(@code{treelang}) compiler. At the moment this manual is not
115incorporated into the main GCC manual as it is too incomplete. It
116corresponds to the @value{which-treelang} version of @code{treelang}.
117@end ifset
118@end ifset
119@ifclear USING
120This file documents the internals of the GNU Treelang (@code{treelang}) compiler.
121It corresponds to the @value{which-treelang} version of @code{treelang}.
122@end ifclear
123@ifclear INTERNALS
124This file documents the use of the GNU Treelang (@code{treelang}) compiler.
125It corresponds to the @value{which-treelang} version of @code{treelang}.
126@end ifclear
127
128Published by the Free Software Foundation
12959 Temple Place - Suite 330
130Boston, MA 02111-1307 USA
131
132@insertcopying
133@end ifnottex
134
135treelang was Contributed by Tim Josling (@email{@value{email-josling}}).
136Inspired by and based on the 'toy' language, written by Richard Kenner.
137
138This document was written by Tim Josling, based on the GNU C++
139documentation.
140
141@setchapternewpage odd
142@c @finalout
143@titlepage
144@ifset INTERNALS
145@ifset USING
146@center @titlefont{Using and Maintaining GNU Treelang}
147
148@end ifset
149@end ifset
150@ifclear INTERNALS
151@title Using GNU Treelang
152@end ifclear
153@ifclear USING
154@title Maintaining GNU Treelang
155@end ifclear
156@sp 2
157@center Tim Josling
158@sp 3
159@center Last updated @value{last-update}
160@sp 1
161@center for version @value{version-treelang}
162@page
163@vskip 0pt plus 1filll
164For the @value{which-treelang} Version*
165@sp 1
166Published by the Free Software Foundation @*
16759 Temple Place - Suite 330@*
168Boston, MA 02111-1307, USA@*
169@c Last printed ??ber, 19??.@*
170@c Printed copies are available for $? each.@*
171@c ISBN ???
172@sp 1
173@insertcopying
174@end titlepage
175@page
176
177@ifnottex
178
179@node Top, Copying,, (dir)
180@top Introduction
181@cindex Introduction
182
183@ifset INTERNALS
184@ifset USING
185This manual documents how to run, install and maintain @code{treelang},
186as well as its new features and incompatibilities,
187and how to report bugs.
188It corresponds to the @value{which-treelang} version of @code{treelang}.
189@end ifset
190@end ifset
191
192@ifclear INTERNALS
193This manual documents how to run and install @code{treelang},
194as well as its new features and incompatibilities, and how to report
195bugs.
196It corresponds to the @value{which-treelang} version of @code{treelang}.
197@end ifclear
198@ifclear USING
199This manual documents how to maintain @code{treelang}, as well as its
200new features and incompatibilities, and how to report bugs.  It
201corresponds to the @value{which-treelang} version of @code{treelang}.
202@end ifclear
203
204@end ifnottex
205
206@ifset DEVELOPMENT
207@emph{Warning:} This document is still under development, and might not
208accurately reflect the @code{treelang} code base of which it is a part.
209@end ifset
210
211@menu
212* Copying::
213* Contributors::
214* GNU Free Documentation License::
215* Funding::
216* Getting Started::
217* What is GNU Treelang?::
218* Lexical Syntax::
219* Parsing Syntax::
220* Compiler Overview::
221* TREELANG and GCC::
222* Compiler::
223* Other Languages::
224* treelang internals::
225* Open Questions::
226* Bugs::
227* Service::
228* Projects::
229* Index::
230
231@detailmenu
232 --- The Detailed Node Listing ---
233
234Other Languages
235
236* Interoperating with C and C++::
237
238treelang internals
239
240* treelang files::
241* treelang compiler interfaces::
242* Hints and tips::
243
244treelang compiler interfaces
245
246* treelang driver::
247* treelang main compiler::
248
249treelang main compiler
250
251* Interfacing to toplev.c::
252* Interfacing to the garbage collection::
253* Interfacing to the code generation code. ::
254
255Reporting Bugs
256
257* Sending Patches::
258
259@end detailmenu
260@end menu
261
262@include gpl.texi
263
264@include fdl.texi
265
266@node Contributors
267
268@unnumbered Contributors to GNU Treelang
269@cindex contributors
270@cindex credits
271
272Treelang was based on 'toy' by Richard Kenner, and also uses code from
273the GCC core code tree. Tim Josling first created the language and
274documentation, based on the GCC Fortran compiler's documentation
275framework.
276
277@itemize @bullet
278@item
279The packaging and compiler portions of GNU Treelang are based largely
280on the GCC compiler.
281@xref{Contributors,,Contributors to GCC,GCC,Using and Maintaining GCC},
282for more information.
283
284@item
285There is no specific run-time library for treelang, other than the
286standard C runtime.
287
288@item
289It would have been difficult to build treelang without access to Joachim
290Nadler's guide to writing a front end to GCC (written in German). A
291translation of this document into English is available via the
292CobolForGCC project or via the documentation links from the GCC home
293page @uref{http://GCC.gnu.org}.
294@end itemize
295
296@include funding.texi
297
298@node Getting Started
299@chapter Getting Started
300@cindex getting started
301@cindex new users
302@cindex newbies
303@cindex beginners
304
305Treelang is a sample language, useful only to help people understand how
306to implement a new language front end to GCC. It is not a useful
307language in itself other than as an example or basis for building a new
308language. Therefore only language developers are likely to have an
309interest in it.
310
311This manual assumes familiarity with GCC, which you can obtain by using
312it and by reading the manual @samp{Using and Porting GCC}.
313
314To install treelang, follow the GCC installation instructions,
315taking care to ensure you specify treelang in the configure step.
316
317If you're generally curious about the future of
318@code{treelang}, see @ref{Projects}.
319If you're curious about its past,
320see @ref{Contributors}.
321
322To see a few of the questions maintainers of @code{treelang} have,
323and that you might be able to answer,
324see @ref{Open Questions}.
325
326@ifset USING
327@node What is GNU Treelang?, Lexical Syntax, Getting Started, Top
328@chapter What is GNU Treelang?
329@cindex concepts, basic
330@cindex basic concepts
331
332GNU Treelang, or @code{treelang}, is designed initially as a free
333replacement for, or alternative to, the 'toy' language, but which is
334amenable to inclusion within the GCC source tree.
335
336@code{treelang} is largely a cut down version of C, designed to showcase
337the features of the GCC code generation back end. Only those features
338that are directly supported by the GCC code generation back end are
339implemented. Features are implemented in a manner which is easiest and
340clearest to implement. Not all or even most code generation back end
341features are implemented. The intention is to add features incrementally
342until most features of the GCC back end are implemented in treelang.
343
344The main features missing are structures, arrays and pointers.
345
346A sample program follows:
347
348@example
349// function prototypes
350// function 'add' taking two ints and returning an int
351external_definition int add(int arg1, int arg2);
352external_definition int subtract(int arg3, int arg4);
353external_definition int first_nonzero(int arg5, int arg6);
354external_definition int double_plus_one(int arg7);
355
356// function definition
357add
358@{
359// return the sum of arg1 and arg2
360  return arg1 + arg2;
361@}
362
363
364subtract
365@{
366  return arg3 - arg4;
367@}
368
369double_plus_one
370@{
371// aaa is a variable, of type integer and allocated at the start of the function
372  automatic int aaa;
373// set aaa to the value returned from aaa, when passed arg7 and arg7 as the two parameters
374  aaa=add(arg7, arg7);
375  aaa=add(aaa, aaa);
376  aaa=subtract(subtract(aaa, arg7), arg7) + 1;
377  return aaa;
378@}
379
380first_nonzero
381@{
382// C-like if statement
383  if (arg5)
384    @{
385      return arg5;
386    @}
387  else
388    @{
389    @}
390  return arg6;
391@}
392@end example
393
394@node Lexical Syntax, Parsing Syntax, What is GNU Treelang?, Top
395@chapter Lexical Syntax
396@cindex Lexical Syntax
397
398Treelang programs consist of whitespace, comments, keywords and names.
399@itemize @bullet
400
401@item
402Whitespace consists of the space character and the end of line
403character. Tabs are not allowed. Line terminations are as defined by the
404standard C library. Whitespace is ignored except within comments,
405and where it separates parts of the program. In the example below, A and
406B are two separate names separated by whitespace.
407
408@smallexample
409A B
410@end smallexample
411
412@item
413Comments consist of @samp{//} followed by any characters up to the end
414of the line. C style comments (/* */) are not supported. For example,
415the assignment below is followed by a not very helpful comment.
416
417@smallexample
418x=1; // Set X to 1
419@end smallexample
420
421@item
422Keywords consist of any reserved words or symbols as described
423later. The list of keywords follows:
424
425@smallexample
426@{ - used to start the statements in a function
427@} - used to end the statements in a function
428( - start list of function arguments, or to change the precedence of operators in an expression
429) - end list or prioritized operators in expression
430, - used to separate parameters in a function prototype or in a function call
431; - used to end a statement
432+ - addition
433- - subtraction
434= - assignment
435== - equality test
436if - begin IF statement
437else - begin 'else' portion of IF statement
438static - indicate variable is permanent, or function has file scope only
439automatic - indicate that variable is allocated for the life of the function
440external_reference - indicate that variable or function is defined in another file
441external_definition - indicate that variable or function is to be accessible from other files
442int - variable is an integer (same as C int)
443char - variable is a character (same as C char)
444unsigned - variable is unsigned. If this is not present, the variable is signed
445return - start function return statement
446void - used as function type to indicate function returns nothing
447@end smallexample
448
449
450@item
451Names consist of any letter or "_" followed by any number of letters or
452numbers or "_". "$" is not allowed in a name. All names must be globally
453unique - the same name may not be used twice in any context - and must
454not be a keyword. Names and keywords are case sensitive. For example:
455
456@smallexample
457a A _a a_ IF_X
458@end smallexample
459
460are all different names.
461
462@end itemize
463
464@node Parsing Syntax, Compiler Overview, Lexical Syntax, Top
465@chapter Parsing Syntax
466@cindex Parsing Syntax
467
468Declarations are built up from the lexical elements described above. A
469file may contain one of more declarations.
470
471@itemize @bullet
472
473@item
474declaration: variable declaration OR function prototype OR function declaration
475
476@item
477Function Prototype: storage type NAME ( parameter_list )
478
479@smallexample
480static int add (int a, int b)
481@end smallexample
482
483@item
484variable_declaration: storage type NAME initial;
485
486Example:
487
488@smallexample
489int temp1=1;
490@end smallexample
491
492A variable declaration can be outside a function, or at the start of a function.
493
494@item
495storage: automatic OR static OR external_reference OR external_definition
496
497This defines the scope, duration and visibility of a function or variable
498
499@enumerate 1
500
501@item
502automatic: This means a variable is allocated at start of function and
503released when the function returns. This can only be used for variables
504within functions. It cannot be used for functions.
505
506@item
507static: This means a variable is allocated at start of program and
508remains allocated until the program as a whole ends. For a function, it
509means that the function is only visible within the current file.
510
511@item
512external_definition: For a variable, which must be defined outside a
513function, it means that the variable is visible from other files. For a
514function, it means that the function is visible from another file.
515
516@item
517external_reference: For a variable, which must be defined outside a
518function, it means that the variable is defined in another file. For a
519function, it means that the function is defined in another file.
520
521@end enumerate
522
523@item
524type: int OR unsigned int OR char OR unsigned char OR void
525
526This defines the data type of a variable or the return type of a function.
527
528@enumerate a
529
530@item
531int: The variable is a signed integer. The function returns a signed integer.
532
533@item
534unsigned int: The variable is an unsigned integer. The function returns an unsigned integer.
535
536@item
537char: The variable is a signed character. The function returns a signed character.
538
539@item
540unsigned char: The variable is an unsigned character. The function returns an unsigned character.
541
542@end enumerate
543
544@item
545parameter_list OR parameter [, parameter]...
546
547@item
548parameter: variable_declaration ,
549
550The variable declarations must not have initialisations.
551
552@item
553initial: = value
554
555@item
556value: integer_constant
557
558@smallexample
559eg 1 +2 -3
560@end smallexample
561
562@item
563function_declaration: name @{variable_declarations statements @}
564
565A function consists of the function name then the declarations (if any)
566and statements (if any) within one pair of braces.
567
568The details of the function arguments come from the function
569prototype. The function prototype must precede the function declaration
570in the file.
571
572@item
573statement: if_statement OR expression_statement OR return_statement
574
575@item
576if_statement: if (expression) @{ statements @} else @{ statements @}
577
578The first lot of statements is executed if the expression is
579nonzero. Otherwise the second lot of statements is executed. Either
580list of statements may be empty, but both sets of braces and the else must be present.
581
582@smallexample
583if (a==b)
584@{
585// nothing
586@}
587else
588@{
589a=b;
590@}
591@end smallexample
592
593@item
594expression_statement: expression;
595
596The expression is executed and any side effects, such
597
598@item
599return_statement: return expression_opt;
600
601Returns from the function. If the function is void, the expression must
602be absent, and if the function is not void the expression must be
603present.
604
605@item
606expression: variable OR integer_constant OR expression+expression OR expression-expression
607 OR expression==expression OR (expression) OR variable=expression OR function_call
608
609An expression can be a constant or a variable reference or a
610function_call. Expressions can be combined as a sum of two expressions
611or the difference of two expressions, or an equality test of two
612expresions. An assignment is also an expression. Expresions and operator
613precedence work as in C.
614
615@item
616function_call: function_name (comma_separated_expressions)
617
618This invokes the function, passing to it the values of the expressions
619as actual parameters.
620
621@end itemize
622
623@cindex compilers
624@node Compiler Overview, TREELANG and GCC, Parsing Syntax, Top
625@chapter Compiler Overview
626treelang is run as part of the GCC compiler.
627
628@itemize @bullet
629@cindex source code
630@cindex file, source
631@cindex code, source
632@cindex source file
633@item
634It reads a user's program, stored in a file and containing instructions
635written in the appropriate language (Treelang, C, and so on).  This file
636contains @dfn{source code}.
637
638@cindex translation of user programs
639@cindex machine code
640@cindex code, machine
641@cindex mistakes
642@item
643It translates the user's program into instructions a computer can carry
644out more quickly than it takes to translate the instructions in the
645first place.  These instructions are called @dfn{machine code}---code
646designed to be efficiently translated and processed by a machine such as
647a computer.  Humans usually aren't as good writing machine code as they
648are at writing Treelang or C, because it is easy to make tiny mistakes
649writing machine code.  When writing Treelang or C, it is easy to make
650big mistakes. But you can only make one mistake, because the compiler
651stops after it finds any problem.
652
653@cindex debugger
654@cindex bugs, finding
655@cindex @code{gdb}, command
656@cindex commands, @code{gdb}
657@item
658It provides information in the generated machine code
659that can make it easier to find bugs in the program
660(using a debugging tool, called a @dfn{debugger},
661such as @code{gdb}).
662
663@cindex libraries
664@cindex linking
665@cindex @code{ld} command
666@cindex commands, @code{ld}
667@item
668It locates and gathers machine code already generated to perform actions
669requested by statements in the user's program.  This machine code is
670organized into @dfn{libraries} and is located and gathered during the
671@dfn{link} phase of the compilation process.  (Linking often is thought
672of as a separate step, because it can be directly invoked via the
673@code{ld} command.  However, the @code{gcc} command, as with most
674compiler commands, automatically performs the linking step by calling on
675@code{ld} directly, unless asked to not do so by the user.)
676
677@cindex language, incorrect use of
678@cindex incorrect use of language
679@item
680It attempts to diagnose cases where the user's program contains
681incorrect usages of the language.  The @dfn{diagnostics} produced by the
682compiler indicate the problem and the location in the user's source file
683where the problem was first noticed.  The user can use this information
684to locate and fix the problem.
685
686The compiler stops after the first error. There are no plans to fix
687this, ever, as it would vastly complicate the implementation of treelang
688to little or no benefit.
689
690@cindex diagnostics, incorrect
691@cindex incorrect diagnostics
692@cindex error messages, incorrect
693@cindex incorrect error messages
694(Sometimes an incorrect usage of the language leads to a situation where
695the compiler can not make any sense of what it reads---while a human
696might be able to---and thus ends up complaining about an incorrect
697``problem'' it encounters that, in fact, reflects a misunderstanding of
698the programmer's intention.)
699
700@cindex warnings
701@cindex questionable instructions
702@item
703There are no warnings in treelang. A program is either correct or in
704error.
705@end itemize
706
707@cindex components of treelang
708@cindex @code{treelang}, components of
709@code{treelang} consists of several components:
710
711@cindex @code{gcc}, command
712@cindex commands, @code{gcc}
713@itemize @bullet
714@item
715A modified version of the @code{gcc} command, which also might be
716installed as the system's @code{cc} command.
717(In many cases, @code{cc} refers to the
718system's ``native'' C compiler, which
719might be a non-GNU compiler, or an older version
720of @code{GCC} considered more stable or that is
721used to build the operating system kernel.)
722
723@cindex @code{treelang}, command
724@cindex commands, @code{treelang}
725@item
726The @code{treelang} command itself.
727
728@item
729The @code{libc} run-time library.  This library contains the machine
730code needed to support capabilities of the Treelang language that are
731not directly provided by the machine code generated by the
732@code{treelang} compilation phase. This is the same library that the
733main c compiler uses (libc).
734
735@cindex @code{tree1}, program
736@cindex programs, @code{tree1}
737@cindex assembler
738@cindex @code{as} command
739@cindex commands, @code{as}
740@cindex assembly code
741@cindex code, assembly
742@item
743The compiler itself, is internally named @code{tree1}.
744
745Note that @code{tree1} does not generate machine code directly---it
746generates @dfn{assembly code} that is a more readable form
747of machine code, leaving the conversion to actual machine code
748to an @dfn{assembler}, usually named @code{as}.
749@end itemize
750
751@code{GCC} is often thought of as ``the C compiler'' only,
752but it does more than that.
753Based on command-line options and the names given for files
754on the command line, @code{gcc} determines which actions to perform, including
755preprocessing, compiling (in a variety of possible languages), assembling,
756and linking.
757
758@cindex driver, gcc command as
759@cindex @code{gcc}, command as driver
760@cindex executable file
761@cindex files, executable
762@cindex cc1 program
763@cindex programs, cc1
764@cindex preprocessor
765@cindex cpp program
766@cindex programs, cpp
767For example, the command @samp{gcc foo.c} @dfn{drives} the file
768@file{foo.c} through the preprocessor @code{cpp}, then
769the C compiler (internally named
770@code{cc1}), then the assembler (usually @code{as}), then the linker
771(@code{ld}), producing an executable program named @file{a.out} (on
772UNIX systems).
773
774@cindex treelang program
775@cindex programs, treelang
776As another example, the command @samp{gcc foo.tree} would do much the
777same as @samp{gcc foo.c}, but instead of using the C compiler named
778@code{cc1}, @code{gcc} would use the treelang compiler (named
779@code{tree1}). However there is no preprocessor for treelang.
780
781@cindex @code{tree1}, program
782@cindex programs, @code{tree1}
783In a GNU Treelang installation, @code{gcc} recognizes Treelang source
784files by name just like it does C and C++ source files.  It knows to use
785the Treelang compiler named @code{tree1}, instead of @code{cc1} or
786@code{cc1plus}, to compile Treelang files. If a file's name ends in
787@code{.tree} then GCC knows that the program is written in treelang. You
788can also manually override the language.
789
790@cindex @code{gcc}, not recognizing Treelang source
791@cindex unrecognized file format
792@cindex file format not recognized
793Non-Treelang-related operation of @code{gcc} is generally
794unaffected by installing the GNU Treelang version of @code{gcc}.
795However, without the installed version of @code{gcc} being the
796GNU Treelang version, @code{gcc} will not be able to compile
797and link Treelang programs.
798
799@cindex printing version information
800@cindex version information, printing
801The command @samp{gcc -v x.tree} where @samp{x.tree} is a file which
802must exist but whose contents are ignored, is a quick way to display
803version information for the various programs used to compile a typical
804Treelang source file.
805
806The @code{tree1} program represents most of what is unique to GNU
807Treelang; @code{tree1} is a combination of two rather large chunks of
808code.
809
810@cindex GCC Back End (GBE)
811@cindex GBE
812@cindex @code{GCC}, back end
813@cindex back end, GCC
814@cindex code generator
815One chunk is the so-called @dfn{GNU Back End}, or GBE,
816which knows how to generate fast code for a wide variety of processors.
817The same GBE is used by the C, C++, and Treelang compiler programs @code{cc1},
818@code{cc1plus}, and @code{tree1}, plus others.
819Often the GBE is referred to as the ``GCC back end'' or
820even just ``GCC''---in this manual, the term GBE is used
821whenever the distinction is important.
822
823@cindex GNU Treelang Front End (TFE)
824@cindex tree1
825@cindex @code{treelang}, front end
826@cindex front end, @code{treelang}
827The other chunk of @code{tree1} is the majority of what is unique about
828GNU Treelang---the code that knows how to interpret Treelang programs to
829determine what they are intending to do, and then communicate that
830knowledge to the GBE for actual compilation of those programs.  This
831chunk is called the @dfn{Treelang Front End} (TFE).  The @code{cc1} and
832@code{cc1plus} programs have their own front ends, for the C and C++
833languages, respectively.  These fronts ends are responsible for
834diagnosing incorrect usage of their respective languages by the programs
835the process, and are responsible for most of the warnings about
836questionable constructs as well.  (The GBE in principle handles
837producing some warnings, like those concerning possible references to
838undefined variables, but these warnings should not occur in treelang
839programs as the front end is meant to pick them up first).
840
841Because so much is shared among the compilers for various languages,
842much of the behavior and many of the user-selectable options for these
843compilers are similar.
844For example, diagnostics (error messages and
845warnings) are similar in appearance; command-line
846options like @samp{-Wall} have generally similar effects; and the quality
847of generated code (in terms of speed and size) is roughly similar
848(since that work is done by the shared GBE).
849
850@node TREELANG and GCC, Compiler, Compiler Overview, Top
851@chapter Compile Treelang, C, or Other Programs
852@cindex compiling programs
853@cindex programs, compiling
854
855@cindex @code{gcc}, command
856@cindex commands, @code{gcc}
857A GNU Treelang installation includes a modified version of the @code{gcc}
858command.
859
860In a non-Treelang installation, @code{gcc} recognizes C, C++,
861and Objective-C source files.
862
863In a GNU Treelang installation, @code{gcc} also recognizes Treelang source
864files and accepts Treelang-specific command-line options, plus some
865command-line options that are designed to cater to Treelang users
866but apply to other languages as well.
867
868@xref{G++ and GCC,,Compile C; C++; or Objective-C,GCC,Using and Porting GCC},
869for information on the way different languages are handled
870by the GCC compiler (@code{gcc}).
871
872You can use this, combined with the output of the @samp{GCC -v x.tree}
873command to get the options applicable to treelang. Treelang programs
874must end with the suffix @samp{.tree}.
875
876@cindex preprocessor
877
878Treelang programs are not by default run through the C
879preprocessor by @code{gcc}. There is no reason why they cannot be run through the
880preprocessor manually, but you would need to prevent the preprocessor
881from generating #line directives, using the @samp{-P} option, otherwise
882tree1 will not accept the input.
883
884@node Compiler, Other Languages, TREELANG and GCC, Top
885@chapter The GNU Treelang Compiler
886
887The GNU Treelang compiler, @code{treelang}, supports programs written
888in the GNU Treelang language.
889
890@node Other Languages, treelang internals, Compiler, Top
891@chapter Other Languages
892
893@menu
894* Interoperating with C and C++::
895@end menu
896
897@node Interoperating with C and C++,  , Other Languages, Other Languages
898@section Tools and advice for interoperating with C and C++
899
900The output of treelang programs looks like C program code to the linker
901and everybody else, so you should be able to freely mix treelang and C
902(and C++) code, with one proviso.
903
904C promotes small integer types to 'int' when used as function parameters and
905return values. The treelang compiler does not do this, so if you want to interface
906to C, you need to specify the promoted value, not the nominal value.
907
908@ifset INTERNALS
909@node treelang internals, Open Questions, Other Languages, Top
910@chapter treelang internals
911
912@menu
913* treelang files::
914* treelang compiler interfaces::
915* Hints and tips::
916@end menu
917
918@node treelang files, treelang compiler interfaces, treelang internals, treelang internals
919@section treelang files
920
921To create a compiler that integrates into GCC, you need create many
922files. Some of the files are integrated into the main GCC makefile, to
923build the various parts of the compiler and to run the test
924suite. Others are incorporated into various GCC programs such as
925GCC.c. Finally you must provide the actual programs comprising your
926compiler.
927
928@cindex files
929
930The files are:
931
932@enumerate 1
933
934@item
935COPYING. This is the copyright file, assuming you are going to use the
936GNU General Public Licence. You probably need to use the GPL because if
937you use the GCC back end your program and the back end are one program,
938and the back end is GPLed.
939
940This need not be present if the language is incorporated into the main
941GCC tree, as the main GCC directory has this file.
942
943@item
944COPYING.LIB. This is the copyright file for those parts of your program
945that are not to be covered by the GPL, but are instead to be covered by
946the LGPL (Library or Lesser GPL). This licence may be appropriate for
947the library routines associated with your compiler. These are the
948routines that are linked with the @emph{output} of the compiler. Using
949the LGPL for these programs allows programs written using your compiler
950to be closed source. For example LIBC is under the LGPL.
951
952This need not be present if the language is incorporated into the main
953GCC tree, as the main GCC directory has this file.
954
955@item
956ChangeLog. Record all the changes to your compiler. Use the same format
957as used in treelang as it is supported by an emacs editing mode and is
958part of the FSF coding standard. Normally each directory has its own
959changelog. The FSF standard allows but does not require a meaningful
960comment on why the changes were made, above and beyond @emph{why} they
961were made. In the author's opinion it is useful to provide this
962information.
963
964@item
965treelang.texi. The manual, written in texinfo. Your manual would have a
966different file name. You need not write it in texinfo if you don't want
967do, but a lot of GNU software does use texinfo.
968
969@cindex Make-lang.in
970@item
971Make-lang.in. This file is part of the make file which in incorporated
972with the GCC make file skeleton (Makefile.in in the GCC directory) to
973make Makefile, as part of the configuration process.
974
975Makefile in turn is the main instruction to actually build
976everything. The build instructions are held in the main GCC manual and
977web site so they are not repeated here.
978
979There are some comments at the top which will help you understand what
980you need to do.
981
982There are make commands to build things, remove generated files with
983various degrees of thoroughness, count the lines of code (so you know
984how much progress you are making), build info and html files from the
985texinfo source, run the tests etc.
986
987@item
988README. Just a brief informative text file saying what is in this
989directory.
990
991@cindex config-lang.in
992@item
993config-lang.in. This file is read by the configuration progress and must
994be present. You specify the name of your language, the name(s) of the
995compiler(s) incouding preprocessors you are going to build, whether any,
996usually generated, files should be excluded from diffs (ie when making
997diff files to send in patches). Whether the equate 'stagestuff' is used
998is unknown (???).
999
1000@cindex lang-options
1001@item
1002lang-options. This file is included into GCC.c, the main GCC driver, and
1003tells it what options your language supports. This is only used to
1004display help (is this true ???).
1005
1006@cindex lang-specs
1007@item
1008lang-specs. This file is also included in GCC.c. It tells GCC.c when to
1009call your programs and what options to send them. The mini-language
1010'specs' is documented in the source of GCC.c. Do not attempt to write a
1011specs file from scratch - use an existing one as the base and enhance
1012it.
1013
1014@item
1015Your texi files. Texinfo can be used to build documentation in HTML,
1016info, dvi and postscript formats. It is a tagged language, is documented
1017in its own manual, and has its own emacs mode.
1018
1019@item
1020Your programs. The relationships between all the programs are explained
1021in the next section. You need to write or use the following programs:
1022
1023@itemize @bullet
1024
1025@item
1026lexer. This breaks the input into words and passes these to the
1027parser. This is lex.l in treelang, which is passed through flex, a lex
1028variant, to produce C code lex.c. Note there is a school of thought that
1029says real men hand code their own lexers, however you may prefer to
1030write far less code and use flex, as was done with treelang.
1031
1032@item
1033parser. This breaks the program into recognizable constructs such as
1034expressions, statements etc. This is parse.y in treelang, which is
1035passed through bison, which is a yacc variant, to produce C code parse.c.
1036
1037@item
1038back end interface. This interfaces to the code generation back end. In
1039treelang, this is tree1.c which mainly interfaces to toplev.c and
1040treetree.c which mainly interfaces to everything else. Many languages
1041mix up the back end interface with the parser, as in the C compiler for
1042example. It is a matter of taste which way to do it, but with treelang
1043it is separated out to make the back end interface cleaner and easier to
1044understand.
1045
1046@item
1047header files. For function prototypes and common data items. One point
1048to note here is that bison can generate a header files with all the
1049numbers is has assigned to the keywords and symbols, and you can include
1050the same header in your lexer. This technique is demonstrated in
1051treelang.
1052
1053@item
1054compiler main file. GCC comes with a program toplev.c which is a
1055perfectly serviceable main program for your compiler. treelang uses
1056toplev.c but other languages have been known to replace it with their
1057own main program. Again this is a matter of taste and how much code you
1058want to write.
1059
1060@end itemize
1061
1062@end enumerate
1063
1064@node treelang compiler interfaces, Hints and tips, treelang files, treelang internals
1065@section treelang compiler interfaces
1066
1067@cindex driver
1068@cindex toplev.c
1069
1070@menu
1071* treelang driver::
1072* treelang main compiler::
1073@end menu
1074
1075@node treelang driver, treelang main compiler, treelang compiler interfaces, treelang compiler interfaces
1076@subsection treelang driver
1077
1078The GCC compiler consists of a driver, which then executes the various
1079compiler phases based on the instructions in the specs files.
1080
1081Typically a program's language will be identified from its suffix (eg
1082.tree) for treelang programs.
1083
1084The driver (gcc.c) will then drive (exec) in turn a preprocessor, the main
1085compiler, the assembler and the link editor. Options to GCC allow you to
1086override all of this. In the case of treelang programs there is no
1087preprocessor, and mostly these days the C preprocessor is run within the
1088main C compiler rather than as a separate process, apparently for reasons of speed.
1089
1090You will be using the standard assembler and linkage editor so these are
1091ignored from now on.
1092
1093You have to write your own preprocessor if you want one. This is usually
1094totally language specific. The main point to be aware of is to ensure
1095that you find some way to pass file name and line number information
1096through to the main compiler so that it can tell the back end this
1097information and so the debugger can find the right source line for each
1098piece of code. That is all there is to say about the preprocessor except
1099that the preprocessor will probably not be the slowest part of the
1100compiler and will probably not use the most memory so don't waste too
1101much time tuning it until you know you need to do so.
1102
1103@node treelang main compiler,  , treelang driver, treelang compiler interfaces
1104@subsection treelang main compiler
1105
1106The main compiler for treelang consists of toplev.c from the main GCC
1107compiler, the parser, lexer and back end interface routines, and the
1108back end routines themselves, of which there are many.
1109
1110toplev.c does a lot of work for you and you should almost certainly use it,
1111
1112Writing this code is the hard part of creating a compiler using GCC. The
1113back end interface documentation is incomplete and the interface is
1114complex.
1115
1116There are three main aspects to interfacing to the other GCC code.
1117
1118@menu
1119* Interfacing to toplev.c::
1120* Interfacing to the garbage collection::
1121* Interfacing to the code generation code. ::
1122@end menu
1123
1124@node Interfacing to toplev.c, Interfacing to the garbage collection, treelang main compiler, treelang main compiler
1125@subsubsection Interfacing to toplev.c
1126
1127In treelang this is handled mainly in tree1.c
1128and partly in treetree.c. Peruse toplev.c for details of what you need
1129to do.
1130
1131@node Interfacing to the garbage collection, Interfacing to the code generation code. , Interfacing to toplev.c, treelang main compiler
1132@subsubsection Interfacing to the garbage collection
1133
1134Interfacing to the garbage collection. In treelang this is mainly in
1135tree1.c.
1136
1137Memory allocation in the compiler should be done using the ggc_alloc and
1138kindred routines in ggc*.*. At the end of every 'function' in your language, toplev.c calls
1139the garbage collection several times. The garbage collection calls mark
1140routines which go through the memory which is still used, telling the
1141garbage collection not to free it. Then all the memory not used is
1142freed.
1143
1144What this means is that you need a way to hook into this marking
1145process. This is done by calling ggc_add_root. This provides the address
1146of a callback routine which will be called duing garbage collection and
1147which can call ggc_mark to save the storage. If storage is only
1148used within the parsing of a function, you do not need to provide a way
1149to mark it.
1150
1151Note that you can also call ggc_mark_tree to mark any of the back end
1152internal 'tree' nodes. This routine will follow the branches of the
1153trees and mark all the subordinate structures. This is useful for
1154example when you have created a variable declaaration that will be used
1155across multiple functions, or for a function declaration (from a
1156prototype) that may be used later on. See the next item for more on the
1157tree nodes.
1158
1159@node Interfacing to the code generation code. ,  , Interfacing to the garbage collection, treelang main compiler
1160@subsubsection Interfacing to the code generation code.
1161
1162In treelang this is done in treetree.c. A typedef called 'tree' which is
1163defined in tree.h and tree.def in the GCC directory and largely
1164implemented in tree.c and stmt.c forms the basic interface to the
1165compiler back end.
1166
1167In general you call various tree routines to generate code, either
1168directly or through toplev.c. You build up data structures and
1169expressions in similar ways.
1170
1171You can read some documentation on this which can be found via the GCC
1172main web page. In particular, the documentation produced by Joachim
1173Nadler and translated by Tim Josling can be quite useful. the C compiler
1174also has documentation in the main GCC manual (particularly the current
1175CVS version) which is useful on a lot of the details.
1176
1177In time it is hoped to enhance this document to provide a more
1178comprehensive overview of this topic. The main gap is in explaining how
1179it all works together.
1180
1181@node Hints and tips,  , treelang compiler interfaces, treelang internals
1182@section Hints and tips
1183
1184@itemize @bullet
1185
1186@item
1187TAGS: Use the make ETAGS commands to create TAGS files which can be used in
1188emacs to jump to any symbol quickly.
1189
1190@item
1191GREP: grep is also a useful way to find all uses of a symbol.
1192
1193@item
1194TREE: The main routines to look at are tree.h and tree.def. You will
1195probably want a hardcopy of these.
1196
1197@item
1198SAMPLE: look at the sample interfacing code in treetree.c. You can use
1199gdb to trace through the code and learn about how it all works.
1200
1201@item
1202GDB: the GCC back end works well with gdb. It traps abort() and allows
1203you to trace back what went wrong.
1204
1205@item
1206Error Checking: The compiler back end does some error and consistency
1207checking. Often the result of an error is just no code being
1208generated. You will then need to trace through and find out what is
1209going wrong. The rtl dump files can help here also.
1210
1211@item
1212rtl dump files: The main compiler documents these files which are dumps
1213of the rtl (intermediate code) which is manipulated doing the code
1214generation process. This can provide useful clues about what is going
1215wrong. The rtl 'language' is documented in the main GCC manual.
1216
1217@end itemize
1218
1219@end ifset
1220
1221@node Open Questions, Bugs, treelang internals, Top
1222@chapter Open Questions
1223
1224If you know GCC well, please consider looking at the file treetree.c and
1225resolving any questions marked "???".
1226
1227@node Bugs, Service, Open Questions, Top
1228@chapter Reporting Bugs
1229@cindex bugs
1230@cindex reporting bugs
1231
1232You can report bugs to @email{@value{email-bugs}}. Please make
1233sure bugs are real before reporting them. Follow the guidelines in the
1234main GCC manual for submitting bug reports.
1235
1236@menu
1237* Sending Patches::
1238@end menu
1239
1240@node Sending Patches,  , Bugs, Bugs
1241@section Sending Patches for GNU Treelang
1242
1243If you would like to write bug fixes or improvements for the GNU
1244Treelang compiler, that is very helpful.  Send suggested fixes to
1245@email{@value{email-patches}}.
1246
1247@node Service, Projects, Bugs, Top
1248@chapter How To Get Help with GNU Treelang
1249
1250If you need help installing, using or changing GNU Treelang, there are two
1251ways to find it:
1252
1253@itemize @bullet
1254
1255@item
1256Look in the service directory for someone who might help you for a fee.
1257The service directory is found in the file named @file{SERVICE} in the
1258GCC distribution.
1259
1260@item
1261Send a message to @email{@value{email-general}}.
1262
1263@end itemize
1264
1265@end ifset
1266@ifset INTERNALS
1267
1268@node Projects, Index, Service, Top
1269@chapter Projects
1270@cindex projects
1271
1272If you want to contribute to @code{treelang} by doing research,
1273design, specification, documentation, coding, or testing,
1274the following information should give you some ideas.
1275
1276Send a message to @email{@value{email-general}} if you plan to add a
1277feature.
1278
1279The main requirement for treelang is to add features and to add
1280documentation. Features are things that the GCC back end can do but
1281which are not reflected in treelang. Examples include structures,
1282unions, pointers, arrays.
1283
1284@end ifset
1285
1286@node Index,  , Projects, Top
1287@unnumbered Index
1288
1289@printindex cp
1290@summarycontents
1291@contents
1292@bye
1293