xref: /csrg-svn/old/yacc/yaccnews (revision 10937)
1*10937Srrh5/18/78
2*10937SrrhA new version of Yacc has been installed which contains some new
3*10937Srrhfeatures relating to error recovery, detection of funny conditions in the
4*10937Srrhgrammar, and strong typing.  Existing grammars should continue to work,
5*10937Srrhwith the possible exception of somewhat better error recovery behavior.
6*10937SrrhMore details follow:
7*10937Srrh
8*10937Srrh***	Ratfor and EFL Yacc are dead.  Long live C!
9*10937Srrh
10*10937Srrh***	The y.tab.c file now uses the # line feature to reflect
11*10937Srrh	most error conditions in actions, etc., back to the yacc source
12*10937Srrh	file, rather than the y.tab.c file.  As always with such features,
13*10937Srrh	lookahead may cause the line number to be one too large
14*10937Srrh	occasionally.
15*10937Srrh
16*10937Srrh***	The error recovery algorithm has been changed to cause the
17*10937Srrh	parser never to reduce on a state where there is a shift
18*10937Srrh	on the special token `error'.  This has the effect of causing
19*10937Srrh	the error recovery action to take place somewhat closer to the
20*10937Srrh	location of the error than previously.  It does not affect the
21*10937Srrh	behavior of the parser in the absence of errors.  The parse
22*10937Srrh	tables may be 1-2% larger as a result of this change.
23*10937Srrh
24*10937Srrh***	Yacc now detects the existence of nonterminals in the grammar
25*10937Srrh	which can never derive any strings of tokens (even the empty string).
26*10937Srrh	The simplest example is the grammar:
27*10937Srrh		%%
28*10937Srrh		s	:	s 'a' ;
29*10937Srrh	Here, one must reduce `s' in order to reduce `s': the
30*10937Srrh	parser would always report error.  If such nonterminals are
31*10937Srrh	present, Yacc reports all such, then terminates.
32*10937Srrh
33*10937Srrh***	There is a new reserved word, %start.  When used in the declarations
34*10937Srrh	section, it may be used to declare the start symbol of the grammar.
35*10937Srrh	If %start does not appear, the start symbol is, as at present, the
36*10937Srrh	first nonterminal symbol encountered.
37*10937Srrh
38*10937Srrh***	Yacc produced parsers are notorious for producing many many
39*10937Srrh	comments from lint.  The problem is the value stack of the
40*10937Srrh	parser, which typically may contain integers, pointers, and
41*10937Srrh	possibly even floating point, etc., values.  The lack
42*10937Srrh	of tight specification of this stack leads to potential
43*10937Srrh	nonportability, and considerable loss of the diagnostic power
44*10937Srrh	of lint.  Thus, some new features have been added which make use
45*10937Srrh	of the new structure and union facilities of C.  In effect,
46*10937Srrh	the user of Yacc may `honestly' declare the value stack, as
47*10937Srrh	well as the lexical interface variable, yylval, to be unions
48*10937Srrh	of all the types desired.  Yacc will keep track of the types
49*10937Srrh	declared for all terminals and nonterminals, and automatically
50*10937Srrh	insert the appropriate union tag for all constructions such
51*10937Srrh	as $1, $$, etc.  It is up to the user to supply the appropriate
52*10937Srrh	union declaration, and to declare the type of all the terminal
53*10937Srrh	and nonterminal symbols which will have values.  If the type
54*10937Srrh	declaration feature is used at all, it must be used correctly;
55*10937Srrh	if it is not used, the default values are integers, as at present.
56*10937Srrh	The new type declaration features are described below:
57*10937Srrh
58*10937Srrh***	There is a new keyword, %union.  A construction such as
59*10937Srrh		%union {
60*10937Srrh			int inttag;
61*10937Srrh			float floattag;
62*10937Srrh			struct mumble *ptrtag;
63*10937Srrh			}
64*10937Srrh	can be used, in the declarations section, to declare
65*10937Srrh	the type of the yacc stack.  The declaration is
66*10937Srrh	effectively copied to the y.tab.c file, and, if the -d
67*10937Srrh	option is present, to the y.tab.h file as well.  The
68*10937Srrh	declaration is used to declare the typedef YYSTYPE, which is the
69*10937Srrh	type of the value stack.  If the -d option is present,
70*10937Srrh	the declaration
71*10937Srrh		extern YYSTYPE yylval;
72*10937Srrh	is also placed onto the y.tab.h file.  Note that the lexical
73*10937Srrh	analyzer must be changed to use the appropriate union tag when
74*10937Srrh	assigning values.  It is not necessary that the %union
75*10937Srrh	mechanism be used, as long as there is a union type YYSTYPE
76*10937Srrh	defined in the declarations section.
77*10937Srrh
78*10937Srrh***	The %token, %left, %right, and %nonassoc declarations now
79*10937Srrh	accept a union tag, enclosed in angle brackets (<...>), immediately
80*10937Srrh	after the keyword.  All tokens mentioned in that declaration are
81*10937Srrh	taken to have the appropriate type.
82*10937Srrh
83*10937Srrh***	There is a new keyword, %type, also followed by a union tag
84*10937Srrh	in angle brackets, which may be used in the declarations section to
85*10937Srrh	declare nonterminal symbols to have a particular type.
86*10937Srrh
87*10937Srrh	In both cases, whenever a $$ or $n is encountered in an action,
88*10937Srrh	the appropriate union tag is supplied by Yacc.  Once any type is
89*10937Srrh	declared, it is an error to use a $$ or $n whose type is unknown.
90*10937Srrh	It is also illegal to have a grammar rule whose LHS has a type,
91*10937Srrh	but the rule has no action and the default action { $$ = $1; }
92*10937Srrh	would be inapplicable because $1 had a different type.
93*10937Srrh
94*10937Srrh***	There are occasional times when the type of something is
95*10937Srrh	not known (for example, when an action within a rule returns a
96*10937Srrh	value).  In this case, the $$ and $n syntax is extended
97*10937Srrh	to permit the declaration of the type: the syntax is
98*10937Srrh		$<tag>$
99*10937Srrh	and
100*10937Srrh		$<tag>n
101*10937Srrh	respectively.  This rather strange syntax is necessitated by the
102*10937Srrh	need to distinguish the <> surrounding the tag from the < and >
103*10937Srrh	operators of C in the action.  It is anticipated that the usage
104*10937Srrh	will be rare.
105*10937Srrh
106*10937Srrh***	As always, report gripes, bugs, suggestions to SCJ ***
107*10937Srrh
108*10937Srrh12/01/76
109*10937SrrhA newer version of Yacc has been installed which copies the actions directly
110*10937Srrhinto the parser, rather than gathering them into a separate routine.
111*10937SrrhThe advantages include
112*10937Srrh1.  It's faster
113*10937Srrh2.  You can return a value from yyparse (and stop parsing...) by
114*10937Srrh    saying `return(x);' in an action
115*10937Srrh3.  There are macros which simulate various interesting parsing
116*10937Srrh    actions:
117*10937Srrh      YYERROR  causes the parser to behave as if a syntax
118*10937Srrh               error had been encountered (i.e., do error recovery)
119*10937Srrh      YYACCEPT causes a return from yyparse with a value of 0
120*10937Srrh      YYABORT  causes a return from yyparse with a value of 1
121*10937Srrh
122*10937SrrhThe repositioning of the actions may cause scope problems
123*10937Srrhfor some people who include lexical analyzers in funny places.
124*10937SrrhThis can probably be avoided by using another
125*10937Srrhnew feature: the `-d' option.
126*10937SrrhInvoking Yacc with the -d option causes the #defines
127*10937Srrhgenerated by Yacc to be written out onto a file
128*10937Srrhcalled "y.tab.h".  This can then be included as desired
129*10937Srrhin lexical analyzers, etc.
130*10937Srrh
131*10937Srrh11/28/76
132*10937SrrhA new version of Yacc has been installed which permits actions within
133*10937Srrhrules.  For such actions, $$ and $1, $2, etc. continue to have their
134*10937Srrhusual meanings.  An error message is returned if any $n refers to
135*10937Srrha value lying to the right of the action in the rule.
136*10937Srrh
137*10937SrrhThese internal actions are assumed to return a value, which is accessed
138*10937Srrhthrough the $n mechanism.
139*10937Srrh
140*10937SrrhIn the y.output file, the actions are referred to by created nonterminal
141*10937Srrhnames of the form $$nnn.
142*10937Srrh
143*10937SrrhAll actions within rules are assumed to be distinct.  If some actions
144*10937Srrhare the same, Yacc might report reduce/reduce conflicts which could
145*10937Srrhbe resolved by explicitly identifying identical actions; does anyone
146*10937Srrhhave a good idea for a syntax to do this?
147*10937Srrh
148*10937SrrhIn the new Yacc, the = sign may now be omitted in action constructions
149*10937Srrhof the form    ={  ...   }
150