1*10937Srrh5/18/78 2*10937SrrhA new version of Yacc has been installed which contains some new 3*10937Srrhfeatures relating to error recovery, detection of funny conditions in the 4*10937Srrhgrammar, and strong typing. Existing grammars should continue to work, 5*10937Srrhwith the possible exception of somewhat better error recovery behavior. 6*10937SrrhMore details follow: 7*10937Srrh 8*10937Srrh*** Ratfor and EFL Yacc are dead. Long live C! 9*10937Srrh 10*10937Srrh*** The y.tab.c file now uses the # line feature to reflect 11*10937Srrh most error conditions in actions, etc., back to the yacc source 12*10937Srrh file, rather than the y.tab.c file. As always with such features, 13*10937Srrh lookahead may cause the line number to be one too large 14*10937Srrh occasionally. 15*10937Srrh 16*10937Srrh*** The error recovery algorithm has been changed to cause the 17*10937Srrh parser never to reduce on a state where there is a shift 18*10937Srrh on the special token `error'. This has the effect of causing 19*10937Srrh the error recovery action to take place somewhat closer to the 20*10937Srrh location of the error than previously. It does not affect the 21*10937Srrh behavior of the parser in the absence of errors. The parse 22*10937Srrh tables may be 1-2% larger as a result of this change. 23*10937Srrh 24*10937Srrh*** Yacc now detects the existence of nonterminals in the grammar 25*10937Srrh which can never derive any strings of tokens (even the empty string). 26*10937Srrh The simplest example is the grammar: 27*10937Srrh %% 28*10937Srrh s : s 'a' ; 29*10937Srrh Here, one must reduce `s' in order to reduce `s': the 30*10937Srrh parser would always report error. If such nonterminals are 31*10937Srrh present, Yacc reports all such, then terminates. 32*10937Srrh 33*10937Srrh*** There is a new reserved word, %start. When used in the declarations 34*10937Srrh section, it may be used to declare the start symbol of the grammar. 35*10937Srrh If %start does not appear, the start symbol is, as at present, the 36*10937Srrh first nonterminal symbol encountered. 37*10937Srrh 38*10937Srrh*** Yacc produced parsers are notorious for producing many many 39*10937Srrh comments from lint. The problem is the value stack of the 40*10937Srrh parser, which typically may contain integers, pointers, and 41*10937Srrh possibly even floating point, etc., values. The lack 42*10937Srrh of tight specification of this stack leads to potential 43*10937Srrh nonportability, and considerable loss of the diagnostic power 44*10937Srrh of lint. Thus, some new features have been added which make use 45*10937Srrh of the new structure and union facilities of C. In effect, 46*10937Srrh the user of Yacc may `honestly' declare the value stack, as 47*10937Srrh well as the lexical interface variable, yylval, to be unions 48*10937Srrh of all the types desired. Yacc will keep track of the types 49*10937Srrh declared for all terminals and nonterminals, and automatically 50*10937Srrh insert the appropriate union tag for all constructions such 51*10937Srrh as $1, $$, etc. It is up to the user to supply the appropriate 52*10937Srrh union declaration, and to declare the type of all the terminal 53*10937Srrh and nonterminal symbols which will have values. If the type 54*10937Srrh declaration feature is used at all, it must be used correctly; 55*10937Srrh if it is not used, the default values are integers, as at present. 56*10937Srrh The new type declaration features are described below: 57*10937Srrh 58*10937Srrh*** There is a new keyword, %union. A construction such as 59*10937Srrh %union { 60*10937Srrh int inttag; 61*10937Srrh float floattag; 62*10937Srrh struct mumble *ptrtag; 63*10937Srrh } 64*10937Srrh can be used, in the declarations section, to declare 65*10937Srrh the type of the yacc stack. The declaration is 66*10937Srrh effectively copied to the y.tab.c file, and, if the -d 67*10937Srrh option is present, to the y.tab.h file as well. The 68*10937Srrh declaration is used to declare the typedef YYSTYPE, which is the 69*10937Srrh type of the value stack. If the -d option is present, 70*10937Srrh the declaration 71*10937Srrh extern YYSTYPE yylval; 72*10937Srrh is also placed onto the y.tab.h file. Note that the lexical 73*10937Srrh analyzer must be changed to use the appropriate union tag when 74*10937Srrh assigning values. It is not necessary that the %union 75*10937Srrh mechanism be used, as long as there is a union type YYSTYPE 76*10937Srrh defined in the declarations section. 77*10937Srrh 78*10937Srrh*** The %token, %left, %right, and %nonassoc declarations now 79*10937Srrh accept a union tag, enclosed in angle brackets (<...>), immediately 80*10937Srrh after the keyword. All tokens mentioned in that declaration are 81*10937Srrh taken to have the appropriate type. 82*10937Srrh 83*10937Srrh*** There is a new keyword, %type, also followed by a union tag 84*10937Srrh in angle brackets, which may be used in the declarations section to 85*10937Srrh declare nonterminal symbols to have a particular type. 86*10937Srrh 87*10937Srrh In both cases, whenever a $$ or $n is encountered in an action, 88*10937Srrh the appropriate union tag is supplied by Yacc. Once any type is 89*10937Srrh declared, it is an error to use a $$ or $n whose type is unknown. 90*10937Srrh It is also illegal to have a grammar rule whose LHS has a type, 91*10937Srrh but the rule has no action and the default action { $$ = $1; } 92*10937Srrh would be inapplicable because $1 had a different type. 93*10937Srrh 94*10937Srrh*** There are occasional times when the type of something is 95*10937Srrh not known (for example, when an action within a rule returns a 96*10937Srrh value). In this case, the $$ and $n syntax is extended 97*10937Srrh to permit the declaration of the type: the syntax is 98*10937Srrh $<tag>$ 99*10937Srrh and 100*10937Srrh $<tag>n 101*10937Srrh respectively. This rather strange syntax is necessitated by the 102*10937Srrh need to distinguish the <> surrounding the tag from the < and > 103*10937Srrh operators of C in the action. It is anticipated that the usage 104*10937Srrh will be rare. 105*10937Srrh 106*10937Srrh*** As always, report gripes, bugs, suggestions to SCJ *** 107*10937Srrh 108*10937Srrh12/01/76 109*10937SrrhA newer version of Yacc has been installed which copies the actions directly 110*10937Srrhinto the parser, rather than gathering them into a separate routine. 111*10937SrrhThe advantages include 112*10937Srrh1. It's faster 113*10937Srrh2. You can return a value from yyparse (and stop parsing...) by 114*10937Srrh saying `return(x);' in an action 115*10937Srrh3. There are macros which simulate various interesting parsing 116*10937Srrh actions: 117*10937Srrh YYERROR causes the parser to behave as if a syntax 118*10937Srrh error had been encountered (i.e., do error recovery) 119*10937Srrh YYACCEPT causes a return from yyparse with a value of 0 120*10937Srrh YYABORT causes a return from yyparse with a value of 1 121*10937Srrh 122*10937SrrhThe repositioning of the actions may cause scope problems 123*10937Srrhfor some people who include lexical analyzers in funny places. 124*10937SrrhThis can probably be avoided by using another 125*10937Srrhnew feature: the `-d' option. 126*10937SrrhInvoking Yacc with the -d option causes the #defines 127*10937Srrhgenerated by Yacc to be written out onto a file 128*10937Srrhcalled "y.tab.h". This can then be included as desired 129*10937Srrhin lexical analyzers, etc. 130*10937Srrh 131*10937Srrh11/28/76 132*10937SrrhA new version of Yacc has been installed which permits actions within 133*10937Srrhrules. For such actions, $$ and $1, $2, etc. continue to have their 134*10937Srrhusual meanings. An error message is returned if any $n refers to 135*10937Srrha value lying to the right of the action in the rule. 136*10937Srrh 137*10937SrrhThese internal actions are assumed to return a value, which is accessed 138*10937Srrhthrough the $n mechanism. 139*10937Srrh 140*10937SrrhIn the y.output file, the actions are referred to by created nonterminal 141*10937Srrhnames of the form $$nnn. 142*10937Srrh 143*10937SrrhAll actions within rules are assumed to be distinct. If some actions 144*10937Srrhare the same, Yacc might report reduce/reduce conflicts which could 145*10937Srrhbe resolved by explicitly identifying identical actions; does anyone 146*10937Srrhhave a good idea for a syntax to do this? 147*10937Srrh 148*10937SrrhIn the new Yacc, the = sign may now be omitted in action constructions 149*10937Srrhof the form ={ ... } 150