Lines Matching +full:start +full:- +full:up

4 flex, lex \- fast lexical analyzer generator
7 .B [\-bcdfhilnpstvwBFILTV78+? \-C[aefFmr] \-ooutput \-Pprefix \-Sskeleton]
8 .B [\-\-help \-\-version]
13 a tool for generating programs that perform pattern-matching on text.
37 Start Conditions
39 managing "mini-scanners"
45 End-of-file Rules
58 flex command-line options, and the "%option"
107 .B \-ll
178 /* scanner for a toy Pascal-like language */
185 DIGIT [0-9]
186 ID [a-z][a-z0-9]*
206 "+"|"-"|"*"|"/" printf( "An operator: %s\\n", yytext );
208 "{"[^}\\n]*"}" /* eat up one-line comments */
210 [ \\t\\n]+ /* eat up whitespace */
220 ++argv, --argc; /* skip over program name */
258 .I start conditions,
268 followed by zero or more letters, digits, '_', or '-' (dash).
269 The definition is taken to begin at the first non-white-space character
276 DIGIT [0-9]
277 ID [a-z][a-z0-9]*
283 followed by zero-or-more letters-or-digits.
293 ([0-9])+"."([0-9])*
296 and matches one-or-more digits followed by a '.' followed
297 by zero-or-more digits.
339 but its meaning is not well-defined and it may well cause compile-time
346 beginning with "/*") is also copied verbatim to the output up
358 [abj-oZ] a "character class" with a range in it; matches
361 [^A-Z] a "negated character class", i.e., any character
364 [^A-Z\\n] any character EXCEPT an uppercase letter or
377 then the ANSI-C interpretation of \\x.
418 <s>r an r, but only in start condition s (see
419 below for discussion of start conditions)
421 same, but in any of start conditions s1,
423 <*>r an r in any start condition, even an exclusive one.
426 <<EOF>> an end-of-file
428 an end-of-file when in start condition s1 or s2
433 operators, '-', ']', and, at the beginning of the class, '^'.
457 the string "ba" followed by zero-or-more r's.
458 To match "foo" or zero-or-more "bar"'s, use:
464 and to match zero-or-more "foo"'s-or-"bar"'s:
497 returns true - i.e., any alphabetic or numeric.
509 [[:alpha:]0-9]
510 [a-zA-Z0-9]
513 If your scanner is case-insensitive (the
514 .B \-i
523 .IP -
524 A negated character class such as the example "[^A-Z]"
529 (e.g., "[^A-Z\\n]").
535 .IP -
538 The start condition, '^', and "<<EOF>>" patterns
561 If what's wanted is a "foo" or a bar-followed-by-a-newline, the following
570 bar-at-the-beginning-of-a-line.
628 .B -l
696 results in too much text being pushed back; instead, a run-time error results.
707 The pattern ends at the first non-escaped
739 and will consider the action to be all the text up to the next
759 characters to its end--these will overwrite later characters in the
775 .IP -
778 .IP -
780 followed by the name of a start condition places the scanner in the
781 corresponding start condition (see below).
782 .IP -
791 set up appropriately.
824 .|\\n /* eat up any unmatched character */
839 .I -Cf
841 .I -CF
851 .IP -
859 For example, given the input "mega-kludge"
860 the following will write "mega-mega-kludge" to the output:
864 mega- ECHO; yymore();
868 First "mega-" is matched and echoed to the output.
870 is matched, but the previous "mega-" is still hanging around at the
875 for the "kludge" rule will actually write "mega-kludge".
892 .IP -
912 [a-z]+ ECHO;
927 .IP -
942 for ( i = yyleng - 1; i >= 0; --i )
953 of the input stream, pushing back strings must be done back-to-front.
976 to attempt to mark the input stream with an end-of-file.
977 .IP -
981 the following is one way to eat up C comments:
992 ; /* eat up text of comment */
1021 .IP -
1032 .IP -
1039 is also called when an end-of-file is encountered.
1075 K&R-style/non-prototyped function declaration, you must terminate
1076 the definition with a semi-colon (;).
1084 an end-of-file (at which point it returns the value 0) or
1089 If the scanner reaches an end-of-file, subsequent calls are undefined
1099 pointer (which can be nil, if you've set up
1125 reset the start condition to
1127 (see Start Conditions, below).
1137 block-reads rather than simple
1145 Its action is to place up to
1155 global file-pointer "yyin".
1173 When the scanner receives an end-of-file indication from YY_INPUT,
1180 function has gone ahead and set up
1184 true (non-zero), then the scanner terminates, returning 0 to its
1186 Note that in either case, the start condition remains unchanged;
1199 .B \-ll
1202 Three routines are available for scanning from in-memory buffers rather
1217 .SH START CONDITIONS
1222 the scanner is in the start condition named "sc".
1226 <STRING>[^"]* { /* eat up the string body ... */
1231 will be active only when the scanner is in the "STRING" start
1240 will be active only when the current start condition is
1243 Start conditions
1252 start conditions, the latter
1254 start conditions.
1255 A start condition is activated using the
1260 action is executed, rules with the given start
1262 rules with other start conditions will be inactive.
1263 If the start condition is
1265 then rules with no start conditions at all will also be active.
1270 rules qualified with the start condition will be active.
1271 A set of rules contingent on the same exclusive start condition
1276 exclusive start conditions make it easy to specify "mini-scanners"
1280 If the distinction between inclusive and exclusive start conditions
1310 when in start condition
1323 start condition is an
1326 start condition.
1328 Also note that the special start-condition specifier
1330 matches every start condition.
1345 any unmatched character) remains active in start conditions.
1356 no start conditions are active.
1358 referred to as the start-condition "INITIAL", so
1362 (The parentheses around the start condition name are not required but
1369 the scanner to enter the "SPECIAL" start condition whenever
1388 To illustrate the uses of start conditions,
1394 "expect-floats"
1395 it will treat it as a single token, the floating-point number
1405 expect-floats BEGIN(expect);
1407 <expect>[0-9]+"."[0-9]+ {
1413 * we need another "expect-number"
1420 [0-9]+ {
1439 <comment>"*"+[^*/\\n]* /* eat up '*'s not followed by '/'s */
1447 a high-speed scanner try to match as much possible in each rule, as
1450 Note that start-conditions names are really integer values and
1474 <comment>"*"+[^*/\\n]* /* eat up '*'s not followed by '/'s */
1479 Furthermore, you can access the current start condition using
1480 the integer-valued
1498 Note that start conditions do not have their own name-space; %s's and %x's
1501 Finally, here's an example of how to match C-style quoted strings using
1502 exclusive start conditions, including expanded escape sequences (but
1515 <str>\\" { /* saw closing quote - all done */
1524 /* error - unterminated string constant */
1528 <str>\\\\[0-7]{1,3} {
1535 /* error, constant is out-of-bounds */
1540 <str>\\\\[0-9]+ {
1541 /* generate error - bad escape sequence; something
1563 Often, such as in some of the examples above, you wind up writing a
1564 whole bunch of rules all preceded by the same start condition(s).
1566 start condition
1568 A start condition scope is begun with:
1576 is a list of one or more start conditions.
1577 Inside the start condition
1604 Start condition scopes may be nested.
1606 Three routines are available for manipulating stacks of start conditions:
1609 pushes the current start condition onto the top of the start condition
1614 (recall that start condition names are also integers).
1623 The start condition stack grows dynamically and so has no built-in
1627 To use start condition stacks, your scanner must include a
1701 may be used by yywrap() to set things up for continued scanning, instead
1711 change the start condition.
1753 /* the "incl" state is used for picking up the name
1767 [a-z]+ ECHO;
1768 [^a-z\\n]*\\n? ECHO;
1793 if ( --include_stack_ptr < 0 )
1807 Three routines are available for setting up input buffers for
1808 scanning in-memory strings instead of files.
1819 will start scanning the string.
1822 scans a NUL-terminated string.
1853 .B base[size-2],
1856 If you fail to set up
1868 .SH END-OF-FILE RULES
1870 actions which are to be taken when an end-of-file is
1871 encountered and yywrap() returns non-zero (i.e., indicates
1875 .IP -
1882 .IP -
1886 .IP -
1890 .IP -
1896 patterns; they may only be qualified with a list of start
1901 start conditions which do not already have <<EOF>> actions.
1903 specify an <<EOF>> rule for only the initial start condition, use
1937 it could be #define'd to call a routine to convert yytext to lower-case.
1957 .B \-s),
1982 .B \-I
1984 A non-zero value
1986 value as non-interactive.
1989 .B %option always-interactive
1991 .B %option never-interactive
2002 A non-zero macro argument makes rules anchored with
2028 .IP -
2065 .B \-+
2067 .IP -
2070 .IP -
2082 Once scanning terminates because an end-of-file
2086 .IP -
2091 The switch-over to the new file is immediate
2092 (any previously buffered-up input is lost).
2099 .IP -
2105 .IP -
2110 .IP -
2112 returns an integer value corresponding to the current start
2116 to return to that start condition.
2122 parser-generator.
2136 .B \-d
2159 [0-9]+ yylval = atoi( yytext ); return TOK_NUMBER;
2166 .B \-b, --backup
2167 Generate backing-up information to
2169 This is a list of scanner states which require backing up
2172 can remove backing-up states.
2175 backing-up states are eliminated and
2176 .B \-Cf
2178 .B \-CF
2180 .B \-p
2186 .B \-c
2187 is a do-nothing, deprecated option included for POSIX compliance.
2189 .B \-d, \-\-debug
2195 is non-zero (which is the default),
2201 --accepting rule at line 53 ("the matched text")
2206 Messages are also generated when the scanner backs up, accepts the
2209 or reaches an end-of-file.
2211 .B \-f, \-\-full
2217 .B \-Cfr
2220 .B \-h, \-\-help
2226 .B \-?
2228 .B \-\-help
2230 .B \-h.
2232 .B \-i, \-\-case-insensitive
2236 .I case-insensitive
2246 .B \-l, \-\-lex\-compat
2255 .B \-+, -f, -F, -Cf,
2257 .B -CF
2266 .B \-n
2267 is another do-nothing, deprecated option included only for
2270 .B \-p, \-\-perf\-report
2289 .B \-I
2292 .B \-s, \-\-no\-default
2303 .B \-t, \-\-stdout
2310 .B \-v, \-\-verbose
2321 .B \-V),
2325 .B \-w, \-\-nowarn
2328 .B \-B, \-\-batch
2336 .B \-I
2339 .B \-B
2349 .B \-Cf
2351 .B \-CF
2353 .B \-B
2356 .B \-F, \-\-fast
2362 .B (-f),
2366 and a catch-all, "identifier" rule, such as in the set:
2373 [a-z]+ return TOK_ID;
2380 .B -F.
2383 .B \-CFr
2386 .B \-+.
2388 .B \-I, \-\-interactive
2410 .B \-Cf
2412 .B \-CF
2413 table-compression options (see below).
2415 for high-performance you should be using one of these options, so if you
2418 assumes you'd rather trade off a bit of run-time performance for intuitive
2423 .B \-I
2425 .B \-Cf
2427 .B \-CF.
2434 .B \-I
2437 .B %option always-interactive
2443 .B \-B
2446 .B \-L, \-\-noline
2462 fault -- you should report these sorts of errors to the email address
2465 .B \-T, \-\-trace
2474 the form of the input and the resultant non-deterministic and deterministic
2479 .B \-V, \-\-version
2483 .B \-\-version
2485 .B \-V.
2487 .B \-7, \-\-7bit
2490 to generate a 7-bit scanner, i.e., one which can only recognize 7-bit
2493 .B \-7
2494 is that the scanner's tables can be up to half the size of those generated
2496 .B \-8
2499 or crash if their input contains an 8-bit character.
2502 .B \-Cf
2504 .B \-CF
2506 .B \-7
2510 default behavior is to generate an 8-bit scanner unless you use the
2511 .B \-Cf
2513 .B \-CF,
2516 defaults to generating 7-bit scanners unless your site was always
2517 configured to generate 8-bit scanners (as will often be the case
2518 with non-USA sites).
2519 You can tell whether flex generated a 7-bit
2520 or an 8-bit scanner by inspecting the flag summary in the
2521 .B \-v
2525 .B \-Cfe
2527 .B \-CFe
2529 discussed see below), flex still defaults to generating an 8-bit
2530 scanner, since usually with these compression options full 8-bit tables
2531 are not much more expensive than 7-bit tables.
2533 .B \-8, \-\-8bit
2536 to generate an 8-bit scanner, i.e., one which can recognize 8-bit
2539 .B \-Cf
2541 .B \-CF,
2542 as otherwise flex defaults to generating an 8-bit scanner anyway.
2545 .B \-7
2546 above for flex's default behavior and the tradeoffs between 7-bit
2547 and 8-bit scanners.
2549 .B \-+, \-\-c++
2555 .B \-C[aefFmr]
2556 controls the degree of table compression and, more generally, trade-offs
2559 .B \-Ca, \-\-align
2565 than with smaller-sized units such as shortwords.
2569 .B \-Ce, \-\-ecs
2579 "[0-9]" then the digits '0', '1', ..., '9' will all be put
2583 a factor of 2-5) and are pretty cheap performance-wise (one array
2584 look-up per character scanned).
2586 .B \-Cf
2589 scanner tables should be generated -
2595 .B \-CF
2598 .B \-F
2602 .B \-+.
2604 .B \-Cm, \-\-meta-ecs
2608 .I meta-equivalence classes,
2611 Meta-equivalence
2614 array look-up per character scanned).
2616 .B \-Cr, \-\-read
2628 .B \-Cf
2630 .B \-CF.
2632 .B \-Cr
2638 .B \-Cr
2644 .B \-C
2646 equivalence classes nor meta-equivalence classes should be used.
2649 .B \-Cf
2651 .B \-CF
2653 .B \-Cm
2654 do not make sense together - there is no opportunity for meta-equivalence
2660 .B \-Cem,
2664 and meta-equivalence classes.
2667 faster-executing scanners at the cost of larger tables with
2672 -Cem
2673 -Cm
2674 -Ce
2675 -C
2676 -C{f,F}e
2677 -C{f,F}
2678 -C{f,F}a
2687 .B \-Cfe
2691 .B \-ooutput, \-\-outputfile=FILE
2697 .B \-o
2699 .B \-t
2705 .B \\-L
2709 .B \-Pprefix, \-\-prefix=STRING
2714 for all globally-visible variable and function names to instead be
2717 .B \-Pfoo
2763 provide your own (appropriately-named) version of the routine for your
2767 .B \-ll
2770 .B \-Sskeleton_file, \-\-skel=FILE
2778 .B \-X, \-\-posix\-compat
2781 .B \-\-yylineno
2784 .B \-\-yyclass=NAME
2787 .B \-\-header\-file=FILE
2790 .B \-\-tables\-file[=FILE]
2793 .B \\-Dmacro[=defn]
2796 .B \-R, \-\-reentrant
2799 .B \-\-bison\-bridge
2802 .B \-\-bison\-locations
2805 .B \-\-stdinit
2808 .B \-\-noansi\-definitions old\-style function definitions.
2810 .B \-\-noansi\-prototypes
2813 .B \-\-nounistd
2816 .B \-\-noFUNCTION
2821 scanner specification itself, rather than from the flex command-line.
2835 7bit -7 option
2836 8bit -8 option
2837 align -Ca option
2838 backup -b option
2839 batch -B option
2840 c++ -+ option
2843 case-sensitive opposite of -i (default)
2845 case-insensitive or
2846 caseless -i option
2848 debug -d option
2849 default opposite of -s option
2850 ecs -Ce option
2851 fast -F option
2852 full -f option
2853 interactive -I option
2854 lex-compat -l option
2855 meta-ecs -Cm option
2856 perf-report -p option
2857 read -Cr option
2858 stdout -t option
2859 verbose -v option
2860 warn opposite of -w option
2861 (use "%option nowarn" for -w)
2871 .B always-interactive
2891 .B never-interactive
2896 .B always-interactive.
2899 enables the use of start condition stacks (see Start Conditions above).
2921 to be compile-time constant.
2930 .B %option lex-compat.
2937 upon an end-of-file, but simply assume that there are no more
2962 Three options take string-delimited values, offset with '=':
2969 .B -oABC,
2977 .B -PXYZ.
2985 .B \-+
3001 member function that emits a run-time error (by invoking
3027 is that it generate high-performance scanners.
3031 .B \-C
3041 pattern sets that require backing up
3044 %option always-interactive
3046 '^' beginning-of-line operator
3057 is a quite-cheap macro; so if just putting back some excess text you
3065 Getting rid of backing up is messy and often may be an enormous
3068 .B \-b
3083 State #6 is non-accepting -
3086 out-transitions: [ o ]
3087 jam-transitions: EOF [ \\001-n p-\\177 ]
3089 State #8 is non-accepting -
3092 out-transitions: [ a ]
3093 jam-transitions: EOF [ \\001-` b-\\177 ]
3095 State #9 is non-accepting -
3098 out-transitions: [ r ]
3099 jam-transitions: EOF [ \\001-q s-\\177 ]
3101 Compressed tables always back up.
3111 something other than an 'o', it will have to back up to find
3117 have to back up to simply match the 'f' (by the default rule).
3122 than an 'a', the scanner will have to back up to accept "foo".
3127 all the trouble of removing backing up from the rules unless
3129 .B \-Cf
3131 .B \-CF,
3134 The way to remove the backing up is to add "error" rules:
3150 Eliminating backing up among a list of keywords can also be
3151 done using a "catch-all" rule:
3158 [a-z]+ return TOK_ID;
3163 Backing up messages tend to cascade.
3167 only takes a dozen or so rules to eliminate the backing up (though
3172 feature will be to automatically add rules to eliminate backing up).
3175 backing up only if you eliminate
3177 instance of backing up.
3218 does not often have to go through the additional work of setting up
3237 This could be sped up by writing it as:
3265 A final example in speeding up a scanner: suppose you want to scan
3283 To eliminate the back-tracking, introduce a catch-all rule:
3294 [a-z]+ |
3312 [a-z]+\\n |
3316 One has to be careful here, as we have now reintroduced backing up
3323 can't figure this out, and it will plan for possibly needing to back up
3329 To eliminate the possibility of backing up,
3332 how it's classified, we can introduce one more catch-all rule, this
3344 [a-z]+\\n |
3345 [a-z]+ |
3350 .B \-Cf,
3394 .B \-+
3515 (if non-nil)
3538 reads up to
3543 To indicate end-of-input, return 0 characters.
3545 .B \-B
3547 .B \-I
3563 which, while NUL-terminated, may also contain "internal" NUL's if
3583 .B \-P
3605 alpha [A-Za-z]
3606 dig [0-9]
3607 name ({alpha}|{dig}|\\$)({alpha}|{dig}|[_.\\-/$])*
3608 num1 [-+]?{dig}+\\.?([eE][-+]?{dig}+)?
3609 num2 [-+]?{dig}*\\.{dig}+([eE][-+]?{dig}+)?
3647 while(lexer->yylex() != 0)
3653 .B \-P
3707 .B \-l
3714 .B \-l
3721 .IP -
3727 .B \-l
3733 should be maintained on a per-buffer basis, rather than a per-scanner
3738 .IP -
3745 encounters an end-of-file the normal
3748 A ``real'' end-of-file is returned by
3765 .IP -
3770 .IP -
3776 an interrupt handler which long-jumps out of the scanner, and
3781 fatal flex scanner internal error--end of buffer missed
3798 .IP -
3803 macro is done to the file-pointer
3810 .IP -
3812 does not support exclusive start conditions (%x), though they
3814 .IP -
3821 NAME [A-Z][A-Z0-9]*
3828 is expanded the rule is equivalent to "foo[A-Z][A-Z0-9]*?"
3830 "[A-Z0-9]*".
3834 "foo([A-Z][A-Z0-9]*)?" and so the string "foo" will match.
3853 .B \-l
3859 .IP -
3873 .IP -
3880 .IP -
3891 .B \-l
3893 .IP -
3904 .IP -
3915 .IP -
3916 The special table-size declarations such as
3925 .IP -
3951 start condition scopes
3952 start condition stacks
3953 interactive/non-interactive scanners
3974 semi-colons, while with
3998 an identifier "catch-all" rule:
4001 [a-z]+ got_identifier();
4010 .B \-s
4013 means that it is possible (perhaps only in a particular start condition)
4017 .B \-s
4022 .I yymore_used_but_not_detected undefined -
4041 .I flex scanner jammed -
4043 .B \-s
4048 .I token too large, exceeds YYLMAX -
4061 .I scanner requires \-8 flag to
4062 .I use the character 'x' -
4063 Your scanner specification includes recognizing the 8-bit character
4065 and you did not specify the \-8 flag, and your scanner defaulted to 7-bit
4067 .B \-Cf
4069 .B \-CF
4072 .B \-7
4075 .I flex scanner push-back overflow -
4079 both the pushed-back text and the current token in
4085 input buffer overflow, can't enlarge buffer because scanner uses REJECT -
4093 fatal flex scanner internal error--end of buffer missed -
4094 This can occur in a scanner which is reentered after a long-jump
4104 .I too many start conditions in <> construct! -
4105 you listed more start conditions in a <> construct than exist (so
4109 .B \-ll
4119 .B -+.
4132 backing-up information for
4133 .B \-b
4148 For some trailing context rules, parts which are actually fixed-length are
4151 considered variable-length.
4173 .B \-l
4176 Pattern-matching of NUL's is substantially slower than matching other
4182 Due to both buffering of input and read-ahead, you cannot intermix
4193 .B \-v
4203 .B \-f
4205 .B \-F
4220 .I LEX \- Lexical Analyzer Generator
4224 Addison-Wesley (1986).
4225 Describes the pattern-matching techniques used by
4239 beta-testers, feedbackers, and contributors, especially Francois Pinard,
4242 Stan Adermann, Terry Allen, David Barker-Plummer, John Basrai,
4273 Larry Schwimmer, Alex Siegel, Eckehard Stolz, Jan-Erik Strvmquist,
4279 mail-archiving skills but whose contributions are appreciated all the
4287 Thanks to Esmond Pitt and Earle Horton for 8-bit character support; to