xref: /minix3/external/bsd/flex/dist/examples/fastwc/README (revision 357f1050293be536ca8309aae20889945ce99fc1)
1*357f1050SThomas VeermanThis directory contains some examples illustrating techniques for extracting
2*357f1050SThomas Veermanhigh-performance from flex scanners.  Each program implements a simplified
3*357f1050SThomas Veermanversion of the Unix "wc" tool: read text from stdin and print the number of
4*357f1050SThomas Veermancharacters, words, and lines present in the text.  All programs were compiled
5*357f1050SThomas Veermanusing gcc (version unavailable, sorry) with the -O flag, and run on a
6*357f1050SThomas VeermanSPARCstation 1+.  The input used was a PostScript file, mainly containing
7*357f1050SThomas Veermanfigures, with the following "wc" counts:
8*357f1050SThomas Veerman
9*357f1050SThomas Veerman	lines  words  characters
10*357f1050SThomas Veerman	214217 635954 2592172
11*357f1050SThomas Veerman
12*357f1050SThomas Veerman
13*357f1050SThomas VeermanThe basic principles illustrated by these programs are:
14*357f1050SThomas Veerman
15*357f1050SThomas Veerman	- match as much text with each rule as possible
16*357f1050SThomas Veerman	- adding rules does not slow you down!
17*357f1050SThomas Veerman	- avoid backing up
18*357f1050SThomas Veerman
19*357f1050SThomas Veermanand the big caveat that comes with them is:
20*357f1050SThomas Veerman
21*357f1050SThomas Veerman	- you buy performance with decreased maintainability; make
22*357f1050SThomas Veerman	  sure you really need it before applying the above techniques.
23*357f1050SThomas Veerman
24*357f1050SThomas VeermanSee the "Performance Considerations" section of flexdoc for more
25*357f1050SThomas Veermandetails regarding these principles.
26*357f1050SThomas Veerman
27*357f1050SThomas Veerman
28*357f1050SThomas VeermanThe different versions of "wc":
29*357f1050SThomas Veerman
30*357f1050SThomas Veerman	mywc.c
31*357f1050SThomas Veerman		a simple but fairly efficient C version
32*357f1050SThomas Veerman
33*357f1050SThomas Veerman	wc1.l	a naive flex "wc" implementation
34*357f1050SThomas Veerman
35*357f1050SThomas Veerman	wc2.l	somewhat faster; adds rules to match multiple tokens at once
36*357f1050SThomas Veerman
37*357f1050SThomas Veerman	wc3.l	faster still; adds more rules to match longer runs of tokens
38*357f1050SThomas Veerman
39*357f1050SThomas Veerman	wc4.l	fastest; still more rules added; hard to do much better
40*357f1050SThomas Veerman		using flex (or, I suspect, hand-coding)
41*357f1050SThomas Veerman
42*357f1050SThomas Veerman	wc5.l	identical to wc3.l except one rule has been slightly
43*357f1050SThomas Veerman		shortened, introducing backing-up
44*357f1050SThomas Veerman
45*357f1050SThomas VeermanTiming results (all times in user CPU seconds):
46*357f1050SThomas Veerman
47*357f1050SThomas Veerman	program	  time 	 notes
48*357f1050SThomas Veerman	-------   ----   -----
49*357f1050SThomas Veerman	wc1       16.4   default flex table compression (= -Cem)
50*357f1050SThomas Veerman	wc1        6.7   -Cf compression option
51*357f1050SThomas Veerman	/bin/wc	   5.8	 Sun's standard "wc" tool
52*357f1050SThomas Veerman	mywc	   4.6   simple but better C implementation!
53*357f1050SThomas Veerman	wc2	   4.6   as good as C implementation; built using -Cf
54*357f1050SThomas Veerman	wc3	   3.8   -Cf
55*357f1050SThomas Veerman	wc4	   3.3   -Cf
56*357f1050SThomas Veerman	wc5	   5.7   -Cf; ouch, backing up is expensive
57