README
1NAME
2 testregex - regex(3) test harness
3
4SYNOPSIS
5 testregex [ options ]
6
7DESCRIPTION
8 testregex reads regex(3) test specifications, one per line, from the
9 standard input and writes one output line for each failed test. A
10 summary line is written after all tests are done. Each successful
11 test is run again with REG_NOSUB. Unsupported features are noted
12 before the first test, and tests requiring these features are
13 silently ignored.
14
15OPTIONS
16 -c catch signals and non-terminating calls
17 -e ignore error return mismatches
18 -h list help on standard error
19 -n do not repeat successful tests with regnexec()
20 -o ignore match[] overrun errors
21 -p ignore negative position mismatches
22 -s use stack instead of malloc
23 -x do not repeat successful tests with REG_NOSUB
24 -v list each test line
25 -A list failed test lines with actual answers
26 -B list all test lines with actual answers
27 -F list failed test lines
28 -P list passed test lines
29 -S output one summary line
30
31INPUT FORMAT
32 Input lines may be blank, a comment beginning with #, or a test
33 specification. A specification is five fields separated by one
34 or more tabs. NULL denotes the empty string and NIL denotes the
35 0 pointer.
36
37 Field 1: the regex(3) flags to apply, one character per REG_feature
38 flag. The test is skipped if REG_feature is not supported by the
39 implementation. If the first character is not [BEASKLP] then the
40 specification is a global control line. One or more of [BEASKLP] may be
41 specified; the test will be repeated for each mode.
42
43 B basic BRE (grep, ed, sed)
44 E REG_EXTENDED ERE (egrep)
45 A REG_AUGMENTED ARE (egrep with negation)
46 S REG_SHELL SRE (sh glob)
47 K REG_SHELL|REG_AUGMENTED KRE (ksh glob)
48 L REG_LITERAL LRE (fgrep)
49
50 a REG_LEFT|REG_RIGHT implicit ^...$
51 b REG_NOTBOL lhs does not match ^
52 c REG_COMMENT ignore space and #...\n
53 d REG_SHELL_DOT explicit leading . match
54 e REG_NOTEOL rhs does not match $
55 f REG_MULTIPLE multiple \n separated patterns
56 g FNM_LEADING_DIR testfnmatch only -- match until /
57 h REG_MULTIREF multiple digit backref
58 i REG_ICASE ignore case
59 j REG_SPAN . matches \n
60 k REG_ESCAPE \ to ecape [...] delimiter
61 l REG_LEFT implicit ^...
62 m REG_MINIMAL minimal match
63 n REG_NEWLINE explicit \n match
64 o REG_ENCLOSED (|&) magic inside [@|&](...)
65 p REG_SHELL_PATH explicit / match
66 q REG_DELIMITED delimited pattern
67 r REG_RIGHT implicit ...$
68 s REG_SHELL_ESCAPED \ not special
69 t REG_MUSTDELIM all delimiters must be specified
70 u standard unspecified behavior -- errors not counted
71 v REG_CLASS_ESCAPE \ special inside [...]
72 w REG_NOSUB no subexpression match array
73 x REG_LENIENT let some errors slide
74 y REG_LEFT regexec() implicit ^...
75 z REG_NULL NULL subexpressions ok
76 $ expand C \c escapes in fields 2 and 3
77 / field 2 is a regsubcomp() expression
78 = field 3 is a regdecomp() expression
79
80 Field 1 control lines:
81
82 C set LC_COLLATE and LC_CTYPE to locale in field 2
83
84 ?test ... output field 5 if passed and != EXPECTED, silent otherwise
85 &test ... output field 5 if current and previous passed
86 |test ... output field 5 if current passed and previous failed
87 ; ... output field 2 if previous failed
88 {test ... skip if failed until }
89 } end of skip
90
91 : comment comment copied as output NOTE
92 :comment:test :comment: ignored
93 N[OTE] comment comment copied as output NOTE
94 T[EST] comment comment
95
96 number use number for nmatch (20 by default)
97
98 Field 2: the regular expression pattern; SAME uses the pattern from
99 the previous specification.
100
101 Field 3: the string to match.
102
103 Field 4: the test outcome. This is either one of the posix error
104 codes (with REG_ omitted) or the match array, a list of (m,n)
105 entries with m and n being first and last+1 positions in the
106 field 3 string, or NULL if REG_NOSUB is in effect and success
107 is expected. BADPAT is acceptable in place of any regcomp(3)
108 error code. The match[] array is initialized to (-2,-2) before
109 each test. All array elements from 0 to nmatch-1 must be specified
110 in the outcome. Unspecified endpoints (offset -1) are denoted by ?.
111 Unset endpoints (offset -2) are denoted by X. {x}(o:n) denotes a
112 matched (?{...}) expression, where x is the text enclosed by {...},
113 o is the expression ordinal counting from 1, and n is the length of
114 the unmatched portion of the subject string. If x starts with a
115 number then that is the return value of re_execf(), otherwise 0 is
116 returned.
117
118 Field 5: optional comment appended to the report.
119
120CAVEAT
121 If a regex implementation misbehaves with memory then all bets are off.
122
123CONTRIBUTORS
124 Glenn Fowler glenn.s.fowler@gmail.com (ksh strmatch, regex extensions)
125 David Korn dgkorn@gmail.com (ksh glob matcher)
126 Doug McIlroy mcilroy@dartmouth.edu (ast regex/testre in C++)
127 Tom Lord lord@regexps.com (rx tests)
128 Henry Spencer henry@zoo.toronto.edu (original public regex)
129 Andrew Hume andrew@research.att.com (gre tests)
130 John Maddock John_Maddock@compuserve.com (regex++ tests)
131 Philip Hazel ph10@cam.ac.uk (pcre tests)
132 Ville Laurikari vl@iki.fi (libtre tests)
133
134WEB SITE
135 http://www2.research.att.com/~astopen/testregex/
136 AT&T Research regex(3) regression tests
137
138 Glenn Fowler <glenn.s.fowler@gmail.com>
139 AT&T Research - Florham Park NJ
140
141 testregex.c 2004-05-31 is the latest source for the AT&T Research regression
142 test harness for the X/Open regex pattern match interface.
143 The source and test data posted here are license free.
144
145 testregex can:
146 - verify stability for a particular implementation in the face of source
147 code and/or compilation environment changes
148 - verify standard compliance for all implementations
149 - provide a basis for discussions on what compliance means
150