1*bbf21555SRichard Lowe.\" 2*bbf21555SRichard Lowe.\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for 3*bbf21555SRichard Lowe.\" permission to reproduce portions of its copyrighted documentation. 4*bbf21555SRichard Lowe.\" Original documentation from The Open Group can be obtained online at 5*bbf21555SRichard Lowe.\" http://www.opengroup.org/bookstore/. 6*bbf21555SRichard Lowe.\" 7*bbf21555SRichard Lowe.\" The Institute of Electrical and Electronics Engineers and The Open 8*bbf21555SRichard Lowe.\" Group, have given us permission to reprint portions of their 9*bbf21555SRichard Lowe.\" documentation. 10*bbf21555SRichard Lowe.\" 11*bbf21555SRichard Lowe.\" In the following statement, the phrase ``this text'' refers to portions 12*bbf21555SRichard Lowe.\" of the system documentation. 13*bbf21555SRichard Lowe.\" 14*bbf21555SRichard Lowe.\" Portions of this text are reprinted and reproduced in electronic form 15*bbf21555SRichard Lowe.\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition, 16*bbf21555SRichard Lowe.\" Standard for Information Technology -- Portable Operating System 17*bbf21555SRichard Lowe.\" Interface (POSIX), The Open Group Base Specifications Issue 6, 18*bbf21555SRichard Lowe.\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics 19*bbf21555SRichard Lowe.\" Engineers, Inc and The Open Group. In the event of any discrepancy 20*bbf21555SRichard Lowe.\" between these versions and the original IEEE and The Open Group 21*bbf21555SRichard Lowe.\" Standard, the original IEEE and The Open Group Standard is the referee 22*bbf21555SRichard Lowe.\" document. The original Standard can be obtained online at 23*bbf21555SRichard Lowe.\" http://www.opengroup.org/unix/online.html. 24*bbf21555SRichard Lowe.\" 25*bbf21555SRichard Lowe.\" This notice shall appear on any product containing this material. 26*bbf21555SRichard Lowe.\" 27*bbf21555SRichard Lowe.\" The contents of this file are subject to the terms of the 28*bbf21555SRichard Lowe.\" Common Development and Distribution License (the "License"). 29*bbf21555SRichard Lowe.\" You may not use this file except in compliance with the License. 30*bbf21555SRichard Lowe.\" 31*bbf21555SRichard Lowe.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 32*bbf21555SRichard Lowe.\" or http://www.opensolaris.org/os/licensing. 33*bbf21555SRichard Lowe.\" See the License for the specific language governing permissions 34*bbf21555SRichard Lowe.\" and limitations under the License. 35*bbf21555SRichard Lowe.\" 36*bbf21555SRichard Lowe.\" When distributing Covered Code, include this CDDL HEADER in each 37*bbf21555SRichard Lowe.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE. 38*bbf21555SRichard Lowe.\" If applicable, add the following below this CDDL HEADER, with the 39*bbf21555SRichard Lowe.\" fields enclosed by brackets "[]" replaced with your own identifying 40*bbf21555SRichard Lowe.\" information: Portions Copyright [yyyy] [name of copyright owner] 41*bbf21555SRichard Lowe.\" 42*bbf21555SRichard Lowe.\" 43*bbf21555SRichard Lowe.\" Copyright (c) 1992, X/Open Company Limited All Rights Reserved 44*bbf21555SRichard Lowe.\" Portions Copyright (c) 1999, Sun Microsystems, Inc. All Rights Reserved 45*bbf21555SRichard Lowe.\" Copyright 2017 Nexenta Systems, Inc. 46*bbf21555SRichard Lowe.\" 47*bbf21555SRichard Lowe.Dd August 14, 2020 48*bbf21555SRichard Lowe.Dt REGEX 7 49*bbf21555SRichard Lowe.Os 50*bbf21555SRichard Lowe.Sh NAME 51*bbf21555SRichard Lowe.Nm regex 52*bbf21555SRichard Lowe.Nd internationalized basic and extended regular expression matching 53*bbf21555SRichard Lowe.Sh DESCRIPTION 54*bbf21555SRichard LoweRegular Expressions 55*bbf21555SRichard Lowe.Pq REs 56*bbf21555SRichard Loweprovide a mechanism to select specific strings from a set of character strings. 57*bbf21555SRichard LoweThe Internationalized Regular Expressions described below differ from the Simple 58*bbf21555SRichard LoweRegular Expressions described on the 59*bbf21555SRichard Lowe.Xr regexp 7 60*bbf21555SRichard Lowemanual page in the following ways: 61*bbf21555SRichard Lowe.Bl -bullet 62*bbf21555SRichard Lowe.It 63*bbf21555SRichard Loweboth Basic and Extended Regular Expressions are supported 64*bbf21555SRichard Lowe.It 65*bbf21555SRichard Lowethe Internationalization features -- character class, equivalence class, and 66*bbf21555SRichard Lowemulti-character collation -- are supported. 67*bbf21555SRichard Lowe.El 68*bbf21555SRichard Lowe.Pp 69*bbf21555SRichard LoweThe Basic Regular Expression 70*bbf21555SRichard Lowe.Pq BRE 71*bbf21555SRichard Lowenotation and construction rules described in the 72*bbf21555SRichard Lowe.Sx BASIC REGULAR EXPRESSIONS 73*bbf21555SRichard Lowesection apply to most utilities supporting regular expressions. 74*bbf21555SRichard LoweSome utilities, instead, support the Extended Regular Expressions 75*bbf21555SRichard Lowe.Pq ERE 76*bbf21555SRichard Lowedescribed in the 77*bbf21555SRichard Lowe.Sx EXTENDED REGULAR EXPRESSIONS 78*bbf21555SRichard Lowesection; any exceptions for both cases are noted in the descriptions of the 79*bbf21555SRichard Lowespecific utilities using regular expressions. 80*bbf21555SRichard LoweBoth BREs and EREs are supported by the Regular Expression Matching interfaces 81*bbf21555SRichard Lowe.Xr regcomp 3C 82*bbf21555SRichard Loweand 83*bbf21555SRichard Lowe.Xr regexec 3C . 84*bbf21555SRichard Lowe.Sh BASIC REGULAR EXPRESSIONS 85*bbf21555SRichard Lowe.Ss BREs Matching a Single Character 86*bbf21555SRichard LoweA BRE ordinary character, a special character preceded by a backslash, or a 87*bbf21555SRichard Loweperiod matches a single character. 88*bbf21555SRichard LoweA bracket expression matches a single character or a single collating element. 89*bbf21555SRichard LoweSee 90*bbf21555SRichard Lowe.Sx RE Bracket Expression , 91*bbf21555SRichard Lowebelow. 92*bbf21555SRichard Lowe.Ss BRE Ordinary Characters 93*bbf21555SRichard LoweAn ordinary character is a BRE that matches itself: any character in the 94*bbf21555SRichard Lowesupported character set, except for the BRE special characters listed in 95*bbf21555SRichard Lowe.Sx BRE Special Characters , 96*bbf21555SRichard Lowebelow. 97*bbf21555SRichard Lowe.Pp 98*bbf21555SRichard LoweThe interpretation of an ordinary character preceded by a backslash 99*bbf21555SRichard Lowe.Pq Qq \e 100*bbf21555SRichard Loweis undefined, except for: 101*bbf21555SRichard Lowe.Bl -enum 102*bbf21555SRichard Lowe.It 103*bbf21555SRichard Lowethe characters 104*bbf21555SRichard Lowe.Qq \&) , 105*bbf21555SRichard Lowe.Qq \&( , 106*bbf21555SRichard Lowe.Qq { , 107*bbf21555SRichard Loweand 108*bbf21555SRichard Lowe.Qq } 109*bbf21555SRichard Lowe.It 110*bbf21555SRichard Lowethe digits 1 to 9 inclusive 111*bbf21555SRichard Lowe.Po see 112*bbf21555SRichard Lowe.Sx BREs Matching Multiple Characters , 113*bbf21555SRichard Lowebelow 114*bbf21555SRichard Lowe.Pc 115*bbf21555SRichard Lowe.It 116*bbf21555SRichard Lowea character inside a bracket expression. 117*bbf21555SRichard Lowe.El 118*bbf21555SRichard Lowe.Ss BRE Special Characters 119*bbf21555SRichard LoweA BRE special character has special properties in certain contexts. 120*bbf21555SRichard LoweOutside those contexts, or when preceded by a backslash, such a character will 121*bbf21555SRichard Lowebe a BRE that matches the special character itself. 122*bbf21555SRichard LoweThe BRE special characters and the contexts in which they have their special 123*bbf21555SRichard Lowemeaning are: 124*bbf21555SRichard Lowe.Bl -tag -width Ds 125*bbf21555SRichard Lowe.It Sy \&. \&[ \&\e 126*bbf21555SRichard LoweThe period, left-bracket, and backslash are special except when used in a 127*bbf21555SRichard Lowebracket expression 128*bbf21555SRichard Lowe.Po see 129*bbf21555SRichard Lowe.Sx RE Bracket Expression , 130*bbf21555SRichard Lowebelow 131*bbf21555SRichard Lowe.Pc . 132*bbf21555SRichard LoweAn expression containing a 133*bbf21555SRichard Lowe.Qq \&[ 134*bbf21555SRichard Lowethat is not preceded by a backslash and is not part of a bracket expression 135*bbf21555SRichard Loweproduces undefined results. 136*bbf21555SRichard Lowe.It Sy * 137*bbf21555SRichard LoweThe asterisk is special except when used: 138*bbf21555SRichard Lowe.Bl -bullet 139*bbf21555SRichard Lowe.It 140*bbf21555SRichard Lowein a bracket expression 141*bbf21555SRichard Lowe.It 142*bbf21555SRichard Loweas the first character of an entire BRE 143*bbf21555SRichard Lowe.Po after an initial 144*bbf21555SRichard Lowe.Qq ^ , 145*bbf21555SRichard Loweif any 146*bbf21555SRichard Lowe.Pc 147*bbf21555SRichard Lowe.It 148*bbf21555SRichard Loweas the first character of a subexpression 149*bbf21555SRichard Lowe.Po after an initial 150*bbf21555SRichard Lowe.Qq ^ , 151*bbf21555SRichard Loweif any; see 152*bbf21555SRichard Lowe.Sx BREs Matching Multiple Characters , 153*bbf21555SRichard Lowebelow 154*bbf21555SRichard Lowe.Pc . 155*bbf21555SRichard Lowe.El 156*bbf21555SRichard Lowe.It Sy ^ 157*bbf21555SRichard LoweThe circumflex is special when used: 158*bbf21555SRichard Lowe.Bl -bullet 159*bbf21555SRichard Lowe.It 160*bbf21555SRichard Loweas an anchor 161*bbf21555SRichard Lowe.Po see 162*bbf21555SRichard Lowe.Sx BRE Expression Anchoring , 163*bbf21555SRichard Lowebelow 164*bbf21555SRichard Lowe.Pc . 165*bbf21555SRichard Lowe.It 166*bbf21555SRichard Loweas the first character of a bracket expression 167*bbf21555SRichard Lowe.Po see 168*bbf21555SRichard Lowe.Sx RE Bracket Expression , 169*bbf21555SRichard Lowebelow 170*bbf21555SRichard Lowe.Pc . 171*bbf21555SRichard Lowe.El 172*bbf21555SRichard Lowe.It Sy $ 173*bbf21555SRichard LoweThe dollar sign is special when used as an anchor. 174*bbf21555SRichard Lowe.El 175*bbf21555SRichard Lowe.Ss Periods in BREs 176*bbf21555SRichard LoweA period 177*bbf21555SRichard Lowe.Pq Qq \&. , 178*bbf21555SRichard Lowewhen used outside a bracket expression, is a BRE that matches any character in 179*bbf21555SRichard Lowethe supported character set except NUL. 180*bbf21555SRichard Lowe.Ss RE Bracket Expression 181*bbf21555SRichard LoweA bracket expression 182*bbf21555SRichard Lowe.Po an expression enclosed in square brackets, 183*bbf21555SRichard Lowe.Qq [] 184*bbf21555SRichard Lowe.Pc 185*bbf21555SRichard Loweis an RE that matches a single collating element contained in the non-empty set 186*bbf21555SRichard Loweof collating elements represented by the bracket expression. 187*bbf21555SRichard Lowe.Pp 188*bbf21555SRichard LoweThe following rules and definitions apply to bracket expressions: 189*bbf21555SRichard Lowe.Bl -enum 190*bbf21555SRichard Lowe.It 191*bbf21555SRichard LoweA 192*bbf21555SRichard Lowe.Em bracket expression 193*bbf21555SRichard Loweis either a matching list expression or a non-matching list expression. 194*bbf21555SRichard LoweIt consists of one or more expressions: collating elements, collating symbols, 195*bbf21555SRichard Loweequivalence classes, character classes, or range expressions 196*bbf21555SRichard Lowe.Pq see rule 7 below . 197*bbf21555SRichard LowePortable applications must not use range expressions, even though all 198*bbf21555SRichard Loweimplementations support them. 199*bbf21555SRichard LoweThe right-bracket 200*bbf21555SRichard Lowe.Pq Qq \&] 201*bbf21555SRichard Loweloses its special meaning and represents itself in a bracket expression if it 202*bbf21555SRichard Loweoccurs first in the list 203*bbf21555SRichard Lowe.Po after an initial circumflex 204*bbf21555SRichard Lowe.Pq Qq ^ , 205*bbf21555SRichard Loweif any 206*bbf21555SRichard Lowe.Pc . 207*bbf21555SRichard LoweOtherwise, it terminates the bracket expression, unless it appears in a 208*bbf21555SRichard Lowecollating symbol 209*bbf21555SRichard Lowe.Po such as 210*bbf21555SRichard Lowe.Qq [.].] 211*bbf21555SRichard Lowe.Pc 212*bbf21555SRichard Loweor is the ending right-bracket for a collating symbol, equivalence class, or 213*bbf21555SRichard Lowecharacter class. 214*bbf21555SRichard Lowe.Pp 215*bbf21555SRichard LoweThe special characters 216*bbf21555SRichard Lowe.Qq \&. , 217*bbf21555SRichard Lowe.Qq * , 218*bbf21555SRichard Lowe.Qq \&[ , 219*bbf21555SRichard Lowe.Qq \&\e 220*bbf21555SRichard Lowe.Pq period, asterisk, left-bracket and backslash, respectively 221*bbf21555SRichard Lowelose their special meaning within a bracket expression. 222*bbf21555SRichard Lowe.Pp 223*bbf21555SRichard LoweThe character sequences 224*bbf21555SRichard Lowe.Qq [. , 225*bbf21555SRichard Lowe.Qq [= , 226*bbf21555SRichard Lowe.Qq [: 227*bbf21555SRichard Lowe.Pq left-bracket followed by a period, equals-sign, or colon 228*bbf21555SRichard Loweare special inside a bracket expression and are used to delimit collating 229*bbf21555SRichard Lowesymbols, equivalence class expressions, and character class expressions. 230*bbf21555SRichard LoweThese symbols must be followed by a valid expression and the matching 231*bbf21555SRichard Loweterminating sequence 232*bbf21555SRichard Lowe.Qq .] , 233*bbf21555SRichard Lowe.Qq =] 234*bbf21555SRichard Loweor 235*bbf21555SRichard Lowe.Qq :] , 236*bbf21555SRichard Loweas described in the following items. 237*bbf21555SRichard Lowe.It 238*bbf21555SRichard LoweA 239*bbf21555SRichard Lowe.Em matching list expression 240*bbf21555SRichard Lowespecifies a list that matches any one of the expressions represented in the 241*bbf21555SRichard Lowelist. 242*bbf21555SRichard LoweThe first character in the list must not be the circumflex. 243*bbf21555SRichard LoweFor example, 244*bbf21555SRichard Lowe.Qq [abc] 245*bbf21555SRichard Loweis an RE that matches any of the characters 246*bbf21555SRichard Lowe.Qq a , 247*bbf21555SRichard Lowe.Qq b 248*bbf21555SRichard Loweor 249*bbf21555SRichard Lowe.Qq c . 250*bbf21555SRichard Lowe.It 251*bbf21555SRichard LoweA 252*bbf21555SRichard Lowe.Em non-matching list expression 253*bbf21555SRichard Lowebegins with a circumflex 254*bbf21555SRichard Lowe.Pq Qq ^ , 255*bbf21555SRichard Loweand specifies a list that matches any character or collating element except for 256*bbf21555SRichard Lowethe expressions represented in the list after the leading circumflex. 257*bbf21555SRichard LoweFor example, 258*bbf21555SRichard Lowe.Qq [^abc] 259*bbf21555SRichard Loweis an RE that matches any character or collating element except the characters 260*bbf21555SRichard Lowe.Qq a , 261*bbf21555SRichard Lowe.Qq b , 262*bbf21555SRichard Loweor 263*bbf21555SRichard Lowe.Qq c . 264*bbf21555SRichard LoweThe circumflex will have this special meaning only when it occurs first in the 265*bbf21555SRichard Lowelist, immediately following the left-bracket. 266*bbf21555SRichard Lowe.It 267*bbf21555SRichard LoweA 268*bbf21555SRichard Lowe.Em collating symbol 269*bbf21555SRichard Loweis a collating element enclosed within bracket-period 270*bbf21555SRichard Lowe.Pq Qq [..] 271*bbf21555SRichard Lowedelimiters. 272*bbf21555SRichard LoweMulti-character collating elements must be represented as collating symbols when 273*bbf21555SRichard Loweit is necessary to distinguish them from a list of the individual characters 274*bbf21555SRichard Lowethat make up the multi-character collating element. 275*bbf21555SRichard LoweFor example, if the string 276*bbf21555SRichard Lowe.Qq ch 277*bbf21555SRichard Loweis a collating element in the current collation sequence with the associated 278*bbf21555SRichard Lowecollating symbol 279*bbf21555SRichard Lowe.Qq Aq ch , 280*bbf21555SRichard Lowethe expression 281*bbf21555SRichard Lowe.Qq [[.ch.]] 282*bbf21555SRichard Lowewill be treated as an RE matching the character sequence 283*bbf21555SRichard Lowe.Qq ch , 284*bbf21555SRichard Lowewhile 285*bbf21555SRichard Lowe.Qq [ch] 286*bbf21555SRichard Lowewill be treated as an RE matching 287*bbf21555SRichard Lowe.Qq c 288*bbf21555SRichard Loweor 289*bbf21555SRichard Lowe.Qq h . 290*bbf21555SRichard LoweCollating symbols will be recognized only inside bracket expressions. 291*bbf21555SRichard LoweThis implies that the RE 292*bbf21555SRichard Lowe.Qq [[.ch.]]*c 293*bbf21555SRichard Lowematches the first to fifth character in the string 294*bbf21555SRichard Lowe.Qq chchch. 295*bbf21555SRichard LoweIf the string is not a collating element in the current collating sequence 296*bbf21555SRichard Lowedefinition, or if the collating element has no characters associated with it, 297*bbf21555SRichard Lowethe symbol will be treated as an invalid expression. 298*bbf21555SRichard Lowe.It 299*bbf21555SRichard LoweAn 300*bbf21555SRichard Lowe.Em equivalence class expression 301*bbf21555SRichard Lowerepresents the set of collating elements belonging to an equivalence class. 302*bbf21555SRichard LoweOnly primary equivalence classes will be recognised. 303*bbf21555SRichard LoweThe class is expressed by enclosing any one of the collating elements in the 304*bbf21555SRichard Loweequivalence class within bracket-equal 305*bbf21555SRichard Lowe.Pq Qq [==] 306*bbf21555SRichard Lowedelimiters. 307*bbf21555SRichard LoweFor example, if 308*bbf21555SRichard Lowe.Qq a 309*bbf21555SRichard Loweand 310*bbf21555SRichard Lowe.Qq b 311*bbf21555SRichard Lowebelong to the same equivalence class, then 312*bbf21555SRichard Lowe.Qq [[=a=]b] , 313*bbf21555SRichard Lowe.Qq [[==]a] 314*bbf21555SRichard Loweand 315*bbf21555SRichard Lowe.Qq [[==]b] 316*bbf21555SRichard Lowewill each be equivalent to 317*bbf21555SRichard Lowe.Qq [ab] . 318*bbf21555SRichard LoweIf the collating element does not belong to an equivalence class, the 319*bbf21555SRichard Loweequivalence class expression will be treated as a 320*bbf21555SRichard Lowe.Em collating symbol . 321*bbf21555SRichard Lowe.It 322*bbf21555SRichard LoweA 323*bbf21555SRichard Lowe.Em character class expression 324*bbf21555SRichard Lowerepresents the set of characters belonging to a character class, as defined in 325*bbf21555SRichard Lowethe 326*bbf21555SRichard Lowe.Ev LC_CTYPE 327*bbf21555SRichard Lowecategory in the current locale. 328*bbf21555SRichard LoweAll character classes specified in the current locale will be recognized. 329*bbf21555SRichard LoweA character class expression is expressed as a character class name enclosed 330*bbf21555SRichard Lowewithin bracket-colon 331*bbf21555SRichard Lowe.Pq Qq [::] 332*bbf21555SRichard Lowedelimiters. 333*bbf21555SRichard Lowe.Pp 334*bbf21555SRichard LoweThe following character class expressions are supported in all locales: 335*bbf21555SRichard Lowe.Bl -column "[:alnum:]" "[:cntrl:]" "[:lower:]" "[:xdigit:]" 336*bbf21555SRichard Lowe.It [:alnum:] Ta [:cntrl:] Ta [:lower:] Ta [:space:] 337*bbf21555SRichard Lowe.It [:alpha:] Ta [:digit:] Ta [:print:] Ta [:upper:] 338*bbf21555SRichard Lowe.It [:blank:] Ta [:graph:] Ta [:punct:] Ta [:xdigit:] 339*bbf21555SRichard Lowe.El 340*bbf21555SRichard Lowe.Pp 341*bbf21555SRichard LoweIn addition, character class expressions of the form 342*bbf21555SRichard Lowe.Qq [:name:] 343*bbf21555SRichard Loweare recognized in those locales where the 344*bbf21555SRichard Lowe.Em name 345*bbf21555SRichard Lowekeyword has been given a 346*bbf21555SRichard Lowe.Em charclass 347*bbf21555SRichard Lowedefinition in the 348*bbf21555SRichard Lowe.Ev LC_CTYPE 349*bbf21555SRichard Lowecategory. 350*bbf21555SRichard Lowe.It 351*bbf21555SRichard LoweA 352*bbf21555SRichard Lowe.Em range expression 353*bbf21555SRichard Lowerepresents the set of collating elements that fall between two elements in the 354*bbf21555SRichard Lowecurrent collation sequence, inclusively. 355*bbf21555SRichard LoweIt is expressed as the starting point and the ending point separated by a hyphen 356*bbf21555SRichard Lowe.Pq Qq - . 357*bbf21555SRichard Lowe.Pp 358*bbf21555SRichard LoweRange expressions must not be used in portable applications because their 359*bbf21555SRichard Lowebehavior is dependent on the collating sequence. 360*bbf21555SRichard LoweRanges will be treated according to the current collating sequence, and include 361*bbf21555SRichard Lowesuch characters that fall within the range based on that collating sequence, 362*bbf21555SRichard Loweregardless of character values. 363*bbf21555SRichard LoweThis, however, means that the interpretation will differ depending on collating 364*bbf21555SRichard Lowesequence. 365*bbf21555SRichard LoweIf, for instance, one collating sequence defines as a variant of 366*bbf21555SRichard Lowe.Qq a , 367*bbf21555SRichard Lowewhile another defines it as a letter following 368*bbf21555SRichard Lowe.Qq z , 369*bbf21555SRichard Lowethen the expression 370*bbf21555SRichard Lowe.Qq [-z] 371*bbf21555SRichard Loweis valid in the first language and invalid in the second. 372*bbf21555SRichard Lowe.sp 373*bbf21555SRichard LoweIn the following, all examples assume the collation sequence specified for the 374*bbf21555SRichard LowePOSIX locale, unless another collation sequence is specifically defined. 375*bbf21555SRichard Lowe.Pp 376*bbf21555SRichard LoweThe starting range point and the ending range point must be a collating element 377*bbf21555SRichard Loweor collating symbol. 378*bbf21555SRichard LoweAn equivalence class expression used as a starting or ending point of a range 379*bbf21555SRichard Loweexpression produces unspecified results. 380*bbf21555SRichard LoweAn equivalence class can be used portably within a bracket expression, but only 381*bbf21555SRichard Loweoutside the range. 382*bbf21555SRichard LoweFor example, the unspecified expression 383*bbf21555SRichard Lowe.Qq [[=e=]-f] 384*bbf21555SRichard Loweshould be given as 385*bbf21555SRichard Lowe.Qq [[=e=]e-f] . 386*bbf21555SRichard LoweThe ending range point must collate equal to or higher than the starting range 387*bbf21555SRichard Lowepoint; otherwise, the expression will be treated as invalid. 388*bbf21555SRichard LoweThe order used is the order in which the collating elements are specified in the 389*bbf21555SRichard Lowecurrent collation definition. 390*bbf21555SRichard LoweOne-to-many mappings 391*bbf21555SRichard Lowe.Po see 392*bbf21555SRichard Lowe.Xr locale 7 393*bbf21555SRichard Lowe.Pc 394*bbf21555SRichard Lowewill not be performed. 395*bbf21555SRichard LoweFor example, assuming that the character 396*bbf21555SRichard Lowe.Qq eszet 397*bbf21555SRichard Loweis placed in the collation sequence after 398*bbf21555SRichard Lowe.Qq r 399*bbf21555SRichard Loweand 400*bbf21555SRichard Lowe.Qq s , 401*bbf21555SRichard Lowebut before 402*bbf21555SRichard Lowe.Qq t , 403*bbf21555SRichard Loweand that it maps to the sequence 404*bbf21555SRichard Lowe.Qq ss 405*bbf21555SRichard Lowefor collation purposes, then the expression 406*bbf21555SRichard Lowe.Qq [r-s] 407*bbf21555SRichard Lowematches only 408*bbf21555SRichard Lowe.Qq r 409*bbf21555SRichard Loweand 410*bbf21555SRichard Lowe.Qq s , 411*bbf21555SRichard Lowebut the expression 412*bbf21555SRichard Lowe.Qq [s-t] 413*bbf21555SRichard Lowematches 414*bbf21555SRichard Lowe.Qq s , 415*bbf21555SRichard Lowe.Qq beta , 416*bbf21555SRichard Loweor 417*bbf21555SRichard Lowe.Qq t . 418*bbf21555SRichard Lowe.Pp 419*bbf21555SRichard LoweThe interpretation of range expressions where the ending range point is also 420*bbf21555SRichard Lowethe starting range point of a subsequent range expression 421*bbf21555SRichard Lowe.Po for instance 422*bbf21555SRichard Lowe.Qq [a-m-o] 423*bbf21555SRichard Lowe.Pc 424*bbf21555SRichard Loweis undefined. 425*bbf21555SRichard Lowe.Pp 426*bbf21555SRichard LoweThe hyphen character will be treated as itself if it occurs first 427*bbf21555SRichard Lowe.Po after an initial 428*bbf21555SRichard Lowe.Qq ^ , 429*bbf21555SRichard Loweif any 430*bbf21555SRichard Lowe.Pc 431*bbf21555SRichard Loweor last in the list, or as an ending range point in a range expression. 432*bbf21555SRichard LoweAs examples, the expressions 433*bbf21555SRichard Lowe.Qq [-ac] 434*bbf21555SRichard Loweand 435*bbf21555SRichard Lowe.Qq [ac-] 436*bbf21555SRichard Loweare equivalent and match any of the characters 437*bbf21555SRichard Lowe.Qq a , 438*bbf21555SRichard Lowe.Qq c , 439*bbf21555SRichard Loweor 440*bbf21555SRichard Lowe.Qq -; 441*bbf21555SRichard Lowe.Qq [^-ac] 442*bbf21555SRichard Loweand 443*bbf21555SRichard Lowe.Qq [^ac-] 444*bbf21555SRichard Loweare equivalent and match any characters except 445*bbf21555SRichard Lowe.Qq a , 446*bbf21555SRichard Lowe.Qq c , 447*bbf21555SRichard Loweor 448*bbf21555SRichard Lowe.Qq -; 449*bbf21555SRichard Lowethe expression 450*bbf21555SRichard Lowe.Qq [%--] 451*bbf21555SRichard Lowematches any of the characters between 452*bbf21555SRichard Lowe.Qq % 453*bbf21555SRichard Loweand 454*bbf21555SRichard Lowe.Qq - 455*bbf21555SRichard Loweinclusive; the expression 456*bbf21555SRichard Lowe.Qq [--@] 457*bbf21555SRichard Lowematches any of the characters between 458*bbf21555SRichard Lowe.Qq - 459*bbf21555SRichard Loweand 460*bbf21555SRichard Lowe.Qq @ 461*bbf21555SRichard Loweinclusive; and the expression 462*bbf21555SRichard Lowe.Qq [a--@] 463*bbf21555SRichard Loweis invalid, because the letter 464*bbf21555SRichard Lowe.Qq a 465*bbf21555SRichard Lowefollows the symbol 466*bbf21555SRichard Lowe.Qq - 467*bbf21555SRichard Lowein the POSIX locale. 468*bbf21555SRichard LoweTo use a hyphen as the starting range point, it must either come first in the 469*bbf21555SRichard Lowebracket expression or be specified as a collating symbol, for example: 470*bbf21555SRichard Lowe.Qq [][.-.]-0] , 471*bbf21555SRichard Lowewhich matches either a right bracket or any character or collating element that 472*bbf21555SRichard Lowecollates between hyphen and 0, inclusive. 473*bbf21555SRichard Lowe.Pp 474*bbf21555SRichard LoweIf a bracket expression must specify both 475*bbf21555SRichard Lowe.Qq - 476*bbf21555SRichard Loweand 477*bbf21555SRichard Lowe.Qq \&] , 478*bbf21555SRichard Lowethe 479*bbf21555SRichard Lowe.Qq \&] 480*bbf21555SRichard Lowemust be placed first 481*bbf21555SRichard Lowe.Po after the 482*bbf21555SRichard Lowe.Qq ^ , 483*bbf21555SRichard Loweif any 484*bbf21555SRichard Lowe.Pc 485*bbf21555SRichard Loweand the 486*bbf21555SRichard Lowe.Qq - 487*bbf21555SRichard Lowelast within the bracket expression. 488*bbf21555SRichard Lowe.El 489*bbf21555SRichard Lowe.Pp 490*bbf21555SRichard LoweNote: Latin-1 characters such as 491*bbf21555SRichard Lowe.Qq \(ga 492*bbf21555SRichard Loweor 493*bbf21555SRichard Lowe.Qq ^ 494*bbf21555SRichard Loweare not printable in some locales, for example, the 495*bbf21555SRichard Lowe.Em ja 496*bbf21555SRichard Lowelocale. 497*bbf21555SRichard Lowe.Ss BREs Matching Multiple Characters 498*bbf21555SRichard LoweThe following rules can be used to construct BREs matching multiple characters 499*bbf21555SRichard Lowefrom BREs matching a single character: 500*bbf21555SRichard Lowe.Bl -enum 501*bbf21555SRichard Lowe.It 502*bbf21555SRichard LoweThe concatenation of BREs matches the concatenation of the strings matched 503*bbf21555SRichard Loweby each component of the BRE. 504*bbf21555SRichard Lowe.It 505*bbf21555SRichard LoweA 506*bbf21555SRichard Lowe.Em subexpression 507*bbf21555SRichard Lowecan be defined within a BRE by enclosing it between the character pairs 508*bbf21555SRichard Lowe.Qq \e( 509*bbf21555SRichard Loweand 510*bbf21555SRichard Lowe.Qq \e) . 511*bbf21555SRichard LoweSuch a subexpression matches whatever it would have matched without the 512*bbf21555SRichard Lowe.Qq \e( 513*bbf21555SRichard Loweand 514*bbf21555SRichard Lowe.Qq \e) , 515*bbf21555SRichard Loweexcept that anchoring within subexpressions is optional behavior; see 516*bbf21555SRichard Lowe.Sx BRE Expression Anchoring , 517*bbf21555SRichard Lowebelow. 518*bbf21555SRichard LoweSubexpressions can be arbitrarily nested. 519*bbf21555SRichard Lowe.It 520*bbf21555SRichard LoweThe 521*bbf21555SRichard Lowe.Em back-reference 522*bbf21555SRichard Loweexpression 523*bbf21555SRichard Lowe.Qq \e Ns Em n 524*bbf21555SRichard Lowematches the same 525*bbf21555SRichard Lowe.Pq possibly empty 526*bbf21555SRichard Lowestring of characters as was matched by a subexpression enclosed between 527*bbf21555SRichard Lowe.Qq \e( 528*bbf21555SRichard Loweand 529*bbf21555SRichard Lowe.Qq \e) 530*bbf21555SRichard Lowepreceding the 531*bbf21555SRichard Lowe.Qq \e Ns Em n . 532*bbf21555SRichard LoweThe character 533*bbf21555SRichard Lowe.Qq Em n 534*bbf21555SRichard Lowemust be a digit from 1 to 9 inclusive, 535*bbf21555SRichard Lowe.Em n Ns th 536*bbf21555SRichard Lowesubexpression 537*bbf21555SRichard Lowe.Po the one that begins with the 538*bbf21555SRichard Lowe.Em n Ns th 539*bbf21555SRichard Lowe.Qq \e( 540*bbf21555SRichard Loweand ends with the corresponding paired 541*bbf21555SRichard Lowe.Qq \e) 542*bbf21555SRichard Lowe.Pc . 543*bbf21555SRichard LoweThe expression is invalid if less than 544*bbf21555SRichard Lowe.Em n 545*bbf21555SRichard Lowesubexpressions precede the 546*bbf21555SRichard Lowe.Qq \e Ns Em n . 547*bbf21555SRichard LoweFor example, the expression 548*bbf21555SRichard Lowe.Qq ^\e(.*\e)\e1$ 549*bbf21555SRichard Lowematches a line consisting of two adjacent appearances of the same string, and 550*bbf21555SRichard Lowethe expression 551*bbf21555SRichard Lowe.Qq \e(a\e)*\e1 552*bbf21555SRichard Lowefails to match 553*bbf21555SRichard Lowe.Qq a . 554*bbf21555SRichard LoweThe limit of nine back-references to subexpressions in the RE is based on the 555*bbf21555SRichard Loweuse of a single digit identifier. 556*bbf21555SRichard LoweThis does not imply that only nine subexpressions are allowed in REs. 557*bbf21555SRichard Lowe.It 558*bbf21555SRichard LoweWhen a BRE matching a single character, a subexpression or a back-reference is 559*bbf21555SRichard Lowefollowed by the special character asterisk 560*bbf21555SRichard Lowe.Pq Qq * , 561*bbf21555SRichard Lowetogether with that asterisk it matches what zero or more consecutive occurrences 562*bbf21555SRichard Loweof the BRE would match. 563*bbf21555SRichard LoweFor example, 564*bbf21555SRichard Lowe.Qq [ab]* 565*bbf21555SRichard Loweand 566*bbf21555SRichard Lowe.Qq [ab][ab] 567*bbf21555SRichard Loweare equivalent when matching the string 568*bbf21555SRichard Lowe.Qq ab . 569*bbf21555SRichard Lowe.It 570*bbf21555SRichard LoweWhen a BRE matching a single character, a subexpression, or a back-reference 571*bbf21555SRichard Loweis followed by an 572*bbf21555SRichard Lowe.Em interval expression 573*bbf21555SRichard Loweof the format 574*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns \e} , 575*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns ,\e} 576*bbf21555SRichard Loweor 577*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns \&, Ns Em n Ns \e} , 578*bbf21555SRichard Lowetogether with that interval expression it matches what repeated consecutive 579*bbf21555SRichard Loweoccurrences of the BRE would match. 580*bbf21555SRichard LoweThe values of 581*bbf21555SRichard Lowe.Em m 582*bbf21555SRichard Loweand 583*bbf21555SRichard Lowe.Em n 584*bbf21555SRichard Lowewill be decimal integers in the range 0 <= 585*bbf21555SRichard Lowe.Em m 586*bbf21555SRichard Lowe<= 587*bbf21555SRichard Lowe.Em n 588*bbf21555SRichard Lowe<= 589*bbf21555SRichard Lowe.Dv BRE_DUP_MAX , 590*bbf21555SRichard Lowewhere 591*bbf21555SRichard Lowe.Em m 592*bbf21555SRichard Lowespecifies the exact or minimum number of occurrences and 593*bbf21555SRichard Lowe.Em n 594*bbf21555SRichard Lowespecifies the maximum number of occurrences. 595*bbf21555SRichard LoweThe expression 596*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns \e} 597*bbf21555SRichard Lowematches exactly 598*bbf21555SRichard Lowe.Em m 599*bbf21555SRichard Loweoccurrences of the preceding BRE, 600*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns ,\e} 601*bbf21555SRichard Lowematches at least 602*bbf21555SRichard Lowe.Em m 603*bbf21555SRichard Loweoccurrences and 604*bbf21555SRichard Lowe.Qq \e{ Ns Em m Ns \&, Ns Em n Ns \e} 605*bbf21555SRichard Lowematches any number of occurrences between 606*bbf21555SRichard Lowe.Em m 607*bbf21555SRichard Loweand 608*bbf21555SRichard Lowe.Em n , 609*bbf21555SRichard Loweinclusive. 610*bbf21555SRichard Lowe.Pp 611*bbf21555SRichard LoweFor example, in the string 612*bbf21555SRichard Lowe.Qq abababccccccd , 613*bbf21555SRichard Lowethe BRE 614*bbf21555SRichard Lowe.Qq c\e{3\e} 615*bbf21555SRichard Loweis matched by characters seven to nine, the BRE 616*bbf21555SRichard Lowe.Qq \e(ab\e)\e{4,\e} 617*bbf21555SRichard Loweis not matched at all and the BRE 618*bbf21555SRichard Lowe.Qq c\e{1,3\e}d 619*bbf21555SRichard Loweis matched by characters ten to thirteen. 620*bbf21555SRichard Lowe.El 621*bbf21555SRichard Lowe.Pp 622*bbf21555SRichard LoweThe behavior of multiple adjacent duplication symbols 623*bbf21555SRichard Lowe.Po Qq * 624*bbf21555SRichard Loweand intervals 625*bbf21555SRichard Lowe.Pc 626*bbf21555SRichard Loweproduces undefined results. 627*bbf21555SRichard Lowe.Ss BRE Precedence 628*bbf21555SRichard LoweThe order of precedence is as shown in the following table: 629*bbf21555SRichard Lowe.Bl -column "BRE Precedence (from high to low)" "" 630*bbf21555SRichard Lowe.It Sy BRE Precedence (from high to low) Ta 631*bbf21555SRichard Lowe.It collation-related bracket symbols Ta [= =] [: :] [. .] 632*bbf21555SRichard Lowe.It escaped characters Ta \e< Ns Em special character Ns > 633*bbf21555SRichard Lowe.It bracket expression Ta [ ] 634*bbf21555SRichard Lowe.It subexpressions/back-references Ta \e( \e) \e Ns Em n 635*bbf21555SRichard Lowe.It single-character-BRE duplication Ta * \e{ Ns Em m Ns \&, Ns Em n Ns \e} 636*bbf21555SRichard Lowe.It concatenation Ta 637*bbf21555SRichard Lowe.It anchoring Ta ^ $ 638*bbf21555SRichard Lowe.El 639*bbf21555SRichard Lowe.Ss BRE Expression Anchoring 640*bbf21555SRichard LoweA BRE can be limited to matching strings that begin or end a line; this is 641*bbf21555SRichard Lowecalled 642*bbf21555SRichard Lowe.Em anchoring . 643*bbf21555SRichard LoweThe circumflex and dollar sign special characters will be considered BRE anchors 644*bbf21555SRichard Lowein the following contexts: 645*bbf21555SRichard Lowe.Bl -enum 646*bbf21555SRichard Lowe.It 647*bbf21555SRichard LoweA circumflex 648*bbf21555SRichard Lowe.Pq Qq ^ 649*bbf21555SRichard Loweis an anchor when used as the first character of an entire BRE. 650*bbf21555SRichard LoweThe implementation may treat circumflex as an anchor when used as the first 651*bbf21555SRichard Lowecharacter of a subexpression. 652*bbf21555SRichard LoweThe circumflex will anchor the expression to the beginning of a string; 653*bbf21555SRichard Loweonly sequences starting at the first character of a string will be matched by 654*bbf21555SRichard Lowethe BRE. 655*bbf21555SRichard LoweFor example, the BRE 656*bbf21555SRichard Lowe.Qq ^ab 657*bbf21555SRichard Lowematches 658*bbf21555SRichard Lowe.Qq ab 659*bbf21555SRichard Lowein the string 660*bbf21555SRichard Lowe.Qq abcdef , 661*bbf21555SRichard Lowebut fails to match in the string 662*bbf21555SRichard Lowe.Qq cdefab . 663*bbf21555SRichard LoweA portable BRE must escape a leading circumflex in a subexpression to match a 664*bbf21555SRichard Loweliteral circumflex. 665*bbf21555SRichard Lowe.It 666*bbf21555SRichard LoweA dollar sign 667*bbf21555SRichard Lowe.Pq Qq $ 668*bbf21555SRichard Loweis an anchor when used as the last character of an entire BRE. 669*bbf21555SRichard LoweThe implementation may treat a dollar sign as an anchor when used as the last 670*bbf21555SRichard Lowecharacter of a subexpression. 671*bbf21555SRichard LoweThe dollar sign will anchor the expression to the end of the string being 672*bbf21555SRichard Lowematched; the dollar sign can be said to match the end-of-string following the 673*bbf21555SRichard Lowelast character. 674*bbf21555SRichard Lowe.It 675*bbf21555SRichard LoweA BRE anchored by both 676*bbf21555SRichard Lowe.Qq ^ 677*bbf21555SRichard Loweand 678*bbf21555SRichard Lowe.Qq $ 679*bbf21555SRichard Lowematches only an entire string. 680*bbf21555SRichard LoweFor example, the BRE 681*bbf21555SRichard Lowe^abcdef$ 682*bbf21555SRichard Lowematches strings consisting only of 683*bbf21555SRichard Lowe.Qq abcdef . 684*bbf21555SRichard Lowe.It 685*bbf21555SRichard Lowe.Qq ^ 686*bbf21555SRichard Loweand 687*bbf21555SRichard Lowe.Qq $ 688*bbf21555SRichard Loweare not special in subexpressions. 689*bbf21555SRichard Lowe.El 690*bbf21555SRichard Lowe.Pp 691*bbf21555SRichard LoweNote: The Solaris implementation does not support anchoring in BRE 692*bbf21555SRichard Lowesubexpressions. 693*bbf21555SRichard Lowe.Sh EXTENDED REGULAR EXPRESSIONS 694*bbf21555SRichard LoweThe rules specified for BREs apply to Extended Regular Expressions 695*bbf21555SRichard Lowe.Pq EREs 696*bbf21555SRichard Lowewith the following exceptions: 697*bbf21555SRichard Lowe.Bl -bullet 698*bbf21555SRichard Lowe.It 699*bbf21555SRichard LoweThe characters 700*bbf21555SRichard Lowe.Qq | , 701*bbf21555SRichard Lowe.Qq + , 702*bbf21555SRichard Loweand 703*bbf21555SRichard Lowe.Qq \&? 704*bbf21555SRichard Lowehave special meaning, as defined below. 705*bbf21555SRichard Lowe.It 706*bbf21555SRichard LoweThe 707*bbf21555SRichard Lowe.Qq { 708*bbf21555SRichard Loweand 709*bbf21555SRichard Lowe.Qq } 710*bbf21555SRichard Lowecharacters, when used as the duplication operator, are not preceded by 711*bbf21555SRichard Lowebackslashes. 712*bbf21555SRichard LoweThe constructs 713*bbf21555SRichard Lowe.Qq \e{ 714*bbf21555SRichard Loweand 715*bbf21555SRichard Lowe.Qq \e} 716*bbf21555SRichard Lowesimply match the characters 717*bbf21555SRichard Lowe.Qq { 718*bbf21555SRichard Loweand 719*bbf21555SRichard Lowe.Qq }, respectively. 720*bbf21555SRichard Lowe.It 721*bbf21555SRichard LoweThe back reference operator is not supported. 722*bbf21555SRichard Lowe.It 723*bbf21555SRichard LoweAnchoring 724*bbf21555SRichard Lowe.Pq Qq ^$ 725*bbf21555SRichard Loweis supported in subexpressions. 726*bbf21555SRichard Lowe.El 727*bbf21555SRichard Lowe.Ss EREs Matching a Single Character 728*bbf21555SRichard LoweAn ERE ordinary character, a special character preceded by a backslash, or a 729*bbf21555SRichard Loweperiod matches a single character. 730*bbf21555SRichard LoweA bracket expression matches a single character or a single collating element. 731*bbf21555SRichard LoweAn 732*bbf21555SRichard Lowe.Em ERE matching a single character 733*bbf21555SRichard Loweenclosed in parentheses matches the same as the ERE without parentheses would 734*bbf21555SRichard Lowehave matched. 735*bbf21555SRichard Lowe.Ss ERE Ordinary Characters 736*bbf21555SRichard LoweAn 737*bbf21555SRichard Lowe.Em ordinary character 738*bbf21555SRichard Loweis an ERE that matches itself. 739*bbf21555SRichard LoweAn ordinary character is any character in the supported character set, except 740*bbf21555SRichard Lowefor the ERE special characters listed in 741*bbf21555SRichard Lowe.Sx ERE Special Characters 742*bbf21555SRichard Lowebelow. 743*bbf21555SRichard LoweThe interpretation of an ordinary character preceded by a backslash 744*bbf21555SRichard Lowe.Pq Qq \&\e 745*bbf21555SRichard Loweis undefined. 746*bbf21555SRichard Lowe.Ss ERE Special Characters 747*bbf21555SRichard LoweAn 748*bbf21555SRichard Lowe.Em ERE special character 749*bbf21555SRichard Lowehas special properties in certain contexts. 750*bbf21555SRichard LoweOutside those contexts, or when preceded by a backslash, such a character is an 751*bbf21555SRichard LoweERE that matches the special character itself. 752*bbf21555SRichard LoweThe extended regular expression special characters and the contexts in which 753*bbf21555SRichard Lowethey have their special meaning are: 754*bbf21555SRichard Lowe.Bl -tag -width Ds 755*bbf21555SRichard Lowe.It Sy \&. \&[ \&\e \&( 756*bbf21555SRichard LoweThe period, left-bracket, backslash, and left-parenthesis are special except 757*bbf21555SRichard Lowewhen used in a bracket expression 758*bbf21555SRichard Lowe.Po see 759*bbf21555SRichard Lowe.Sx RE Bracket Expression , 760*bbf21555SRichard Loweabove 761*bbf21555SRichard Lowe.Pc . 762*bbf21555SRichard LoweOutside a bracket expression, a left-parenthesis immediately followed by a 763*bbf21555SRichard Loweright-parenthesis produces undefined results. 764*bbf21555SRichard Lowe.It Sy \&) 765*bbf21555SRichard LoweThe right-parenthesis is special when matched with a preceding 766*bbf21555SRichard Loweleft-parenthesis, both outside a bracket expression. 767*bbf21555SRichard Lowe.It Sy * + \&? { 768*bbf21555SRichard LoweThe asterisk, plus-sign, question-mark, and left-brace are special except when 769*bbf21555SRichard Loweused in a bracket expression 770*bbf21555SRichard Lowe.Po see 771*bbf21555SRichard Lowe.Sx RE Bracket Expression , 772*bbf21555SRichard Loweabove 773*bbf21555SRichard Lowe.Pc . 774*bbf21555SRichard LoweAny of the following uses produce undefined results: 775*bbf21555SRichard Lowe.Bl -bullet 776*bbf21555SRichard Lowe.It 777*bbf21555SRichard Loweif these characters appear first in an ERE, or immediately following a 778*bbf21555SRichard Lowevertical-line, circumflex or left-parenthesis 779*bbf21555SRichard Lowe.It 780*bbf21555SRichard Loweif a left-brace is not part of a valid interval expression. 781*bbf21555SRichard Lowe.El 782*bbf21555SRichard Lowe.It Sy \&| 783*bbf21555SRichard LoweThe vertical-line is special except when used in a bracket expression 784*bbf21555SRichard Lowe.Po see 785*bbf21555SRichard Lowe.Sx RE Bracket Expression , 786*bbf21555SRichard Loweabove 787*bbf21555SRichard Lowe.Pc . 788*bbf21555SRichard LoweA vertical-line appearing first or last in an ERE, or immediately following a 789*bbf21555SRichard Lowevertical-line or a left-parenthesis, or immediately preceding a 790*bbf21555SRichard Loweright-parenthesis, produces undefined results. 791*bbf21555SRichard Lowe.It Sy ^ 792*bbf21555SRichard LoweThe circumflex is special when used: 793*bbf21555SRichard Lowe.Bl -bullet 794*bbf21555SRichard Lowe.It 795*bbf21555SRichard Loweas an anchor 796*bbf21555SRichard Lowe.Po see 797*bbf21555SRichard Lowe.Sx ERE Expression Anchoring , 798*bbf21555SRichard Lowebelow 799*bbf21555SRichard Lowe.Pc . 800*bbf21555SRichard Lowe.It 801*bbf21555SRichard Loweas the first character of a bracket expression 802*bbf21555SRichard Lowe.Po see 803*bbf21555SRichard Lowe.Sx RE Bracket Expression , 804*bbf21555SRichard Loweabove 805*bbf21555SRichard Lowe.Pc . 806*bbf21555SRichard Lowe.El 807*bbf21555SRichard Lowe.It Sy $ 808*bbf21555SRichard LoweThe dollar sign is special when used as an anchor. 809*bbf21555SRichard Lowe.El 810*bbf21555SRichard Lowe.Ss Periods in EREs 811*bbf21555SRichard LoweA period 812*bbf21555SRichard Lowe.Pq Qq \&. , 813*bbf21555SRichard Lowewhen used outside a bracket expression, is an ERE that matches any character in 814*bbf21555SRichard Lowethe supported character set except NUL. 815*bbf21555SRichard Lowe.Ss ERE Bracket Expression 816*bbf21555SRichard LoweThe rules for ERE Bracket Expressions are the same as for Basic Regular 817*bbf21555SRichard LoweExpressions; see 818*bbf21555SRichard Lowe.Sx RE Bracket Expression , 819*bbf21555SRichard Loweabove. 820*bbf21555SRichard Lowe.Ss EREs Matching Multiple Characters 821*bbf21555SRichard LoweThe following rules will be used to construct EREs matching multiple characters 822*bbf21555SRichard Lowefrom EREs matching a single character: 823*bbf21555SRichard Lowe.Bl -enum 824*bbf21555SRichard Lowe.It 825*bbf21555SRichard LoweA 826*bbf21555SRichard Lowe.Em concatenation of EREs 827*bbf21555SRichard Lowematches the concatenation of the character sequences matched by each component 828*bbf21555SRichard Loweof the ERE. 829*bbf21555SRichard LoweA concatenation of EREs enclosed in parentheses matches whatever the 830*bbf21555SRichard Loweconcatenation without the parentheses matches. 831*bbf21555SRichard LoweFor example, both the ERE 832*bbf21555SRichard Lowe.Qq cd 833*bbf21555SRichard Loweand the ERE 834*bbf21555SRichard Lowe.Qq (cd) 835*bbf21555SRichard Loweare matched by the third and fourth character of the string 836*bbf21555SRichard Lowe.Qq abcdefabcdef . 837*bbf21555SRichard Lowe.It 838*bbf21555SRichard LoweWhen an ERE matching a single character or an ERE enclosed in parentheses is 839*bbf21555SRichard Lowefollowed by the special character plus-sign 840*bbf21555SRichard Lowe.Pq Qq + , 841*bbf21555SRichard Lowetogether with that plus-sign it matches what one or more consecutive occurrences 842*bbf21555SRichard Loweof the ERE would match. 843*bbf21555SRichard LoweFor example, the ERE 844*bbf21555SRichard Lowe.Qq b+(bc) 845*bbf21555SRichard Lowematches the fourth to seventh characters in the string 846*bbf21555SRichard Lowe.Qq acabbbcde ; 847*bbf21555SRichard Lowe.Qq [ab]+ 848*bbf21555SRichard Loweand 849*bbf21555SRichard Lowe.Qq [ab][ab]* 850*bbf21555SRichard Loweare equivalent. 851*bbf21555SRichard Lowe.It 852*bbf21555SRichard LoweWhen an ERE matching a single character or an ERE enclosed in parentheses is 853*bbf21555SRichard Lowefollowed by the special character asterisk 854*bbf21555SRichard Lowe.Pq Qq * , 855*bbf21555SRichard Lowetogether with that asterisk it matches what zero or more consecutive occurrences 856*bbf21555SRichard Loweof the ERE would match. 857*bbf21555SRichard LoweFor example, the ERE 858*bbf21555SRichard Lowe.Qq b*c 859*bbf21555SRichard Lowematches the first character in the string 860*bbf21555SRichard Lowe.Qq cabbbcde , 861*bbf21555SRichard Loweand the ERE 862*bbf21555SRichard Lowe.Qq b*cd 863*bbf21555SRichard Lowematches the third to seventh characters in the string 864*bbf21555SRichard Lowe.Qq cabbbcdebbbbbbcdbc . 865*bbf21555SRichard LoweAnd, 866*bbf21555SRichard Lowe.Qq [ab]* 867*bbf21555SRichard Loweand 868*bbf21555SRichard Lowe.Qq [ab][ab] 869*bbf21555SRichard Loweare equivalent when matching the string 870*bbf21555SRichard Lowe.Qq ab . 871*bbf21555SRichard Lowe.It 872*bbf21555SRichard LoweWhen an ERE matching a single character or an ERE enclosed in parentheses is 873*bbf21555SRichard Lowefollowed by the special character question-mark 874*bbf21555SRichard Lowe.Pq Qq \&? , 875*bbf21555SRichard Lowetogether with that question-mark it matches what zero or one consecutive 876*bbf21555SRichard Loweoccurrences of the ERE would match. 877*bbf21555SRichard LoweFor example, the ERE 878*bbf21555SRichard Lowe.Qq b?c 879*bbf21555SRichard Lowematches the second character in the string 880*bbf21555SRichard Lowe.Qq acabbbcde . 881*bbf21555SRichard Lowe.It 882*bbf21555SRichard LoweWhen an ERE matching a single character or an ERE enclosed in parentheses is 883*bbf21555SRichard Lowefollowed by an 884*bbf21555SRichard Lowe.Em interval expression 885*bbf21555SRichard Loweof the format 886*bbf21555SRichard Lowe.Qq { Ns Em m Ns } , 887*bbf21555SRichard Lowe.Qq { Ns Em m Ns ,} 888*bbf21555SRichard Loweor 889*bbf21555SRichard Lowe.Qq { Ns Em m Ns \&, Ns Em n Ns } , 890*bbf21555SRichard Lowetogether with that interval expression it matches what repeated consecutive 891*bbf21555SRichard Loweoccurrences of the ERE would match. 892*bbf21555SRichard LoweThe values of 893*bbf21555SRichard Lowe.Em m 894*bbf21555SRichard Loweand 895*bbf21555SRichard Lowe.Em n 896*bbf21555SRichard Lowewill be decimal integers in the range 0 <= 897*bbf21555SRichard Lowe.Em m 898*bbf21555SRichard Lowe<= 899*bbf21555SRichard Lowe.Em n 900*bbf21555SRichard Lowe<= 901*bbf21555SRichard Lowe.Dv RE_DUP_MAX , 902*bbf21555SRichard Lowewhere 903*bbf21555SRichard Lowe.Em m 904*bbf21555SRichard Lowespecifies the exact or minimum number of occurrences and 905*bbf21555SRichard Lowe.Em n 906*bbf21555SRichard Lowespecifies the maximum number of occurrences. 907*bbf21555SRichard LoweThe expression 908*bbf21555SRichard Lowe.Qq { Ns Em m Ns } 909*bbf21555SRichard Lowematches exactly 910*bbf21555SRichard Lowe.Em m 911*bbf21555SRichard Loweoccurrences of the preceding ERE, 912*bbf21555SRichard Lowe.Qq { Ns Em m Ns ,} 913*bbf21555SRichard Lowematches at least 914*bbf21555SRichard Lowe.Em m 915*bbf21555SRichard Loweoccurrences and 916*bbf21555SRichard Lowe.Qq { Ns m Ns \&, Ns Em n Ns } 917*bbf21555SRichard Lowematches any number of occurrences between 918*bbf21555SRichard Lowe.Em m 919*bbf21555SRichard Loweand 920*bbf21555SRichard Lowe.Em n , 921*bbf21555SRichard Loweinclusive. 922*bbf21555SRichard Lowe.El 923*bbf21555SRichard Lowe.Pp 924*bbf21555SRichard LoweFor example, in the string 925*bbf21555SRichard Lowe.Qq abababccccccd 926*bbf21555SRichard Lowethe ERE 927*bbf21555SRichard Lowe.Qq c{3} 928*bbf21555SRichard Loweis matched by characters seven to nine and the ERE 929*bbf21555SRichard Lowe.Qq (ab){2,} 930*bbf21555SRichard Loweis matched by characters one to six. 931*bbf21555SRichard Lowe.Pp 932*bbf21555SRichard LoweThe behavior of multiple adjacent duplication symbols 933*bbf21555SRichard Lowe.Po 934*bbf21555SRichard Lowe.Qq + , 935*bbf21555SRichard Lowe.Qq * , 936*bbf21555SRichard Lowe.Qq \&? 937*bbf21555SRichard Loweand intervals 938*bbf21555SRichard Lowe.Pc 939*bbf21555SRichard Loweproduces undefined results. 940*bbf21555SRichard Lowe.Ss ERE Alternation 941*bbf21555SRichard LoweTwo EREs separated by the special character vertical-line 942*bbf21555SRichard Lowe.Pq Qq | 943*bbf21555SRichard Lowematch a string that is matched by either. 944*bbf21555SRichard LoweFor example, the ERE 945*bbf21555SRichard Lowe.Qq a((bc)|d) 946*bbf21555SRichard Lowematches the string 947*bbf21555SRichard Lowe.Qq abc 948*bbf21555SRichard Loweand the string 949*bbf21555SRichard Lowe.Qq ad . 950*bbf21555SRichard LoweSingle characters, or expressions matching single characters, separated by the 951*bbf21555SRichard Lowevertical bar and enclosed in parentheses, will be treated as an ERE matching a 952*bbf21555SRichard Lowesingle character. 953*bbf21555SRichard Lowe.Ss ERE Precedence 954*bbf21555SRichard LoweThe order of precedence will be as shown in the following table: 955*bbf21555SRichard Lowe.Bl -column "ERE Precedence (from high to low)" "" 956*bbf21555SRichard Lowe.It Sy ERE Precedence (from high to low) Ta 957*bbf21555SRichard Lowe.It collation-related bracket symbols Ta [= =] [: :] [. .] 958*bbf21555SRichard Lowe.It escaped characters Ta \e< Ns Em special character Ns > 959*bbf21555SRichard Lowe.It bracket expression Ta \&[ \&] 960*bbf21555SRichard Lowe.It grouping Ta \&( \&) 961*bbf21555SRichard Lowe.It single-character-ERE duplication Ta * + \&? { Ns Em m Ns \&, Ns Em n Ns} 962*bbf21555SRichard Lowe.It concatenation Ta 963*bbf21555SRichard Lowe.It anchoring Ta ^ $ 964*bbf21555SRichard Lowe.It alternation Ta | 965*bbf21555SRichard Lowe.El 966*bbf21555SRichard Lowe.Pp 967*bbf21555SRichard LoweFor example, the ERE 968*bbf21555SRichard Lowe.Qq abba|cde 969*bbf21555SRichard Lowematches either the string 970*bbf21555SRichard Lowe.Qq abba 971*bbf21555SRichard Loweor the string 972*bbf21555SRichard Lowe.Qq cde 973*bbf21555SRichard Lowe.Po rather than the string 974*bbf21555SRichard Lowe.Qq abbade 975*bbf21555SRichard Loweor 976*bbf21555SRichard Lowe.Qq abbcde , 977*bbf21555SRichard Lowebecause concatenation has a higher order of precedence than alternation 978*bbf21555SRichard Lowe.Pc . 979*bbf21555SRichard Lowe.Ss ERE Expression Anchoring 980*bbf21555SRichard LoweAn ERE can be limited to matching strings that begin or end a line; this is 981*bbf21555SRichard Lowecalled 982*bbf21555SRichard Lowe.Em anchoring . 983*bbf21555SRichard LoweThe circumflex and dollar sign special characters are considered ERE anchors 984*bbf21555SRichard Lowewhen used anywhere outside a bracket expression. 985*bbf21555SRichard LoweThis has the following effects: 986*bbf21555SRichard Lowe.Bl -enum 987*bbf21555SRichard Lowe.It 988*bbf21555SRichard LoweA circumflex 989*bbf21555SRichard Lowe.Pq Qq ^ 990*bbf21555SRichard Loweoutside a bracket expression anchors the expression or subexpression it begins 991*bbf21555SRichard Loweto the beginning of a string; such an expression or subexpression can match only 992*bbf21555SRichard Lowea sequence starting at the first character of a string. 993*bbf21555SRichard LoweFor example, the EREs 994*bbf21555SRichard Lowe.Qq ^ab 995*bbf21555SRichard Loweand 996*bbf21555SRichard Lowe.Qq (^ab) 997*bbf21555SRichard Lowematch 998*bbf21555SRichard Lowe.Qq ab 999*bbf21555SRichard Lowein the string 1000*bbf21555SRichard Lowe.Qq abcdef , 1001*bbf21555SRichard Lowebut fail to match in the string 1002*bbf21555SRichard Lowe.Qq cdefab , 1003*bbf21555SRichard Loweand the ERE 1004*bbf21555SRichard Lowe.Qq a^b 1005*bbf21555SRichard Loweis valid, but can never match because the 1006*bbf21555SRichard Lowe.Qq a 1007*bbf21555SRichard Loweprevents the expression 1008*bbf21555SRichard Lowe.Qq ^b 1009*bbf21555SRichard Lowefrom matching starting at the first character. 1010*bbf21555SRichard Lowe.It 1011*bbf21555SRichard LoweA dollar sign 1012*bbf21555SRichard Lowe.Pq Qq $ 1013*bbf21555SRichard Loweoutside a bracket expression anchors the expression or subexpression it ends to 1014*bbf21555SRichard Lowethe end of a string; such an expression or subexpression can match only a 1015*bbf21555SRichard Lowesequence ending at the last character of a string. 1016*bbf21555SRichard LoweFor example, the EREs 1017*bbf21555SRichard Lowe.Qq ef$ 1018*bbf21555SRichard Loweand 1019*bbf21555SRichard Lowe.Qq (ef$) 1020*bbf21555SRichard Lowematch 1021*bbf21555SRichard Lowe.Qq ef 1022*bbf21555SRichard Lowein the string 1023*bbf21555SRichard Lowe.Qq abcdef , 1024*bbf21555SRichard Lowebut fail to match in the string 1025*bbf21555SRichard Lowe.Qq cdefab , 1026*bbf21555SRichard Loweand the ERE 1027*bbf21555SRichard Lowe.Qq e$f 1028*bbf21555SRichard Loweis valid, but can never match because the 1029*bbf21555SRichard Lowe.Qq f 1030*bbf21555SRichard Loweprevents the expression 1031*bbf21555SRichard Lowe.Qq e$ 1032*bbf21555SRichard Lowefrom matching ending at the last character. 1033*bbf21555SRichard Lowe.El 1034*bbf21555SRichard Lowe.Sh SEE ALSO 1035*bbf21555SRichard Lowe.Xr localedef 1 , 1036*bbf21555SRichard Lowe.Xr regcomp 3C , 1037*bbf21555SRichard Lowe.Xr attributes 7 , 1038*bbf21555SRichard Lowe.Xr environ 7 , 1039*bbf21555SRichard Lowe.Xr locale 7 , 1040*bbf21555SRichard Lowe.Xr regexp 7 1041