1*0Sstevel@tonic-gate=head1 NAME 2*0Sstevel@tonic-gate 3*0Sstevel@tonic-gateperlop - Perl operators and precedence 4*0Sstevel@tonic-gate 5*0Sstevel@tonic-gate=head1 DESCRIPTION 6*0Sstevel@tonic-gate 7*0Sstevel@tonic-gate=head2 Operator Precedence and Associativity 8*0Sstevel@tonic-gate 9*0Sstevel@tonic-gateOperator precedence and associativity work in Perl more or less like 10*0Sstevel@tonic-gatethey do in mathematics. 11*0Sstevel@tonic-gate 12*0Sstevel@tonic-gateI<Operator precedence> means some operators are evaluated before 13*0Sstevel@tonic-gateothers. For example, in C<2 + 4 * 5>, the multiplication has higher 14*0Sstevel@tonic-gateprecedence so C<4 * 5> is evaluated first yielding C<2 + 20 == 15*0Sstevel@tonic-gate22> and not C<6 * 5 == 30>. 16*0Sstevel@tonic-gate 17*0Sstevel@tonic-gateI<Operator associativity> defines what happens if a sequence of the 18*0Sstevel@tonic-gatesame operators is used one after another: whether the evaluator will 19*0Sstevel@tonic-gateevaluate the left operations first or the right. For example, in C<8 20*0Sstevel@tonic-gate- 4 - 2>, subtraction is left associative so Perl evaluates the 21*0Sstevel@tonic-gateexpression left to right. C<8 - 4> is evaluated first making the 22*0Sstevel@tonic-gateexpression C<4 - 2 == 2> and not C<8 - 2 == 6>. 23*0Sstevel@tonic-gate 24*0Sstevel@tonic-gatePerl operators have the following associativity and precedence, 25*0Sstevel@tonic-gatelisted from highest precedence to lowest. Operators borrowed from 26*0Sstevel@tonic-gateC keep the same precedence relationship with each other, even where 27*0Sstevel@tonic-gateC's precedence is slightly screwy. (This makes learning Perl easier 28*0Sstevel@tonic-gatefor C folks.) With very few exceptions, these all operate on scalar 29*0Sstevel@tonic-gatevalues only, not array values. 30*0Sstevel@tonic-gate 31*0Sstevel@tonic-gate left terms and list operators (leftward) 32*0Sstevel@tonic-gate left -> 33*0Sstevel@tonic-gate nonassoc ++ -- 34*0Sstevel@tonic-gate right ** 35*0Sstevel@tonic-gate right ! ~ \ and unary + and - 36*0Sstevel@tonic-gate left =~ !~ 37*0Sstevel@tonic-gate left * / % x 38*0Sstevel@tonic-gate left + - . 39*0Sstevel@tonic-gate left << >> 40*0Sstevel@tonic-gate nonassoc named unary operators 41*0Sstevel@tonic-gate nonassoc < > <= >= lt gt le ge 42*0Sstevel@tonic-gate nonassoc == != <=> eq ne cmp 43*0Sstevel@tonic-gate left & 44*0Sstevel@tonic-gate left | ^ 45*0Sstevel@tonic-gate left && 46*0Sstevel@tonic-gate left || 47*0Sstevel@tonic-gate nonassoc .. ... 48*0Sstevel@tonic-gate right ?: 49*0Sstevel@tonic-gate right = += -= *= etc. 50*0Sstevel@tonic-gate left , => 51*0Sstevel@tonic-gate nonassoc list operators (rightward) 52*0Sstevel@tonic-gate right not 53*0Sstevel@tonic-gate left and 54*0Sstevel@tonic-gate left or xor 55*0Sstevel@tonic-gate 56*0Sstevel@tonic-gateIn the following sections, these operators are covered in precedence order. 57*0Sstevel@tonic-gate 58*0Sstevel@tonic-gateMany operators can be overloaded for objects. See L<overload>. 59*0Sstevel@tonic-gate 60*0Sstevel@tonic-gate=head2 Terms and List Operators (Leftward) 61*0Sstevel@tonic-gate 62*0Sstevel@tonic-gateA TERM has the highest precedence in Perl. They include variables, 63*0Sstevel@tonic-gatequote and quote-like operators, any expression in parentheses, 64*0Sstevel@tonic-gateand any function whose arguments are parenthesized. Actually, there 65*0Sstevel@tonic-gatearen't really functions in this sense, just list operators and unary 66*0Sstevel@tonic-gateoperators behaving as functions because you put parentheses around 67*0Sstevel@tonic-gatethe arguments. These are all documented in L<perlfunc>. 68*0Sstevel@tonic-gate 69*0Sstevel@tonic-gateIf any list operator (print(), etc.) or any unary operator (chdir(), etc.) 70*0Sstevel@tonic-gateis followed by a left parenthesis as the next token, the operator and 71*0Sstevel@tonic-gatearguments within parentheses are taken to be of highest precedence, 72*0Sstevel@tonic-gatejust like a normal function call. 73*0Sstevel@tonic-gate 74*0Sstevel@tonic-gateIn the absence of parentheses, the precedence of list operators such as 75*0Sstevel@tonic-gateC<print>, C<sort>, or C<chmod> is either very high or very low depending on 76*0Sstevel@tonic-gatewhether you are looking at the left side or the right side of the operator. 77*0Sstevel@tonic-gateFor example, in 78*0Sstevel@tonic-gate 79*0Sstevel@tonic-gate @ary = (1, 3, sort 4, 2); 80*0Sstevel@tonic-gate print @ary; # prints 1324 81*0Sstevel@tonic-gate 82*0Sstevel@tonic-gatethe commas on the right of the sort are evaluated before the sort, 83*0Sstevel@tonic-gatebut the commas on the left are evaluated after. In other words, 84*0Sstevel@tonic-gatelist operators tend to gobble up all arguments that follow, and 85*0Sstevel@tonic-gatethen act like a simple TERM with regard to the preceding expression. 86*0Sstevel@tonic-gateBe careful with parentheses: 87*0Sstevel@tonic-gate 88*0Sstevel@tonic-gate # These evaluate exit before doing the print: 89*0Sstevel@tonic-gate print($foo, exit); # Obviously not what you want. 90*0Sstevel@tonic-gate print $foo, exit; # Nor is this. 91*0Sstevel@tonic-gate 92*0Sstevel@tonic-gate # These do the print before evaluating exit: 93*0Sstevel@tonic-gate (print $foo), exit; # This is what you want. 94*0Sstevel@tonic-gate print($foo), exit; # Or this. 95*0Sstevel@tonic-gate print ($foo), exit; # Or even this. 96*0Sstevel@tonic-gate 97*0Sstevel@tonic-gateAlso note that 98*0Sstevel@tonic-gate 99*0Sstevel@tonic-gate print ($foo & 255) + 1, "\n"; 100*0Sstevel@tonic-gate 101*0Sstevel@tonic-gateprobably doesn't do what you expect at first glance. The parentheses 102*0Sstevel@tonic-gateenclose the argument list for C<print> which is evaluated (printing 103*0Sstevel@tonic-gatethe result of C<$foo & 255>). Then one is added to the return value 104*0Sstevel@tonic-gateof C<print> (usually 1). The result is something like this: 105*0Sstevel@tonic-gate 106*0Sstevel@tonic-gate 1 + 1, "\n"; # Obviously not what you meant. 107*0Sstevel@tonic-gate 108*0Sstevel@tonic-gateTo do what you meant properly, you must write: 109*0Sstevel@tonic-gate 110*0Sstevel@tonic-gate print(($foo & 255) + 1, "\n"); 111*0Sstevel@tonic-gate 112*0Sstevel@tonic-gateSee L<Named Unary Operators> for more discussion of this. 113*0Sstevel@tonic-gate 114*0Sstevel@tonic-gateAlso parsed as terms are the C<do {}> and C<eval {}> constructs, as 115*0Sstevel@tonic-gatewell as subroutine and method calls, and the anonymous 116*0Sstevel@tonic-gateconstructors C<[]> and C<{}>. 117*0Sstevel@tonic-gate 118*0Sstevel@tonic-gateSee also L<Quote and Quote-like Operators> toward the end of this section, 119*0Sstevel@tonic-gateas well as L<"I/O Operators">. 120*0Sstevel@tonic-gate 121*0Sstevel@tonic-gate=head2 The Arrow Operator 122*0Sstevel@tonic-gate 123*0Sstevel@tonic-gate"C<< -> >>" is an infix dereference operator, just as it is in C 124*0Sstevel@tonic-gateand C++. If the right side is either a C<[...]>, C<{...}>, or a 125*0Sstevel@tonic-gateC<(...)> subscript, then the left side must be either a hard or 126*0Sstevel@tonic-gatesymbolic reference to an array, a hash, or a subroutine respectively. 127*0Sstevel@tonic-gate(Or technically speaking, a location capable of holding a hard 128*0Sstevel@tonic-gatereference, if it's an array or hash reference being used for 129*0Sstevel@tonic-gateassignment.) See L<perlreftut> and L<perlref>. 130*0Sstevel@tonic-gate 131*0Sstevel@tonic-gateOtherwise, the right side is a method name or a simple scalar 132*0Sstevel@tonic-gatevariable containing either the method name or a subroutine reference, 133*0Sstevel@tonic-gateand the left side must be either an object (a blessed reference) 134*0Sstevel@tonic-gateor a class name (that is, a package name). See L<perlobj>. 135*0Sstevel@tonic-gate 136*0Sstevel@tonic-gate=head2 Auto-increment and Auto-decrement 137*0Sstevel@tonic-gate 138*0Sstevel@tonic-gate"++" and "--" work as in C. That is, if placed before a variable, 139*0Sstevel@tonic-gatethey increment or decrement the variable by one before returning the 140*0Sstevel@tonic-gatevalue, and if placed after, increment or decrement after returning the 141*0Sstevel@tonic-gatevalue. 142*0Sstevel@tonic-gate 143*0Sstevel@tonic-gate $i = 0; $j = 0; 144*0Sstevel@tonic-gate print $i++; # prints 0 145*0Sstevel@tonic-gate print ++$j; # prints 1 146*0Sstevel@tonic-gate 147*0Sstevel@tonic-gateThe auto-increment operator has a little extra builtin magic to it. If 148*0Sstevel@tonic-gateyou increment a variable that is numeric, or that has ever been used in 149*0Sstevel@tonic-gatea numeric context, you get a normal increment. If, however, the 150*0Sstevel@tonic-gatevariable has been used in only string contexts since it was set, and 151*0Sstevel@tonic-gatehas a value that is not the empty string and matches the pattern 152*0Sstevel@tonic-gateC</^[a-zA-Z]*[0-9]*\z/>, the increment is done as a string, preserving each 153*0Sstevel@tonic-gatecharacter within its range, with carry: 154*0Sstevel@tonic-gate 155*0Sstevel@tonic-gate print ++($foo = '99'); # prints '100' 156*0Sstevel@tonic-gate print ++($foo = 'a0'); # prints 'a1' 157*0Sstevel@tonic-gate print ++($foo = 'Az'); # prints 'Ba' 158*0Sstevel@tonic-gate print ++($foo = 'zz'); # prints 'aaa' 159*0Sstevel@tonic-gate 160*0Sstevel@tonic-gateC<undef> is always treated as numeric, and in particular is changed 161*0Sstevel@tonic-gateto C<0> before incrementing (so that a post-increment of an undef value 162*0Sstevel@tonic-gatewill return C<0> rather than C<undef>). 163*0Sstevel@tonic-gate 164*0Sstevel@tonic-gateThe auto-decrement operator is not magical. 165*0Sstevel@tonic-gate 166*0Sstevel@tonic-gate=head2 Exponentiation 167*0Sstevel@tonic-gate 168*0Sstevel@tonic-gateBinary "**" is the exponentiation operator. It binds even more 169*0Sstevel@tonic-gatetightly than unary minus, so -2**4 is -(2**4), not (-2)**4. (This is 170*0Sstevel@tonic-gateimplemented using C's pow(3) function, which actually works on doubles 171*0Sstevel@tonic-gateinternally.) 172*0Sstevel@tonic-gate 173*0Sstevel@tonic-gate=head2 Symbolic Unary Operators 174*0Sstevel@tonic-gate 175*0Sstevel@tonic-gateUnary "!" performs logical negation, i.e., "not". See also C<not> for a lower 176*0Sstevel@tonic-gateprecedence version of this. 177*0Sstevel@tonic-gate 178*0Sstevel@tonic-gateUnary "-" performs arithmetic negation if the operand is numeric. If 179*0Sstevel@tonic-gatethe operand is an identifier, a string consisting of a minus sign 180*0Sstevel@tonic-gateconcatenated with the identifier is returned. Otherwise, if the string 181*0Sstevel@tonic-gatestarts with a plus or minus, a string starting with the opposite sign 182*0Sstevel@tonic-gateis returned. One effect of these rules is that C<-bareword> is equivalent 183*0Sstevel@tonic-gateto C<"-bareword">. 184*0Sstevel@tonic-gate 185*0Sstevel@tonic-gateUnary "~" performs bitwise negation, i.e., 1's complement. For 186*0Sstevel@tonic-gateexample, C<0666 & ~027> is 0640. (See also L<Integer Arithmetic> and 187*0Sstevel@tonic-gateL<Bitwise String Operators>.) Note that the width of the result is 188*0Sstevel@tonic-gateplatform-dependent: ~0 is 32 bits wide on a 32-bit platform, but 64 189*0Sstevel@tonic-gatebits wide on a 64-bit platform, so if you are expecting a certain bit 190*0Sstevel@tonic-gatewidth, remember to use the & operator to mask off the excess bits. 191*0Sstevel@tonic-gate 192*0Sstevel@tonic-gateUnary "+" has no effect whatsoever, even on strings. It is useful 193*0Sstevel@tonic-gatesyntactically for separating a function name from a parenthesized expression 194*0Sstevel@tonic-gatethat would otherwise be interpreted as the complete list of function 195*0Sstevel@tonic-gatearguments. (See examples above under L<Terms and List Operators (Leftward)>.) 196*0Sstevel@tonic-gate 197*0Sstevel@tonic-gateUnary "\" creates a reference to whatever follows it. See L<perlreftut> 198*0Sstevel@tonic-gateand L<perlref>. Do not confuse this behavior with the behavior of 199*0Sstevel@tonic-gatebackslash within a string, although both forms do convey the notion 200*0Sstevel@tonic-gateof protecting the next thing from interpolation. 201*0Sstevel@tonic-gate 202*0Sstevel@tonic-gate=head2 Binding Operators 203*0Sstevel@tonic-gate 204*0Sstevel@tonic-gateBinary "=~" binds a scalar expression to a pattern match. Certain operations 205*0Sstevel@tonic-gatesearch or modify the string $_ by default. This operator makes that kind 206*0Sstevel@tonic-gateof operation work on some other string. The right argument is a search 207*0Sstevel@tonic-gatepattern, substitution, or transliteration. The left argument is what is 208*0Sstevel@tonic-gatesupposed to be searched, substituted, or transliterated instead of the default 209*0Sstevel@tonic-gate$_. When used in scalar context, the return value generally indicates the 210*0Sstevel@tonic-gatesuccess of the operation. Behavior in list context depends on the particular 211*0Sstevel@tonic-gateoperator. See L</"Regexp Quote-Like Operators"> for details. 212*0Sstevel@tonic-gate 213*0Sstevel@tonic-gateIf the right argument is an expression rather than a search pattern, 214*0Sstevel@tonic-gatesubstitution, or transliteration, it is interpreted as a search pattern at run 215*0Sstevel@tonic-gatetime. 216*0Sstevel@tonic-gate 217*0Sstevel@tonic-gateBinary "!~" is just like "=~" except the return value is negated in 218*0Sstevel@tonic-gatethe logical sense. 219*0Sstevel@tonic-gate 220*0Sstevel@tonic-gate=head2 Multiplicative Operators 221*0Sstevel@tonic-gate 222*0Sstevel@tonic-gateBinary "*" multiplies two numbers. 223*0Sstevel@tonic-gate 224*0Sstevel@tonic-gateBinary "/" divides two numbers. 225*0Sstevel@tonic-gate 226*0Sstevel@tonic-gateBinary "%" computes the modulus of two numbers. Given integer 227*0Sstevel@tonic-gateoperands C<$a> and C<$b>: If C<$b> is positive, then C<$a % $b> is 228*0Sstevel@tonic-gateC<$a> minus the largest multiple of C<$b> that is not greater than 229*0Sstevel@tonic-gateC<$a>. If C<$b> is negative, then C<$a % $b> is C<$a> minus the 230*0Sstevel@tonic-gatesmallest multiple of C<$b> that is not less than C<$a> (i.e. the 231*0Sstevel@tonic-gateresult will be less than or equal to zero). 232*0Sstevel@tonic-gateNote that when C<use integer> is in scope, "%" gives you direct access 233*0Sstevel@tonic-gateto the modulus operator as implemented by your C compiler. This 234*0Sstevel@tonic-gateoperator is not as well defined for negative operands, but it will 235*0Sstevel@tonic-gateexecute faster. 236*0Sstevel@tonic-gate 237*0Sstevel@tonic-gateBinary "x" is the repetition operator. In scalar context or if the left 238*0Sstevel@tonic-gateoperand is not enclosed in parentheses, it returns a string consisting 239*0Sstevel@tonic-gateof the left operand repeated the number of times specified by the right 240*0Sstevel@tonic-gateoperand. In list context, if the left operand is enclosed in 241*0Sstevel@tonic-gateparentheses, it repeats the list. If the right operand is zero or 242*0Sstevel@tonic-gatenegative, it returns an empty string or an empty list, depending on the 243*0Sstevel@tonic-gatecontext. 244*0Sstevel@tonic-gate 245*0Sstevel@tonic-gate print '-' x 80; # print row of dashes 246*0Sstevel@tonic-gate 247*0Sstevel@tonic-gate print "\t" x ($tab/8), ' ' x ($tab%8); # tab over 248*0Sstevel@tonic-gate 249*0Sstevel@tonic-gate @ones = (1) x 80; # a list of 80 1's 250*0Sstevel@tonic-gate @ones = (5) x @ones; # set all elements to 5 251*0Sstevel@tonic-gate 252*0Sstevel@tonic-gate 253*0Sstevel@tonic-gate=head2 Additive Operators 254*0Sstevel@tonic-gate 255*0Sstevel@tonic-gateBinary "+" returns the sum of two numbers. 256*0Sstevel@tonic-gate 257*0Sstevel@tonic-gateBinary "-" returns the difference of two numbers. 258*0Sstevel@tonic-gate 259*0Sstevel@tonic-gateBinary "." concatenates two strings. 260*0Sstevel@tonic-gate 261*0Sstevel@tonic-gate=head2 Shift Operators 262*0Sstevel@tonic-gate 263*0Sstevel@tonic-gateBinary "<<" returns the value of its left argument shifted left by the 264*0Sstevel@tonic-gatenumber of bits specified by the right argument. Arguments should be 265*0Sstevel@tonic-gateintegers. (See also L<Integer Arithmetic>.) 266*0Sstevel@tonic-gate 267*0Sstevel@tonic-gateBinary ">>" returns the value of its left argument shifted right by 268*0Sstevel@tonic-gatethe number of bits specified by the right argument. Arguments should 269*0Sstevel@tonic-gatebe integers. (See also L<Integer Arithmetic>.) 270*0Sstevel@tonic-gate 271*0Sstevel@tonic-gateNote that both "<<" and ">>" in Perl are implemented directly using 272*0Sstevel@tonic-gate"<<" and ">>" in C. If C<use integer> (see L<Integer Arithmetic>) is 273*0Sstevel@tonic-gatein force then signed C integers are used, else unsigned C integers are 274*0Sstevel@tonic-gateused. Either way, the implementation isn't going to generate results 275*0Sstevel@tonic-gatelarger than the size of the integer type Perl was built with (32 bits 276*0Sstevel@tonic-gateor 64 bits). 277*0Sstevel@tonic-gate 278*0Sstevel@tonic-gateThe result of overflowing the range of the integers is undefined 279*0Sstevel@tonic-gatebecause it is undefined also in C. In other words, using 32-bit 280*0Sstevel@tonic-gateintegers, C<< 1 << 32 >> is undefined. Shifting by a negative number 281*0Sstevel@tonic-gateof bits is also undefined. 282*0Sstevel@tonic-gate 283*0Sstevel@tonic-gate=head2 Named Unary Operators 284*0Sstevel@tonic-gate 285*0Sstevel@tonic-gateThe various named unary operators are treated as functions with one 286*0Sstevel@tonic-gateargument, with optional parentheses. 287*0Sstevel@tonic-gate 288*0Sstevel@tonic-gateIf any list operator (print(), etc.) or any unary operator (chdir(), etc.) 289*0Sstevel@tonic-gateis followed by a left parenthesis as the next token, the operator and 290*0Sstevel@tonic-gatearguments within parentheses are taken to be of highest precedence, 291*0Sstevel@tonic-gatejust like a normal function call. For example, 292*0Sstevel@tonic-gatebecause named unary operators are higher precedence than ||: 293*0Sstevel@tonic-gate 294*0Sstevel@tonic-gate chdir $foo || die; # (chdir $foo) || die 295*0Sstevel@tonic-gate chdir($foo) || die; # (chdir $foo) || die 296*0Sstevel@tonic-gate chdir ($foo) || die; # (chdir $foo) || die 297*0Sstevel@tonic-gate chdir +($foo) || die; # (chdir $foo) || die 298*0Sstevel@tonic-gate 299*0Sstevel@tonic-gatebut, because * is higher precedence than named operators: 300*0Sstevel@tonic-gate 301*0Sstevel@tonic-gate chdir $foo * 20; # chdir ($foo * 20) 302*0Sstevel@tonic-gate chdir($foo) * 20; # (chdir $foo) * 20 303*0Sstevel@tonic-gate chdir ($foo) * 20; # (chdir $foo) * 20 304*0Sstevel@tonic-gate chdir +($foo) * 20; # chdir ($foo * 20) 305*0Sstevel@tonic-gate 306*0Sstevel@tonic-gate rand 10 * 20; # rand (10 * 20) 307*0Sstevel@tonic-gate rand(10) * 20; # (rand 10) * 20 308*0Sstevel@tonic-gate rand (10) * 20; # (rand 10) * 20 309*0Sstevel@tonic-gate rand +(10) * 20; # rand (10 * 20) 310*0Sstevel@tonic-gate 311*0Sstevel@tonic-gateRegarding precedence, the filetest operators, like C<-f>, C<-M>, etc. are 312*0Sstevel@tonic-gatetreated like named unary operators, but they don't follow this functional 313*0Sstevel@tonic-gateparenthesis rule. That means, for example, that C<-f($file).".bak"> is 314*0Sstevel@tonic-gateequivalent to C<-f "$file.bak">. 315*0Sstevel@tonic-gate 316*0Sstevel@tonic-gateSee also L<"Terms and List Operators (Leftward)">. 317*0Sstevel@tonic-gate 318*0Sstevel@tonic-gate=head2 Relational Operators 319*0Sstevel@tonic-gate 320*0Sstevel@tonic-gateBinary "<" returns true if the left argument is numerically less than 321*0Sstevel@tonic-gatethe right argument. 322*0Sstevel@tonic-gate 323*0Sstevel@tonic-gateBinary ">" returns true if the left argument is numerically greater 324*0Sstevel@tonic-gatethan the right argument. 325*0Sstevel@tonic-gate 326*0Sstevel@tonic-gateBinary "<=" returns true if the left argument is numerically less than 327*0Sstevel@tonic-gateor equal to the right argument. 328*0Sstevel@tonic-gate 329*0Sstevel@tonic-gateBinary ">=" returns true if the left argument is numerically greater 330*0Sstevel@tonic-gatethan or equal to the right argument. 331*0Sstevel@tonic-gate 332*0Sstevel@tonic-gateBinary "lt" returns true if the left argument is stringwise less than 333*0Sstevel@tonic-gatethe right argument. 334*0Sstevel@tonic-gate 335*0Sstevel@tonic-gateBinary "gt" returns true if the left argument is stringwise greater 336*0Sstevel@tonic-gatethan the right argument. 337*0Sstevel@tonic-gate 338*0Sstevel@tonic-gateBinary "le" returns true if the left argument is stringwise less than 339*0Sstevel@tonic-gateor equal to the right argument. 340*0Sstevel@tonic-gate 341*0Sstevel@tonic-gateBinary "ge" returns true if the left argument is stringwise greater 342*0Sstevel@tonic-gatethan or equal to the right argument. 343*0Sstevel@tonic-gate 344*0Sstevel@tonic-gate=head2 Equality Operators 345*0Sstevel@tonic-gate 346*0Sstevel@tonic-gateBinary "==" returns true if the left argument is numerically equal to 347*0Sstevel@tonic-gatethe right argument. 348*0Sstevel@tonic-gate 349*0Sstevel@tonic-gateBinary "!=" returns true if the left argument is numerically not equal 350*0Sstevel@tonic-gateto the right argument. 351*0Sstevel@tonic-gate 352*0Sstevel@tonic-gateBinary "<=>" returns -1, 0, or 1 depending on whether the left 353*0Sstevel@tonic-gateargument is numerically less than, equal to, or greater than the right 354*0Sstevel@tonic-gateargument. If your platform supports NaNs (not-a-numbers) as numeric 355*0Sstevel@tonic-gatevalues, using them with "<=>" returns undef. NaN is not "<", "==", ">", 356*0Sstevel@tonic-gate"<=" or ">=" anything (even NaN), so those 5 return false. NaN != NaN 357*0Sstevel@tonic-gatereturns true, as does NaN != anything else. If your platform doesn't 358*0Sstevel@tonic-gatesupport NaNs then NaN is just a string with numeric value 0. 359*0Sstevel@tonic-gate 360*0Sstevel@tonic-gate perl -le '$a = NaN; print "No NaN support here" if $a == $a' 361*0Sstevel@tonic-gate perl -le '$a = NaN; print "NaN support here" if $a != $a' 362*0Sstevel@tonic-gate 363*0Sstevel@tonic-gateBinary "eq" returns true if the left argument is stringwise equal to 364*0Sstevel@tonic-gatethe right argument. 365*0Sstevel@tonic-gate 366*0Sstevel@tonic-gateBinary "ne" returns true if the left argument is stringwise not equal 367*0Sstevel@tonic-gateto the right argument. 368*0Sstevel@tonic-gate 369*0Sstevel@tonic-gateBinary "cmp" returns -1, 0, or 1 depending on whether the left 370*0Sstevel@tonic-gateargument is stringwise less than, equal to, or greater than the right 371*0Sstevel@tonic-gateargument. 372*0Sstevel@tonic-gate 373*0Sstevel@tonic-gate"lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified 374*0Sstevel@tonic-gateby the current locale if C<use locale> is in effect. See L<perllocale>. 375*0Sstevel@tonic-gate 376*0Sstevel@tonic-gate=head2 Bitwise And 377*0Sstevel@tonic-gate 378*0Sstevel@tonic-gateBinary "&" returns its operands ANDed together bit by bit. 379*0Sstevel@tonic-gate(See also L<Integer Arithmetic> and L<Bitwise String Operators>.) 380*0Sstevel@tonic-gate 381*0Sstevel@tonic-gateNote that "&" has lower priority than relational operators, so for example 382*0Sstevel@tonic-gatethe brackets are essential in a test like 383*0Sstevel@tonic-gate 384*0Sstevel@tonic-gate print "Even\n" if ($x & 1) == 0; 385*0Sstevel@tonic-gate 386*0Sstevel@tonic-gate=head2 Bitwise Or and Exclusive Or 387*0Sstevel@tonic-gate 388*0Sstevel@tonic-gateBinary "|" returns its operands ORed together bit by bit. 389*0Sstevel@tonic-gate(See also L<Integer Arithmetic> and L<Bitwise String Operators>.) 390*0Sstevel@tonic-gate 391*0Sstevel@tonic-gateBinary "^" returns its operands XORed together bit by bit. 392*0Sstevel@tonic-gate(See also L<Integer Arithmetic> and L<Bitwise String Operators>.) 393*0Sstevel@tonic-gate 394*0Sstevel@tonic-gateNote that "|" and "^" have lower priority than relational operators, so 395*0Sstevel@tonic-gatefor example the brackets are essential in a test like 396*0Sstevel@tonic-gate 397*0Sstevel@tonic-gate print "false\n" if (8 | 2) != 10; 398*0Sstevel@tonic-gate 399*0Sstevel@tonic-gate=head2 C-style Logical And 400*0Sstevel@tonic-gate 401*0Sstevel@tonic-gateBinary "&&" performs a short-circuit logical AND operation. That is, 402*0Sstevel@tonic-gateif the left operand is false, the right operand is not even evaluated. 403*0Sstevel@tonic-gateScalar or list context propagates down to the right operand if it 404*0Sstevel@tonic-gateis evaluated. 405*0Sstevel@tonic-gate 406*0Sstevel@tonic-gate=head2 C-style Logical Or 407*0Sstevel@tonic-gate 408*0Sstevel@tonic-gateBinary "||" performs a short-circuit logical OR operation. That is, 409*0Sstevel@tonic-gateif the left operand is true, the right operand is not even evaluated. 410*0Sstevel@tonic-gateScalar or list context propagates down to the right operand if it 411*0Sstevel@tonic-gateis evaluated. 412*0Sstevel@tonic-gate 413*0Sstevel@tonic-gateThe C<||> and C<&&> operators return the last value evaluated 414*0Sstevel@tonic-gate(unlike C's C<||> and C<&&>, which return 0 or 1). Thus, a reasonably 415*0Sstevel@tonic-gateportable way to find out the home directory might be: 416*0Sstevel@tonic-gate 417*0Sstevel@tonic-gate $home = $ENV{'HOME'} || $ENV{'LOGDIR'} || 418*0Sstevel@tonic-gate (getpwuid($<))[7] || die "You're homeless!\n"; 419*0Sstevel@tonic-gate 420*0Sstevel@tonic-gateIn particular, this means that you shouldn't use this 421*0Sstevel@tonic-gatefor selecting between two aggregates for assignment: 422*0Sstevel@tonic-gate 423*0Sstevel@tonic-gate @a = @b || @c; # this is wrong 424*0Sstevel@tonic-gate @a = scalar(@b) || @c; # really meant this 425*0Sstevel@tonic-gate @a = @b ? @b : @c; # this works fine, though 426*0Sstevel@tonic-gate 427*0Sstevel@tonic-gateAs more readable alternatives to C<&&> and C<||> when used for 428*0Sstevel@tonic-gatecontrol flow, Perl provides C<and> and C<or> operators (see below). 429*0Sstevel@tonic-gateThe short-circuit behavior is identical. The precedence of "and" and 430*0Sstevel@tonic-gate"or" is much lower, however, so that you can safely use them after a 431*0Sstevel@tonic-gatelist operator without the need for parentheses: 432*0Sstevel@tonic-gate 433*0Sstevel@tonic-gate unlink "alpha", "beta", "gamma" 434*0Sstevel@tonic-gate or gripe(), next LINE; 435*0Sstevel@tonic-gate 436*0Sstevel@tonic-gateWith the C-style operators that would have been written like this: 437*0Sstevel@tonic-gate 438*0Sstevel@tonic-gate unlink("alpha", "beta", "gamma") 439*0Sstevel@tonic-gate || (gripe(), next LINE); 440*0Sstevel@tonic-gate 441*0Sstevel@tonic-gateUsing "or" for assignment is unlikely to do what you want; see below. 442*0Sstevel@tonic-gate 443*0Sstevel@tonic-gate=head2 Range Operators 444*0Sstevel@tonic-gate 445*0Sstevel@tonic-gateBinary ".." is the range operator, which is really two different 446*0Sstevel@tonic-gateoperators depending on the context. In list context, it returns a 447*0Sstevel@tonic-gatelist of values counting (up by ones) from the left value to the right 448*0Sstevel@tonic-gatevalue. If the left value is greater than the right value then it 449*0Sstevel@tonic-gatereturns the empty list. The range operator is useful for writing 450*0Sstevel@tonic-gateC<foreach (1..10)> loops and for doing slice operations on arrays. In 451*0Sstevel@tonic-gatethe current implementation, no temporary array is created when the 452*0Sstevel@tonic-gaterange operator is used as the expression in C<foreach> loops, but older 453*0Sstevel@tonic-gateversions of Perl might burn a lot of memory when you write something 454*0Sstevel@tonic-gatelike this: 455*0Sstevel@tonic-gate 456*0Sstevel@tonic-gate for (1 .. 1_000_000) { 457*0Sstevel@tonic-gate # code 458*0Sstevel@tonic-gate } 459*0Sstevel@tonic-gate 460*0Sstevel@tonic-gateThe range operator also works on strings, using the magical auto-increment, 461*0Sstevel@tonic-gatesee below. 462*0Sstevel@tonic-gate 463*0Sstevel@tonic-gateIn scalar context, ".." returns a boolean value. The operator is 464*0Sstevel@tonic-gatebistable, like a flip-flop, and emulates the line-range (comma) operator 465*0Sstevel@tonic-gateof B<sed>, B<awk>, and various editors. Each ".." operator maintains its 466*0Sstevel@tonic-gateown boolean state. It is false as long as its left operand is false. 467*0Sstevel@tonic-gateOnce the left operand is true, the range operator stays true until the 468*0Sstevel@tonic-gateright operand is true, I<AFTER> which the range operator becomes false 469*0Sstevel@tonic-gateagain. It doesn't become false till the next time the range operator is 470*0Sstevel@tonic-gateevaluated. It can test the right operand and become false on the same 471*0Sstevel@tonic-gateevaluation it became true (as in B<awk>), but it still returns true once. 472*0Sstevel@tonic-gateIf you don't want it to test the right operand till the next 473*0Sstevel@tonic-gateevaluation, as in B<sed>, just use three dots ("...") instead of 474*0Sstevel@tonic-gatetwo. In all other regards, "..." behaves just like ".." does. 475*0Sstevel@tonic-gate 476*0Sstevel@tonic-gateThe right operand is not evaluated while the operator is in the 477*0Sstevel@tonic-gate"false" state, and the left operand is not evaluated while the 478*0Sstevel@tonic-gateoperator is in the "true" state. The precedence is a little lower 479*0Sstevel@tonic-gatethan || and &&. The value returned is either the empty string for 480*0Sstevel@tonic-gatefalse, or a sequence number (beginning with 1) for true. The 481*0Sstevel@tonic-gatesequence number is reset for each range encountered. The final 482*0Sstevel@tonic-gatesequence number in a range has the string "E0" appended to it, which 483*0Sstevel@tonic-gatedoesn't affect its numeric value, but gives you something to search 484*0Sstevel@tonic-gatefor if you want to exclude the endpoint. You can exclude the 485*0Sstevel@tonic-gatebeginning point by waiting for the sequence number to be greater 486*0Sstevel@tonic-gatethan 1. 487*0Sstevel@tonic-gate 488*0Sstevel@tonic-gateIf either operand of scalar ".." is a constant expression, 489*0Sstevel@tonic-gatethat operand is considered true if it is equal (C<==>) to the current 490*0Sstevel@tonic-gateinput line number (the C<$.> variable). 491*0Sstevel@tonic-gate 492*0Sstevel@tonic-gateTo be pedantic, the comparison is actually C<int(EXPR) == int(EXPR)>, 493*0Sstevel@tonic-gatebut that is only an issue if you use a floating point expression; when 494*0Sstevel@tonic-gateimplicitly using C<$.> as described in the previous paragraph, the 495*0Sstevel@tonic-gatecomparison is C<int(EXPR) == int($.)> which is only an issue when C<$.> 496*0Sstevel@tonic-gateis set to a floating point value and you are not reading from a file. 497*0Sstevel@tonic-gateFurthermore, C<"span" .. "spat"> or C<2.18 .. 3.14> will not do what 498*0Sstevel@tonic-gateyou want in scalar context because each of the operands are evaluated 499*0Sstevel@tonic-gateusing their integer representation. 500*0Sstevel@tonic-gate 501*0Sstevel@tonic-gateExamples: 502*0Sstevel@tonic-gate 503*0Sstevel@tonic-gateAs a scalar operator: 504*0Sstevel@tonic-gate 505*0Sstevel@tonic-gate if (101 .. 200) { print; } # print 2nd hundred lines, short for 506*0Sstevel@tonic-gate # if ($. == 101 .. $. == 200) ... 507*0Sstevel@tonic-gate next line if (1 .. /^$/); # skip header lines, short for 508*0Sstevel@tonic-gate # ... if ($. == 1 .. /^$/); 509*0Sstevel@tonic-gate s/^/> / if (/^$/ .. eof()); # quote body 510*0Sstevel@tonic-gate 511*0Sstevel@tonic-gate # parse mail messages 512*0Sstevel@tonic-gate while (<>) { 513*0Sstevel@tonic-gate $in_header = 1 .. /^$/; 514*0Sstevel@tonic-gate $in_body = /^$/ .. eof; 515*0Sstevel@tonic-gate if ($in_header) { 516*0Sstevel@tonic-gate # ... 517*0Sstevel@tonic-gate } else { # in body 518*0Sstevel@tonic-gate # ... 519*0Sstevel@tonic-gate } 520*0Sstevel@tonic-gate } continue { 521*0Sstevel@tonic-gate close ARGV if eof; # reset $. each file 522*0Sstevel@tonic-gate } 523*0Sstevel@tonic-gate 524*0Sstevel@tonic-gateAs a list operator: 525*0Sstevel@tonic-gate 526*0Sstevel@tonic-gate for (101 .. 200) { print; } # print $_ 100 times 527*0Sstevel@tonic-gate @foo = @foo[0 .. $#foo]; # an expensive no-op 528*0Sstevel@tonic-gate @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items 529*0Sstevel@tonic-gate 530*0Sstevel@tonic-gateThe range operator (in list context) makes use of the magical 531*0Sstevel@tonic-gateauto-increment algorithm if the operands are strings. You 532*0Sstevel@tonic-gatecan say 533*0Sstevel@tonic-gate 534*0Sstevel@tonic-gate @alphabet = ('A' .. 'Z'); 535*0Sstevel@tonic-gate 536*0Sstevel@tonic-gateto get all normal letters of the English alphabet, or 537*0Sstevel@tonic-gate 538*0Sstevel@tonic-gate $hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15]; 539*0Sstevel@tonic-gate 540*0Sstevel@tonic-gateto get a hexadecimal digit, or 541*0Sstevel@tonic-gate 542*0Sstevel@tonic-gate @z2 = ('01' .. '31'); print $z2[$mday]; 543*0Sstevel@tonic-gate 544*0Sstevel@tonic-gateto get dates with leading zeros. If the final value specified is not 545*0Sstevel@tonic-gatein the sequence that the magical increment would produce, the sequence 546*0Sstevel@tonic-gategoes until the next value would be longer than the final value 547*0Sstevel@tonic-gatespecified. 548*0Sstevel@tonic-gate 549*0Sstevel@tonic-gateBecause each operand is evaluated in integer form, C<2.18 .. 3.14> will 550*0Sstevel@tonic-gatereturn two elements in list context. 551*0Sstevel@tonic-gate 552*0Sstevel@tonic-gate @list = (2.18 .. 3.14); # same as @list = (2 .. 3); 553*0Sstevel@tonic-gate 554*0Sstevel@tonic-gate=head2 Conditional Operator 555*0Sstevel@tonic-gate 556*0Sstevel@tonic-gateTernary "?:" is the conditional operator, just as in C. It works much 557*0Sstevel@tonic-gatelike an if-then-else. If the argument before the ? is true, the 558*0Sstevel@tonic-gateargument before the : is returned, otherwise the argument after the : 559*0Sstevel@tonic-gateis returned. For example: 560*0Sstevel@tonic-gate 561*0Sstevel@tonic-gate printf "I have %d dog%s.\n", $n, 562*0Sstevel@tonic-gate ($n == 1) ? '' : "s"; 563*0Sstevel@tonic-gate 564*0Sstevel@tonic-gateScalar or list context propagates downward into the 2nd 565*0Sstevel@tonic-gateor 3rd argument, whichever is selected. 566*0Sstevel@tonic-gate 567*0Sstevel@tonic-gate $a = $ok ? $b : $c; # get a scalar 568*0Sstevel@tonic-gate @a = $ok ? @b : @c; # get an array 569*0Sstevel@tonic-gate $a = $ok ? @b : @c; # oops, that's just a count! 570*0Sstevel@tonic-gate 571*0Sstevel@tonic-gateThe operator may be assigned to if both the 2nd and 3rd arguments are 572*0Sstevel@tonic-gatelegal lvalues (meaning that you can assign to them): 573*0Sstevel@tonic-gate 574*0Sstevel@tonic-gate ($a_or_b ? $a : $b) = $c; 575*0Sstevel@tonic-gate 576*0Sstevel@tonic-gateBecause this operator produces an assignable result, using assignments 577*0Sstevel@tonic-gatewithout parentheses will get you in trouble. For example, this: 578*0Sstevel@tonic-gate 579*0Sstevel@tonic-gate $a % 2 ? $a += 10 : $a += 2 580*0Sstevel@tonic-gate 581*0Sstevel@tonic-gateReally means this: 582*0Sstevel@tonic-gate 583*0Sstevel@tonic-gate (($a % 2) ? ($a += 10) : $a) += 2 584*0Sstevel@tonic-gate 585*0Sstevel@tonic-gateRather than this: 586*0Sstevel@tonic-gate 587*0Sstevel@tonic-gate ($a % 2) ? ($a += 10) : ($a += 2) 588*0Sstevel@tonic-gate 589*0Sstevel@tonic-gateThat should probably be written more simply as: 590*0Sstevel@tonic-gate 591*0Sstevel@tonic-gate $a += ($a % 2) ? 10 : 2; 592*0Sstevel@tonic-gate 593*0Sstevel@tonic-gate=head2 Assignment Operators 594*0Sstevel@tonic-gate 595*0Sstevel@tonic-gate"=" is the ordinary assignment operator. 596*0Sstevel@tonic-gate 597*0Sstevel@tonic-gateAssignment operators work as in C. That is, 598*0Sstevel@tonic-gate 599*0Sstevel@tonic-gate $a += 2; 600*0Sstevel@tonic-gate 601*0Sstevel@tonic-gateis equivalent to 602*0Sstevel@tonic-gate 603*0Sstevel@tonic-gate $a = $a + 2; 604*0Sstevel@tonic-gate 605*0Sstevel@tonic-gatealthough without duplicating any side effects that dereferencing the lvalue 606*0Sstevel@tonic-gatemight trigger, such as from tie(). Other assignment operators work similarly. 607*0Sstevel@tonic-gateThe following are recognized: 608*0Sstevel@tonic-gate 609*0Sstevel@tonic-gate **= += *= &= <<= &&= 610*0Sstevel@tonic-gate -= /= |= >>= ||= 611*0Sstevel@tonic-gate .= %= ^= 612*0Sstevel@tonic-gate x= 613*0Sstevel@tonic-gate 614*0Sstevel@tonic-gateAlthough these are grouped by family, they all have the precedence 615*0Sstevel@tonic-gateof assignment. 616*0Sstevel@tonic-gate 617*0Sstevel@tonic-gateUnlike in C, the scalar assignment operator produces a valid lvalue. 618*0Sstevel@tonic-gateModifying an assignment is equivalent to doing the assignment and 619*0Sstevel@tonic-gatethen modifying the variable that was assigned to. This is useful 620*0Sstevel@tonic-gatefor modifying a copy of something, like this: 621*0Sstevel@tonic-gate 622*0Sstevel@tonic-gate ($tmp = $global) =~ tr [A-Z] [a-z]; 623*0Sstevel@tonic-gate 624*0Sstevel@tonic-gateLikewise, 625*0Sstevel@tonic-gate 626*0Sstevel@tonic-gate ($a += 2) *= 3; 627*0Sstevel@tonic-gate 628*0Sstevel@tonic-gateis equivalent to 629*0Sstevel@tonic-gate 630*0Sstevel@tonic-gate $a += 2; 631*0Sstevel@tonic-gate $a *= 3; 632*0Sstevel@tonic-gate 633*0Sstevel@tonic-gateSimilarly, a list assignment in list context produces the list of 634*0Sstevel@tonic-gatelvalues assigned to, and a list assignment in scalar context returns 635*0Sstevel@tonic-gatethe number of elements produced by the expression on the right hand 636*0Sstevel@tonic-gateside of the assignment. 637*0Sstevel@tonic-gate 638*0Sstevel@tonic-gate=head2 Comma Operator 639*0Sstevel@tonic-gate 640*0Sstevel@tonic-gateBinary "," is the comma operator. In scalar context it evaluates 641*0Sstevel@tonic-gateits left argument, throws that value away, then evaluates its right 642*0Sstevel@tonic-gateargument and returns that value. This is just like C's comma operator. 643*0Sstevel@tonic-gate 644*0Sstevel@tonic-gateIn list context, it's just the list argument separator, and inserts 645*0Sstevel@tonic-gateboth its arguments into the list. 646*0Sstevel@tonic-gate 647*0Sstevel@tonic-gateThe C<< => >> operator is a synonym for the comma, but forces any word 648*0Sstevel@tonic-gateto its left to be interpreted as a string (as of 5.001). It is helpful 649*0Sstevel@tonic-gatein documenting the correspondence between keys and values in hashes, 650*0Sstevel@tonic-gateand other paired elements in lists. 651*0Sstevel@tonic-gate 652*0Sstevel@tonic-gate=head2 List Operators (Rightward) 653*0Sstevel@tonic-gate 654*0Sstevel@tonic-gateOn the right side of a list operator, it has very low precedence, 655*0Sstevel@tonic-gatesuch that it controls all comma-separated expressions found there. 656*0Sstevel@tonic-gateThe only operators with lower precedence are the logical operators 657*0Sstevel@tonic-gate"and", "or", and "not", which may be used to evaluate calls to list 658*0Sstevel@tonic-gateoperators without the need for extra parentheses: 659*0Sstevel@tonic-gate 660*0Sstevel@tonic-gate open HANDLE, "filename" 661*0Sstevel@tonic-gate or die "Can't open: $!\n"; 662*0Sstevel@tonic-gate 663*0Sstevel@tonic-gateSee also discussion of list operators in L<Terms and List Operators (Leftward)>. 664*0Sstevel@tonic-gate 665*0Sstevel@tonic-gate=head2 Logical Not 666*0Sstevel@tonic-gate 667*0Sstevel@tonic-gateUnary "not" returns the logical negation of the expression to its right. 668*0Sstevel@tonic-gateIt's the equivalent of "!" except for the very low precedence. 669*0Sstevel@tonic-gate 670*0Sstevel@tonic-gate=head2 Logical And 671*0Sstevel@tonic-gate 672*0Sstevel@tonic-gateBinary "and" returns the logical conjunction of the two surrounding 673*0Sstevel@tonic-gateexpressions. It's equivalent to && except for the very low 674*0Sstevel@tonic-gateprecedence. This means that it short-circuits: i.e., the right 675*0Sstevel@tonic-gateexpression is evaluated only if the left expression is true. 676*0Sstevel@tonic-gate 677*0Sstevel@tonic-gate=head2 Logical or and Exclusive Or 678*0Sstevel@tonic-gate 679*0Sstevel@tonic-gateBinary "or" returns the logical disjunction of the two surrounding 680*0Sstevel@tonic-gateexpressions. It's equivalent to || except for the very low precedence. 681*0Sstevel@tonic-gateThis makes it useful for control flow 682*0Sstevel@tonic-gate 683*0Sstevel@tonic-gate print FH $data or die "Can't write to FH: $!"; 684*0Sstevel@tonic-gate 685*0Sstevel@tonic-gateThis means that it short-circuits: i.e., the right expression is evaluated 686*0Sstevel@tonic-gateonly if the left expression is false. Due to its precedence, you should 687*0Sstevel@tonic-gateprobably avoid using this for assignment, only for control flow. 688*0Sstevel@tonic-gate 689*0Sstevel@tonic-gate $a = $b or $c; # bug: this is wrong 690*0Sstevel@tonic-gate ($a = $b) or $c; # really means this 691*0Sstevel@tonic-gate $a = $b || $c; # better written this way 692*0Sstevel@tonic-gate 693*0Sstevel@tonic-gateHowever, when it's a list-context assignment and you're trying to use 694*0Sstevel@tonic-gate"||" for control flow, you probably need "or" so that the assignment 695*0Sstevel@tonic-gatetakes higher precedence. 696*0Sstevel@tonic-gate 697*0Sstevel@tonic-gate @info = stat($file) || die; # oops, scalar sense of stat! 698*0Sstevel@tonic-gate @info = stat($file) or die; # better, now @info gets its due 699*0Sstevel@tonic-gate 700*0Sstevel@tonic-gateThen again, you could always use parentheses. 701*0Sstevel@tonic-gate 702*0Sstevel@tonic-gateBinary "xor" returns the exclusive-OR of the two surrounding expressions. 703*0Sstevel@tonic-gateIt cannot short circuit, of course. 704*0Sstevel@tonic-gate 705*0Sstevel@tonic-gate=head2 C Operators Missing From Perl 706*0Sstevel@tonic-gate 707*0Sstevel@tonic-gateHere is what C has that Perl doesn't: 708*0Sstevel@tonic-gate 709*0Sstevel@tonic-gate=over 8 710*0Sstevel@tonic-gate 711*0Sstevel@tonic-gate=item unary & 712*0Sstevel@tonic-gate 713*0Sstevel@tonic-gateAddress-of operator. (But see the "\" operator for taking a reference.) 714*0Sstevel@tonic-gate 715*0Sstevel@tonic-gate=item unary * 716*0Sstevel@tonic-gate 717*0Sstevel@tonic-gateDereference-address operator. (Perl's prefix dereferencing 718*0Sstevel@tonic-gateoperators are typed: $, @, %, and &.) 719*0Sstevel@tonic-gate 720*0Sstevel@tonic-gate=item (TYPE) 721*0Sstevel@tonic-gate 722*0Sstevel@tonic-gateType-casting operator. 723*0Sstevel@tonic-gate 724*0Sstevel@tonic-gate=back 725*0Sstevel@tonic-gate 726*0Sstevel@tonic-gate=head2 Quote and Quote-like Operators 727*0Sstevel@tonic-gate 728*0Sstevel@tonic-gateWhile we usually think of quotes as literal values, in Perl they 729*0Sstevel@tonic-gatefunction as operators, providing various kinds of interpolating and 730*0Sstevel@tonic-gatepattern matching capabilities. Perl provides customary quote characters 731*0Sstevel@tonic-gatefor these behaviors, but also provides a way for you to choose your 732*0Sstevel@tonic-gatequote character for any of them. In the following table, a C<{}> represents 733*0Sstevel@tonic-gateany pair of delimiters you choose. 734*0Sstevel@tonic-gate 735*0Sstevel@tonic-gate Customary Generic Meaning Interpolates 736*0Sstevel@tonic-gate '' q{} Literal no 737*0Sstevel@tonic-gate "" qq{} Literal yes 738*0Sstevel@tonic-gate `` qx{} Command yes* 739*0Sstevel@tonic-gate qw{} Word list no 740*0Sstevel@tonic-gate // m{} Pattern match yes* 741*0Sstevel@tonic-gate qr{} Pattern yes* 742*0Sstevel@tonic-gate s{}{} Substitution yes* 743*0Sstevel@tonic-gate tr{}{} Transliteration no (but see below) 744*0Sstevel@tonic-gate <<EOF here-doc yes* 745*0Sstevel@tonic-gate 746*0Sstevel@tonic-gate * unless the delimiter is ''. 747*0Sstevel@tonic-gate 748*0Sstevel@tonic-gateNon-bracketing delimiters use the same character fore and aft, but the four 749*0Sstevel@tonic-gatesorts of brackets (round, angle, square, curly) will all nest, which means 750*0Sstevel@tonic-gatethat 751*0Sstevel@tonic-gate 752*0Sstevel@tonic-gate q{foo{bar}baz} 753*0Sstevel@tonic-gate 754*0Sstevel@tonic-gateis the same as 755*0Sstevel@tonic-gate 756*0Sstevel@tonic-gate 'foo{bar}baz' 757*0Sstevel@tonic-gate 758*0Sstevel@tonic-gateNote, however, that this does not always work for quoting Perl code: 759*0Sstevel@tonic-gate 760*0Sstevel@tonic-gate $s = q{ if($a eq "}") ... }; # WRONG 761*0Sstevel@tonic-gate 762*0Sstevel@tonic-gateis a syntax error. The C<Text::Balanced> module (from CPAN, and 763*0Sstevel@tonic-gatestarting from Perl 5.8 part of the standard distribution) is able 764*0Sstevel@tonic-gateto do this properly. 765*0Sstevel@tonic-gate 766*0Sstevel@tonic-gateThere can be whitespace between the operator and the quoting 767*0Sstevel@tonic-gatecharacters, except when C<#> is being used as the quoting character. 768*0Sstevel@tonic-gateC<q#foo#> is parsed as the string C<foo>, while C<q #foo#> is the 769*0Sstevel@tonic-gateoperator C<q> followed by a comment. Its argument will be taken 770*0Sstevel@tonic-gatefrom the next line. This allows you to write: 771*0Sstevel@tonic-gate 772*0Sstevel@tonic-gate s {foo} # Replace foo 773*0Sstevel@tonic-gate {bar} # with bar. 774*0Sstevel@tonic-gate 775*0Sstevel@tonic-gateThe following escape sequences are available in constructs that interpolate 776*0Sstevel@tonic-gateand in transliterations. 777*0Sstevel@tonic-gate 778*0Sstevel@tonic-gate \t tab (HT, TAB) 779*0Sstevel@tonic-gate \n newline (NL) 780*0Sstevel@tonic-gate \r return (CR) 781*0Sstevel@tonic-gate \f form feed (FF) 782*0Sstevel@tonic-gate \b backspace (BS) 783*0Sstevel@tonic-gate \a alarm (bell) (BEL) 784*0Sstevel@tonic-gate \e escape (ESC) 785*0Sstevel@tonic-gate \033 octal char (ESC) 786*0Sstevel@tonic-gate \x1b hex char (ESC) 787*0Sstevel@tonic-gate \x{263a} wide hex char (SMILEY) 788*0Sstevel@tonic-gate \c[ control char (ESC) 789*0Sstevel@tonic-gate \N{name} named Unicode character 790*0Sstevel@tonic-gate 791*0Sstevel@tonic-gateB<NOTE>: Unlike C and other languages, Perl has no \v escape sequence for 792*0Sstevel@tonic-gatethe vertical tab (VT - ASCII 11). 793*0Sstevel@tonic-gate 794*0Sstevel@tonic-gateThe following escape sequences are available in constructs that interpolate 795*0Sstevel@tonic-gatebut not in transliterations. 796*0Sstevel@tonic-gate 797*0Sstevel@tonic-gate \l lowercase next char 798*0Sstevel@tonic-gate \u uppercase next char 799*0Sstevel@tonic-gate \L lowercase till \E 800*0Sstevel@tonic-gate \U uppercase till \E 801*0Sstevel@tonic-gate \E end case modification 802*0Sstevel@tonic-gate \Q quote non-word characters till \E 803*0Sstevel@tonic-gate 804*0Sstevel@tonic-gateIf C<use locale> is in effect, the case map used by C<\l>, C<\L>, 805*0Sstevel@tonic-gateC<\u> and C<\U> is taken from the current locale. See L<perllocale>. 806*0Sstevel@tonic-gateIf Unicode (for example, C<\N{}> or wide hex characters of 0x100 or 807*0Sstevel@tonic-gatebeyond) is being used, the case map used by C<\l>, C<\L>, C<\u> and 808*0Sstevel@tonic-gateC<\U> is as defined by Unicode. For documentation of C<\N{name}>, 809*0Sstevel@tonic-gatesee L<charnames>. 810*0Sstevel@tonic-gate 811*0Sstevel@tonic-gateAll systems use the virtual C<"\n"> to represent a line terminator, 812*0Sstevel@tonic-gatecalled a "newline". There is no such thing as an unvarying, physical 813*0Sstevel@tonic-gatenewline character. It is only an illusion that the operating system, 814*0Sstevel@tonic-gatedevice drivers, C libraries, and Perl all conspire to preserve. Not all 815*0Sstevel@tonic-gatesystems read C<"\r"> as ASCII CR and C<"\n"> as ASCII LF. For example, 816*0Sstevel@tonic-gateon a Mac, these are reversed, and on systems without line terminator, 817*0Sstevel@tonic-gateprinting C<"\n"> may emit no actual data. In general, use C<"\n"> when 818*0Sstevel@tonic-gateyou mean a "newline" for your system, but use the literal ASCII when you 819*0Sstevel@tonic-gateneed an exact character. For example, most networking protocols expect 820*0Sstevel@tonic-gateand prefer a CR+LF (C<"\015\012"> or C<"\cM\cJ">) for line terminators, 821*0Sstevel@tonic-gateand although they often accept just C<"\012">, they seldom tolerate just 822*0Sstevel@tonic-gateC<"\015">. If you get in the habit of using C<"\n"> for networking, 823*0Sstevel@tonic-gateyou may be burned some day. 824*0Sstevel@tonic-gate 825*0Sstevel@tonic-gateFor constructs that do interpolate, variables beginning with "C<$>" 826*0Sstevel@tonic-gateor "C<@>" are interpolated. Subscripted variables such as C<$a[3]> or 827*0Sstevel@tonic-gateC<< $href->{key}[0] >> are also interpolated, as are array and hash slices. 828*0Sstevel@tonic-gateBut method calls such as C<< $obj->meth >> are not. 829*0Sstevel@tonic-gate 830*0Sstevel@tonic-gateInterpolating an array or slice interpolates the elements in order, 831*0Sstevel@tonic-gateseparated by the value of C<$">, so is equivalent to interpolating 832*0Sstevel@tonic-gateC<join $", @array>. "Punctuation" arrays such as C<@+> are only 833*0Sstevel@tonic-gateinterpolated if the name is enclosed in braces C<@{+}>. 834*0Sstevel@tonic-gate 835*0Sstevel@tonic-gateYou cannot include a literal C<$> or C<@> within a C<\Q> sequence. 836*0Sstevel@tonic-gateAn unescaped C<$> or C<@> interpolates the corresponding variable, 837*0Sstevel@tonic-gatewhile escaping will cause the literal string C<\$> to be inserted. 838*0Sstevel@tonic-gateYou'll need to write something like C<m/\Quser\E\@\Qhost/>. 839*0Sstevel@tonic-gate 840*0Sstevel@tonic-gatePatterns are subject to an additional level of interpretation as a 841*0Sstevel@tonic-gateregular expression. This is done as a second pass, after variables are 842*0Sstevel@tonic-gateinterpolated, so that regular expressions may be incorporated into the 843*0Sstevel@tonic-gatepattern from the variables. If this is not what you want, use C<\Q> to 844*0Sstevel@tonic-gateinterpolate a variable literally. 845*0Sstevel@tonic-gate 846*0Sstevel@tonic-gateApart from the behavior described above, Perl does not expand 847*0Sstevel@tonic-gatemultiple levels of interpolation. In particular, contrary to the 848*0Sstevel@tonic-gateexpectations of shell programmers, back-quotes do I<NOT> interpolate 849*0Sstevel@tonic-gatewithin double quotes, nor do single quotes impede evaluation of 850*0Sstevel@tonic-gatevariables when used within double quotes. 851*0Sstevel@tonic-gate 852*0Sstevel@tonic-gate=head2 Regexp Quote-Like Operators 853*0Sstevel@tonic-gate 854*0Sstevel@tonic-gateHere are the quote-like operators that apply to pattern 855*0Sstevel@tonic-gatematching and related activities. 856*0Sstevel@tonic-gate 857*0Sstevel@tonic-gate=over 8 858*0Sstevel@tonic-gate 859*0Sstevel@tonic-gate=item ?PATTERN? 860*0Sstevel@tonic-gate 861*0Sstevel@tonic-gateThis is just like the C</pattern/> search, except that it matches only 862*0Sstevel@tonic-gateonce between calls to the reset() operator. This is a useful 863*0Sstevel@tonic-gateoptimization when you want to see only the first occurrence of 864*0Sstevel@tonic-gatesomething in each file of a set of files, for instance. Only C<??> 865*0Sstevel@tonic-gatepatterns local to the current package are reset. 866*0Sstevel@tonic-gate 867*0Sstevel@tonic-gate while (<>) { 868*0Sstevel@tonic-gate if (?^$?) { 869*0Sstevel@tonic-gate # blank line between header and body 870*0Sstevel@tonic-gate } 871*0Sstevel@tonic-gate } continue { 872*0Sstevel@tonic-gate reset if eof; # clear ?? status for next file 873*0Sstevel@tonic-gate } 874*0Sstevel@tonic-gate 875*0Sstevel@tonic-gateThis usage is vaguely deprecated, which means it just might possibly 876*0Sstevel@tonic-gatebe removed in some distant future version of Perl, perhaps somewhere 877*0Sstevel@tonic-gatearound the year 2168. 878*0Sstevel@tonic-gate 879*0Sstevel@tonic-gate=item m/PATTERN/cgimosx 880*0Sstevel@tonic-gate 881*0Sstevel@tonic-gate=item /PATTERN/cgimosx 882*0Sstevel@tonic-gate 883*0Sstevel@tonic-gateSearches a string for a pattern match, and in scalar context returns 884*0Sstevel@tonic-gatetrue if it succeeds, false if it fails. If no string is specified 885*0Sstevel@tonic-gatevia the C<=~> or C<!~> operator, the $_ string is searched. (The 886*0Sstevel@tonic-gatestring specified with C<=~> need not be an lvalue--it may be the 887*0Sstevel@tonic-gateresult of an expression evaluation, but remember the C<=~> binds 888*0Sstevel@tonic-gaterather tightly.) See also L<perlre>. See L<perllocale> for 889*0Sstevel@tonic-gatediscussion of additional considerations that apply when C<use locale> 890*0Sstevel@tonic-gateis in effect. 891*0Sstevel@tonic-gate 892*0Sstevel@tonic-gateOptions are: 893*0Sstevel@tonic-gate 894*0Sstevel@tonic-gate c Do not reset search position on a failed match when /g is in effect. 895*0Sstevel@tonic-gate g Match globally, i.e., find all occurrences. 896*0Sstevel@tonic-gate i Do case-insensitive pattern matching. 897*0Sstevel@tonic-gate m Treat string as multiple lines. 898*0Sstevel@tonic-gate o Compile pattern only once. 899*0Sstevel@tonic-gate s Treat string as single line. 900*0Sstevel@tonic-gate x Use extended regular expressions. 901*0Sstevel@tonic-gate 902*0Sstevel@tonic-gateIf "/" is the delimiter then the initial C<m> is optional. With the C<m> 903*0Sstevel@tonic-gateyou can use any pair of non-alphanumeric, non-whitespace characters 904*0Sstevel@tonic-gateas delimiters. This is particularly useful for matching path names 905*0Sstevel@tonic-gatethat contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is 906*0Sstevel@tonic-gatethe delimiter, then the match-only-once rule of C<?PATTERN?> applies. 907*0Sstevel@tonic-gateIf "'" is the delimiter, no interpolation is performed on the PATTERN. 908*0Sstevel@tonic-gate 909*0Sstevel@tonic-gatePATTERN may contain variables, which will be interpolated (and the 910*0Sstevel@tonic-gatepattern recompiled) every time the pattern search is evaluated, except 911*0Sstevel@tonic-gatefor when the delimiter is a single quote. (Note that C<$(>, C<$)>, and 912*0Sstevel@tonic-gateC<$|> are not interpolated because they look like end-of-string tests.) 913*0Sstevel@tonic-gateIf you want such a pattern to be compiled only once, add a C</o> after 914*0Sstevel@tonic-gatethe trailing delimiter. This avoids expensive run-time recompilations, 915*0Sstevel@tonic-gateand is useful when the value you are interpolating won't change over 916*0Sstevel@tonic-gatethe life of the script. However, mentioning C</o> constitutes a promise 917*0Sstevel@tonic-gatethat you won't change the variables in the pattern. If you change them, 918*0Sstevel@tonic-gatePerl won't even notice. See also L<"qr/STRING/imosx">. 919*0Sstevel@tonic-gate 920*0Sstevel@tonic-gateIf the PATTERN evaluates to the empty string, the last 921*0Sstevel@tonic-gateI<successfully> matched regular expression is used instead. In this 922*0Sstevel@tonic-gatecase, only the C<g> and C<c> flags on the empty pattern is honoured - 923*0Sstevel@tonic-gatethe other flags are taken from the original pattern. If no match has 924*0Sstevel@tonic-gatepreviously succeeded, this will (silently) act instead as a genuine 925*0Sstevel@tonic-gateempty pattern (which will always match). 926*0Sstevel@tonic-gate 927*0Sstevel@tonic-gateIf the C</g> option is not used, C<m//> in list context returns a 928*0Sstevel@tonic-gatelist consisting of the subexpressions matched by the parentheses in the 929*0Sstevel@tonic-gatepattern, i.e., (C<$1>, C<$2>, C<$3>...). (Note that here C<$1> etc. are 930*0Sstevel@tonic-gatealso set, and that this differs from Perl 4's behavior.) When there are 931*0Sstevel@tonic-gateno parentheses in the pattern, the return value is the list C<(1)> for 932*0Sstevel@tonic-gatesuccess. With or without parentheses, an empty list is returned upon 933*0Sstevel@tonic-gatefailure. 934*0Sstevel@tonic-gate 935*0Sstevel@tonic-gateExamples: 936*0Sstevel@tonic-gate 937*0Sstevel@tonic-gate open(TTY, '/dev/tty'); 938*0Sstevel@tonic-gate <TTY> =~ /^y/i && foo(); # do foo if desired 939*0Sstevel@tonic-gate 940*0Sstevel@tonic-gate if (/Version: *([0-9.]*)/) { $version = $1; } 941*0Sstevel@tonic-gate 942*0Sstevel@tonic-gate next if m#^/usr/spool/uucp#; 943*0Sstevel@tonic-gate 944*0Sstevel@tonic-gate # poor man's grep 945*0Sstevel@tonic-gate $arg = shift; 946*0Sstevel@tonic-gate while (<>) { 947*0Sstevel@tonic-gate print if /$arg/o; # compile only once 948*0Sstevel@tonic-gate } 949*0Sstevel@tonic-gate 950*0Sstevel@tonic-gate if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/)) 951*0Sstevel@tonic-gate 952*0Sstevel@tonic-gateThis last example splits $foo into the first two words and the 953*0Sstevel@tonic-gateremainder of the line, and assigns those three fields to $F1, $F2, and 954*0Sstevel@tonic-gate$Etc. The conditional is true if any variables were assigned, i.e., if 955*0Sstevel@tonic-gatethe pattern matched. 956*0Sstevel@tonic-gate 957*0Sstevel@tonic-gateThe C</g> modifier specifies global pattern matching--that is, 958*0Sstevel@tonic-gatematching as many times as possible within the string. How it behaves 959*0Sstevel@tonic-gatedepends on the context. In list context, it returns a list of the 960*0Sstevel@tonic-gatesubstrings matched by any capturing parentheses in the regular 961*0Sstevel@tonic-gateexpression. If there are no parentheses, it returns a list of all 962*0Sstevel@tonic-gatethe matched strings, as if there were parentheses around the whole 963*0Sstevel@tonic-gatepattern. 964*0Sstevel@tonic-gate 965*0Sstevel@tonic-gateIn scalar context, each execution of C<m//g> finds the next match, 966*0Sstevel@tonic-gatereturning true if it matches, and false if there is no further match. 967*0Sstevel@tonic-gateThe position after the last match can be read or set using the pos() 968*0Sstevel@tonic-gatefunction; see L<perlfunc/pos>. A failed match normally resets the 969*0Sstevel@tonic-gatesearch position to the beginning of the string, but you can avoid that 970*0Sstevel@tonic-gateby adding the C</c> modifier (e.g. C<m//gc>). Modifying the target 971*0Sstevel@tonic-gatestring also resets the search position. 972*0Sstevel@tonic-gate 973*0Sstevel@tonic-gateYou can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a 974*0Sstevel@tonic-gatezero-width assertion that matches the exact position where the previous 975*0Sstevel@tonic-gateC<m//g>, if any, left off. Without the C</g> modifier, the C<\G> assertion 976*0Sstevel@tonic-gatestill anchors at pos(), but the match is of course only attempted once. 977*0Sstevel@tonic-gateUsing C<\G> without C</g> on a target string that has not previously had a 978*0Sstevel@tonic-gateC</g> match applied to it is the same as using the C<\A> assertion to match 979*0Sstevel@tonic-gatethe beginning of the string. Note also that, currently, C<\G> is only 980*0Sstevel@tonic-gateproperly supported when anchored at the very beginning of the pattern. 981*0Sstevel@tonic-gate 982*0Sstevel@tonic-gateExamples: 983*0Sstevel@tonic-gate 984*0Sstevel@tonic-gate # list context 985*0Sstevel@tonic-gate ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g); 986*0Sstevel@tonic-gate 987*0Sstevel@tonic-gate # scalar context 988*0Sstevel@tonic-gate $/ = ""; 989*0Sstevel@tonic-gate while (defined($paragraph = <>)) { 990*0Sstevel@tonic-gate while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) { 991*0Sstevel@tonic-gate $sentences++; 992*0Sstevel@tonic-gate } 993*0Sstevel@tonic-gate } 994*0Sstevel@tonic-gate print "$sentences\n"; 995*0Sstevel@tonic-gate 996*0Sstevel@tonic-gate # using m//gc with \G 997*0Sstevel@tonic-gate $_ = "ppooqppqq"; 998*0Sstevel@tonic-gate while ($i++ < 2) { 999*0Sstevel@tonic-gate print "1: '"; 1000*0Sstevel@tonic-gate print $1 while /(o)/gc; print "', pos=", pos, "\n"; 1001*0Sstevel@tonic-gate print "2: '"; 1002*0Sstevel@tonic-gate print $1 if /\G(q)/gc; print "', pos=", pos, "\n"; 1003*0Sstevel@tonic-gate print "3: '"; 1004*0Sstevel@tonic-gate print $1 while /(p)/gc; print "', pos=", pos, "\n"; 1005*0Sstevel@tonic-gate } 1006*0Sstevel@tonic-gate print "Final: '$1', pos=",pos,"\n" if /\G(.)/; 1007*0Sstevel@tonic-gate 1008*0Sstevel@tonic-gateThe last example should print: 1009*0Sstevel@tonic-gate 1010*0Sstevel@tonic-gate 1: 'oo', pos=4 1011*0Sstevel@tonic-gate 2: 'q', pos=5 1012*0Sstevel@tonic-gate 3: 'pp', pos=7 1013*0Sstevel@tonic-gate 1: '', pos=7 1014*0Sstevel@tonic-gate 2: 'q', pos=8 1015*0Sstevel@tonic-gate 3: '', pos=8 1016*0Sstevel@tonic-gate Final: 'q', pos=8 1017*0Sstevel@tonic-gate 1018*0Sstevel@tonic-gateNotice that the final match matched C<q> instead of C<p>, which a match 1019*0Sstevel@tonic-gatewithout the C<\G> anchor would have done. Also note that the final match 1020*0Sstevel@tonic-gatedid not update C<pos> -- C<pos> is only updated on a C</g> match. If the 1021*0Sstevel@tonic-gatefinal match did indeed match C<p>, it's a good bet that you're running an 1022*0Sstevel@tonic-gateolder (pre-5.6.0) Perl. 1023*0Sstevel@tonic-gate 1024*0Sstevel@tonic-gateA useful idiom for C<lex>-like scanners is C</\G.../gc>. You can 1025*0Sstevel@tonic-gatecombine several regexps like this to process a string part-by-part, 1026*0Sstevel@tonic-gatedoing different actions depending on which regexp matched. Each 1027*0Sstevel@tonic-gateregexp tries to match where the previous one leaves off. 1028*0Sstevel@tonic-gate 1029*0Sstevel@tonic-gate $_ = <<'EOL'; 1030*0Sstevel@tonic-gate $url = new URI::URL "http://www/"; die if $url eq "xXx"; 1031*0Sstevel@tonic-gate EOL 1032*0Sstevel@tonic-gate LOOP: 1033*0Sstevel@tonic-gate { 1034*0Sstevel@tonic-gate print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc; 1035*0Sstevel@tonic-gate print(" lowercase"), redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc; 1036*0Sstevel@tonic-gate print(" UPPERCASE"), redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc; 1037*0Sstevel@tonic-gate print(" Capitalized"), redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc; 1038*0Sstevel@tonic-gate print(" MiXeD"), redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc; 1039*0Sstevel@tonic-gate print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc; 1040*0Sstevel@tonic-gate print(" line-noise"), redo LOOP if /\G[^A-Za-z0-9]+/gc; 1041*0Sstevel@tonic-gate print ". That's all!\n"; 1042*0Sstevel@tonic-gate } 1043*0Sstevel@tonic-gate 1044*0Sstevel@tonic-gateHere is the output (split into several lines): 1045*0Sstevel@tonic-gate 1046*0Sstevel@tonic-gate line-noise lowercase line-noise lowercase UPPERCASE line-noise 1047*0Sstevel@tonic-gate UPPERCASE line-noise lowercase line-noise lowercase line-noise 1048*0Sstevel@tonic-gate lowercase lowercase line-noise lowercase lowercase line-noise 1049*0Sstevel@tonic-gate MiXeD line-noise. That's all! 1050*0Sstevel@tonic-gate 1051*0Sstevel@tonic-gate=item q/STRING/ 1052*0Sstevel@tonic-gate 1053*0Sstevel@tonic-gate=item C<'STRING'> 1054*0Sstevel@tonic-gate 1055*0Sstevel@tonic-gateA single-quoted, literal string. A backslash represents a backslash 1056*0Sstevel@tonic-gateunless followed by the delimiter or another backslash, in which case 1057*0Sstevel@tonic-gatethe delimiter or backslash is interpolated. 1058*0Sstevel@tonic-gate 1059*0Sstevel@tonic-gate $foo = q!I said, "You said, 'She said it.'"!; 1060*0Sstevel@tonic-gate $bar = q('This is it.'); 1061*0Sstevel@tonic-gate $baz = '\n'; # a two-character string 1062*0Sstevel@tonic-gate 1063*0Sstevel@tonic-gate=item qq/STRING/ 1064*0Sstevel@tonic-gate 1065*0Sstevel@tonic-gate=item "STRING" 1066*0Sstevel@tonic-gate 1067*0Sstevel@tonic-gateA double-quoted, interpolated string. 1068*0Sstevel@tonic-gate 1069*0Sstevel@tonic-gate $_ .= qq 1070*0Sstevel@tonic-gate (*** The previous line contains the naughty word "$1".\n) 1071*0Sstevel@tonic-gate if /\b(tcl|java|python)\b/i; # :-) 1072*0Sstevel@tonic-gate $baz = "\n"; # a one-character string 1073*0Sstevel@tonic-gate 1074*0Sstevel@tonic-gate=item qr/STRING/imosx 1075*0Sstevel@tonic-gate 1076*0Sstevel@tonic-gateThis operator quotes (and possibly compiles) its I<STRING> as a regular 1077*0Sstevel@tonic-gateexpression. I<STRING> is interpolated the same way as I<PATTERN> 1078*0Sstevel@tonic-gatein C<m/PATTERN/>. If "'" is used as the delimiter, no interpolation 1079*0Sstevel@tonic-gateis done. Returns a Perl value which may be used instead of the 1080*0Sstevel@tonic-gatecorresponding C</STRING/imosx> expression. 1081*0Sstevel@tonic-gate 1082*0Sstevel@tonic-gateFor example, 1083*0Sstevel@tonic-gate 1084*0Sstevel@tonic-gate $rex = qr/my.STRING/is; 1085*0Sstevel@tonic-gate s/$rex/foo/; 1086*0Sstevel@tonic-gate 1087*0Sstevel@tonic-gateis equivalent to 1088*0Sstevel@tonic-gate 1089*0Sstevel@tonic-gate s/my.STRING/foo/is; 1090*0Sstevel@tonic-gate 1091*0Sstevel@tonic-gateThe result may be used as a subpattern in a match: 1092*0Sstevel@tonic-gate 1093*0Sstevel@tonic-gate $re = qr/$pattern/; 1094*0Sstevel@tonic-gate $string =~ /foo${re}bar/; # can be interpolated in other patterns 1095*0Sstevel@tonic-gate $string =~ $re; # or used standalone 1096*0Sstevel@tonic-gate $string =~ /$re/; # or this way 1097*0Sstevel@tonic-gate 1098*0Sstevel@tonic-gateSince Perl may compile the pattern at the moment of execution of qr() 1099*0Sstevel@tonic-gateoperator, using qr() may have speed advantages in some situations, 1100*0Sstevel@tonic-gatenotably if the result of qr() is used standalone: 1101*0Sstevel@tonic-gate 1102*0Sstevel@tonic-gate sub match { 1103*0Sstevel@tonic-gate my $patterns = shift; 1104*0Sstevel@tonic-gate my @compiled = map qr/$_/i, @$patterns; 1105*0Sstevel@tonic-gate grep { 1106*0Sstevel@tonic-gate my $success = 0; 1107*0Sstevel@tonic-gate foreach my $pat (@compiled) { 1108*0Sstevel@tonic-gate $success = 1, last if /$pat/; 1109*0Sstevel@tonic-gate } 1110*0Sstevel@tonic-gate $success; 1111*0Sstevel@tonic-gate } @_; 1112*0Sstevel@tonic-gate } 1113*0Sstevel@tonic-gate 1114*0Sstevel@tonic-gatePrecompilation of the pattern into an internal representation at 1115*0Sstevel@tonic-gatethe moment of qr() avoids a need to recompile the pattern every 1116*0Sstevel@tonic-gatetime a match C</$pat/> is attempted. (Perl has many other internal 1117*0Sstevel@tonic-gateoptimizations, but none would be triggered in the above example if 1118*0Sstevel@tonic-gatewe did not use qr() operator.) 1119*0Sstevel@tonic-gate 1120*0Sstevel@tonic-gateOptions are: 1121*0Sstevel@tonic-gate 1122*0Sstevel@tonic-gate i Do case-insensitive pattern matching. 1123*0Sstevel@tonic-gate m Treat string as multiple lines. 1124*0Sstevel@tonic-gate o Compile pattern only once. 1125*0Sstevel@tonic-gate s Treat string as single line. 1126*0Sstevel@tonic-gate x Use extended regular expressions. 1127*0Sstevel@tonic-gate 1128*0Sstevel@tonic-gateSee L<perlre> for additional information on valid syntax for STRING, and 1129*0Sstevel@tonic-gatefor a detailed look at the semantics of regular expressions. 1130*0Sstevel@tonic-gate 1131*0Sstevel@tonic-gate=item qx/STRING/ 1132*0Sstevel@tonic-gate 1133*0Sstevel@tonic-gate=item `STRING` 1134*0Sstevel@tonic-gate 1135*0Sstevel@tonic-gateA string which is (possibly) interpolated and then executed as a 1136*0Sstevel@tonic-gatesystem command with C</bin/sh> or its equivalent. Shell wildcards, 1137*0Sstevel@tonic-gatepipes, and redirections will be honored. The collected standard 1138*0Sstevel@tonic-gateoutput of the command is returned; standard error is unaffected. In 1139*0Sstevel@tonic-gatescalar context, it comes back as a single (potentially multi-line) 1140*0Sstevel@tonic-gatestring, or undef if the command failed. In list context, returns a 1141*0Sstevel@tonic-gatelist of lines (however you've defined lines with $/ or 1142*0Sstevel@tonic-gate$INPUT_RECORD_SEPARATOR), or an empty list if the command failed. 1143*0Sstevel@tonic-gate 1144*0Sstevel@tonic-gateBecause backticks do not affect standard error, use shell file descriptor 1145*0Sstevel@tonic-gatesyntax (assuming the shell supports this) if you care to address this. 1146*0Sstevel@tonic-gateTo capture a command's STDERR and STDOUT together: 1147*0Sstevel@tonic-gate 1148*0Sstevel@tonic-gate $output = `cmd 2>&1`; 1149*0Sstevel@tonic-gate 1150*0Sstevel@tonic-gateTo capture a command's STDOUT but discard its STDERR: 1151*0Sstevel@tonic-gate 1152*0Sstevel@tonic-gate $output = `cmd 2>/dev/null`; 1153*0Sstevel@tonic-gate 1154*0Sstevel@tonic-gateTo capture a command's STDERR but discard its STDOUT (ordering is 1155*0Sstevel@tonic-gateimportant here): 1156*0Sstevel@tonic-gate 1157*0Sstevel@tonic-gate $output = `cmd 2>&1 1>/dev/null`; 1158*0Sstevel@tonic-gate 1159*0Sstevel@tonic-gateTo exchange a command's STDOUT and STDERR in order to capture the STDERR 1160*0Sstevel@tonic-gatebut leave its STDOUT to come out the old STDERR: 1161*0Sstevel@tonic-gate 1162*0Sstevel@tonic-gate $output = `cmd 3>&1 1>&2 2>&3 3>&-`; 1163*0Sstevel@tonic-gate 1164*0Sstevel@tonic-gateTo read both a command's STDOUT and its STDERR separately, it's easiest 1165*0Sstevel@tonic-gateto redirect them separately to files, and then read from those files 1166*0Sstevel@tonic-gatewhen the program is done: 1167*0Sstevel@tonic-gate 1168*0Sstevel@tonic-gate system("program args 1>program.stdout 2>program.stderr"); 1169*0Sstevel@tonic-gate 1170*0Sstevel@tonic-gateUsing single-quote as a delimiter protects the command from Perl's 1171*0Sstevel@tonic-gatedouble-quote interpolation, passing it on to the shell instead: 1172*0Sstevel@tonic-gate 1173*0Sstevel@tonic-gate $perl_info = qx(ps $$); # that's Perl's $$ 1174*0Sstevel@tonic-gate $shell_info = qx'ps $$'; # that's the new shell's $$ 1175*0Sstevel@tonic-gate 1176*0Sstevel@tonic-gateHow that string gets evaluated is entirely subject to the command 1177*0Sstevel@tonic-gateinterpreter on your system. On most platforms, you will have to protect 1178*0Sstevel@tonic-gateshell metacharacters if you want them treated literally. This is in 1179*0Sstevel@tonic-gatepractice difficult to do, as it's unclear how to escape which characters. 1180*0Sstevel@tonic-gateSee L<perlsec> for a clean and safe example of a manual fork() and exec() 1181*0Sstevel@tonic-gateto emulate backticks safely. 1182*0Sstevel@tonic-gate 1183*0Sstevel@tonic-gateOn some platforms (notably DOS-like ones), the shell may not be 1184*0Sstevel@tonic-gatecapable of dealing with multiline commands, so putting newlines in 1185*0Sstevel@tonic-gatethe string may not get you what you want. You may be able to evaluate 1186*0Sstevel@tonic-gatemultiple commands in a single line by separating them with the command 1187*0Sstevel@tonic-gateseparator character, if your shell supports that (e.g. C<;> on many Unix 1188*0Sstevel@tonic-gateshells; C<&> on the Windows NT C<cmd> shell). 1189*0Sstevel@tonic-gate 1190*0Sstevel@tonic-gateBeginning with v5.6.0, Perl will attempt to flush all files opened for 1191*0Sstevel@tonic-gateoutput before starting the child process, but this may not be supported 1192*0Sstevel@tonic-gateon some platforms (see L<perlport>). To be safe, you may need to set 1193*0Sstevel@tonic-gateC<$|> ($AUTOFLUSH in English) or call the C<autoflush()> method of 1194*0Sstevel@tonic-gateC<IO::Handle> on any open handles. 1195*0Sstevel@tonic-gate 1196*0Sstevel@tonic-gateBeware that some command shells may place restrictions on the length 1197*0Sstevel@tonic-gateof the command line. You must ensure your strings don't exceed this 1198*0Sstevel@tonic-gatelimit after any necessary interpolations. See the platform-specific 1199*0Sstevel@tonic-gaterelease notes for more details about your particular environment. 1200*0Sstevel@tonic-gate 1201*0Sstevel@tonic-gateUsing this operator can lead to programs that are difficult to port, 1202*0Sstevel@tonic-gatebecause the shell commands called vary between systems, and may in 1203*0Sstevel@tonic-gatefact not be present at all. As one example, the C<type> command under 1204*0Sstevel@tonic-gatethe POSIX shell is very different from the C<type> command under DOS. 1205*0Sstevel@tonic-gateThat doesn't mean you should go out of your way to avoid backticks 1206*0Sstevel@tonic-gatewhen they're the right way to get something done. Perl was made to be 1207*0Sstevel@tonic-gatea glue language, and one of the things it glues together is commands. 1208*0Sstevel@tonic-gateJust understand what you're getting yourself into. 1209*0Sstevel@tonic-gate 1210*0Sstevel@tonic-gateSee L<"I/O Operators"> for more discussion. 1211*0Sstevel@tonic-gate 1212*0Sstevel@tonic-gate=item qw/STRING/ 1213*0Sstevel@tonic-gate 1214*0Sstevel@tonic-gateEvaluates to a list of the words extracted out of STRING, using embedded 1215*0Sstevel@tonic-gatewhitespace as the word delimiters. It can be understood as being roughly 1216*0Sstevel@tonic-gateequivalent to: 1217*0Sstevel@tonic-gate 1218*0Sstevel@tonic-gate split(' ', q/STRING/); 1219*0Sstevel@tonic-gate 1220*0Sstevel@tonic-gatethe differences being that it generates a real list at compile time, and 1221*0Sstevel@tonic-gatein scalar context it returns the last element in the list. So 1222*0Sstevel@tonic-gatethis expression: 1223*0Sstevel@tonic-gate 1224*0Sstevel@tonic-gate qw(foo bar baz) 1225*0Sstevel@tonic-gate 1226*0Sstevel@tonic-gateis semantically equivalent to the list: 1227*0Sstevel@tonic-gate 1228*0Sstevel@tonic-gate 'foo', 'bar', 'baz' 1229*0Sstevel@tonic-gate 1230*0Sstevel@tonic-gateSome frequently seen examples: 1231*0Sstevel@tonic-gate 1232*0Sstevel@tonic-gate use POSIX qw( setlocale localeconv ) 1233*0Sstevel@tonic-gate @EXPORT = qw( foo bar baz ); 1234*0Sstevel@tonic-gate 1235*0Sstevel@tonic-gateA common mistake is to try to separate the words with comma or to 1236*0Sstevel@tonic-gateput comments into a multi-line C<qw>-string. For this reason, the 1237*0Sstevel@tonic-gateC<use warnings> pragma and the B<-w> switch (that is, the C<$^W> variable) 1238*0Sstevel@tonic-gateproduces warnings if the STRING contains the "," or the "#" character. 1239*0Sstevel@tonic-gate 1240*0Sstevel@tonic-gate=item s/PATTERN/REPLACEMENT/egimosx 1241*0Sstevel@tonic-gate 1242*0Sstevel@tonic-gateSearches a string for a pattern, and if found, replaces that pattern 1243*0Sstevel@tonic-gatewith the replacement text and returns the number of substitutions 1244*0Sstevel@tonic-gatemade. Otherwise it returns false (specifically, the empty string). 1245*0Sstevel@tonic-gate 1246*0Sstevel@tonic-gateIf no string is specified via the C<=~> or C<!~> operator, the C<$_> 1247*0Sstevel@tonic-gatevariable is searched and modified. (The string specified with C<=~> must 1248*0Sstevel@tonic-gatebe scalar variable, an array element, a hash element, or an assignment 1249*0Sstevel@tonic-gateto one of those, i.e., an lvalue.) 1250*0Sstevel@tonic-gate 1251*0Sstevel@tonic-gateIf the delimiter chosen is a single quote, no interpolation is 1252*0Sstevel@tonic-gatedone on either the PATTERN or the REPLACEMENT. Otherwise, if the 1253*0Sstevel@tonic-gatePATTERN contains a $ that looks like a variable rather than an 1254*0Sstevel@tonic-gateend-of-string test, the variable will be interpolated into the pattern 1255*0Sstevel@tonic-gateat run-time. If you want the pattern compiled only once the first time 1256*0Sstevel@tonic-gatethe variable is interpolated, use the C</o> option. If the pattern 1257*0Sstevel@tonic-gateevaluates to the empty string, the last successfully executed regular 1258*0Sstevel@tonic-gateexpression is used instead. See L<perlre> for further explanation on these. 1259*0Sstevel@tonic-gateSee L<perllocale> for discussion of additional considerations that apply 1260*0Sstevel@tonic-gatewhen C<use locale> is in effect. 1261*0Sstevel@tonic-gate 1262*0Sstevel@tonic-gateOptions are: 1263*0Sstevel@tonic-gate 1264*0Sstevel@tonic-gate e Evaluate the right side as an expression. 1265*0Sstevel@tonic-gate g Replace globally, i.e., all occurrences. 1266*0Sstevel@tonic-gate i Do case-insensitive pattern matching. 1267*0Sstevel@tonic-gate m Treat string as multiple lines. 1268*0Sstevel@tonic-gate o Compile pattern only once. 1269*0Sstevel@tonic-gate s Treat string as single line. 1270*0Sstevel@tonic-gate x Use extended regular expressions. 1271*0Sstevel@tonic-gate 1272*0Sstevel@tonic-gateAny non-alphanumeric, non-whitespace delimiter may replace the 1273*0Sstevel@tonic-gateslashes. If single quotes are used, no interpretation is done on the 1274*0Sstevel@tonic-gatereplacement string (the C</e> modifier overrides this, however). Unlike 1275*0Sstevel@tonic-gatePerl 4, Perl 5 treats backticks as normal delimiters; the replacement 1276*0Sstevel@tonic-gatetext is not evaluated as a command. If the 1277*0Sstevel@tonic-gatePATTERN is delimited by bracketing quotes, the REPLACEMENT has its own 1278*0Sstevel@tonic-gatepair of quotes, which may or may not be bracketing quotes, e.g., 1279*0Sstevel@tonic-gateC<s(foo)(bar)> or C<< s<foo>/bar/ >>. A C</e> will cause the 1280*0Sstevel@tonic-gatereplacement portion to be treated as a full-fledged Perl expression 1281*0Sstevel@tonic-gateand evaluated right then and there. It is, however, syntax checked at 1282*0Sstevel@tonic-gatecompile-time. A second C<e> modifier will cause the replacement portion 1283*0Sstevel@tonic-gateto be C<eval>ed before being run as a Perl expression. 1284*0Sstevel@tonic-gate 1285*0Sstevel@tonic-gateExamples: 1286*0Sstevel@tonic-gate 1287*0Sstevel@tonic-gate s/\bgreen\b/mauve/g; # don't change wintergreen 1288*0Sstevel@tonic-gate 1289*0Sstevel@tonic-gate $path =~ s|/usr/bin|/usr/local/bin|; 1290*0Sstevel@tonic-gate 1291*0Sstevel@tonic-gate s/Login: $foo/Login: $bar/; # run-time pattern 1292*0Sstevel@tonic-gate 1293*0Sstevel@tonic-gate ($foo = $bar) =~ s/this/that/; # copy first, then change 1294*0Sstevel@tonic-gate 1295*0Sstevel@tonic-gate $count = ($paragraph =~ s/Mister\b/Mr./g); # get change-count 1296*0Sstevel@tonic-gate 1297*0Sstevel@tonic-gate $_ = 'abc123xyz'; 1298*0Sstevel@tonic-gate s/\d+/$&*2/e; # yields 'abc246xyz' 1299*0Sstevel@tonic-gate s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz' 1300*0Sstevel@tonic-gate s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz' 1301*0Sstevel@tonic-gate 1302*0Sstevel@tonic-gate s/%(.)/$percent{$1}/g; # change percent escapes; no /e 1303*0Sstevel@tonic-gate s/%(.)/$percent{$1} || $&/ge; # expr now, so /e 1304*0Sstevel@tonic-gate s/^=(\w+)/&pod($1)/ge; # use function call 1305*0Sstevel@tonic-gate 1306*0Sstevel@tonic-gate # expand variables in $_, but dynamics only, using 1307*0Sstevel@tonic-gate # symbolic dereferencing 1308*0Sstevel@tonic-gate s/\$(\w+)/${$1}/g; 1309*0Sstevel@tonic-gate 1310*0Sstevel@tonic-gate # Add one to the value of any numbers in the string 1311*0Sstevel@tonic-gate s/(\d+)/1 + $1/eg; 1312*0Sstevel@tonic-gate 1313*0Sstevel@tonic-gate # This will expand any embedded scalar variable 1314*0Sstevel@tonic-gate # (including lexicals) in $_ : First $1 is interpolated 1315*0Sstevel@tonic-gate # to the variable name, and then evaluated 1316*0Sstevel@tonic-gate s/(\$\w+)/$1/eeg; 1317*0Sstevel@tonic-gate 1318*0Sstevel@tonic-gate # Delete (most) C comments. 1319*0Sstevel@tonic-gate $program =~ s { 1320*0Sstevel@tonic-gate /\* # Match the opening delimiter. 1321*0Sstevel@tonic-gate .*? # Match a minimal number of characters. 1322*0Sstevel@tonic-gate \*/ # Match the closing delimiter. 1323*0Sstevel@tonic-gate } []gsx; 1324*0Sstevel@tonic-gate 1325*0Sstevel@tonic-gate s/^\s*(.*?)\s*$/$1/; # trim white space in $_, expensively 1326*0Sstevel@tonic-gate 1327*0Sstevel@tonic-gate for ($variable) { # trim white space in $variable, cheap 1328*0Sstevel@tonic-gate s/^\s+//; 1329*0Sstevel@tonic-gate s/\s+$//; 1330*0Sstevel@tonic-gate } 1331*0Sstevel@tonic-gate 1332*0Sstevel@tonic-gate s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields 1333*0Sstevel@tonic-gate 1334*0Sstevel@tonic-gateNote the use of $ instead of \ in the last example. Unlike 1335*0Sstevel@tonic-gateB<sed>, we use the \<I<digit>> form in only the left hand side. 1336*0Sstevel@tonic-gateAnywhere else it's $<I<digit>>. 1337*0Sstevel@tonic-gate 1338*0Sstevel@tonic-gateOccasionally, you can't use just a C</g> to get all the changes 1339*0Sstevel@tonic-gateto occur that you might want. Here are two common cases: 1340*0Sstevel@tonic-gate 1341*0Sstevel@tonic-gate # put commas in the right places in an integer 1342*0Sstevel@tonic-gate 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g; 1343*0Sstevel@tonic-gate 1344*0Sstevel@tonic-gate # expand tabs to 8-column spacing 1345*0Sstevel@tonic-gate 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e; 1346*0Sstevel@tonic-gate 1347*0Sstevel@tonic-gate=item tr/SEARCHLIST/REPLACEMENTLIST/cds 1348*0Sstevel@tonic-gate 1349*0Sstevel@tonic-gate=item y/SEARCHLIST/REPLACEMENTLIST/cds 1350*0Sstevel@tonic-gate 1351*0Sstevel@tonic-gateTransliterates all occurrences of the characters found in the search list 1352*0Sstevel@tonic-gatewith the corresponding character in the replacement list. It returns 1353*0Sstevel@tonic-gatethe number of characters replaced or deleted. If no string is 1354*0Sstevel@tonic-gatespecified via the =~ or !~ operator, the $_ string is transliterated. (The 1355*0Sstevel@tonic-gatestring specified with =~ must be a scalar variable, an array element, a 1356*0Sstevel@tonic-gatehash element, or an assignment to one of those, i.e., an lvalue.) 1357*0Sstevel@tonic-gate 1358*0Sstevel@tonic-gateA character range may be specified with a hyphen, so C<tr/A-J/0-9/> 1359*0Sstevel@tonic-gatedoes the same replacement as C<tr/ACEGIBDFHJ/0246813579/>. 1360*0Sstevel@tonic-gateFor B<sed> devotees, C<y> is provided as a synonym for C<tr>. If the 1361*0Sstevel@tonic-gateSEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST has 1362*0Sstevel@tonic-gateits own pair of quotes, which may or may not be bracketing quotes, 1363*0Sstevel@tonic-gatee.g., C<tr[A-Z][a-z]> or C<tr(+\-*/)/ABCD/>. 1364*0Sstevel@tonic-gate 1365*0Sstevel@tonic-gateNote that C<tr> does B<not> do regular expression character classes 1366*0Sstevel@tonic-gatesuch as C<\d> or C<[:lower:]>. The <tr> operator is not equivalent to 1367*0Sstevel@tonic-gatethe tr(1) utility. If you want to map strings between lower/upper 1368*0Sstevel@tonic-gatecases, see L<perlfunc/lc> and L<perlfunc/uc>, and in general consider 1369*0Sstevel@tonic-gateusing the C<s> operator if you need regular expressions. 1370*0Sstevel@tonic-gate 1371*0Sstevel@tonic-gateNote also that the whole range idea is rather unportable between 1372*0Sstevel@tonic-gatecharacter sets--and even within character sets they may cause results 1373*0Sstevel@tonic-gateyou probably didn't expect. A sound principle is to use only ranges 1374*0Sstevel@tonic-gatethat begin from and end at either alphabets of equal case (a-e, A-E), 1375*0Sstevel@tonic-gateor digits (0-4). Anything else is unsafe. If in doubt, spell out the 1376*0Sstevel@tonic-gatecharacter sets in full. 1377*0Sstevel@tonic-gate 1378*0Sstevel@tonic-gateOptions: 1379*0Sstevel@tonic-gate 1380*0Sstevel@tonic-gate c Complement the SEARCHLIST. 1381*0Sstevel@tonic-gate d Delete found but unreplaced characters. 1382*0Sstevel@tonic-gate s Squash duplicate replaced characters. 1383*0Sstevel@tonic-gate 1384*0Sstevel@tonic-gateIf the C</c> modifier is specified, the SEARCHLIST character set 1385*0Sstevel@tonic-gateis complemented. If the C</d> modifier is specified, any characters 1386*0Sstevel@tonic-gatespecified by SEARCHLIST not found in REPLACEMENTLIST are deleted. 1387*0Sstevel@tonic-gate(Note that this is slightly more flexible than the behavior of some 1388*0Sstevel@tonic-gateB<tr> programs, which delete anything they find in the SEARCHLIST, 1389*0Sstevel@tonic-gateperiod.) If the C</s> modifier is specified, sequences of characters 1390*0Sstevel@tonic-gatethat were transliterated to the same character are squashed down 1391*0Sstevel@tonic-gateto a single instance of the character. 1392*0Sstevel@tonic-gate 1393*0Sstevel@tonic-gateIf the C</d> modifier is used, the REPLACEMENTLIST is always interpreted 1394*0Sstevel@tonic-gateexactly as specified. Otherwise, if the REPLACEMENTLIST is shorter 1395*0Sstevel@tonic-gatethan the SEARCHLIST, the final character is replicated till it is long 1396*0Sstevel@tonic-gateenough. If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated. 1397*0Sstevel@tonic-gateThis latter is useful for counting characters in a class or for 1398*0Sstevel@tonic-gatesquashing character sequences in a class. 1399*0Sstevel@tonic-gate 1400*0Sstevel@tonic-gateExamples: 1401*0Sstevel@tonic-gate 1402*0Sstevel@tonic-gate $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case 1403*0Sstevel@tonic-gate 1404*0Sstevel@tonic-gate $cnt = tr/*/*/; # count the stars in $_ 1405*0Sstevel@tonic-gate 1406*0Sstevel@tonic-gate $cnt = $sky =~ tr/*/*/; # count the stars in $sky 1407*0Sstevel@tonic-gate 1408*0Sstevel@tonic-gate $cnt = tr/0-9//; # count the digits in $_ 1409*0Sstevel@tonic-gate 1410*0Sstevel@tonic-gate tr/a-zA-Z//s; # bookkeeper -> bokeper 1411*0Sstevel@tonic-gate 1412*0Sstevel@tonic-gate ($HOST = $host) =~ tr/a-z/A-Z/; 1413*0Sstevel@tonic-gate 1414*0Sstevel@tonic-gate tr/a-zA-Z/ /cs; # change non-alphas to single space 1415*0Sstevel@tonic-gate 1416*0Sstevel@tonic-gate tr [\200-\377] 1417*0Sstevel@tonic-gate [\000-\177]; # delete 8th bit 1418*0Sstevel@tonic-gate 1419*0Sstevel@tonic-gateIf multiple transliterations are given for a character, only the 1420*0Sstevel@tonic-gatefirst one is used: 1421*0Sstevel@tonic-gate 1422*0Sstevel@tonic-gate tr/AAA/XYZ/ 1423*0Sstevel@tonic-gate 1424*0Sstevel@tonic-gatewill transliterate any A to X. 1425*0Sstevel@tonic-gate 1426*0Sstevel@tonic-gateBecause the transliteration table is built at compile time, neither 1427*0Sstevel@tonic-gatethe SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote 1428*0Sstevel@tonic-gateinterpolation. That means that if you want to use variables, you 1429*0Sstevel@tonic-gatemust use an eval(): 1430*0Sstevel@tonic-gate 1431*0Sstevel@tonic-gate eval "tr/$oldlist/$newlist/"; 1432*0Sstevel@tonic-gate die $@ if $@; 1433*0Sstevel@tonic-gate 1434*0Sstevel@tonic-gate eval "tr/$oldlist/$newlist/, 1" or die $@; 1435*0Sstevel@tonic-gate 1436*0Sstevel@tonic-gate=item <<EOF 1437*0Sstevel@tonic-gate 1438*0Sstevel@tonic-gateA line-oriented form of quoting is based on the shell "here-document" 1439*0Sstevel@tonic-gatesyntax. Following a C<< << >> you specify a string to terminate 1440*0Sstevel@tonic-gatethe quoted material, and all lines following the current line down to 1441*0Sstevel@tonic-gatethe terminating string are the value of the item. The terminating 1442*0Sstevel@tonic-gatestring may be either an identifier (a word), or some quoted text. If 1443*0Sstevel@tonic-gatequoted, the type of quotes you use determines the treatment of the 1444*0Sstevel@tonic-gatetext, just as in regular quoting. An unquoted identifier works like 1445*0Sstevel@tonic-gatedouble quotes. There must be no space between the C<< << >> and 1446*0Sstevel@tonic-gatethe identifier, unless the identifier is quoted. (If you put a space it 1447*0Sstevel@tonic-gatewill be treated as a null identifier, which is valid, and matches the first 1448*0Sstevel@tonic-gateempty line.) The terminating string must appear by itself (unquoted and 1449*0Sstevel@tonic-gatewith no surrounding whitespace) on the terminating line. 1450*0Sstevel@tonic-gate 1451*0Sstevel@tonic-gate print <<EOF; 1452*0Sstevel@tonic-gate The price is $Price. 1453*0Sstevel@tonic-gate EOF 1454*0Sstevel@tonic-gate 1455*0Sstevel@tonic-gate print << "EOF"; # same as above 1456*0Sstevel@tonic-gate The price is $Price. 1457*0Sstevel@tonic-gate EOF 1458*0Sstevel@tonic-gate 1459*0Sstevel@tonic-gate print << `EOC`; # execute commands 1460*0Sstevel@tonic-gate echo hi there 1461*0Sstevel@tonic-gate echo lo there 1462*0Sstevel@tonic-gate EOC 1463*0Sstevel@tonic-gate 1464*0Sstevel@tonic-gate print <<"foo", <<"bar"; # you can stack them 1465*0Sstevel@tonic-gate I said foo. 1466*0Sstevel@tonic-gate foo 1467*0Sstevel@tonic-gate I said bar. 1468*0Sstevel@tonic-gate bar 1469*0Sstevel@tonic-gate 1470*0Sstevel@tonic-gate myfunc(<< "THIS", 23, <<'THAT'); 1471*0Sstevel@tonic-gate Here's a line 1472*0Sstevel@tonic-gate or two. 1473*0Sstevel@tonic-gate THIS 1474*0Sstevel@tonic-gate and here's another. 1475*0Sstevel@tonic-gate THAT 1476*0Sstevel@tonic-gate 1477*0Sstevel@tonic-gateJust don't forget that you have to put a semicolon on the end 1478*0Sstevel@tonic-gateto finish the statement, as Perl doesn't know you're not going to 1479*0Sstevel@tonic-gatetry to do this: 1480*0Sstevel@tonic-gate 1481*0Sstevel@tonic-gate print <<ABC 1482*0Sstevel@tonic-gate 179231 1483*0Sstevel@tonic-gate ABC 1484*0Sstevel@tonic-gate + 20; 1485*0Sstevel@tonic-gate 1486*0Sstevel@tonic-gateIf you want your here-docs to be indented with the 1487*0Sstevel@tonic-gaterest of the code, you'll need to remove leading whitespace 1488*0Sstevel@tonic-gatefrom each line manually: 1489*0Sstevel@tonic-gate 1490*0Sstevel@tonic-gate ($quote = <<'FINIS') =~ s/^\s+//gm; 1491*0Sstevel@tonic-gate The Road goes ever on and on, 1492*0Sstevel@tonic-gate down from the door where it began. 1493*0Sstevel@tonic-gate FINIS 1494*0Sstevel@tonic-gate 1495*0Sstevel@tonic-gateIf you use a here-doc within a delimited construct, such as in C<s///eg>, 1496*0Sstevel@tonic-gatethe quoted material must come on the lines following the final delimiter. 1497*0Sstevel@tonic-gateSo instead of 1498*0Sstevel@tonic-gate 1499*0Sstevel@tonic-gate s/this/<<E . 'that' 1500*0Sstevel@tonic-gate the other 1501*0Sstevel@tonic-gate E 1502*0Sstevel@tonic-gate . 'more '/eg; 1503*0Sstevel@tonic-gate 1504*0Sstevel@tonic-gateyou have to write 1505*0Sstevel@tonic-gate 1506*0Sstevel@tonic-gate s/this/<<E . 'that' 1507*0Sstevel@tonic-gate . 'more '/eg; 1508*0Sstevel@tonic-gate the other 1509*0Sstevel@tonic-gate E 1510*0Sstevel@tonic-gate 1511*0Sstevel@tonic-gateIf the terminating identifier is on the last line of the program, you 1512*0Sstevel@tonic-gatemust be sure there is a newline after it; otherwise, Perl will give the 1513*0Sstevel@tonic-gatewarning B<Can't find string terminator "END" anywhere before EOF...>. 1514*0Sstevel@tonic-gate 1515*0Sstevel@tonic-gateAdditionally, the quoting rules for the identifier are not related to 1516*0Sstevel@tonic-gatePerl's quoting rules -- C<q()>, C<qq()>, and the like are not supported 1517*0Sstevel@tonic-gatein place of C<''> and C<"">, and the only interpolation is for backslashing 1518*0Sstevel@tonic-gatethe quoting character: 1519*0Sstevel@tonic-gate 1520*0Sstevel@tonic-gate print << "abc\"def"; 1521*0Sstevel@tonic-gate testing... 1522*0Sstevel@tonic-gate abc"def 1523*0Sstevel@tonic-gate 1524*0Sstevel@tonic-gateFinally, quoted strings cannot span multiple lines. The general rule is 1525*0Sstevel@tonic-gatethat the identifier must be a string literal. Stick with that, and you 1526*0Sstevel@tonic-gateshould be safe. 1527*0Sstevel@tonic-gate 1528*0Sstevel@tonic-gate=back 1529*0Sstevel@tonic-gate 1530*0Sstevel@tonic-gate=head2 Gory details of parsing quoted constructs 1531*0Sstevel@tonic-gate 1532*0Sstevel@tonic-gateWhen presented with something that might have several different 1533*0Sstevel@tonic-gateinterpretations, Perl uses the B<DWIM> (that's "Do What I Mean") 1534*0Sstevel@tonic-gateprinciple to pick the most probable interpretation. This strategy 1535*0Sstevel@tonic-gateis so successful that Perl programmers often do not suspect the 1536*0Sstevel@tonic-gateambivalence of what they write. But from time to time, Perl's 1537*0Sstevel@tonic-gatenotions differ substantially from what the author honestly meant. 1538*0Sstevel@tonic-gate 1539*0Sstevel@tonic-gateThis section hopes to clarify how Perl handles quoted constructs. 1540*0Sstevel@tonic-gateAlthough the most common reason to learn this is to unravel labyrinthine 1541*0Sstevel@tonic-gateregular expressions, because the initial steps of parsing are the 1542*0Sstevel@tonic-gatesame for all quoting operators, they are all discussed together. 1543*0Sstevel@tonic-gate 1544*0Sstevel@tonic-gateThe most important Perl parsing rule is the first one discussed 1545*0Sstevel@tonic-gatebelow: when processing a quoted construct, Perl first finds the end 1546*0Sstevel@tonic-gateof that construct, then interprets its contents. If you understand 1547*0Sstevel@tonic-gatethis rule, you may skip the rest of this section on the first 1548*0Sstevel@tonic-gatereading. The other rules are likely to contradict the user's 1549*0Sstevel@tonic-gateexpectations much less frequently than this first one. 1550*0Sstevel@tonic-gate 1551*0Sstevel@tonic-gateSome passes discussed below are performed concurrently, but because 1552*0Sstevel@tonic-gatetheir results are the same, we consider them individually. For different 1553*0Sstevel@tonic-gatequoting constructs, Perl performs different numbers of passes, from 1554*0Sstevel@tonic-gateone to five, but these passes are always performed in the same order. 1555*0Sstevel@tonic-gate 1556*0Sstevel@tonic-gate=over 4 1557*0Sstevel@tonic-gate 1558*0Sstevel@tonic-gate=item Finding the end 1559*0Sstevel@tonic-gate 1560*0Sstevel@tonic-gateThe first pass is finding the end of the quoted construct, whether 1561*0Sstevel@tonic-gateit be a multicharacter delimiter C<"\nEOF\n"> in the C<<<EOF> 1562*0Sstevel@tonic-gateconstruct, a C</> that terminates a C<qq//> construct, a C<]> which 1563*0Sstevel@tonic-gateterminates C<qq[]> construct, or a C<< > >> which terminates a 1564*0Sstevel@tonic-gatefileglob started with C<< < >>. 1565*0Sstevel@tonic-gate 1566*0Sstevel@tonic-gateWhen searching for single-character non-pairing delimiters, such 1567*0Sstevel@tonic-gateas C</>, combinations of C<\\> and C<\/> are skipped. However, 1568*0Sstevel@tonic-gatewhen searching for single-character pairing delimiter like C<[>, 1569*0Sstevel@tonic-gatecombinations of C<\\>, C<\]>, and C<\[> are all skipped, and nested 1570*0Sstevel@tonic-gateC<[>, C<]> are skipped as well. When searching for multicharacter 1571*0Sstevel@tonic-gatedelimiters, nothing is skipped. 1572*0Sstevel@tonic-gate 1573*0Sstevel@tonic-gateFor constructs with three-part delimiters (C<s///>, C<y///>, and 1574*0Sstevel@tonic-gateC<tr///>), the search is repeated once more. 1575*0Sstevel@tonic-gate 1576*0Sstevel@tonic-gateDuring this search no attention is paid to the semantics of the construct. 1577*0Sstevel@tonic-gateThus: 1578*0Sstevel@tonic-gate 1579*0Sstevel@tonic-gate "$hash{"$foo/$bar"}" 1580*0Sstevel@tonic-gate 1581*0Sstevel@tonic-gateor: 1582*0Sstevel@tonic-gate 1583*0Sstevel@tonic-gate m/ 1584*0Sstevel@tonic-gate bar # NOT a comment, this slash / terminated m//! 1585*0Sstevel@tonic-gate /x 1586*0Sstevel@tonic-gate 1587*0Sstevel@tonic-gatedo not form legal quoted expressions. The quoted part ends on the 1588*0Sstevel@tonic-gatefirst C<"> and C</>, and the rest happens to be a syntax error. 1589*0Sstevel@tonic-gateBecause the slash that terminated C<m//> was followed by a C<SPACE>, 1590*0Sstevel@tonic-gatethe example above is not C<m//x>, but rather C<m//> with no C</x> 1591*0Sstevel@tonic-gatemodifier. So the embedded C<#> is interpreted as a literal C<#>. 1592*0Sstevel@tonic-gate 1593*0Sstevel@tonic-gate=item Removal of backslashes before delimiters 1594*0Sstevel@tonic-gate 1595*0Sstevel@tonic-gateDuring the second pass, text between the starting and ending 1596*0Sstevel@tonic-gatedelimiters is copied to a safe location, and the C<\> is removed 1597*0Sstevel@tonic-gatefrom combinations consisting of C<\> and delimiter--or delimiters, 1598*0Sstevel@tonic-gatemeaning both starting and ending delimiters will should these differ. 1599*0Sstevel@tonic-gateThis removal does not happen for multi-character delimiters. 1600*0Sstevel@tonic-gateNote that the combination C<\\> is left intact, just as it was. 1601*0Sstevel@tonic-gate 1602*0Sstevel@tonic-gateStarting from this step no information about the delimiters is 1603*0Sstevel@tonic-gateused in parsing. 1604*0Sstevel@tonic-gate 1605*0Sstevel@tonic-gate=item Interpolation 1606*0Sstevel@tonic-gate 1607*0Sstevel@tonic-gateThe next step is interpolation in the text obtained, which is now 1608*0Sstevel@tonic-gatedelimiter-independent. There are four different cases. 1609*0Sstevel@tonic-gate 1610*0Sstevel@tonic-gate=over 4 1611*0Sstevel@tonic-gate 1612*0Sstevel@tonic-gate=item C<<<'EOF'>, C<m''>, C<s'''>, C<tr///>, C<y///> 1613*0Sstevel@tonic-gate 1614*0Sstevel@tonic-gateNo interpolation is performed. 1615*0Sstevel@tonic-gate 1616*0Sstevel@tonic-gate=item C<''>, C<q//> 1617*0Sstevel@tonic-gate 1618*0Sstevel@tonic-gateThe only interpolation is removal of C<\> from pairs C<\\>. 1619*0Sstevel@tonic-gate 1620*0Sstevel@tonic-gate=item C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >> 1621*0Sstevel@tonic-gate 1622*0Sstevel@tonic-gateC<\Q>, C<\U>, C<\u>, C<\L>, C<\l> (possibly paired with C<\E>) are 1623*0Sstevel@tonic-gateconverted to corresponding Perl constructs. Thus, C<"$foo\Qbaz$bar"> 1624*0Sstevel@tonic-gateis converted to C<$foo . (quotemeta("baz" . $bar))> internally. 1625*0Sstevel@tonic-gateThe other combinations are replaced with appropriate expansions. 1626*0Sstevel@tonic-gate 1627*0Sstevel@tonic-gateLet it be stressed that I<whatever falls between C<\Q> and C<\E>> 1628*0Sstevel@tonic-gateis interpolated in the usual way. Something like C<"\Q\\E"> has 1629*0Sstevel@tonic-gateno C<\E> inside. instead, it has C<\Q>, C<\\>, and C<E>, so the 1630*0Sstevel@tonic-gateresult is the same as for C<"\\\\E">. As a general rule, backslashes 1631*0Sstevel@tonic-gatebetween C<\Q> and C<\E> may lead to counterintuitive results. So, 1632*0Sstevel@tonic-gateC<"\Q\t\E"> is converted to C<quotemeta("\t")>, which is the same 1633*0Sstevel@tonic-gateas C<"\\\t"> (since TAB is not alphanumeric). Note also that: 1634*0Sstevel@tonic-gate 1635*0Sstevel@tonic-gate $str = '\t'; 1636*0Sstevel@tonic-gate return "\Q$str"; 1637*0Sstevel@tonic-gate 1638*0Sstevel@tonic-gatemay be closer to the conjectural I<intention> of the writer of C<"\Q\t\E">. 1639*0Sstevel@tonic-gate 1640*0Sstevel@tonic-gateInterpolated scalars and arrays are converted internally to the C<join> and 1641*0Sstevel@tonic-gateC<.> catenation operations. Thus, C<"$foo XXX '@arr'"> becomes: 1642*0Sstevel@tonic-gate 1643*0Sstevel@tonic-gate $foo . " XXX '" . (join $", @arr) . "'"; 1644*0Sstevel@tonic-gate 1645*0Sstevel@tonic-gateAll operations above are performed simultaneously, left to right. 1646*0Sstevel@tonic-gate 1647*0Sstevel@tonic-gateBecause the result of C<"\Q STRING \E"> has all metacharacters 1648*0Sstevel@tonic-gatequoted, there is no way to insert a literal C<$> or C<@> inside a 1649*0Sstevel@tonic-gateC<\Q\E> pair. If protected by C<\>, C<$> will be quoted to became 1650*0Sstevel@tonic-gateC<"\\\$">; if not, it is interpreted as the start of an interpolated 1651*0Sstevel@tonic-gatescalar. 1652*0Sstevel@tonic-gate 1653*0Sstevel@tonic-gateNote also that the interpolation code needs to make a decision on 1654*0Sstevel@tonic-gatewhere the interpolated scalar ends. For instance, whether 1655*0Sstevel@tonic-gateC<< "a $b -> {c}" >> really means: 1656*0Sstevel@tonic-gate 1657*0Sstevel@tonic-gate "a " . $b . " -> {c}"; 1658*0Sstevel@tonic-gate 1659*0Sstevel@tonic-gateor: 1660*0Sstevel@tonic-gate 1661*0Sstevel@tonic-gate "a " . $b -> {c}; 1662*0Sstevel@tonic-gate 1663*0Sstevel@tonic-gateMost of the time, the longest possible text that does not include 1664*0Sstevel@tonic-gatespaces between components and which contains matching braces or 1665*0Sstevel@tonic-gatebrackets. because the outcome may be determined by voting based 1666*0Sstevel@tonic-gateon heuristic estimators, the result is not strictly predictable. 1667*0Sstevel@tonic-gateFortunately, it's usually correct for ambiguous cases. 1668*0Sstevel@tonic-gate 1669*0Sstevel@tonic-gate=item C<?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>, 1670*0Sstevel@tonic-gate 1671*0Sstevel@tonic-gateProcessing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, and interpolation 1672*0Sstevel@tonic-gatehappens (almost) as with C<qq//> constructs, but the substitution 1673*0Sstevel@tonic-gateof C<\> followed by RE-special chars (including C<\>) is not 1674*0Sstevel@tonic-gateperformed. Moreover, inside C<(?{BLOCK})>, C<(?# comment )>, and 1675*0Sstevel@tonic-gatea C<#>-comment in a C<//x>-regular expression, no processing is 1676*0Sstevel@tonic-gateperformed whatsoever. This is the first step at which the presence 1677*0Sstevel@tonic-gateof the C<//x> modifier is relevant. 1678*0Sstevel@tonic-gate 1679*0Sstevel@tonic-gateInterpolation has several quirks: C<$|>, C<$(>, and C<$)> are not 1680*0Sstevel@tonic-gateinterpolated, and constructs C<$var[SOMETHING]> are voted (by several 1681*0Sstevel@tonic-gatedifferent estimators) to be either an array element or C<$var> 1682*0Sstevel@tonic-gatefollowed by an RE alternative. This is where the notation 1683*0Sstevel@tonic-gateC<${arr[$bar]}> comes handy: C</${arr[0-9]}/> is interpreted as 1684*0Sstevel@tonic-gatearray element C<-9>, not as a regular expression from the variable 1685*0Sstevel@tonic-gateC<$arr> followed by a digit, which would be the interpretation of 1686*0Sstevel@tonic-gateC</$arr[0-9]/>. Since voting among different estimators may occur, 1687*0Sstevel@tonic-gatethe result is not predictable. 1688*0Sstevel@tonic-gate 1689*0Sstevel@tonic-gateIt is at this step that C<\1> is begrudgingly converted to C<$1> in 1690*0Sstevel@tonic-gatethe replacement text of C<s///> to correct the incorrigible 1691*0Sstevel@tonic-gateI<sed> hackers who haven't picked up the saner idiom yet. A warning 1692*0Sstevel@tonic-gateis emitted if the C<use warnings> pragma or the B<-w> command-line flag 1693*0Sstevel@tonic-gate(that is, the C<$^W> variable) was set. 1694*0Sstevel@tonic-gate 1695*0Sstevel@tonic-gateThe lack of processing of C<\\> creates specific restrictions on 1696*0Sstevel@tonic-gatethe post-processed text. If the delimiter is C</>, one cannot get 1697*0Sstevel@tonic-gatethe combination C<\/> into the result of this step. C</> will 1698*0Sstevel@tonic-gatefinish the regular expression, C<\/> will be stripped to C</> on 1699*0Sstevel@tonic-gatethe previous step, and C<\\/> will be left as is. Because C</> is 1700*0Sstevel@tonic-gateequivalent to C<\/> inside a regular expression, this does not 1701*0Sstevel@tonic-gatematter unless the delimiter happens to be character special to the 1702*0Sstevel@tonic-gateRE engine, such as in C<s*foo*bar*>, C<m[foo]>, or C<?foo?>; or an 1703*0Sstevel@tonic-gatealphanumeric char, as in: 1704*0Sstevel@tonic-gate 1705*0Sstevel@tonic-gate m m ^ a \s* b mmx; 1706*0Sstevel@tonic-gate 1707*0Sstevel@tonic-gateIn the RE above, which is intentionally obfuscated for illustration, the 1708*0Sstevel@tonic-gatedelimiter is C<m>, the modifier is C<mx>, and after backslash-removal the 1709*0Sstevel@tonic-gateRE is the same as for C<m/ ^ a \s* b /mx>. There's more than one 1710*0Sstevel@tonic-gatereason you're encouraged to restrict your delimiters to non-alphanumeric, 1711*0Sstevel@tonic-gatenon-whitespace choices. 1712*0Sstevel@tonic-gate 1713*0Sstevel@tonic-gate=back 1714*0Sstevel@tonic-gate 1715*0Sstevel@tonic-gateThis step is the last one for all constructs except regular expressions, 1716*0Sstevel@tonic-gatewhich are processed further. 1717*0Sstevel@tonic-gate 1718*0Sstevel@tonic-gate=item Interpolation of regular expressions 1719*0Sstevel@tonic-gate 1720*0Sstevel@tonic-gatePrevious steps were performed during the compilation of Perl code, 1721*0Sstevel@tonic-gatebut this one happens at run time--although it may be optimized to 1722*0Sstevel@tonic-gatebe calculated at compile time if appropriate. After preprocessing 1723*0Sstevel@tonic-gatedescribed above, and possibly after evaluation if catenation, 1724*0Sstevel@tonic-gatejoining, casing translation, or metaquoting are involved, the 1725*0Sstevel@tonic-gateresulting I<string> is passed to the RE engine for compilation. 1726*0Sstevel@tonic-gate 1727*0Sstevel@tonic-gateWhatever happens in the RE engine might be better discussed in L<perlre>, 1728*0Sstevel@tonic-gatebut for the sake of continuity, we shall do so here. 1729*0Sstevel@tonic-gate 1730*0Sstevel@tonic-gateThis is another step where the presence of the C<//x> modifier is 1731*0Sstevel@tonic-gaterelevant. The RE engine scans the string from left to right and 1732*0Sstevel@tonic-gateconverts it to a finite automaton. 1733*0Sstevel@tonic-gate 1734*0Sstevel@tonic-gateBackslashed characters are either replaced with corresponding 1735*0Sstevel@tonic-gateliteral strings (as with C<\{>), or else they generate special nodes 1736*0Sstevel@tonic-gatein the finite automaton (as with C<\b>). Characters special to the 1737*0Sstevel@tonic-gateRE engine (such as C<|>) generate corresponding nodes or groups of 1738*0Sstevel@tonic-gatenodes. C<(?#...)> comments are ignored. All the rest is either 1739*0Sstevel@tonic-gateconverted to literal strings to match, or else is ignored (as is 1740*0Sstevel@tonic-gatewhitespace and C<#>-style comments if C<//x> is present). 1741*0Sstevel@tonic-gate 1742*0Sstevel@tonic-gateParsing of the bracketed character class construct, C<[...]>, is 1743*0Sstevel@tonic-gaterather different than the rule used for the rest of the pattern. 1744*0Sstevel@tonic-gateThe terminator of this construct is found using the same rules as 1745*0Sstevel@tonic-gatefor finding the terminator of a C<{}>-delimited construct, the only 1746*0Sstevel@tonic-gateexception being that C<]> immediately following C<[> is treated as 1747*0Sstevel@tonic-gatethough preceded by a backslash. Similarly, the terminator of 1748*0Sstevel@tonic-gateC<(?{...})> is found using the same rules as for finding the 1749*0Sstevel@tonic-gateterminator of a C<{}>-delimited construct. 1750*0Sstevel@tonic-gate 1751*0Sstevel@tonic-gateIt is possible to inspect both the string given to RE engine and the 1752*0Sstevel@tonic-gateresulting finite automaton. See the arguments C<debug>/C<debugcolor> 1753*0Sstevel@tonic-gatein the C<use L<re>> pragma, as well as Perl's B<-Dr> command-line 1754*0Sstevel@tonic-gateswitch documented in L<perlrun/"Command Switches">. 1755*0Sstevel@tonic-gate 1756*0Sstevel@tonic-gate=item Optimization of regular expressions 1757*0Sstevel@tonic-gate 1758*0Sstevel@tonic-gateThis step is listed for completeness only. Since it does not change 1759*0Sstevel@tonic-gatesemantics, details of this step are not documented and are subject 1760*0Sstevel@tonic-gateto change without notice. This step is performed over the finite 1761*0Sstevel@tonic-gateautomaton that was generated during the previous pass. 1762*0Sstevel@tonic-gate 1763*0Sstevel@tonic-gateIt is at this stage that C<split()> silently optimizes C</^/> to 1764*0Sstevel@tonic-gatemean C</^/m>. 1765*0Sstevel@tonic-gate 1766*0Sstevel@tonic-gate=back 1767*0Sstevel@tonic-gate 1768*0Sstevel@tonic-gate=head2 I/O Operators 1769*0Sstevel@tonic-gate 1770*0Sstevel@tonic-gateThere are several I/O operators you should know about. 1771*0Sstevel@tonic-gate 1772*0Sstevel@tonic-gateA string enclosed by backticks (grave accents) first undergoes 1773*0Sstevel@tonic-gatedouble-quote interpolation. It is then interpreted as an external 1774*0Sstevel@tonic-gatecommand, and the output of that command is the value of the 1775*0Sstevel@tonic-gatebacktick string, like in a shell. In scalar context, a single string 1776*0Sstevel@tonic-gateconsisting of all output is returned. In list context, a list of 1777*0Sstevel@tonic-gatevalues is returned, one per line of output. (You can set C<$/> to use 1778*0Sstevel@tonic-gatea different line terminator.) The command is executed each time the 1779*0Sstevel@tonic-gatepseudo-literal is evaluated. The status value of the command is 1780*0Sstevel@tonic-gatereturned in C<$?> (see L<perlvar> for the interpretation of C<$?>). 1781*0Sstevel@tonic-gateUnlike in B<csh>, no translation is done on the return data--newlines 1782*0Sstevel@tonic-gateremain newlines. Unlike in any of the shells, single quotes do not 1783*0Sstevel@tonic-gatehide variable names in the command from interpretation. To pass a 1784*0Sstevel@tonic-gateliteral dollar-sign through to the shell you need to hide it with a 1785*0Sstevel@tonic-gatebackslash. The generalized form of backticks is C<qx//>. (Because 1786*0Sstevel@tonic-gatebackticks always undergo shell expansion as well, see L<perlsec> for 1787*0Sstevel@tonic-gatesecurity concerns.) 1788*0Sstevel@tonic-gate 1789*0Sstevel@tonic-gateIn scalar context, evaluating a filehandle in angle brackets yields 1790*0Sstevel@tonic-gatethe next line from that file (the newline, if any, included), or 1791*0Sstevel@tonic-gateC<undef> at end-of-file or on error. When C<$/> is set to C<undef> 1792*0Sstevel@tonic-gate(sometimes known as file-slurp mode) and the file is empty, it 1793*0Sstevel@tonic-gatereturns C<''> the first time, followed by C<undef> subsequently. 1794*0Sstevel@tonic-gate 1795*0Sstevel@tonic-gateOrdinarily you must assign the returned value to a variable, but 1796*0Sstevel@tonic-gatethere is one situation where an automatic assignment happens. If 1797*0Sstevel@tonic-gateand only if the input symbol is the only thing inside the conditional 1798*0Sstevel@tonic-gateof a C<while> statement (even if disguised as a C<for(;;)> loop), 1799*0Sstevel@tonic-gatethe value is automatically assigned to the global variable $_, 1800*0Sstevel@tonic-gatedestroying whatever was there previously. (This may seem like an 1801*0Sstevel@tonic-gateodd thing to you, but you'll use the construct in almost every Perl 1802*0Sstevel@tonic-gatescript you write.) The $_ variable is not implicitly localized. 1803*0Sstevel@tonic-gateYou'll have to put a C<local $_;> before the loop if you want that 1804*0Sstevel@tonic-gateto happen. 1805*0Sstevel@tonic-gate 1806*0Sstevel@tonic-gateThe following lines are equivalent: 1807*0Sstevel@tonic-gate 1808*0Sstevel@tonic-gate while (defined($_ = <STDIN>)) { print; } 1809*0Sstevel@tonic-gate while ($_ = <STDIN>) { print; } 1810*0Sstevel@tonic-gate while (<STDIN>) { print; } 1811*0Sstevel@tonic-gate for (;<STDIN>;) { print; } 1812*0Sstevel@tonic-gate print while defined($_ = <STDIN>); 1813*0Sstevel@tonic-gate print while ($_ = <STDIN>); 1814*0Sstevel@tonic-gate print while <STDIN>; 1815*0Sstevel@tonic-gate 1816*0Sstevel@tonic-gateThis also behaves similarly, but avoids $_ : 1817*0Sstevel@tonic-gate 1818*0Sstevel@tonic-gate while (my $line = <STDIN>) { print $line } 1819*0Sstevel@tonic-gate 1820*0Sstevel@tonic-gateIn these loop constructs, the assigned value (whether assignment 1821*0Sstevel@tonic-gateis automatic or explicit) is then tested to see whether it is 1822*0Sstevel@tonic-gatedefined. The defined test avoids problems where line has a string 1823*0Sstevel@tonic-gatevalue that would be treated as false by Perl, for example a "" or 1824*0Sstevel@tonic-gatea "0" with no trailing newline. If you really mean for such values 1825*0Sstevel@tonic-gateto terminate the loop, they should be tested for explicitly: 1826*0Sstevel@tonic-gate 1827*0Sstevel@tonic-gate while (($_ = <STDIN>) ne '0') { ... } 1828*0Sstevel@tonic-gate while (<STDIN>) { last unless $_; ... } 1829*0Sstevel@tonic-gate 1830*0Sstevel@tonic-gateIn other boolean contexts, C<< <I<filehandle>> >> without an 1831*0Sstevel@tonic-gateexplicit C<defined> test or comparison elicit a warning if the 1832*0Sstevel@tonic-gateC<use warnings> pragma or the B<-w> 1833*0Sstevel@tonic-gatecommand-line switch (the C<$^W> variable) is in effect. 1834*0Sstevel@tonic-gate 1835*0Sstevel@tonic-gateThe filehandles STDIN, STDOUT, and STDERR are predefined. (The 1836*0Sstevel@tonic-gatefilehandles C<stdin>, C<stdout>, and C<stderr> will also work except 1837*0Sstevel@tonic-gatein packages, where they would be interpreted as local identifiers 1838*0Sstevel@tonic-gaterather than global.) Additional filehandles may be created with 1839*0Sstevel@tonic-gatethe open() function, amongst others. See L<perlopentut> and 1840*0Sstevel@tonic-gateL<perlfunc/open> for details on this. 1841*0Sstevel@tonic-gate 1842*0Sstevel@tonic-gateIf a <FILEHANDLE> is used in a context that is looking for 1843*0Sstevel@tonic-gatea list, a list comprising all input lines is returned, one line per 1844*0Sstevel@tonic-gatelist element. It's easy to grow to a rather large data space this 1845*0Sstevel@tonic-gateway, so use with care. 1846*0Sstevel@tonic-gate 1847*0Sstevel@tonic-gate<FILEHANDLE> may also be spelled C<readline(*FILEHANDLE)>. 1848*0Sstevel@tonic-gateSee L<perlfunc/readline>. 1849*0Sstevel@tonic-gate 1850*0Sstevel@tonic-gateThe null filehandle <> is special: it can be used to emulate the 1851*0Sstevel@tonic-gatebehavior of B<sed> and B<awk>. Input from <> comes either from 1852*0Sstevel@tonic-gatestandard input, or from each file listed on the command line. Here's 1853*0Sstevel@tonic-gatehow it works: the first time <> is evaluated, the @ARGV array is 1854*0Sstevel@tonic-gatechecked, and if it is empty, C<$ARGV[0]> is set to "-", which when opened 1855*0Sstevel@tonic-gategives you standard input. The @ARGV array is then processed as a list 1856*0Sstevel@tonic-gateof filenames. The loop 1857*0Sstevel@tonic-gate 1858*0Sstevel@tonic-gate while (<>) { 1859*0Sstevel@tonic-gate ... # code for each line 1860*0Sstevel@tonic-gate } 1861*0Sstevel@tonic-gate 1862*0Sstevel@tonic-gateis equivalent to the following Perl-like pseudo code: 1863*0Sstevel@tonic-gate 1864*0Sstevel@tonic-gate unshift(@ARGV, '-') unless @ARGV; 1865*0Sstevel@tonic-gate while ($ARGV = shift) { 1866*0Sstevel@tonic-gate open(ARGV, $ARGV); 1867*0Sstevel@tonic-gate while (<ARGV>) { 1868*0Sstevel@tonic-gate ... # code for each line 1869*0Sstevel@tonic-gate } 1870*0Sstevel@tonic-gate } 1871*0Sstevel@tonic-gate 1872*0Sstevel@tonic-gateexcept that it isn't so cumbersome to say, and will actually work. 1873*0Sstevel@tonic-gateIt really does shift the @ARGV array and put the current filename 1874*0Sstevel@tonic-gateinto the $ARGV variable. It also uses filehandle I<ARGV> 1875*0Sstevel@tonic-gateinternally--<> is just a synonym for <ARGV>, which 1876*0Sstevel@tonic-gateis magical. (The pseudo code above doesn't work because it treats 1877*0Sstevel@tonic-gate<ARGV> as non-magical.) 1878*0Sstevel@tonic-gate 1879*0Sstevel@tonic-gateYou can modify @ARGV before the first <> as long as the array ends up 1880*0Sstevel@tonic-gatecontaining the list of filenames you really want. Line numbers (C<$.>) 1881*0Sstevel@tonic-gatecontinue as though the input were one big happy file. See the example 1882*0Sstevel@tonic-gatein L<perlfunc/eof> for how to reset line numbers on each file. 1883*0Sstevel@tonic-gate 1884*0Sstevel@tonic-gateIf you want to set @ARGV to your own list of files, go right ahead. 1885*0Sstevel@tonic-gateThis sets @ARGV to all plain text files if no @ARGV was given: 1886*0Sstevel@tonic-gate 1887*0Sstevel@tonic-gate @ARGV = grep { -f && -T } glob('*') unless @ARGV; 1888*0Sstevel@tonic-gate 1889*0Sstevel@tonic-gateYou can even set them to pipe commands. For example, this automatically 1890*0Sstevel@tonic-gatefilters compressed arguments through B<gzip>: 1891*0Sstevel@tonic-gate 1892*0Sstevel@tonic-gate @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV; 1893*0Sstevel@tonic-gate 1894*0Sstevel@tonic-gateIf you want to pass switches into your script, you can use one of the 1895*0Sstevel@tonic-gateGetopts modules or put a loop on the front like this: 1896*0Sstevel@tonic-gate 1897*0Sstevel@tonic-gate while ($_ = $ARGV[0], /^-/) { 1898*0Sstevel@tonic-gate shift; 1899*0Sstevel@tonic-gate last if /^--$/; 1900*0Sstevel@tonic-gate if (/^-D(.*)/) { $debug = $1 } 1901*0Sstevel@tonic-gate if (/^-v/) { $verbose++ } 1902*0Sstevel@tonic-gate # ... # other switches 1903*0Sstevel@tonic-gate } 1904*0Sstevel@tonic-gate 1905*0Sstevel@tonic-gate while (<>) { 1906*0Sstevel@tonic-gate # ... # code for each line 1907*0Sstevel@tonic-gate } 1908*0Sstevel@tonic-gate 1909*0Sstevel@tonic-gateThe <> symbol will return C<undef> for end-of-file only once. 1910*0Sstevel@tonic-gateIf you call it again after this, it will assume you are processing another 1911*0Sstevel@tonic-gate@ARGV list, and if you haven't set @ARGV, will read input from STDIN. 1912*0Sstevel@tonic-gate 1913*0Sstevel@tonic-gateIf what the angle brackets contain is a simple scalar variable (e.g., 1914*0Sstevel@tonic-gate<$foo>), then that variable contains the name of the 1915*0Sstevel@tonic-gatefilehandle to input from, or its typeglob, or a reference to the 1916*0Sstevel@tonic-gatesame. For example: 1917*0Sstevel@tonic-gate 1918*0Sstevel@tonic-gate $fh = \*STDIN; 1919*0Sstevel@tonic-gate $line = <$fh>; 1920*0Sstevel@tonic-gate 1921*0Sstevel@tonic-gateIf what's within the angle brackets is neither a filehandle nor a simple 1922*0Sstevel@tonic-gatescalar variable containing a filehandle name, typeglob, or typeglob 1923*0Sstevel@tonic-gatereference, it is interpreted as a filename pattern to be globbed, and 1924*0Sstevel@tonic-gateeither a list of filenames or the next filename in the list is returned, 1925*0Sstevel@tonic-gatedepending on context. This distinction is determined on syntactic 1926*0Sstevel@tonic-gategrounds alone. That means C<< <$x> >> is always a readline() from 1927*0Sstevel@tonic-gatean indirect handle, but C<< <$hash{key}> >> is always a glob(). 1928*0Sstevel@tonic-gateThat's because $x is a simple scalar variable, but C<$hash{key}> is 1929*0Sstevel@tonic-gatenot--it's a hash element. 1930*0Sstevel@tonic-gate 1931*0Sstevel@tonic-gateOne level of double-quote interpretation is done first, but you can't 1932*0Sstevel@tonic-gatesay C<< <$foo> >> because that's an indirect filehandle as explained 1933*0Sstevel@tonic-gatein the previous paragraph. (In older versions of Perl, programmers 1934*0Sstevel@tonic-gatewould insert curly brackets to force interpretation as a filename glob: 1935*0Sstevel@tonic-gateC<< <${foo}> >>. These days, it's considered cleaner to call the 1936*0Sstevel@tonic-gateinternal function directly as C<glob($foo)>, which is probably the right 1937*0Sstevel@tonic-gateway to have done it in the first place.) For example: 1938*0Sstevel@tonic-gate 1939*0Sstevel@tonic-gate while (<*.c>) { 1940*0Sstevel@tonic-gate chmod 0644, $_; 1941*0Sstevel@tonic-gate } 1942*0Sstevel@tonic-gate 1943*0Sstevel@tonic-gateis roughly equivalent to: 1944*0Sstevel@tonic-gate 1945*0Sstevel@tonic-gate open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|"); 1946*0Sstevel@tonic-gate while (<FOO>) { 1947*0Sstevel@tonic-gate chomp; 1948*0Sstevel@tonic-gate chmod 0644, $_; 1949*0Sstevel@tonic-gate } 1950*0Sstevel@tonic-gate 1951*0Sstevel@tonic-gateexcept that the globbing is actually done internally using the standard 1952*0Sstevel@tonic-gateC<File::Glob> extension. Of course, the shortest way to do the above is: 1953*0Sstevel@tonic-gate 1954*0Sstevel@tonic-gate chmod 0644, <*.c>; 1955*0Sstevel@tonic-gate 1956*0Sstevel@tonic-gateA (file)glob evaluates its (embedded) argument only when it is 1957*0Sstevel@tonic-gatestarting a new list. All values must be read before it will start 1958*0Sstevel@tonic-gateover. In list context, this isn't important because you automatically 1959*0Sstevel@tonic-gateget them all anyway. However, in scalar context the operator returns 1960*0Sstevel@tonic-gatethe next value each time it's called, or C<undef> when the list has 1961*0Sstevel@tonic-gaterun out. As with filehandle reads, an automatic C<defined> is 1962*0Sstevel@tonic-gategenerated when the glob occurs in the test part of a C<while>, 1963*0Sstevel@tonic-gatebecause legal glob returns (e.g. a file called F<0>) would otherwise 1964*0Sstevel@tonic-gateterminate the loop. Again, C<undef> is returned only once. So if 1965*0Sstevel@tonic-gateyou're expecting a single value from a glob, it is much better to 1966*0Sstevel@tonic-gatesay 1967*0Sstevel@tonic-gate 1968*0Sstevel@tonic-gate ($file) = <blurch*>; 1969*0Sstevel@tonic-gate 1970*0Sstevel@tonic-gatethan 1971*0Sstevel@tonic-gate 1972*0Sstevel@tonic-gate $file = <blurch*>; 1973*0Sstevel@tonic-gate 1974*0Sstevel@tonic-gatebecause the latter will alternate between returning a filename and 1975*0Sstevel@tonic-gatereturning false. 1976*0Sstevel@tonic-gate 1977*0Sstevel@tonic-gateIf you're trying to do variable interpolation, it's definitely better 1978*0Sstevel@tonic-gateto use the glob() function, because the older notation can cause people 1979*0Sstevel@tonic-gateto become confused with the indirect filehandle notation. 1980*0Sstevel@tonic-gate 1981*0Sstevel@tonic-gate @files = glob("$dir/*.[ch]"); 1982*0Sstevel@tonic-gate @files = glob($files[$i]); 1983*0Sstevel@tonic-gate 1984*0Sstevel@tonic-gate=head2 Constant Folding 1985*0Sstevel@tonic-gate 1986*0Sstevel@tonic-gateLike C, Perl does a certain amount of expression evaluation at 1987*0Sstevel@tonic-gatecompile time whenever it determines that all arguments to an 1988*0Sstevel@tonic-gateoperator are static and have no side effects. In particular, string 1989*0Sstevel@tonic-gateconcatenation happens at compile time between literals that don't do 1990*0Sstevel@tonic-gatevariable substitution. Backslash interpolation also happens at 1991*0Sstevel@tonic-gatecompile time. You can say 1992*0Sstevel@tonic-gate 1993*0Sstevel@tonic-gate 'Now is the time for all' . "\n" . 1994*0Sstevel@tonic-gate 'good men to come to.' 1995*0Sstevel@tonic-gate 1996*0Sstevel@tonic-gateand this all reduces to one string internally. Likewise, if 1997*0Sstevel@tonic-gateyou say 1998*0Sstevel@tonic-gate 1999*0Sstevel@tonic-gate foreach $file (@filenames) { 2000*0Sstevel@tonic-gate if (-s $file > 5 + 100 * 2**16) { } 2001*0Sstevel@tonic-gate } 2002*0Sstevel@tonic-gate 2003*0Sstevel@tonic-gatethe compiler will precompute the number which that expression 2004*0Sstevel@tonic-gaterepresents so that the interpreter won't have to. 2005*0Sstevel@tonic-gate 2006*0Sstevel@tonic-gate=head2 Bitwise String Operators 2007*0Sstevel@tonic-gate 2008*0Sstevel@tonic-gateBitstrings of any size may be manipulated by the bitwise operators 2009*0Sstevel@tonic-gate(C<~ | & ^>). 2010*0Sstevel@tonic-gate 2011*0Sstevel@tonic-gateIf the operands to a binary bitwise op are strings of different 2012*0Sstevel@tonic-gatesizes, B<|> and B<^> ops act as though the shorter operand had 2013*0Sstevel@tonic-gateadditional zero bits on the right, while the B<&> op acts as though 2014*0Sstevel@tonic-gatethe longer operand were truncated to the length of the shorter. 2015*0Sstevel@tonic-gateThe granularity for such extension or truncation is one or more 2016*0Sstevel@tonic-gatebytes. 2017*0Sstevel@tonic-gate 2018*0Sstevel@tonic-gate # ASCII-based examples 2019*0Sstevel@tonic-gate print "j p \n" ^ " a h"; # prints "JAPH\n" 2020*0Sstevel@tonic-gate print "JA" | " ph\n"; # prints "japh\n" 2021*0Sstevel@tonic-gate print "japh\nJunk" & '_____'; # prints "JAPH\n"; 2022*0Sstevel@tonic-gate print 'p N$' ^ " E<H\n"; # prints "Perl\n"; 2023*0Sstevel@tonic-gate 2024*0Sstevel@tonic-gateIf you are intending to manipulate bitstrings, be certain that 2025*0Sstevel@tonic-gateyou're supplying bitstrings: If an operand is a number, that will imply 2026*0Sstevel@tonic-gatea B<numeric> bitwise operation. You may explicitly show which type of 2027*0Sstevel@tonic-gateoperation you intend by using C<""> or C<0+>, as in the examples below. 2028*0Sstevel@tonic-gate 2029*0Sstevel@tonic-gate $foo = 150 | 105 ; # yields 255 (0x96 | 0x69 is 0xFF) 2030*0Sstevel@tonic-gate $foo = '150' | 105 ; # yields 255 2031*0Sstevel@tonic-gate $foo = 150 | '105'; # yields 255 2032*0Sstevel@tonic-gate $foo = '150' | '105'; # yields string '155' (under ASCII) 2033*0Sstevel@tonic-gate 2034*0Sstevel@tonic-gate $baz = 0+$foo & 0+$bar; # both ops explicitly numeric 2035*0Sstevel@tonic-gate $biz = "$foo" ^ "$bar"; # both ops explicitly stringy 2036*0Sstevel@tonic-gate 2037*0Sstevel@tonic-gateSee L<perlfunc/vec> for information on how to manipulate individual bits 2038*0Sstevel@tonic-gatein a bit vector. 2039*0Sstevel@tonic-gate 2040*0Sstevel@tonic-gate=head2 Integer Arithmetic 2041*0Sstevel@tonic-gate 2042*0Sstevel@tonic-gateBy default, Perl assumes that it must do most of its arithmetic in 2043*0Sstevel@tonic-gatefloating point. But by saying 2044*0Sstevel@tonic-gate 2045*0Sstevel@tonic-gate use integer; 2046*0Sstevel@tonic-gate 2047*0Sstevel@tonic-gateyou may tell the compiler that it's okay to use integer operations 2048*0Sstevel@tonic-gate(if it feels like it) from here to the end of the enclosing BLOCK. 2049*0Sstevel@tonic-gateAn inner BLOCK may countermand this by saying 2050*0Sstevel@tonic-gate 2051*0Sstevel@tonic-gate no integer; 2052*0Sstevel@tonic-gate 2053*0Sstevel@tonic-gatewhich lasts until the end of that BLOCK. Note that this doesn't 2054*0Sstevel@tonic-gatemean everything is only an integer, merely that Perl may use integer 2055*0Sstevel@tonic-gateoperations if it is so inclined. For example, even under C<use 2056*0Sstevel@tonic-gateinteger>, if you take the C<sqrt(2)>, you'll still get C<1.4142135623731> 2057*0Sstevel@tonic-gateor so. 2058*0Sstevel@tonic-gate 2059*0Sstevel@tonic-gateUsed on numbers, the bitwise operators ("&", "|", "^", "~", "<<", 2060*0Sstevel@tonic-gateand ">>") always produce integral results. (But see also 2061*0Sstevel@tonic-gateL<Bitwise String Operators>.) However, C<use integer> still has meaning for 2062*0Sstevel@tonic-gatethem. By default, their results are interpreted as unsigned integers, but 2063*0Sstevel@tonic-gateif C<use integer> is in effect, their results are interpreted 2064*0Sstevel@tonic-gateas signed integers. For example, C<~0> usually evaluates to a large 2065*0Sstevel@tonic-gateintegral value. However, C<use integer; ~0> is C<-1> on twos-complement 2066*0Sstevel@tonic-gatemachines. 2067*0Sstevel@tonic-gate 2068*0Sstevel@tonic-gate=head2 Floating-point Arithmetic 2069*0Sstevel@tonic-gate 2070*0Sstevel@tonic-gateWhile C<use integer> provides integer-only arithmetic, there is no 2071*0Sstevel@tonic-gateanalogous mechanism to provide automatic rounding or truncation to a 2072*0Sstevel@tonic-gatecertain number of decimal places. For rounding to a certain number 2073*0Sstevel@tonic-gateof digits, sprintf() or printf() is usually the easiest route. 2074*0Sstevel@tonic-gateSee L<perlfaq4>. 2075*0Sstevel@tonic-gate 2076*0Sstevel@tonic-gateFloating-point numbers are only approximations to what a mathematician 2077*0Sstevel@tonic-gatewould call real numbers. There are infinitely more reals than floats, 2078*0Sstevel@tonic-gateso some corners must be cut. For example: 2079*0Sstevel@tonic-gate 2080*0Sstevel@tonic-gate printf "%.20g\n", 123456789123456789; 2081*0Sstevel@tonic-gate # produces 123456789123456784 2082*0Sstevel@tonic-gate 2083*0Sstevel@tonic-gateTesting for exact equality of floating-point equality or inequality is 2084*0Sstevel@tonic-gatenot a good idea. Here's a (relatively expensive) work-around to compare 2085*0Sstevel@tonic-gatewhether two floating-point numbers are equal to a particular number of 2086*0Sstevel@tonic-gatedecimal places. See Knuth, volume II, for a more robust treatment of 2087*0Sstevel@tonic-gatethis topic. 2088*0Sstevel@tonic-gate 2089*0Sstevel@tonic-gate sub fp_equal { 2090*0Sstevel@tonic-gate my ($X, $Y, $POINTS) = @_; 2091*0Sstevel@tonic-gate my ($tX, $tY); 2092*0Sstevel@tonic-gate $tX = sprintf("%.${POINTS}g", $X); 2093*0Sstevel@tonic-gate $tY = sprintf("%.${POINTS}g", $Y); 2094*0Sstevel@tonic-gate return $tX eq $tY; 2095*0Sstevel@tonic-gate } 2096*0Sstevel@tonic-gate 2097*0Sstevel@tonic-gateThe POSIX module (part of the standard perl distribution) implements 2098*0Sstevel@tonic-gateceil(), floor(), and other mathematical and trigonometric functions. 2099*0Sstevel@tonic-gateThe Math::Complex module (part of the standard perl distribution) 2100*0Sstevel@tonic-gatedefines mathematical functions that work on both the reals and the 2101*0Sstevel@tonic-gateimaginary numbers. Math::Complex not as efficient as POSIX, but 2102*0Sstevel@tonic-gatePOSIX can't work with complex numbers. 2103*0Sstevel@tonic-gate 2104*0Sstevel@tonic-gateRounding in financial applications can have serious implications, and 2105*0Sstevel@tonic-gatethe rounding method used should be specified precisely. In these 2106*0Sstevel@tonic-gatecases, it probably pays not to trust whichever system rounding is 2107*0Sstevel@tonic-gatebeing used by Perl, but to instead implement the rounding function you 2108*0Sstevel@tonic-gateneed yourself. 2109*0Sstevel@tonic-gate 2110*0Sstevel@tonic-gate=head2 Bigger Numbers 2111*0Sstevel@tonic-gate 2112*0Sstevel@tonic-gateThe standard Math::BigInt and Math::BigFloat modules provide 2113*0Sstevel@tonic-gatevariable-precision arithmetic and overloaded operators, although 2114*0Sstevel@tonic-gatethey're currently pretty slow. At the cost of some space and 2115*0Sstevel@tonic-gateconsiderable speed, they avoid the normal pitfalls associated with 2116*0Sstevel@tonic-gatelimited-precision representations. 2117*0Sstevel@tonic-gate 2118*0Sstevel@tonic-gate use Math::BigInt; 2119*0Sstevel@tonic-gate $x = Math::BigInt->new('123456789123456789'); 2120*0Sstevel@tonic-gate print $x * $x; 2121*0Sstevel@tonic-gate 2122*0Sstevel@tonic-gate # prints +15241578780673678515622620750190521 2123*0Sstevel@tonic-gate 2124*0Sstevel@tonic-gateThere are several modules that let you calculate with (bound only by 2125*0Sstevel@tonic-gatememory and cpu-time) unlimited or fixed precision. There are also 2126*0Sstevel@tonic-gatesome non-standard modules that provide faster implementations via 2127*0Sstevel@tonic-gateexternal C libraries. 2128*0Sstevel@tonic-gate 2129*0Sstevel@tonic-gateHere is a short, but incomplete summary: 2130*0Sstevel@tonic-gate 2131*0Sstevel@tonic-gate Math::Fraction big, unlimited fractions like 9973 / 12967 2132*0Sstevel@tonic-gate Math::String treat string sequences like numbers 2133*0Sstevel@tonic-gate Math::FixedPrecision calculate with a fixed precision 2134*0Sstevel@tonic-gate Math::Currency for currency calculations 2135*0Sstevel@tonic-gate Bit::Vector manipulate bit vectors fast (uses C) 2136*0Sstevel@tonic-gate Math::BigIntFast Bit::Vector wrapper for big numbers 2137*0Sstevel@tonic-gate Math::Pari provides access to the Pari C library 2138*0Sstevel@tonic-gate Math::BigInteger uses an external C library 2139*0Sstevel@tonic-gate Math::Cephes uses external Cephes C library (no big numbers) 2140*0Sstevel@tonic-gate Math::Cephes::Fraction fractions via the Cephes library 2141*0Sstevel@tonic-gate Math::GMP another one using an external C library 2142*0Sstevel@tonic-gate 2143*0Sstevel@tonic-gateChoose wisely. 2144*0Sstevel@tonic-gate 2145*0Sstevel@tonic-gate=cut 2146