xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/pod/perlop.pod (revision 0:68f95e015346)
1=head1 NAME
2
3perlop - Perl operators and precedence
4
5=head1 DESCRIPTION
6
7=head2 Operator Precedence and Associativity
8
9Operator precedence and associativity work in Perl more or less like
10they do in mathematics.
11
12I<Operator precedence> means some operators are evaluated before
13others.  For example, in C<2 + 4 * 5>, the multiplication has higher
14precedence so C<4 * 5> is evaluated first yielding C<2 + 20 ==
1522> and not C<6 * 5 == 30>.
16
17I<Operator associativity> defines what happens if a sequence of the
18same operators is used one after another: whether the evaluator will
19evaluate the left operations first or the right.  For example, in C<8
20- 4 - 2>, subtraction is left associative so Perl evaluates the
21expression left to right.  C<8 - 4> is evaluated first making the
22expression C<4 - 2 == 2> and not C<8 - 2 == 6>.
23
24Perl operators have the following associativity and precedence,
25listed from highest precedence to lowest.  Operators borrowed from
26C keep the same precedence relationship with each other, even where
27C's precedence is slightly screwy.  (This makes learning Perl easier
28for C folks.)  With very few exceptions, these all operate on scalar
29values only, not array values.
30
31    left	terms and list operators (leftward)
32    left	->
33    nonassoc	++ --
34    right	**
35    right	! ~ \ and unary + and -
36    left	=~ !~
37    left	* / % x
38    left	+ - .
39    left	<< >>
40    nonassoc	named unary operators
41    nonassoc	< > <= >= lt gt le ge
42    nonassoc	== != <=> eq ne cmp
43    left	&
44    left	| ^
45    left	&&
46    left	||
47    nonassoc	..  ...
48    right	?:
49    right	= += -= *= etc.
50    left	, =>
51    nonassoc	list operators (rightward)
52    right	not
53    left	and
54    left	or xor
55
56In the following sections, these operators are covered in precedence order.
57
58Many operators can be overloaded for objects.  See L<overload>.
59
60=head2 Terms and List Operators (Leftward)
61
62A TERM has the highest precedence in Perl.  They include variables,
63quote and quote-like operators, any expression in parentheses,
64and any function whose arguments are parenthesized.  Actually, there
65aren't really functions in this sense, just list operators and unary
66operators behaving as functions because you put parentheses around
67the arguments.  These are all documented in L<perlfunc>.
68
69If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
70is followed by a left parenthesis as the next token, the operator and
71arguments within parentheses are taken to be of highest precedence,
72just like a normal function call.
73
74In the absence of parentheses, the precedence of list operators such as
75C<print>, C<sort>, or C<chmod> is either very high or very low depending on
76whether you are looking at the left side or the right side of the operator.
77For example, in
78
79    @ary = (1, 3, sort 4, 2);
80    print @ary;		# prints 1324
81
82the commas on the right of the sort are evaluated before the sort,
83but the commas on the left are evaluated after.  In other words,
84list operators tend to gobble up all arguments that follow, and
85then act like a simple TERM with regard to the preceding expression.
86Be careful with parentheses:
87
88    # These evaluate exit before doing the print:
89    print($foo, exit);	# Obviously not what you want.
90    print $foo, exit;	# Nor is this.
91
92    # These do the print before evaluating exit:
93    (print $foo), exit;	# This is what you want.
94    print($foo), exit;	# Or this.
95    print ($foo), exit;	# Or even this.
96
97Also note that
98
99    print ($foo & 255) + 1, "\n";
100
101probably doesn't do what you expect at first glance.  The parentheses
102enclose the argument list for C<print> which is evaluated (printing
103the result of C<$foo & 255>).  Then one is added to the return value
104of C<print> (usually 1).  The result is something like this:
105
106    1 + 1, "\n";    # Obviously not what you meant.
107
108To do what you meant properly, you must write:
109
110    print(($foo & 255) + 1, "\n");
111
112See L<Named Unary Operators> for more discussion of this.
113
114Also parsed as terms are the C<do {}> and C<eval {}> constructs, as
115well as subroutine and method calls, and the anonymous
116constructors C<[]> and C<{}>.
117
118See also L<Quote and Quote-like Operators> toward the end of this section,
119as well as L<"I/O Operators">.
120
121=head2 The Arrow Operator
122
123"C<< -> >>" is an infix dereference operator, just as it is in C
124and C++.  If the right side is either a C<[...]>, C<{...}>, or a
125C<(...)> subscript, then the left side must be either a hard or
126symbolic reference to an array, a hash, or a subroutine respectively.
127(Or technically speaking, a location capable of holding a hard
128reference, if it's an array or hash reference being used for
129assignment.)  See L<perlreftut> and L<perlref>.
130
131Otherwise, the right side is a method name or a simple scalar
132variable containing either the method name or a subroutine reference,
133and the left side must be either an object (a blessed reference)
134or a class name (that is, a package name).  See L<perlobj>.
135
136=head2 Auto-increment and Auto-decrement
137
138"++" and "--" work as in C.  That is, if placed before a variable,
139they increment or decrement the variable by one before returning the
140value, and if placed after, increment or decrement after returning the
141value.
142
143    $i = 0;  $j = 0;
144    print $i++;  # prints 0
145    print ++$j;  # prints 1
146
147The auto-increment operator has a little extra builtin magic to it.  If
148you increment a variable that is numeric, or that has ever been used in
149a numeric context, you get a normal increment.  If, however, the
150variable has been used in only string contexts since it was set, and
151has a value that is not the empty string and matches the pattern
152C</^[a-zA-Z]*[0-9]*\z/>, the increment is done as a string, preserving each
153character within its range, with carry:
154
155    print ++($foo = '99');	# prints '100'
156    print ++($foo = 'a0');	# prints 'a1'
157    print ++($foo = 'Az');	# prints 'Ba'
158    print ++($foo = 'zz');	# prints 'aaa'
159
160C<undef> is always treated as numeric, and in particular is changed
161to C<0> before incrementing (so that a post-increment of an undef value
162will return C<0> rather than C<undef>).
163
164The auto-decrement operator is not magical.
165
166=head2 Exponentiation
167
168Binary "**" is the exponentiation operator.  It binds even more
169tightly than unary minus, so -2**4 is -(2**4), not (-2)**4. (This is
170implemented using C's pow(3) function, which actually works on doubles
171internally.)
172
173=head2 Symbolic Unary Operators
174
175Unary "!" performs logical negation, i.e., "not".  See also C<not> for a lower
176precedence version of this.
177
178Unary "-" performs arithmetic negation if the operand is numeric.  If
179the operand is an identifier, a string consisting of a minus sign
180concatenated with the identifier is returned.  Otherwise, if the string
181starts with a plus or minus, a string starting with the opposite sign
182is returned.  One effect of these rules is that C<-bareword> is equivalent
183to C<"-bareword">.
184
185Unary "~" performs bitwise negation, i.e., 1's complement.  For
186example, C<0666 & ~027> is 0640.  (See also L<Integer Arithmetic> and
187L<Bitwise String Operators>.)  Note that the width of the result is
188platform-dependent: ~0 is 32 bits wide on a 32-bit platform, but 64
189bits wide on a 64-bit platform, so if you are expecting a certain bit
190width, remember to use the & operator to mask off the excess bits.
191
192Unary "+" has no effect whatsoever, even on strings.  It is useful
193syntactically for separating a function name from a parenthesized expression
194that would otherwise be interpreted as the complete list of function
195arguments.  (See examples above under L<Terms and List Operators (Leftward)>.)
196
197Unary "\" creates a reference to whatever follows it.  See L<perlreftut>
198and L<perlref>.  Do not confuse this behavior with the behavior of
199backslash within a string, although both forms do convey the notion
200of protecting the next thing from interpolation.
201
202=head2 Binding Operators
203
204Binary "=~" binds a scalar expression to a pattern match.  Certain operations
205search or modify the string $_ by default.  This operator makes that kind
206of operation work on some other string.  The right argument is a search
207pattern, substitution, or transliteration.  The left argument is what is
208supposed to be searched, substituted, or transliterated instead of the default
209$_.  When used in scalar context, the return value generally indicates the
210success of the operation.  Behavior in list context depends on the particular
211operator.  See L</"Regexp Quote-Like Operators"> for details.
212
213If the right argument is an expression rather than a search pattern,
214substitution, or transliteration, it is interpreted as a search pattern at run
215time.
216
217Binary "!~" is just like "=~" except the return value is negated in
218the logical sense.
219
220=head2 Multiplicative Operators
221
222Binary "*" multiplies two numbers.
223
224Binary "/" divides two numbers.
225
226Binary "%" computes the modulus of two numbers.  Given integer
227operands C<$a> and C<$b>: If C<$b> is positive, then C<$a % $b> is
228C<$a> minus the largest multiple of C<$b> that is not greater than
229C<$a>.  If C<$b> is negative, then C<$a % $b> is C<$a> minus the
230smallest multiple of C<$b> that is not less than C<$a> (i.e. the
231result will be less than or equal to zero).
232Note that when C<use integer> is in scope, "%" gives you direct access
233to the modulus operator as implemented by your C compiler.  This
234operator is not as well defined for negative operands, but it will
235execute faster.
236
237Binary "x" is the repetition operator.  In scalar context or if the left
238operand is not enclosed in parentheses, it returns a string consisting
239of the left operand repeated the number of times specified by the right
240operand.  In list context, if the left operand is enclosed in
241parentheses, it repeats the list.  If the right operand is zero or
242negative, it returns an empty string or an empty list, depending on the
243context.
244
245    print '-' x 80;		# print row of dashes
246
247    print "\t" x ($tab/8), ' ' x ($tab%8);	# tab over
248
249    @ones = (1) x 80;		# a list of 80 1's
250    @ones = (5) x @ones;	# set all elements to 5
251
252
253=head2 Additive Operators
254
255Binary "+" returns the sum of two numbers.
256
257Binary "-" returns the difference of two numbers.
258
259Binary "." concatenates two strings.
260
261=head2 Shift Operators
262
263Binary "<<" returns the value of its left argument shifted left by the
264number of bits specified by the right argument.  Arguments should be
265integers.  (See also L<Integer Arithmetic>.)
266
267Binary ">>" returns the value of its left argument shifted right by
268the number of bits specified by the right argument.  Arguments should
269be integers.  (See also L<Integer Arithmetic>.)
270
271Note that both "<<" and ">>" in Perl are implemented directly using
272"<<" and ">>" in C.  If C<use integer> (see L<Integer Arithmetic>) is
273in force then signed C integers are used, else unsigned C integers are
274used.  Either way, the implementation isn't going to generate results
275larger than the size of the integer type Perl was built with (32 bits
276or 64 bits).
277
278The result of overflowing the range of the integers is undefined
279because it is undefined also in C.  In other words, using 32-bit
280integers, C<< 1 << 32 >> is undefined.  Shifting by a negative number
281of bits is also undefined.
282
283=head2 Named Unary Operators
284
285The various named unary operators are treated as functions with one
286argument, with optional parentheses.
287
288If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
289is followed by a left parenthesis as the next token, the operator and
290arguments within parentheses are taken to be of highest precedence,
291just like a normal function call.  For example,
292because named unary operators are higher precedence than ||:
293
294    chdir $foo    || die;	# (chdir $foo) || die
295    chdir($foo)   || die;	# (chdir $foo) || die
296    chdir ($foo)  || die;	# (chdir $foo) || die
297    chdir +($foo) || die;	# (chdir $foo) || die
298
299but, because * is higher precedence than named operators:
300
301    chdir $foo * 20;	# chdir ($foo * 20)
302    chdir($foo) * 20;	# (chdir $foo) * 20
303    chdir ($foo) * 20;	# (chdir $foo) * 20
304    chdir +($foo) * 20;	# chdir ($foo * 20)
305
306    rand 10 * 20;	# rand (10 * 20)
307    rand(10) * 20;	# (rand 10) * 20
308    rand (10) * 20;	# (rand 10) * 20
309    rand +(10) * 20;	# rand (10 * 20)
310
311Regarding precedence, the filetest operators, like C<-f>, C<-M>, etc. are
312treated like named unary operators, but they don't follow this functional
313parenthesis rule.  That means, for example, that C<-f($file).".bak"> is
314equivalent to C<-f "$file.bak">.
315
316See also L<"Terms and List Operators (Leftward)">.
317
318=head2 Relational Operators
319
320Binary "<" returns true if the left argument is numerically less than
321the right argument.
322
323Binary ">" returns true if the left argument is numerically greater
324than the right argument.
325
326Binary "<=" returns true if the left argument is numerically less than
327or equal to the right argument.
328
329Binary ">=" returns true if the left argument is numerically greater
330than or equal to the right argument.
331
332Binary "lt" returns true if the left argument is stringwise less than
333the right argument.
334
335Binary "gt" returns true if the left argument is stringwise greater
336than the right argument.
337
338Binary "le" returns true if the left argument is stringwise less than
339or equal to the right argument.
340
341Binary "ge" returns true if the left argument is stringwise greater
342than or equal to the right argument.
343
344=head2 Equality Operators
345
346Binary "==" returns true if the left argument is numerically equal to
347the right argument.
348
349Binary "!=" returns true if the left argument is numerically not equal
350to the right argument.
351
352Binary "<=>" returns -1, 0, or 1 depending on whether the left
353argument is numerically less than, equal to, or greater than the right
354argument.  If your platform supports NaNs (not-a-numbers) as numeric
355values, using them with "<=>" returns undef.  NaN is not "<", "==", ">",
356"<=" or ">=" anything (even NaN), so those 5 return false. NaN != NaN
357returns true, as does NaN != anything else. If your platform doesn't
358support NaNs then NaN is just a string with numeric value 0.
359
360    perl -le '$a = NaN; print "No NaN support here" if $a == $a'
361    perl -le '$a = NaN; print "NaN support here" if $a != $a'
362
363Binary "eq" returns true if the left argument is stringwise equal to
364the right argument.
365
366Binary "ne" returns true if the left argument is stringwise not equal
367to the right argument.
368
369Binary "cmp" returns -1, 0, or 1 depending on whether the left
370argument is stringwise less than, equal to, or greater than the right
371argument.
372
373"lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified
374by the current locale if C<use locale> is in effect.  See L<perllocale>.
375
376=head2 Bitwise And
377
378Binary "&" returns its operands ANDed together bit by bit.
379(See also L<Integer Arithmetic> and L<Bitwise String Operators>.)
380
381Note that "&" has lower priority than relational operators, so for example
382the brackets are essential in a test like
383
384	print "Even\n" if ($x & 1) == 0;
385
386=head2 Bitwise Or and Exclusive Or
387
388Binary "|" returns its operands ORed together bit by bit.
389(See also L<Integer Arithmetic> and L<Bitwise String Operators>.)
390
391Binary "^" returns its operands XORed together bit by bit.
392(See also L<Integer Arithmetic> and L<Bitwise String Operators>.)
393
394Note that "|" and "^" have lower priority than relational operators, so
395for example the brackets are essential in a test like
396
397	print "false\n" if (8 | 2) != 10;
398
399=head2 C-style Logical And
400
401Binary "&&" performs a short-circuit logical AND operation.  That is,
402if the left operand is false, the right operand is not even evaluated.
403Scalar or list context propagates down to the right operand if it
404is evaluated.
405
406=head2 C-style Logical Or
407
408Binary "||" performs a short-circuit logical OR operation.  That is,
409if the left operand is true, the right operand is not even evaluated.
410Scalar or list context propagates down to the right operand if it
411is evaluated.
412
413The C<||> and C<&&> operators return the last value evaluated
414(unlike C's C<||> and C<&&>, which return 0 or 1). Thus, a reasonably
415portable way to find out the home directory might be:
416
417    $home = $ENV{'HOME'} || $ENV{'LOGDIR'} ||
418	(getpwuid($<))[7] || die "You're homeless!\n";
419
420In particular, this means that you shouldn't use this
421for selecting between two aggregates for assignment:
422
423    @a = @b || @c;		# this is wrong
424    @a = scalar(@b) || @c;	# really meant this
425    @a = @b ? @b : @c;		# this works fine, though
426
427As more readable alternatives to C<&&> and C<||> when used for
428control flow, Perl provides C<and> and C<or> operators (see below).
429The short-circuit behavior is identical.  The precedence of "and" and
430"or" is much lower, however, so that you can safely use them after a
431list operator without the need for parentheses:
432
433    unlink "alpha", "beta", "gamma"
434	    or gripe(), next LINE;
435
436With the C-style operators that would have been written like this:
437
438    unlink("alpha", "beta", "gamma")
439	    || (gripe(), next LINE);
440
441Using "or" for assignment is unlikely to do what you want; see below.
442
443=head2 Range Operators
444
445Binary ".." is the range operator, which is really two different
446operators depending on the context.  In list context, it returns a
447list of values counting (up by ones) from the left value to the right
448value.  If the left value is greater than the right value then it
449returns the empty list.  The range operator is useful for writing
450C<foreach (1..10)> loops and for doing slice operations on arrays. In
451the current implementation, no temporary array is created when the
452range operator is used as the expression in C<foreach> loops, but older
453versions of Perl might burn a lot of memory when you write something
454like this:
455
456    for (1 .. 1_000_000) {
457	# code
458    }
459
460The range operator also works on strings, using the magical auto-increment,
461see below.
462
463In scalar context, ".." returns a boolean value.  The operator is
464bistable, like a flip-flop, and emulates the line-range (comma) operator
465of B<sed>, B<awk>, and various editors.  Each ".." operator maintains its
466own boolean state.  It is false as long as its left operand is false.
467Once the left operand is true, the range operator stays true until the
468right operand is true, I<AFTER> which the range operator becomes false
469again.  It doesn't become false till the next time the range operator is
470evaluated.  It can test the right operand and become false on the same
471evaluation it became true (as in B<awk>), but it still returns true once.
472If you don't want it to test the right operand till the next
473evaluation, as in B<sed>, just use three dots ("...") instead of
474two.  In all other regards, "..." behaves just like ".." does.
475
476The right operand is not evaluated while the operator is in the
477"false" state, and the left operand is not evaluated while the
478operator is in the "true" state.  The precedence is a little lower
479than || and &&.  The value returned is either the empty string for
480false, or a sequence number (beginning with 1) for true.  The
481sequence number is reset for each range encountered.  The final
482sequence number in a range has the string "E0" appended to it, which
483doesn't affect its numeric value, but gives you something to search
484for if you want to exclude the endpoint.  You can exclude the
485beginning point by waiting for the sequence number to be greater
486than 1.
487
488If either operand of scalar ".." is a constant expression,
489that operand is considered true if it is equal (C<==>) to the current
490input line number (the C<$.> variable).
491
492To be pedantic, the comparison is actually C<int(EXPR) == int(EXPR)>,
493but that is only an issue if you use a floating point expression; when
494implicitly using C<$.> as described in the previous paragraph, the
495comparison is C<int(EXPR) == int($.)> which is only an issue when C<$.>
496is set to a floating point value and you are not reading from a file.
497Furthermore, C<"span" .. "spat"> or C<2.18 .. 3.14> will not do what
498you want in scalar context because each of the operands are evaluated
499using their integer representation.
500
501Examples:
502
503As a scalar operator:
504
505    if (101 .. 200) { print; } # print 2nd hundred lines, short for
506                               #   if ($. == 101 .. $. == 200) ...
507    next line if (1 .. /^$/);  # skip header lines, short for
508                               #   ... if ($. == 1 .. /^$/);
509    s/^/> / if (/^$/ .. eof());	# quote body
510
511    # parse mail messages
512    while (<>) {
513        $in_header =   1  .. /^$/;
514        $in_body   = /^$/ .. eof;
515        if ($in_header) {
516            # ...
517        } else { # in body
518            # ...
519        }
520    } continue {
521        close ARGV if eof;             # reset $. each file
522    }
523
524As a list operator:
525
526    for (101 .. 200) { print; }	# print $_ 100 times
527    @foo = @foo[0 .. $#foo];	# an expensive no-op
528    @foo = @foo[$#foo-4 .. $#foo];	# slice last 5 items
529
530The range operator (in list context) makes use of the magical
531auto-increment algorithm if the operands are strings.  You
532can say
533
534    @alphabet = ('A' .. 'Z');
535
536to get all normal letters of the English alphabet, or
537
538    $hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];
539
540to get a hexadecimal digit, or
541
542    @z2 = ('01' .. '31');  print $z2[$mday];
543
544to get dates with leading zeros.  If the final value specified is not
545in the sequence that the magical increment would produce, the sequence
546goes until the next value would be longer than the final value
547specified.
548
549Because each operand is evaluated in integer form, C<2.18 .. 3.14> will
550return two elements in list context.
551
552    @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
553
554=head2 Conditional Operator
555
556Ternary "?:" is the conditional operator, just as in C.  It works much
557like an if-then-else.  If the argument before the ? is true, the
558argument before the : is returned, otherwise the argument after the :
559is returned.  For example:
560
561    printf "I have %d dog%s.\n", $n,
562	    ($n == 1) ? '' : "s";
563
564Scalar or list context propagates downward into the 2nd
565or 3rd argument, whichever is selected.
566
567    $a = $ok ? $b : $c;  # get a scalar
568    @a = $ok ? @b : @c;  # get an array
569    $a = $ok ? @b : @c;  # oops, that's just a count!
570
571The operator may be assigned to if both the 2nd and 3rd arguments are
572legal lvalues (meaning that you can assign to them):
573
574    ($a_or_b ? $a : $b) = $c;
575
576Because this operator produces an assignable result, using assignments
577without parentheses will get you in trouble.  For example, this:
578
579    $a % 2 ? $a += 10 : $a += 2
580
581Really means this:
582
583    (($a % 2) ? ($a += 10) : $a) += 2
584
585Rather than this:
586
587    ($a % 2) ? ($a += 10) : ($a += 2)
588
589That should probably be written more simply as:
590
591    $a += ($a % 2) ? 10 : 2;
592
593=head2 Assignment Operators
594
595"=" is the ordinary assignment operator.
596
597Assignment operators work as in C.  That is,
598
599    $a += 2;
600
601is equivalent to
602
603    $a = $a + 2;
604
605although without duplicating any side effects that dereferencing the lvalue
606might trigger, such as from tie().  Other assignment operators work similarly.
607The following are recognized:
608
609    **=    +=    *=    &=    <<=    &&=
610           -=    /=    |=    >>=    ||=
611           .=    %=    ^=
612	         x=
613
614Although these are grouped by family, they all have the precedence
615of assignment.
616
617Unlike in C, the scalar assignment operator produces a valid lvalue.
618Modifying an assignment is equivalent to doing the assignment and
619then modifying the variable that was assigned to.  This is useful
620for modifying a copy of something, like this:
621
622    ($tmp = $global) =~ tr [A-Z] [a-z];
623
624Likewise,
625
626    ($a += 2) *= 3;
627
628is equivalent to
629
630    $a += 2;
631    $a *= 3;
632
633Similarly, a list assignment in list context produces the list of
634lvalues assigned to, and a list assignment in scalar context returns
635the number of elements produced by the expression on the right hand
636side of the assignment.
637
638=head2 Comma Operator
639
640Binary "," is the comma operator.  In scalar context it evaluates
641its left argument, throws that value away, then evaluates its right
642argument and returns that value.  This is just like C's comma operator.
643
644In list context, it's just the list argument separator, and inserts
645both its arguments into the list.
646
647The C<< => >> operator is a synonym for the comma, but forces any word
648to its left to be interpreted as a string (as of 5.001). It is helpful
649in documenting the correspondence between keys and values in hashes,
650and other paired elements in lists.
651
652=head2 List Operators (Rightward)
653
654On the right side of a list operator, it has very low precedence,
655such that it controls all comma-separated expressions found there.
656The only operators with lower precedence are the logical operators
657"and", "or", and "not", which may be used to evaluate calls to list
658operators without the need for extra parentheses:
659
660    open HANDLE, "filename"
661	or die "Can't open: $!\n";
662
663See also discussion of list operators in L<Terms and List Operators (Leftward)>.
664
665=head2 Logical Not
666
667Unary "not" returns the logical negation of the expression to its right.
668It's the equivalent of "!" except for the very low precedence.
669
670=head2 Logical And
671
672Binary "and" returns the logical conjunction of the two surrounding
673expressions.  It's equivalent to && except for the very low
674precedence.  This means that it short-circuits: i.e., the right
675expression is evaluated only if the left expression is true.
676
677=head2 Logical or and Exclusive Or
678
679Binary "or" returns the logical disjunction of the two surrounding
680expressions.  It's equivalent to || except for the very low precedence.
681This makes it useful for control flow
682
683    print FH $data		or die "Can't write to FH: $!";
684
685This means that it short-circuits: i.e., the right expression is evaluated
686only if the left expression is false.  Due to its precedence, you should
687probably avoid using this for assignment, only for control flow.
688
689    $a = $b or $c;		# bug: this is wrong
690    ($a = $b) or $c;		# really means this
691    $a = $b || $c;		# better written this way
692
693However, when it's a list-context assignment and you're trying to use
694"||" for control flow, you probably need "or" so that the assignment
695takes higher precedence.
696
697    @info = stat($file) || die;     # oops, scalar sense of stat!
698    @info = stat($file) or die;     # better, now @info gets its due
699
700Then again, you could always use parentheses.
701
702Binary "xor" returns the exclusive-OR of the two surrounding expressions.
703It cannot short circuit, of course.
704
705=head2 C Operators Missing From Perl
706
707Here is what C has that Perl doesn't:
708
709=over 8
710
711=item unary &
712
713Address-of operator.  (But see the "\" operator for taking a reference.)
714
715=item unary *
716
717Dereference-address operator. (Perl's prefix dereferencing
718operators are typed: $, @, %, and &.)
719
720=item (TYPE)
721
722Type-casting operator.
723
724=back
725
726=head2 Quote and Quote-like Operators
727
728While we usually think of quotes as literal values, in Perl they
729function as operators, providing various kinds of interpolating and
730pattern matching capabilities.  Perl provides customary quote characters
731for these behaviors, but also provides a way for you to choose your
732quote character for any of them.  In the following table, a C<{}> represents
733any pair of delimiters you choose.
734
735    Customary  Generic        Meaning	     Interpolates
736	''	 q{}	      Literal		  no
737	""	qq{}	      Literal		  yes
738	``	qx{}	      Command		  yes*
739		qw{}	     Word list		  no
740	//	 m{}	   Pattern match	  yes*
741		qr{}	      Pattern		  yes*
742		 s{}{}	    Substitution	  yes*
743		tr{}{}	  Transliteration	  no (but see below)
744        <<EOF                 here-doc            yes*
745
746	* unless the delimiter is ''.
747
748Non-bracketing delimiters use the same character fore and aft, but the four
749sorts of brackets (round, angle, square, curly) will all nest, which means
750that
751
752	q{foo{bar}baz}
753
754is the same as
755
756	'foo{bar}baz'
757
758Note, however, that this does not always work for quoting Perl code:
759
760	$s = q{ if($a eq "}") ... }; # WRONG
761
762is a syntax error. The C<Text::Balanced> module (from CPAN, and
763starting from Perl 5.8 part of the standard distribution) is able
764to do this properly.
765
766There can be whitespace between the operator and the quoting
767characters, except when C<#> is being used as the quoting character.
768C<q#foo#> is parsed as the string C<foo>, while C<q #foo#> is the
769operator C<q> followed by a comment.  Its argument will be taken
770from the next line.  This allows you to write:
771
772    s {foo}  # Replace foo
773      {bar}  # with bar.
774
775The following escape sequences are available in constructs that interpolate
776and in transliterations.
777
778    \t		tab             (HT, TAB)
779    \n		newline         (NL)
780    \r		return          (CR)
781    \f		form feed       (FF)
782    \b		backspace       (BS)
783    \a		alarm (bell)    (BEL)
784    \e		escape          (ESC)
785    \033	octal char	(ESC)
786    \x1b	hex char	(ESC)
787    \x{263a}	wide hex char	(SMILEY)
788    \c[		control char    (ESC)
789    \N{name}	named Unicode character
790
791B<NOTE>: Unlike C and other languages, Perl has no \v escape sequence for
792the vertical tab (VT - ASCII 11).
793
794The following escape sequences are available in constructs that interpolate
795but not in transliterations.
796
797    \l		lowercase next char
798    \u		uppercase next char
799    \L		lowercase till \E
800    \U		uppercase till \E
801    \E		end case modification
802    \Q		quote non-word characters till \E
803
804If C<use locale> is in effect, the case map used by C<\l>, C<\L>,
805C<\u> and C<\U> is taken from the current locale.  See L<perllocale>.
806If Unicode (for example, C<\N{}> or wide hex characters of 0x100 or
807beyond) is being used, the case map used by C<\l>, C<\L>, C<\u> and
808C<\U> is as defined by Unicode.  For documentation of C<\N{name}>,
809see L<charnames>.
810
811All systems use the virtual C<"\n"> to represent a line terminator,
812called a "newline".  There is no such thing as an unvarying, physical
813newline character.  It is only an illusion that the operating system,
814device drivers, C libraries, and Perl all conspire to preserve.  Not all
815systems read C<"\r"> as ASCII CR and C<"\n"> as ASCII LF.  For example,
816on a Mac, these are reversed, and on systems without line terminator,
817printing C<"\n"> may emit no actual data.  In general, use C<"\n"> when
818you mean a "newline" for your system, but use the literal ASCII when you
819need an exact character.  For example, most networking protocols expect
820and prefer a CR+LF (C<"\015\012"> or C<"\cM\cJ">) for line terminators,
821and although they often accept just C<"\012">, they seldom tolerate just
822C<"\015">.  If you get in the habit of using C<"\n"> for networking,
823you may be burned some day.
824
825For constructs that do interpolate, variables beginning with "C<$>"
826or "C<@>" are interpolated.  Subscripted variables such as C<$a[3]> or
827C<< $href->{key}[0] >> are also interpolated, as are array and hash slices.
828But method calls such as C<< $obj->meth >> are not.
829
830Interpolating an array or slice interpolates the elements in order,
831separated by the value of C<$">, so is equivalent to interpolating
832C<join $", @array>.    "Punctuation" arrays such as C<@+> are only
833interpolated if the name is enclosed in braces C<@{+}>.
834
835You cannot include a literal C<$> or C<@> within a C<\Q> sequence.
836An unescaped C<$> or C<@> interpolates the corresponding variable,
837while escaping will cause the literal string C<\$> to be inserted.
838You'll need to write something like C<m/\Quser\E\@\Qhost/>.
839
840Patterns are subject to an additional level of interpretation as a
841regular expression.  This is done as a second pass, after variables are
842interpolated, so that regular expressions may be incorporated into the
843pattern from the variables.  If this is not what you want, use C<\Q> to
844interpolate a variable literally.
845
846Apart from the behavior described above, Perl does not expand
847multiple levels of interpolation.  In particular, contrary to the
848expectations of shell programmers, back-quotes do I<NOT> interpolate
849within double quotes, nor do single quotes impede evaluation of
850variables when used within double quotes.
851
852=head2 Regexp Quote-Like Operators
853
854Here are the quote-like operators that apply to pattern
855matching and related activities.
856
857=over 8
858
859=item ?PATTERN?
860
861This is just like the C</pattern/> search, except that it matches only
862once between calls to the reset() operator.  This is a useful
863optimization when you want to see only the first occurrence of
864something in each file of a set of files, for instance.  Only C<??>
865patterns local to the current package are reset.
866
867    while (<>) {
868	if (?^$?) {
869			    # blank line between header and body
870	}
871    } continue {
872	reset if eof;	    # clear ?? status for next file
873    }
874
875This usage is vaguely deprecated, which means it just might possibly
876be removed in some distant future version of Perl, perhaps somewhere
877around the year 2168.
878
879=item m/PATTERN/cgimosx
880
881=item /PATTERN/cgimosx
882
883Searches a string for a pattern match, and in scalar context returns
884true if it succeeds, false if it fails.  If no string is specified
885via the C<=~> or C<!~> operator, the $_ string is searched.  (The
886string specified with C<=~> need not be an lvalue--it may be the
887result of an expression evaluation, but remember the C<=~> binds
888rather tightly.)  See also L<perlre>.  See L<perllocale> for
889discussion of additional considerations that apply when C<use locale>
890is in effect.
891
892Options are:
893
894    c	Do not reset search position on a failed match when /g is in effect.
895    g	Match globally, i.e., find all occurrences.
896    i	Do case-insensitive pattern matching.
897    m	Treat string as multiple lines.
898    o	Compile pattern only once.
899    s	Treat string as single line.
900    x	Use extended regular expressions.
901
902If "/" is the delimiter then the initial C<m> is optional.  With the C<m>
903you can use any pair of non-alphanumeric, non-whitespace characters
904as delimiters.  This is particularly useful for matching path names
905that contain "/", to avoid LTS (leaning toothpick syndrome).  If "?" is
906the delimiter, then the match-only-once rule of C<?PATTERN?> applies.
907If "'" is the delimiter, no interpolation is performed on the PATTERN.
908
909PATTERN may contain variables, which will be interpolated (and the
910pattern recompiled) every time the pattern search is evaluated, except
911for when the delimiter is a single quote.  (Note that C<$(>, C<$)>, and
912C<$|> are not interpolated because they look like end-of-string tests.)
913If you want such a pattern to be compiled only once, add a C</o> after
914the trailing delimiter.  This avoids expensive run-time recompilations,
915and is useful when the value you are interpolating won't change over
916the life of the script.  However, mentioning C</o> constitutes a promise
917that you won't change the variables in the pattern.  If you change them,
918Perl won't even notice.  See also L<"qr/STRING/imosx">.
919
920If the PATTERN evaluates to the empty string, the last
921I<successfully> matched regular expression is used instead. In this
922case, only the C<g> and C<c> flags on the empty pattern is honoured -
923the other flags are taken from the original pattern. If no match has
924previously succeeded, this will (silently) act instead as a genuine
925empty pattern (which will always match).
926
927If the C</g> option is not used, C<m//> in list context returns a
928list consisting of the subexpressions matched by the parentheses in the
929pattern, i.e., (C<$1>, C<$2>, C<$3>...).  (Note that here C<$1> etc. are
930also set, and that this differs from Perl 4's behavior.)  When there are
931no parentheses in the pattern, the return value is the list C<(1)> for
932success.  With or without parentheses, an empty list is returned upon
933failure.
934
935Examples:
936
937    open(TTY, '/dev/tty');
938    <TTY> =~ /^y/i && foo();	# do foo if desired
939
940    if (/Version: *([0-9.]*)/) { $version = $1; }
941
942    next if m#^/usr/spool/uucp#;
943
944    # poor man's grep
945    $arg = shift;
946    while (<>) {
947	print if /$arg/o;	# compile only once
948    }
949
950    if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))
951
952This last example splits $foo into the first two words and the
953remainder of the line, and assigns those three fields to $F1, $F2, and
954$Etc.  The conditional is true if any variables were assigned, i.e., if
955the pattern matched.
956
957The C</g> modifier specifies global pattern matching--that is,
958matching as many times as possible within the string.  How it behaves
959depends on the context.  In list context, it returns a list of the
960substrings matched by any capturing parentheses in the regular
961expression.  If there are no parentheses, it returns a list of all
962the matched strings, as if there were parentheses around the whole
963pattern.
964
965In scalar context, each execution of C<m//g> finds the next match,
966returning true if it matches, and false if there is no further match.
967The position after the last match can be read or set using the pos()
968function; see L<perlfunc/pos>.   A failed match normally resets the
969search position to the beginning of the string, but you can avoid that
970by adding the C</c> modifier (e.g. C<m//gc>).  Modifying the target
971string also resets the search position.
972
973You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
974zero-width assertion that matches the exact position where the previous
975C<m//g>, if any, left off.  Without the C</g> modifier, the C<\G> assertion
976still anchors at pos(), but the match is of course only attempted once.
977Using C<\G> without C</g> on a target string that has not previously had a
978C</g> match applied to it is the same as using the C<\A> assertion to match
979the beginning of the string.  Note also that, currently, C<\G> is only
980properly supported when anchored at the very beginning of the pattern.
981
982Examples:
983
984    # list context
985    ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
986
987    # scalar context
988    $/ = "";
989    while (defined($paragraph = <>)) {
990	while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
991	    $sentences++;
992	}
993    }
994    print "$sentences\n";
995
996    # using m//gc with \G
997    $_ = "ppooqppqq";
998    while ($i++ < 2) {
999        print "1: '";
1000        print $1 while /(o)/gc; print "', pos=", pos, "\n";
1001        print "2: '";
1002        print $1 if /\G(q)/gc;  print "', pos=", pos, "\n";
1003        print "3: '";
1004        print $1 while /(p)/gc; print "', pos=", pos, "\n";
1005    }
1006    print "Final: '$1', pos=",pos,"\n" if /\G(.)/;
1007
1008The last example should print:
1009
1010    1: 'oo', pos=4
1011    2: 'q', pos=5
1012    3: 'pp', pos=7
1013    1: '', pos=7
1014    2: 'q', pos=8
1015    3: '', pos=8
1016    Final: 'q', pos=8
1017
1018Notice that the final match matched C<q> instead of C<p>, which a match
1019without the C<\G> anchor would have done. Also note that the final match
1020did not update C<pos> -- C<pos> is only updated on a C</g> match. If the
1021final match did indeed match C<p>, it's a good bet that you're running an
1022older (pre-5.6.0) Perl.
1023
1024A useful idiom for C<lex>-like scanners is C</\G.../gc>.  You can
1025combine several regexps like this to process a string part-by-part,
1026doing different actions depending on which regexp matched.  Each
1027regexp tries to match where the previous one leaves off.
1028
1029 $_ = <<'EOL';
1030      $url = new URI::URL "http://www/";   die if $url eq "xXx";
1031 EOL
1032 LOOP:
1033    {
1034      print(" digits"),		redo LOOP if /\G\d+\b[,.;]?\s*/gc;
1035      print(" lowercase"),	redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc;
1036      print(" UPPERCASE"),	redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc;
1037      print(" Capitalized"),	redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc;
1038      print(" MiXeD"),		redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc;
1039      print(" alphanumeric"),	redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc;
1040      print(" line-noise"),	redo LOOP if /\G[^A-Za-z0-9]+/gc;
1041      print ". That's all!\n";
1042    }
1043
1044Here is the output (split into several lines):
1045
1046 line-noise lowercase line-noise lowercase UPPERCASE line-noise
1047 UPPERCASE line-noise lowercase line-noise lowercase line-noise
1048 lowercase lowercase line-noise lowercase lowercase line-noise
1049 MiXeD line-noise. That's all!
1050
1051=item q/STRING/
1052
1053=item C<'STRING'>
1054
1055A single-quoted, literal string.  A backslash represents a backslash
1056unless followed by the delimiter or another backslash, in which case
1057the delimiter or backslash is interpolated.
1058
1059    $foo = q!I said, "You said, 'She said it.'"!;
1060    $bar = q('This is it.');
1061    $baz = '\n';		# a two-character string
1062
1063=item qq/STRING/
1064
1065=item "STRING"
1066
1067A double-quoted, interpolated string.
1068
1069    $_ .= qq
1070     (*** The previous line contains the naughty word "$1".\n)
1071		if /\b(tcl|java|python)\b/i;      # :-)
1072    $baz = "\n";		# a one-character string
1073
1074=item qr/STRING/imosx
1075
1076This operator quotes (and possibly compiles) its I<STRING> as a regular
1077expression.  I<STRING> is interpolated the same way as I<PATTERN>
1078in C<m/PATTERN/>.  If "'" is used as the delimiter, no interpolation
1079is done.  Returns a Perl value which may be used instead of the
1080corresponding C</STRING/imosx> expression.
1081
1082For example,
1083
1084    $rex = qr/my.STRING/is;
1085    s/$rex/foo/;
1086
1087is equivalent to
1088
1089    s/my.STRING/foo/is;
1090
1091The result may be used as a subpattern in a match:
1092
1093    $re = qr/$pattern/;
1094    $string =~ /foo${re}bar/;	# can be interpolated in other patterns
1095    $string =~ $re;		# or used standalone
1096    $string =~ /$re/;		# or this way
1097
1098Since Perl may compile the pattern at the moment of execution of qr()
1099operator, using qr() may have speed advantages in some situations,
1100notably if the result of qr() is used standalone:
1101
1102    sub match {
1103	my $patterns = shift;
1104	my @compiled = map qr/$_/i, @$patterns;
1105	grep {
1106	    my $success = 0;
1107	    foreach my $pat (@compiled) {
1108		$success = 1, last if /$pat/;
1109	    }
1110	    $success;
1111	} @_;
1112    }
1113
1114Precompilation of the pattern into an internal representation at
1115the moment of qr() avoids a need to recompile the pattern every
1116time a match C</$pat/> is attempted.  (Perl has many other internal
1117optimizations, but none would be triggered in the above example if
1118we did not use qr() operator.)
1119
1120Options are:
1121
1122    i	Do case-insensitive pattern matching.
1123    m	Treat string as multiple lines.
1124    o	Compile pattern only once.
1125    s	Treat string as single line.
1126    x	Use extended regular expressions.
1127
1128See L<perlre> for additional information on valid syntax for STRING, and
1129for a detailed look at the semantics of regular expressions.
1130
1131=item qx/STRING/
1132
1133=item `STRING`
1134
1135A string which is (possibly) interpolated and then executed as a
1136system command with C</bin/sh> or its equivalent.  Shell wildcards,
1137pipes, and redirections will be honored.  The collected standard
1138output of the command is returned; standard error is unaffected.  In
1139scalar context, it comes back as a single (potentially multi-line)
1140string, or undef if the command failed.  In list context, returns a
1141list of lines (however you've defined lines with $/ or
1142$INPUT_RECORD_SEPARATOR), or an empty list if the command failed.
1143
1144Because backticks do not affect standard error, use shell file descriptor
1145syntax (assuming the shell supports this) if you care to address this.
1146To capture a command's STDERR and STDOUT together:
1147
1148    $output = `cmd 2>&1`;
1149
1150To capture a command's STDOUT but discard its STDERR:
1151
1152    $output = `cmd 2>/dev/null`;
1153
1154To capture a command's STDERR but discard its STDOUT (ordering is
1155important here):
1156
1157    $output = `cmd 2>&1 1>/dev/null`;
1158
1159To exchange a command's STDOUT and STDERR in order to capture the STDERR
1160but leave its STDOUT to come out the old STDERR:
1161
1162    $output = `cmd 3>&1 1>&2 2>&3 3>&-`;
1163
1164To read both a command's STDOUT and its STDERR separately, it's easiest
1165to redirect them separately to files, and then read from those files
1166when the program is done:
1167
1168    system("program args 1>program.stdout 2>program.stderr");
1169
1170Using single-quote as a delimiter protects the command from Perl's
1171double-quote interpolation, passing it on to the shell instead:
1172
1173    $perl_info  = qx(ps $$);            # that's Perl's $$
1174    $shell_info = qx'ps $$';            # that's the new shell's $$
1175
1176How that string gets evaluated is entirely subject to the command
1177interpreter on your system.  On most platforms, you will have to protect
1178shell metacharacters if you want them treated literally.  This is in
1179practice difficult to do, as it's unclear how to escape which characters.
1180See L<perlsec> for a clean and safe example of a manual fork() and exec()
1181to emulate backticks safely.
1182
1183On some platforms (notably DOS-like ones), the shell may not be
1184capable of dealing with multiline commands, so putting newlines in
1185the string may not get you what you want.  You may be able to evaluate
1186multiple commands in a single line by separating them with the command
1187separator character, if your shell supports that (e.g. C<;> on many Unix
1188shells; C<&> on the Windows NT C<cmd> shell).
1189
1190Beginning with v5.6.0, Perl will attempt to flush all files opened for
1191output before starting the child process, but this may not be supported
1192on some platforms (see L<perlport>).  To be safe, you may need to set
1193C<$|> ($AUTOFLUSH in English) or call the C<autoflush()> method of
1194C<IO::Handle> on any open handles.
1195
1196Beware that some command shells may place restrictions on the length
1197of the command line.  You must ensure your strings don't exceed this
1198limit after any necessary interpolations.  See the platform-specific
1199release notes for more details about your particular environment.
1200
1201Using this operator can lead to programs that are difficult to port,
1202because the shell commands called vary between systems, and may in
1203fact not be present at all.  As one example, the C<type> command under
1204the POSIX shell is very different from the C<type> command under DOS.
1205That doesn't mean you should go out of your way to avoid backticks
1206when they're the right way to get something done.  Perl was made to be
1207a glue language, and one of the things it glues together is commands.
1208Just understand what you're getting yourself into.
1209
1210See L<"I/O Operators"> for more discussion.
1211
1212=item qw/STRING/
1213
1214Evaluates to a list of the words extracted out of STRING, using embedded
1215whitespace as the word delimiters.  It can be understood as being roughly
1216equivalent to:
1217
1218    split(' ', q/STRING/);
1219
1220the differences being that it generates a real list at compile time, and
1221in scalar context it returns the last element in the list.  So
1222this expression:
1223
1224    qw(foo bar baz)
1225
1226is semantically equivalent to the list:
1227
1228    'foo', 'bar', 'baz'
1229
1230Some frequently seen examples:
1231
1232    use POSIX qw( setlocale localeconv )
1233    @EXPORT = qw( foo bar baz );
1234
1235A common mistake is to try to separate the words with comma or to
1236put comments into a multi-line C<qw>-string.  For this reason, the
1237C<use warnings> pragma and the B<-w> switch (that is, the C<$^W> variable)
1238produces warnings if the STRING contains the "," or the "#" character.
1239
1240=item s/PATTERN/REPLACEMENT/egimosx
1241
1242Searches a string for a pattern, and if found, replaces that pattern
1243with the replacement text and returns the number of substitutions
1244made.  Otherwise it returns false (specifically, the empty string).
1245
1246If no string is specified via the C<=~> or C<!~> operator, the C<$_>
1247variable is searched and modified.  (The string specified with C<=~> must
1248be scalar variable, an array element, a hash element, or an assignment
1249to one of those, i.e., an lvalue.)
1250
1251If the delimiter chosen is a single quote, no interpolation is
1252done on either the PATTERN or the REPLACEMENT.  Otherwise, if the
1253PATTERN contains a $ that looks like a variable rather than an
1254end-of-string test, the variable will be interpolated into the pattern
1255at run-time.  If you want the pattern compiled only once the first time
1256the variable is interpolated, use the C</o> option.  If the pattern
1257evaluates to the empty string, the last successfully executed regular
1258expression is used instead.  See L<perlre> for further explanation on these.
1259See L<perllocale> for discussion of additional considerations that apply
1260when C<use locale> is in effect.
1261
1262Options are:
1263
1264    e	Evaluate the right side as an expression.
1265    g	Replace globally, i.e., all occurrences.
1266    i	Do case-insensitive pattern matching.
1267    m	Treat string as multiple lines.
1268    o	Compile pattern only once.
1269    s	Treat string as single line.
1270    x	Use extended regular expressions.
1271
1272Any non-alphanumeric, non-whitespace delimiter may replace the
1273slashes.  If single quotes are used, no interpretation is done on the
1274replacement string (the C</e> modifier overrides this, however).  Unlike
1275Perl 4, Perl 5 treats backticks as normal delimiters; the replacement
1276text is not evaluated as a command.  If the
1277PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own
1278pair of quotes, which may or may not be bracketing quotes, e.g.,
1279C<s(foo)(bar)> or C<< s<foo>/bar/ >>.  A C</e> will cause the
1280replacement portion to be treated as a full-fledged Perl expression
1281and evaluated right then and there.  It is, however, syntax checked at
1282compile-time. A second C<e> modifier will cause the replacement portion
1283to be C<eval>ed before being run as a Perl expression.
1284
1285Examples:
1286
1287    s/\bgreen\b/mauve/g;		# don't change wintergreen
1288
1289    $path =~ s|/usr/bin|/usr/local/bin|;
1290
1291    s/Login: $foo/Login: $bar/; # run-time pattern
1292
1293    ($foo = $bar) =~ s/this/that/;	# copy first, then change
1294
1295    $count = ($paragraph =~ s/Mister\b/Mr./g);  # get change-count
1296
1297    $_ = 'abc123xyz';
1298    s/\d+/$&*2/e;		# yields 'abc246xyz'
1299    s/\d+/sprintf("%5d",$&)/e;	# yields 'abc  246xyz'
1300    s/\w/$& x 2/eg;		# yields 'aabbcc  224466xxyyzz'
1301
1302    s/%(.)/$percent{$1}/g;	# change percent escapes; no /e
1303    s/%(.)/$percent{$1} || $&/ge;	# expr now, so /e
1304    s/^=(\w+)/&pod($1)/ge;	# use function call
1305
1306    # expand variables in $_, but dynamics only, using
1307    # symbolic dereferencing
1308    s/\$(\w+)/${$1}/g;
1309
1310    # Add one to the value of any numbers in the string
1311    s/(\d+)/1 + $1/eg;
1312
1313    # This will expand any embedded scalar variable
1314    # (including lexicals) in $_ : First $1 is interpolated
1315    # to the variable name, and then evaluated
1316    s/(\$\w+)/$1/eeg;
1317
1318    # Delete (most) C comments.
1319    $program =~ s {
1320	/\*	# Match the opening delimiter.
1321	.*?	# Match a minimal number of characters.
1322	\*/	# Match the closing delimiter.
1323    } []gsx;
1324
1325    s/^\s*(.*?)\s*$/$1/;	# trim white space in $_, expensively
1326
1327    for ($variable) {		# trim white space in $variable, cheap
1328	s/^\s+//;
1329	s/\s+$//;
1330    }
1331
1332    s/([^ ]*) *([^ ]*)/$2 $1/;	# reverse 1st two fields
1333
1334Note the use of $ instead of \ in the last example.  Unlike
1335B<sed>, we use the \<I<digit>> form in only the left hand side.
1336Anywhere else it's $<I<digit>>.
1337
1338Occasionally, you can't use just a C</g> to get all the changes
1339to occur that you might want.  Here are two common cases:
1340
1341    # put commas in the right places in an integer
1342    1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g;
1343
1344    # expand tabs to 8-column spacing
1345    1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;
1346
1347=item tr/SEARCHLIST/REPLACEMENTLIST/cds
1348
1349=item y/SEARCHLIST/REPLACEMENTLIST/cds
1350
1351Transliterates all occurrences of the characters found in the search list
1352with the corresponding character in the replacement list.  It returns
1353the number of characters replaced or deleted.  If no string is
1354specified via the =~ or !~ operator, the $_ string is transliterated.  (The
1355string specified with =~ must be a scalar variable, an array element, a
1356hash element, or an assignment to one of those, i.e., an lvalue.)
1357
1358A character range may be specified with a hyphen, so C<tr/A-J/0-9/>
1359does the same replacement as C<tr/ACEGIBDFHJ/0246813579/>.
1360For B<sed> devotees, C<y> is provided as a synonym for C<tr>.  If the
1361SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST has
1362its own pair of quotes, which may or may not be bracketing quotes,
1363e.g., C<tr[A-Z][a-z]> or C<tr(+\-*/)/ABCD/>.
1364
1365Note that C<tr> does B<not> do regular expression character classes
1366such as C<\d> or C<[:lower:]>.  The <tr> operator is not equivalent to
1367the tr(1) utility.  If you want to map strings between lower/upper
1368cases, see L<perlfunc/lc> and L<perlfunc/uc>, and in general consider
1369using the C<s> operator if you need regular expressions.
1370
1371Note also that the whole range idea is rather unportable between
1372character sets--and even within character sets they may cause results
1373you probably didn't expect.  A sound principle is to use only ranges
1374that begin from and end at either alphabets of equal case (a-e, A-E),
1375or digits (0-4).  Anything else is unsafe.  If in doubt, spell out the
1376character sets in full.
1377
1378Options:
1379
1380    c	Complement the SEARCHLIST.
1381    d	Delete found but unreplaced characters.
1382    s	Squash duplicate replaced characters.
1383
1384If the C</c> modifier is specified, the SEARCHLIST character set
1385is complemented.  If the C</d> modifier is specified, any characters
1386specified by SEARCHLIST not found in REPLACEMENTLIST are deleted.
1387(Note that this is slightly more flexible than the behavior of some
1388B<tr> programs, which delete anything they find in the SEARCHLIST,
1389period.) If the C</s> modifier is specified, sequences of characters
1390that were transliterated to the same character are squashed down
1391to a single instance of the character.
1392
1393If the C</d> modifier is used, the REPLACEMENTLIST is always interpreted
1394exactly as specified.  Otherwise, if the REPLACEMENTLIST is shorter
1395than the SEARCHLIST, the final character is replicated till it is long
1396enough.  If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated.
1397This latter is useful for counting characters in a class or for
1398squashing character sequences in a class.
1399
1400Examples:
1401
1402    $ARGV[1] =~ tr/A-Z/a-z/;	# canonicalize to lower case
1403
1404    $cnt = tr/*/*/;		# count the stars in $_
1405
1406    $cnt = $sky =~ tr/*/*/;	# count the stars in $sky
1407
1408    $cnt = tr/0-9//;		# count the digits in $_
1409
1410    tr/a-zA-Z//s;		# bookkeeper -> bokeper
1411
1412    ($HOST = $host) =~ tr/a-z/A-Z/;
1413
1414    tr/a-zA-Z/ /cs;		# change non-alphas to single space
1415
1416    tr [\200-\377]
1417       [\000-\177];		# delete 8th bit
1418
1419If multiple transliterations are given for a character, only the
1420first one is used:
1421
1422    tr/AAA/XYZ/
1423
1424will transliterate any A to X.
1425
1426Because the transliteration table is built at compile time, neither
1427the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
1428interpolation.  That means that if you want to use variables, you
1429must use an eval():
1430
1431    eval "tr/$oldlist/$newlist/";
1432    die $@ if $@;
1433
1434    eval "tr/$oldlist/$newlist/, 1" or die $@;
1435
1436=item <<EOF
1437
1438A line-oriented form of quoting is based on the shell "here-document"
1439syntax.  Following a C<< << >> you specify a string to terminate
1440the quoted material, and all lines following the current line down to
1441the terminating string are the value of the item.  The terminating
1442string may be either an identifier (a word), or some quoted text.  If
1443quoted, the type of quotes you use determines the treatment of the
1444text, just as in regular quoting.  An unquoted identifier works like
1445double quotes.  There must be no space between the C<< << >> and
1446the identifier, unless the identifier is quoted.  (If you put a space it
1447will be treated as a null identifier, which is valid, and matches the first
1448empty line.)  The terminating string must appear by itself (unquoted and
1449with no surrounding whitespace) on the terminating line.
1450
1451       print <<EOF;
1452    The price is $Price.
1453    EOF
1454
1455       print << "EOF"; # same as above
1456    The price is $Price.
1457    EOF
1458
1459       print << `EOC`; # execute commands
1460    echo hi there
1461    echo lo there
1462    EOC
1463
1464       print <<"foo", <<"bar"; # you can stack them
1465    I said foo.
1466    foo
1467    I said bar.
1468    bar
1469
1470       myfunc(<< "THIS", 23, <<'THAT');
1471    Here's a line
1472    or two.
1473    THIS
1474    and here's another.
1475    THAT
1476
1477Just don't forget that you have to put a semicolon on the end
1478to finish the statement, as Perl doesn't know you're not going to
1479try to do this:
1480
1481       print <<ABC
1482    179231
1483    ABC
1484       + 20;
1485
1486If you want your here-docs to be indented with the
1487rest of the code, you'll need to remove leading whitespace
1488from each line manually:
1489
1490    ($quote = <<'FINIS') =~ s/^\s+//gm;
1491       The Road goes ever on and on,
1492       down from the door where it began.
1493    FINIS
1494
1495If you use a here-doc within a delimited construct, such as in C<s///eg>,
1496the quoted material must come on the lines following the final delimiter.
1497So instead of
1498
1499    s/this/<<E . 'that'
1500    the other
1501    E
1502     . 'more '/eg;
1503
1504you have to write
1505
1506    s/this/<<E . 'that'
1507     . 'more '/eg;
1508    the other
1509    E
1510
1511If the terminating identifier is on the last line of the program, you
1512must be sure there is a newline after it; otherwise, Perl will give the
1513warning B<Can't find string terminator "END" anywhere before EOF...>.
1514
1515Additionally, the quoting rules for the identifier are not related to
1516Perl's quoting rules -- C<q()>, C<qq()>, and the like are not supported
1517in place of C<''> and C<"">, and the only interpolation is for backslashing
1518the quoting character:
1519
1520    print << "abc\"def";
1521    testing...
1522    abc"def
1523
1524Finally, quoted strings cannot span multiple lines.  The general rule is
1525that the identifier must be a string literal.  Stick with that, and you
1526should be safe.
1527
1528=back
1529
1530=head2 Gory details of parsing quoted constructs
1531
1532When presented with something that might have several different
1533interpretations, Perl uses the B<DWIM> (that's "Do What I Mean")
1534principle to pick the most probable interpretation.  This strategy
1535is so successful that Perl programmers often do not suspect the
1536ambivalence of what they write.  But from time to time, Perl's
1537notions differ substantially from what the author honestly meant.
1538
1539This section hopes to clarify how Perl handles quoted constructs.
1540Although the most common reason to learn this is to unravel labyrinthine
1541regular expressions, because the initial steps of parsing are the
1542same for all quoting operators, they are all discussed together.
1543
1544The most important Perl parsing rule is the first one discussed
1545below: when processing a quoted construct, Perl first finds the end
1546of that construct, then interprets its contents.  If you understand
1547this rule, you may skip the rest of this section on the first
1548reading.  The other rules are likely to contradict the user's
1549expectations much less frequently than this first one.
1550
1551Some passes discussed below are performed concurrently, but because
1552their results are the same, we consider them individually.  For different
1553quoting constructs, Perl performs different numbers of passes, from
1554one to five, but these passes are always performed in the same order.
1555
1556=over 4
1557
1558=item Finding the end
1559
1560The first pass is finding the end of the quoted construct, whether
1561it be a multicharacter delimiter C<"\nEOF\n"> in the C<<<EOF>
1562construct, a C</> that terminates a C<qq//> construct, a C<]> which
1563terminates C<qq[]> construct, or a C<< > >> which terminates a
1564fileglob started with C<< < >>.
1565
1566When searching for single-character non-pairing delimiters, such
1567as C</>, combinations of C<\\> and C<\/> are skipped.  However,
1568when searching for single-character pairing delimiter like C<[>,
1569combinations of C<\\>, C<\]>, and C<\[> are all skipped, and nested
1570C<[>, C<]> are skipped as well.  When searching for multicharacter
1571delimiters, nothing is skipped.
1572
1573For constructs with three-part delimiters (C<s///>, C<y///>, and
1574C<tr///>), the search is repeated once more.
1575
1576During this search no attention is paid to the semantics of the construct.
1577Thus:
1578
1579    "$hash{"$foo/$bar"}"
1580
1581or:
1582
1583    m/
1584      bar	# NOT a comment, this slash / terminated m//!
1585     /x
1586
1587do not form legal quoted expressions.   The quoted part ends on the
1588first C<"> and C</>, and the rest happens to be a syntax error.
1589Because the slash that terminated C<m//> was followed by a C<SPACE>,
1590the example above is not C<m//x>, but rather C<m//> with no C</x>
1591modifier.  So the embedded C<#> is interpreted as a literal C<#>.
1592
1593=item Removal of backslashes before delimiters
1594
1595During the second pass, text between the starting and ending
1596delimiters is copied to a safe location, and the C<\> is removed
1597from combinations consisting of C<\> and delimiter--or delimiters,
1598meaning both starting and ending delimiters will should these differ.
1599This removal does not happen for multi-character delimiters.
1600Note that the combination C<\\> is left intact, just as it was.
1601
1602Starting from this step no information about the delimiters is
1603used in parsing.
1604
1605=item Interpolation
1606
1607The next step is interpolation in the text obtained, which is now
1608delimiter-independent.  There are four different cases.
1609
1610=over 4
1611
1612=item C<<<'EOF'>, C<m''>, C<s'''>, C<tr///>, C<y///>
1613
1614No interpolation is performed.
1615
1616=item C<''>, C<q//>
1617
1618The only interpolation is removal of C<\> from pairs C<\\>.
1619
1620=item C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >>
1621
1622C<\Q>, C<\U>, C<\u>, C<\L>, C<\l> (possibly paired with C<\E>) are
1623converted to corresponding Perl constructs.  Thus, C<"$foo\Qbaz$bar">
1624is converted to C<$foo . (quotemeta("baz" . $bar))> internally.
1625The other combinations are replaced with appropriate expansions.
1626
1627Let it be stressed that I<whatever falls between C<\Q> and C<\E>>
1628is interpolated in the usual way.  Something like C<"\Q\\E"> has
1629no C<\E> inside.  instead, it has C<\Q>, C<\\>, and C<E>, so the
1630result is the same as for C<"\\\\E">.  As a general rule, backslashes
1631between C<\Q> and C<\E> may lead to counterintuitive results.  So,
1632C<"\Q\t\E"> is converted to C<quotemeta("\t")>, which is the same
1633as C<"\\\t"> (since TAB is not alphanumeric).  Note also that:
1634
1635  $str = '\t';
1636  return "\Q$str";
1637
1638may be closer to the conjectural I<intention> of the writer of C<"\Q\t\E">.
1639
1640Interpolated scalars and arrays are converted internally to the C<join> and
1641C<.> catenation operations.  Thus, C<"$foo XXX '@arr'"> becomes:
1642
1643  $foo . " XXX '" . (join $", @arr) . "'";
1644
1645All operations above are performed simultaneously, left to right.
1646
1647Because the result of C<"\Q STRING \E"> has all metacharacters
1648quoted, there is no way to insert a literal C<$> or C<@> inside a
1649C<\Q\E> pair.  If protected by C<\>, C<$> will be quoted to became
1650C<"\\\$">; if not, it is interpreted as the start of an interpolated
1651scalar.
1652
1653Note also that the interpolation code needs to make a decision on
1654where the interpolated scalar ends.  For instance, whether
1655C<< "a $b -> {c}" >> really means:
1656
1657  "a " . $b . " -> {c}";
1658
1659or:
1660
1661  "a " . $b -> {c};
1662
1663Most of the time, the longest possible text that does not include
1664spaces between components and which contains matching braces or
1665brackets.  because the outcome may be determined by voting based
1666on heuristic estimators, the result is not strictly predictable.
1667Fortunately, it's usually correct for ambiguous cases.
1668
1669=item C<?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>,
1670
1671Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, and interpolation
1672happens (almost) as with C<qq//> constructs, but the substitution
1673of C<\> followed by RE-special chars (including C<\>) is not
1674performed.  Moreover, inside C<(?{BLOCK})>, C<(?# comment )>, and
1675a C<#>-comment in a C<//x>-regular expression, no processing is
1676performed whatsoever.  This is the first step at which the presence
1677of the C<//x> modifier is relevant.
1678
1679Interpolation has several quirks: C<$|>, C<$(>, and C<$)> are not
1680interpolated, and constructs C<$var[SOMETHING]> are voted (by several
1681different estimators) to be either an array element or C<$var>
1682followed by an RE alternative.  This is where the notation
1683C<${arr[$bar]}> comes handy: C</${arr[0-9]}/> is interpreted as
1684array element C<-9>, not as a regular expression from the variable
1685C<$arr> followed by a digit, which would be the interpretation of
1686C</$arr[0-9]/>.  Since voting among different estimators may occur,
1687the result is not predictable.
1688
1689It is at this step that C<\1> is begrudgingly converted to C<$1> in
1690the replacement text of C<s///> to correct the incorrigible
1691I<sed> hackers who haven't picked up the saner idiom yet.  A warning
1692is emitted if the C<use warnings> pragma or the B<-w> command-line flag
1693(that is, the C<$^W> variable) was set.
1694
1695The lack of processing of C<\\> creates specific restrictions on
1696the post-processed text.  If the delimiter is C</>, one cannot get
1697the combination C<\/> into the result of this step.  C</> will
1698finish the regular expression, C<\/> will be stripped to C</> on
1699the previous step, and C<\\/> will be left as is.  Because C</> is
1700equivalent to C<\/> inside a regular expression, this does not
1701matter unless the delimiter happens to be character special to the
1702RE engine, such as in C<s*foo*bar*>, C<m[foo]>, or C<?foo?>; or an
1703alphanumeric char, as in:
1704
1705  m m ^ a \s* b mmx;
1706
1707In the RE above, which is intentionally obfuscated for illustration, the
1708delimiter is C<m>, the modifier is C<mx>, and after backslash-removal the
1709RE is the same as for C<m/ ^ a \s* b /mx>.  There's more than one
1710reason you're encouraged to restrict your delimiters to non-alphanumeric,
1711non-whitespace choices.
1712
1713=back
1714
1715This step is the last one for all constructs except regular expressions,
1716which are processed further.
1717
1718=item Interpolation of regular expressions
1719
1720Previous steps were performed during the compilation of Perl code,
1721but this one happens at run time--although it may be optimized to
1722be calculated at compile time if appropriate.  After preprocessing
1723described above, and possibly after evaluation if catenation,
1724joining, casing translation, or metaquoting are involved, the
1725resulting I<string> is passed to the RE engine for compilation.
1726
1727Whatever happens in the RE engine might be better discussed in L<perlre>,
1728but for the sake of continuity, we shall do so here.
1729
1730This is another step where the presence of the C<//x> modifier is
1731relevant.  The RE engine scans the string from left to right and
1732converts it to a finite automaton.
1733
1734Backslashed characters are either replaced with corresponding
1735literal strings (as with C<\{>), or else they generate special nodes
1736in the finite automaton (as with C<\b>).  Characters special to the
1737RE engine (such as C<|>) generate corresponding nodes or groups of
1738nodes.  C<(?#...)> comments are ignored.  All the rest is either
1739converted to literal strings to match, or else is ignored (as is
1740whitespace and C<#>-style comments if C<//x> is present).
1741
1742Parsing of the bracketed character class construct, C<[...]>, is
1743rather different than the rule used for the rest of the pattern.
1744The terminator of this construct is found using the same rules as
1745for finding the terminator of a C<{}>-delimited construct, the only
1746exception being that C<]> immediately following C<[> is treated as
1747though preceded by a backslash.  Similarly, the terminator of
1748C<(?{...})> is found using the same rules as for finding the
1749terminator of a C<{}>-delimited construct.
1750
1751It is possible to inspect both the string given to RE engine and the
1752resulting finite automaton.  See the arguments C<debug>/C<debugcolor>
1753in the C<use L<re>> pragma, as well as Perl's B<-Dr> command-line
1754switch documented in L<perlrun/"Command Switches">.
1755
1756=item Optimization of regular expressions
1757
1758This step is listed for completeness only.  Since it does not change
1759semantics, details of this step are not documented and are subject
1760to change without notice.  This step is performed over the finite
1761automaton that was generated during the previous pass.
1762
1763It is at this stage that C<split()> silently optimizes C</^/> to
1764mean C</^/m>.
1765
1766=back
1767
1768=head2 I/O Operators
1769
1770There are several I/O operators you should know about.
1771
1772A string enclosed by backticks (grave accents) first undergoes
1773double-quote interpolation.  It is then interpreted as an external
1774command, and the output of that command is the value of the
1775backtick string, like in a shell.  In scalar context, a single string
1776consisting of all output is returned.  In list context, a list of
1777values is returned, one per line of output.  (You can set C<$/> to use
1778a different line terminator.)  The command is executed each time the
1779pseudo-literal is evaluated.  The status value of the command is
1780returned in C<$?> (see L<perlvar> for the interpretation of C<$?>).
1781Unlike in B<csh>, no translation is done on the return data--newlines
1782remain newlines.  Unlike in any of the shells, single quotes do not
1783hide variable names in the command from interpretation.  To pass a
1784literal dollar-sign through to the shell you need to hide it with a
1785backslash.  The generalized form of backticks is C<qx//>.  (Because
1786backticks always undergo shell expansion as well, see L<perlsec> for
1787security concerns.)
1788
1789In scalar context, evaluating a filehandle in angle brackets yields
1790the next line from that file (the newline, if any, included), or
1791C<undef> at end-of-file or on error.  When C<$/> is set to C<undef>
1792(sometimes known as file-slurp mode) and the file is empty, it
1793returns C<''> the first time, followed by C<undef> subsequently.
1794
1795Ordinarily you must assign the returned value to a variable, but
1796there is one situation where an automatic assignment happens.  If
1797and only if the input symbol is the only thing inside the conditional
1798of a C<while> statement (even if disguised as a C<for(;;)> loop),
1799the value is automatically assigned to the global variable $_,
1800destroying whatever was there previously.  (This may seem like an
1801odd thing to you, but you'll use the construct in almost every Perl
1802script you write.)  The $_ variable is not implicitly localized.
1803You'll have to put a C<local $_;> before the loop if you want that
1804to happen.
1805
1806The following lines are equivalent:
1807
1808    while (defined($_ = <STDIN>)) { print; }
1809    while ($_ = <STDIN>) { print; }
1810    while (<STDIN>) { print; }
1811    for (;<STDIN>;) { print; }
1812    print while defined($_ = <STDIN>);
1813    print while ($_ = <STDIN>);
1814    print while <STDIN>;
1815
1816This also behaves similarly, but avoids $_ :
1817
1818    while (my $line = <STDIN>) { print $line }
1819
1820In these loop constructs, the assigned value (whether assignment
1821is automatic or explicit) is then tested to see whether it is
1822defined.  The defined test avoids problems where line has a string
1823value that would be treated as false by Perl, for example a "" or
1824a "0" with no trailing newline.  If you really mean for such values
1825to terminate the loop, they should be tested for explicitly:
1826
1827    while (($_ = <STDIN>) ne '0') { ... }
1828    while (<STDIN>) { last unless $_; ... }
1829
1830In other boolean contexts, C<< <I<filehandle>> >> without an
1831explicit C<defined> test or comparison elicit a warning if the
1832C<use warnings> pragma or the B<-w>
1833command-line switch (the C<$^W> variable) is in effect.
1834
1835The filehandles STDIN, STDOUT, and STDERR are predefined.  (The
1836filehandles C<stdin>, C<stdout>, and C<stderr> will also work except
1837in packages, where they would be interpreted as local identifiers
1838rather than global.)  Additional filehandles may be created with
1839the open() function, amongst others.  See L<perlopentut> and
1840L<perlfunc/open> for details on this.
1841
1842If a <FILEHANDLE> is used in a context that is looking for
1843a list, a list comprising all input lines is returned, one line per
1844list element.  It's easy to grow to a rather large data space this
1845way, so use with care.
1846
1847<FILEHANDLE> may also be spelled C<readline(*FILEHANDLE)>.
1848See L<perlfunc/readline>.
1849
1850The null filehandle <> is special: it can be used to emulate the
1851behavior of B<sed> and B<awk>.  Input from <> comes either from
1852standard input, or from each file listed on the command line.  Here's
1853how it works: the first time <> is evaluated, the @ARGV array is
1854checked, and if it is empty, C<$ARGV[0]> is set to "-", which when opened
1855gives you standard input.  The @ARGV array is then processed as a list
1856of filenames.  The loop
1857
1858    while (<>) {
1859	...			# code for each line
1860    }
1861
1862is equivalent to the following Perl-like pseudo code:
1863
1864    unshift(@ARGV, '-') unless @ARGV;
1865    while ($ARGV = shift) {
1866	open(ARGV, $ARGV);
1867	while (<ARGV>) {
1868	    ...		# code for each line
1869	}
1870    }
1871
1872except that it isn't so cumbersome to say, and will actually work.
1873It really does shift the @ARGV array and put the current filename
1874into the $ARGV variable.  It also uses filehandle I<ARGV>
1875internally--<> is just a synonym for <ARGV>, which
1876is magical.  (The pseudo code above doesn't work because it treats
1877<ARGV> as non-magical.)
1878
1879You can modify @ARGV before the first <> as long as the array ends up
1880containing the list of filenames you really want.  Line numbers (C<$.>)
1881continue as though the input were one big happy file.  See the example
1882in L<perlfunc/eof> for how to reset line numbers on each file.
1883
1884If you want to set @ARGV to your own list of files, go right ahead.
1885This sets @ARGV to all plain text files if no @ARGV was given:
1886
1887    @ARGV = grep { -f && -T } glob('*') unless @ARGV;
1888
1889You can even set them to pipe commands.  For example, this automatically
1890filters compressed arguments through B<gzip>:
1891
1892    @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV;
1893
1894If you want to pass switches into your script, you can use one of the
1895Getopts modules or put a loop on the front like this:
1896
1897    while ($_ = $ARGV[0], /^-/) {
1898	shift;
1899        last if /^--$/;
1900	if (/^-D(.*)/) { $debug = $1 }
1901	if (/^-v/)     { $verbose++  }
1902	# ...		# other switches
1903    }
1904
1905    while (<>) {
1906	# ...		# code for each line
1907    }
1908
1909The <> symbol will return C<undef> for end-of-file only once.
1910If you call it again after this, it will assume you are processing another
1911@ARGV list, and if you haven't set @ARGV, will read input from STDIN.
1912
1913If what the angle brackets contain is a simple scalar variable (e.g.,
1914<$foo>), then that variable contains the name of the
1915filehandle to input from, or its typeglob, or a reference to the
1916same.  For example:
1917
1918    $fh = \*STDIN;
1919    $line = <$fh>;
1920
1921If what's within the angle brackets is neither a filehandle nor a simple
1922scalar variable containing a filehandle name, typeglob, or typeglob
1923reference, it is interpreted as a filename pattern to be globbed, and
1924either a list of filenames or the next filename in the list is returned,
1925depending on context.  This distinction is determined on syntactic
1926grounds alone.  That means C<< <$x> >> is always a readline() from
1927an indirect handle, but C<< <$hash{key}> >> is always a glob().
1928That's because $x is a simple scalar variable, but C<$hash{key}> is
1929not--it's a hash element.
1930
1931One level of double-quote interpretation is done first, but you can't
1932say C<< <$foo> >> because that's an indirect filehandle as explained
1933in the previous paragraph.  (In older versions of Perl, programmers
1934would insert curly brackets to force interpretation as a filename glob:
1935C<< <${foo}> >>.  These days, it's considered cleaner to call the
1936internal function directly as C<glob($foo)>, which is probably the right
1937way to have done it in the first place.)  For example:
1938
1939    while (<*.c>) {
1940	chmod 0644, $_;
1941    }
1942
1943is roughly equivalent to:
1944
1945    open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
1946    while (<FOO>) {
1947	chomp;
1948	chmod 0644, $_;
1949    }
1950
1951except that the globbing is actually done internally using the standard
1952C<File::Glob> extension.  Of course, the shortest way to do the above is:
1953
1954    chmod 0644, <*.c>;
1955
1956A (file)glob evaluates its (embedded) argument only when it is
1957starting a new list.  All values must be read before it will start
1958over.  In list context, this isn't important because you automatically
1959get them all anyway.  However, in scalar context the operator returns
1960the next value each time it's called, or C<undef> when the list has
1961run out.  As with filehandle reads, an automatic C<defined> is
1962generated when the glob occurs in the test part of a C<while>,
1963because legal glob returns (e.g. a file called F<0>) would otherwise
1964terminate the loop.  Again, C<undef> is returned only once.  So if
1965you're expecting a single value from a glob, it is much better to
1966say
1967
1968    ($file) = <blurch*>;
1969
1970than
1971
1972    $file = <blurch*>;
1973
1974because the latter will alternate between returning a filename and
1975returning false.
1976
1977If you're trying to do variable interpolation, it's definitely better
1978to use the glob() function, because the older notation can cause people
1979to become confused with the indirect filehandle notation.
1980
1981    @files = glob("$dir/*.[ch]");
1982    @files = glob($files[$i]);
1983
1984=head2 Constant Folding
1985
1986Like C, Perl does a certain amount of expression evaluation at
1987compile time whenever it determines that all arguments to an
1988operator are static and have no side effects.  In particular, string
1989concatenation happens at compile time between literals that don't do
1990variable substitution.  Backslash interpolation also happens at
1991compile time.  You can say
1992
1993    'Now is the time for all' . "\n" .
1994	'good men to come to.'
1995
1996and this all reduces to one string internally.  Likewise, if
1997you say
1998
1999    foreach $file (@filenames) {
2000	if (-s $file > 5 + 100 * 2**16) {  }
2001    }
2002
2003the compiler will precompute the number which that expression
2004represents so that the interpreter won't have to.
2005
2006=head2 Bitwise String Operators
2007
2008Bitstrings of any size may be manipulated by the bitwise operators
2009(C<~ | & ^>).
2010
2011If the operands to a binary bitwise op are strings of different
2012sizes, B<|> and B<^> ops act as though the shorter operand had
2013additional zero bits on the right, while the B<&> op acts as though
2014the longer operand were truncated to the length of the shorter.
2015The granularity for such extension or truncation is one or more
2016bytes.
2017
2018    # ASCII-based examples
2019    print "j p \n" ^ " a h";        	# prints "JAPH\n"
2020    print "JA" | "  ph\n";          	# prints "japh\n"
2021    print "japh\nJunk" & '_____';   	# prints "JAPH\n";
2022    print 'p N$' ^ " E<H\n";		# prints "Perl\n";
2023
2024If you are intending to manipulate bitstrings, be certain that
2025you're supplying bitstrings: If an operand is a number, that will imply
2026a B<numeric> bitwise operation.  You may explicitly show which type of
2027operation you intend by using C<""> or C<0+>, as in the examples below.
2028
2029    $foo =  150  |  105 ;	# yields 255  (0x96 | 0x69 is 0xFF)
2030    $foo = '150' |  105 ;	# yields 255
2031    $foo =  150  | '105';	# yields 255
2032    $foo = '150' | '105';	# yields string '155' (under ASCII)
2033
2034    $baz = 0+$foo & 0+$bar;	# both ops explicitly numeric
2035    $biz = "$foo" ^ "$bar";	# both ops explicitly stringy
2036
2037See L<perlfunc/vec> for information on how to manipulate individual bits
2038in a bit vector.
2039
2040=head2 Integer Arithmetic
2041
2042By default, Perl assumes that it must do most of its arithmetic in
2043floating point.  But by saying
2044
2045    use integer;
2046
2047you may tell the compiler that it's okay to use integer operations
2048(if it feels like it) from here to the end of the enclosing BLOCK.
2049An inner BLOCK may countermand this by saying
2050
2051    no integer;
2052
2053which lasts until the end of that BLOCK.  Note that this doesn't
2054mean everything is only an integer, merely that Perl may use integer
2055operations if it is so inclined.  For example, even under C<use
2056integer>, if you take the C<sqrt(2)>, you'll still get C<1.4142135623731>
2057or so.
2058
2059Used on numbers, the bitwise operators ("&", "|", "^", "~", "<<",
2060and ">>") always produce integral results.  (But see also
2061L<Bitwise String Operators>.)  However, C<use integer> still has meaning for
2062them.  By default, their results are interpreted as unsigned integers, but
2063if C<use integer> is in effect, their results are interpreted
2064as signed integers.  For example, C<~0> usually evaluates to a large
2065integral value.  However, C<use integer; ~0> is C<-1> on twos-complement
2066machines.
2067
2068=head2 Floating-point Arithmetic
2069
2070While C<use integer> provides integer-only arithmetic, there is no
2071analogous mechanism to provide automatic rounding or truncation to a
2072certain number of decimal places.  For rounding to a certain number
2073of digits, sprintf() or printf() is usually the easiest route.
2074See L<perlfaq4>.
2075
2076Floating-point numbers are only approximations to what a mathematician
2077would call real numbers.  There are infinitely more reals than floats,
2078so some corners must be cut.  For example:
2079
2080    printf "%.20g\n", 123456789123456789;
2081    #        produces 123456789123456784
2082
2083Testing for exact equality of floating-point equality or inequality is
2084not a good idea.  Here's a (relatively expensive) work-around to compare
2085whether two floating-point numbers are equal to a particular number of
2086decimal places.  See Knuth, volume II, for a more robust treatment of
2087this topic.
2088
2089    sub fp_equal {
2090	my ($X, $Y, $POINTS) = @_;
2091	my ($tX, $tY);
2092	$tX = sprintf("%.${POINTS}g", $X);
2093	$tY = sprintf("%.${POINTS}g", $Y);
2094	return $tX eq $tY;
2095    }
2096
2097The POSIX module (part of the standard perl distribution) implements
2098ceil(), floor(), and other mathematical and trigonometric functions.
2099The Math::Complex module (part of the standard perl distribution)
2100defines mathematical functions that work on both the reals and the
2101imaginary numbers.  Math::Complex not as efficient as POSIX, but
2102POSIX can't work with complex numbers.
2103
2104Rounding in financial applications can have serious implications, and
2105the rounding method used should be specified precisely.  In these
2106cases, it probably pays not to trust whichever system rounding is
2107being used by Perl, but to instead implement the rounding function you
2108need yourself.
2109
2110=head2 Bigger Numbers
2111
2112The standard Math::BigInt and Math::BigFloat modules provide
2113variable-precision arithmetic and overloaded operators, although
2114they're currently pretty slow. At the cost of some space and
2115considerable speed, they avoid the normal pitfalls associated with
2116limited-precision representations.
2117
2118    use Math::BigInt;
2119    $x = Math::BigInt->new('123456789123456789');
2120    print $x * $x;
2121
2122    # prints +15241578780673678515622620750190521
2123
2124There are several modules that let you calculate with (bound only by
2125memory and cpu-time) unlimited or fixed precision. There are also
2126some non-standard modules that provide faster implementations via
2127external C libraries.
2128
2129Here is a short, but incomplete summary:
2130
2131	Math::Fraction		big, unlimited fractions like 9973 / 12967
2132	Math::String		treat string sequences like numbers
2133	Math::FixedPrecision	calculate with a fixed precision
2134	Math::Currency		for currency calculations
2135	Bit::Vector		manipulate bit vectors fast (uses C)
2136	Math::BigIntFast	Bit::Vector wrapper for big numbers
2137	Math::Pari		provides access to the Pari C library
2138	Math::BigInteger	uses an external C library
2139	Math::Cephes		uses external Cephes C library (no big numbers)
2140	Math::Cephes::Fraction	fractions via the Cephes library
2141	Math::GMP		another one using an external C library
2142
2143Choose wisely.
2144
2145=cut
2146