1=encoding utf8 2 3=head1 NAME 4X<operator> 5 6perlop - Perl expressions: operators, precedence, string literals 7 8=head1 DESCRIPTION 9 10In Perl, the operator determines what operation is performed, 11independent of the type of the operands. For example S<C<$x + $y>> 12is always a numeric addition, and if C<$x> or C<$y> do not contain 13numbers, an attempt is made to convert them to numbers first. 14 15This is in contrast to many other dynamic languages, where the 16operation is determined by the type of the first argument. It also 17means that Perl has two versions of some operators, one for numeric 18and one for string comparison. For example S<C<$x == $y>> compares 19two numbers for equality, and S<C<$x eq $y>> compares two strings. 20 21There are a few exceptions though: C<x> can be either string 22repetition or list repetition, depending on the type of the left 23operand, and C<&>, C<|>, C<^> and C<~> can be either string or numeric bit 24operations. 25 26=head2 Operator Precedence and Associativity 27X<operator, precedence> X<precedence> X<associativity> 28 29Operator precedence and associativity work in Perl more or less like 30they do in mathematics. 31 32I<Operator precedence> means some operators group more tightly than others. 33For example, in C<2 + 4 * 5>, the multiplication has higher precedence, so C<4 34* 5> is grouped together as the right-hand operand of the addition, rather 35than C<2 + 4> being grouped together as the left-hand operand of the 36multiplication. It is as if the expression were written C<2 + (4 * 5)>, not 37C<(2 + 4) * 5>. So the expression yields C<2 + 20 == 22>, rather than 38C<6 * 5 == 30>. 39 40I<Operator associativity> defines what happens if a sequence of the same 41operators is used one after another: 42usually that they will be grouped at the left 43or the right. For example, in C<9 - 3 - 2>, subtraction is left associative, 44so C<9 - 3> is grouped together as the left-hand operand of the second 45subtraction, rather than C<3 - 2> being grouped together as the right-hand 46operand of the first subtraction. It is as if the expression were written 47C<(9 - 3) - 2>, not C<9 - (3 - 2)>. So the expression yields C<6 - 2 == 4>, 48rather than C<9 - 1 == 8>. 49 50For simple operators that evaluate all their operands and then combine the 51values in some way, precedence and associativity (and parentheses) imply some 52ordering requirements on those combining operations. For example, in C<2 + 4 * 535>, the grouping implied by precedence means that the multiplication of 4 and 545 must be performed before the addition of 2 and 20, simply because the result 55of that multiplication is required as one of the operands of the addition. But 56the order of operations is not fully determined by this: in C<2 * 2 + 4 * 5> 57both multiplications must be performed before the addition, but the grouping 58does not say anything about the order in which the two multiplications are 59performed. In fact Perl has a general rule that the operands of an operator 60are evaluated in left-to-right order. A few operators such as C<&&=> have 61special evaluation rules that can result in an operand not being evaluated at 62all; in general, the top-level operator in an expression has control of 63operand evaluation. 64 65Some comparison operators, as their associativity, I<chain> with some 66operators of the same precedence (but never with operators of different 67precedence). This chaining means that each comparison is performed 68on the two arguments surrounding it, with each interior argument taking 69part in two comparisons, and the comparison results are implicitly ANDed. 70Thus S<C<"$x E<lt> $y E<lt>= $z">> behaves exactly like S<C<"$x E<lt> 71$y && $y E<lt>= $z">>, assuming that C<"$y"> is as simple a scalar as 72it looks. The ANDing short-circuits just like C<"&&"> does, stopping 73the sequence of comparisons as soon as one yields false. 74 75In a chained comparison, each argument expression is evaluated at most 76once, even if it takes part in two comparisons, but the result of the 77evaluation is fetched for each comparison. (It is not evaluated 78at all if the short-circuiting means that it's not required for any 79comparisons.) This matters if the computation of an interior argument 80is expensive or non-deterministic. For example, 81 82 if($x < expensive_sub() <= $z) { ... 83 84is not entirely like 85 86 if($x < expensive_sub() && expensive_sub() <= $z) { ... 87 88but instead closer to 89 90 my $tmp = expensive_sub(); 91 if($x < $tmp && $tmp <= $z) { ... 92 93in that the subroutine is only called once. However, it's not exactly 94like this latter code either, because the chained comparison doesn't 95actually involve any temporary variable (named or otherwise): there is 96no assignment. This doesn't make much difference where the expression 97is a call to an ordinary subroutine, but matters more with an lvalue 98subroutine, or if the argument expression yields some unusual kind of 99scalar by other means. For example, if the argument expression yields 100a tied scalar, then the expression is evaluated to produce that scalar 101at most once, but the value of that scalar may be fetched up to twice, 102once for each comparison in which it is actually used. 103 104In this example, the expression is evaluated only once, and the tied 105scalar (the result of the expression) is fetched for each comparison that 106uses it. 107 108 if ($x < $tied_scalar < $z) { ... 109 110In the next example, the expression is evaluated only once, and the tied 111scalar is fetched once as part of the operation within the expression. 112The result of that operation is fetched for each comparison, which 113normally doesn't matter unless that expression result is also magical due 114to operator overloading. 115 116 if ($x < $tied_scalar + 42 < $z) { ... 117 118Some operators are instead non-associative, meaning that it is a syntax 119error to use a sequence of those operators of the same precedence. 120For example, S<C<"$x .. $y .. $z">> is an error. 121 122Perl operators have the following associativity and precedence, 123listed from highest precedence to lowest. Operators borrowed from 124C keep the same precedence relationship with each other, even where 125C's precedence is slightly screwy. (This makes learning Perl easier 126for C folks.) With very few exceptions, these all operate on scalar 127values only, not array values. 128 129 left terms and list operators (leftward) 130 left -> 131 nonassoc ++ -- 132 right ** 133 right ! ~ ~. \ and unary + and - 134 left =~ !~ 135 left * / % x 136 left + - . 137 left << >> 138 nonassoc named unary operators 139 nonassoc isa 140 chained < > <= >= lt gt le ge 141 chain/na == != eq ne <=> cmp ~~ 142 left & &. 143 left | |. ^ ^. 144 left && 145 left || ^^ // 146 nonassoc .. ... 147 right ?: 148 right = += -= *= etc. goto last next redo dump 149 left , => 150 nonassoc list operators (rightward) 151 right not 152 left and 153 left or xor 154 155In the following sections, these operators are covered in detail, in the 156same order in which they appear in the table above. 157 158Many operators can be overloaded for objects. See L<overload>. 159 160=head2 Terms and List Operators (Leftward) 161X<list operator> X<operator, list> X<term> 162 163A TERM has the highest precedence in Perl. They include variables, 164quote and quote-like operators, any expression in parentheses, 165and any function whose arguments are parenthesized. Actually, there 166aren't really functions in this sense, just list operators and unary 167operators behaving as functions because you put parentheses around 168the arguments. These are all documented in L<perlfunc>. 169 170If any list operator (C<print()>, etc.) or any unary operator (C<chdir()>, etc.) 171is followed by a left parenthesis as the next token, the operator and 172arguments within parentheses are taken to be of highest precedence, 173just like a normal function call. 174 175In the absence of parentheses, the precedence of list operators such as 176C<print>, C<sort>, or C<chmod> is either very high or very low depending on 177whether you are looking at the left side or the right side of the operator. 178For example, in 179 180 @ary = (1, 3, sort 4, 2); 181 print @ary; # prints 1324 182 183the commas on the right of the C<sort> are evaluated before the C<sort>, 184but the commas on the left are evaluated after. In other words, 185list operators tend to gobble up all arguments that follow, and 186then act like a simple TERM with regard to the preceding expression. 187Be careful with parentheses: 188 189 # These evaluate exit before doing the print: 190 print($foo, exit); # Obviously not what you want. 191 print $foo, exit; # Nor is this. 192 193 # These do the print before evaluating exit: 194 (print $foo), exit; # This is what you want. 195 print($foo), exit; # Or this. 196 print ($foo), exit; # Or even this. 197 198Also note that 199 200 print ($foo & 255) + 1, "\n"; 201 202probably doesn't do what you expect at first glance. The parentheses 203enclose the argument list for C<print> which is evaluated (printing 204the result of S<C<$foo & 255>>). Then one is added to the return value 205of C<print> (usually 1). The result is something like this: 206 207 1 + 1, "\n"; # Obviously not what you meant. 208 209To do what you meant properly, you must write: 210 211 print(($foo & 255) + 1, "\n"); 212 213See L</Named Unary Operators> for more discussion of this. 214 215Also parsed as terms are the S<C<do {}>> and S<C<eval {}>> constructs, as 216well as subroutine and method calls, and the anonymous 217constructors C<[]> and C<{}>. 218 219See also L</Quote and Quote-like Operators> toward the end of this section, 220as well as L</"I/O Operators">. 221 222=head2 The Arrow Operator 223X<arrow> X<dereference> X<< -> >> 224 225"C<< -> >>" is an infix dereference operator, just as it is in C 226and C++. If the right side is either a C<[...]>, C<{...}>, or a 227C<(...)> subscript, then the left side must be either a hard or 228symbolic reference to an array, a hash, or a subroutine respectively. 229(Or technically speaking, a location capable of holding a hard 230reference, if it's an array or hash reference being used for 231assignment.) See L<perlreftut> and L<perlref>. 232 233Otherwise, the right side is a method name or a simple scalar 234variable containing either the method name or a subroutine reference, 235and (if it is a method name) the left side must be either an object (a 236blessed reference) or a class name (that is, a package name). See 237L<perlobj>. 238 239The dereferencing cases (as opposed to method-calling cases) are 240somewhat extended by the C<postderef> feature. For the 241details of that feature, consult L<perlref/Postfix Dereference Syntax>. 242 243=head2 Auto-increment and Auto-decrement 244X<increment> X<auto-increment> X<++> X<decrement> X<auto-decrement> X<--> 245 246C<"++"> and C<"--"> work as in C. That is, if placed before a variable, 247they increment or decrement the variable by one before returning the 248value, and if placed after, increment or decrement after returning the 249value. 250 251 $i = 0; $j = 0; 252 print $i++; # prints 0 253 print ++$j; # prints 1 254 255Note that just as in C, Perl doesn't define B<when> the variable is 256incremented or decremented. You just know it will be done sometime 257before or after the value is returned. This also means that modifying 258a variable twice in the same statement will lead to undefined behavior. 259Avoid statements like: 260 261 $i = $i ++; 262 print ++ $i + $i ++; 263 264Perl will not guarantee what the result of the above statements is. 265 266The auto-increment operator has a little extra builtin magic to it. If 267you increment a variable that is numeric, or that has ever been used in 268a numeric context, you get a normal increment. If, however, the 269variable has been used in only string contexts since it was set, and 270has a value that is not the empty string and matches the pattern 271C</^[a-zA-Z]*[0-9]*\z/>, the increment is done as a string, preserving each 272character within its range, with carry: 273 274 print ++($foo = "99"); # prints "100" 275 print ++($foo = "a0"); # prints "a1" 276 print ++($foo = "Az"); # prints "Ba" 277 print ++($foo = "zz"); # prints "aaa" 278 279C<undef> is always treated as numeric, and in particular is changed 280to C<0> before incrementing (so that a post-increment of an undef value 281will return C<0> rather than C<undef>). 282 283The auto-decrement operator is not magical. 284 285=head2 Exponentiation 286X<**> X<exponentiation> X<power> 287 288Binary C<"**"> is the exponentiation operator. It binds even more 289tightly than unary minus, so C<-2**4> is C<-(2**4)>, not C<(-2)**4>. 290(This is 291implemented using C's C<pow(3)> function, which actually works on doubles 292internally.) 293 294Note that certain exponentiation expressions are ill-defined: 295these include C<0**0>, C<1**Inf>, and C<Inf**0>. Do not expect 296any particular results from these special cases, the results 297are platform-dependent. 298 299=head2 Symbolic Unary Operators 300X<unary operator> X<operator, unary> 301 302Unary C<"!"> performs logical negation, that is, "not". See also 303L<C<not>|/Logical Not> for a lower precedence version of this. 304X<!> 305 306Unary C<"-"> performs arithmetic negation if the operand is numeric, 307including any string that looks like a number. If the operand is 308an identifier, a string consisting of a minus sign concatenated 309with the identifier is returned. Otherwise, if the string starts 310with a plus or minus, a string starting with the opposite sign is 311returned. One effect of these rules is that C<-bareword> is equivalent 312to the string C<"-bareword">. If, however, the string begins with a 313non-alphabetic character (excluding C<"+"> or C<"-">), Perl will attempt 314to convert 315the string to a numeric, and the arithmetic negation is performed. If the 316string cannot be cleanly converted to a numeric, Perl will give the warning 317B<Argument "the string" isn't numeric in negation (-) at ...>. 318X<-> X<negation, arithmetic> 319 320Unary C<"~"> performs bitwise negation, that is, 1's complement. For 321example, S<C<0666 & ~027>> is 0640. (See also L</Integer Arithmetic> and 322L</Bitwise String Operators>.) Note that the width of the result is 323platform-dependent: C<~0> is 32 bits wide on a 32-bit platform, but 64 324bits wide on a 64-bit platform, so if you are expecting a certain bit 325width, remember to use the C<"&"> operator to mask off the excess bits. 326X<~> X<negation, binary> 327 328Starting in Perl 5.28, it is a fatal error to try to complement a string 329containing a character with an ordinal value above 255. 330 331If the "bitwise" feature is enabled via S<C<use 332feature 'bitwise'>> or C<use v5.28>, then unary 333C<"~"> always treats its argument as a number, and an 334alternate form of the operator, C<"~.">, always treats its argument as a 335string. So C<~0> and C<~"0"> will both give 2**32-1 on 32-bit platforms, 336whereas C<~.0> and C<~."0"> will both yield C<"\xff">. Until Perl 5.28, 337this feature produced a warning in the C<"experimental::bitwise"> category. 338 339Unary C<"+"> has no effect whatsoever, even on strings. It is useful 340syntactically for separating a function name from a parenthesized expression 341that would otherwise be interpreted as the complete list of function 342arguments. (See examples above under L</Terms and List Operators (Leftward)>.) 343X<+> 344 345Unary C<"\"> creates references. If its operand is a single sigilled 346thing, it creates a reference to that object. If its operand is a 347parenthesised list, then it creates references to the things mentioned 348in the list. Otherwise it puts its operand in list context, and creates 349a list of references to the scalars in the list provided by the operand. 350See L<perlreftut> 351and L<perlref>. Do not confuse this behavior with the behavior of 352backslash within a string, although both forms do convey the notion 353of protecting the next thing from interpolation. 354X<\> X<reference> X<backslash> 355 356=head2 Binding Operators 357X<binding> X<operator, binding> X<=~> X<!~> 358 359Binary C<"=~"> binds a scalar expression to a pattern match. Certain operations 360search or modify the string C<$_> by default. This operator makes that kind 361of operation work on some other string. The right argument is a search 362pattern, substitution, or transliteration. The left argument is what is 363supposed to be searched, substituted, or transliterated instead of the default 364C<$_>. When used in scalar context, the return value generally indicates the 365success of the operation. The exceptions are substitution (C<s///>) 366and transliteration (C<y///>) with the C</r> (non-destructive) option, 367which cause the B<r>eturn value to be the result of the substitution. 368Behavior in list context depends on the particular operator. 369See L</"Regexp Quote-Like Operators"> for details and L<perlretut> for 370examples using these operators. 371 372If the right argument is an expression rather than a search pattern, 373substitution, or transliteration, it is interpreted as a search pattern at run 374time. Note that this means that its 375contents will be interpolated twice, so 376 377 '\\' =~ q'\\'; 378 379is not ok, as the regex engine will end up trying to compile the 380pattern C<\>, which it will consider a syntax error. 381 382Binary C<"!~"> is just like C<"=~"> except the return value is negated in 383the logical sense. 384 385Binary C<"!~"> with a non-destructive substitution (C<s///r>) or transliteration 386(C<y///r>) is a syntax error. 387 388=head2 Multiplicative Operators 389X<operator, multiplicative> 390 391Binary C<"*"> multiplies two numbers. 392X<*> 393 394Binary C<"/"> divides two numbers. 395X</> X<slash> 396 397Binary C<"%"> is the modulo operator, which computes the division 398remainder of its first argument with respect to its second argument. 399Given integer 400operands C<$m> and C<$n>: If C<$n> is positive, then S<C<$m % $n>> is 401C<$m> minus the largest multiple of C<$n> less than or equal to 402C<$m>. If C<$n> is negative, then S<C<$m % $n>> is C<$m> minus the 403smallest multiple of C<$n> that is not less than C<$m> (that is, the 404result will be less than or equal to zero). If the operands 405C<$m> and C<$n> are floating point values and the absolute value of 406C<$n> (that is C<abs($n)>) is less than S<C<(UV_MAX + 1)>>, only 407the integer portion of C<$m> and C<$n> will be used in the operation 408(Note: here C<UV_MAX> means the maximum of the unsigned integer type). 409If the absolute value of the right operand (C<abs($n)>) is greater than 410or equal to S<C<(UV_MAX + 1)>>, C<"%"> computes the floating-point remainder 411C<$r> in the equation S<C<($r = $m - $i*$n)>> where C<$i> is a certain 412integer that makes C<$r> have the same sign as the right operand 413C<$n> (B<not> as the left operand C<$m> like C function C<fmod()>) 414and the absolute value less than that of C<$n>. 415Note that when S<C<use integer>> is in scope, C<"%"> gives you direct access 416to the modulo operator as implemented by your C compiler. This 417operator is not as well defined for negative operands, but it will 418execute faster. 419X<%> X<remainder> X<modulo> X<mod> 420 421Binary C<x> is the repetition operator. In scalar context, or if the 422left operand is neither enclosed in parentheses nor a C<qw//> list, 423it performs a string repetition. In that case it supplies scalar 424context to the left operand, and returns a string consisting of the 425left operand string repeated the number of times specified by the right 426operand. If the C<x> is in list context, and the left operand is either 427enclosed in parentheses or a C<qw//> list, it performs a list repetition. 428In that case it supplies list context to the left operand, and returns 429a list consisting of the left operand list repeated the number of times 430specified by the right operand. 431If the right operand is zero or negative (raising a warning on 432negative), it returns an empty string 433or an empty list, depending on the context. 434X<x> 435 436 print '-' x 80; # print row of dashes 437 438 print "\t" x ($tab/8), ' ' x ($tab%8); # tab over 439 440 @ones = (1) x 80; # a list of 80 1's 441 @ones = (5) x @ones; # set all elements to 5 442 443 444=head2 Additive Operators 445X<operator, additive> 446 447Binary C<"+"> returns the sum of two numbers. 448X<+> 449 450Binary C<"-"> returns the difference of two numbers. 451X<-> 452 453Binary C<"."> concatenates two strings. 454X<string, concatenation> X<concatenation> 455X<cat> X<concat> X<concatenate> X<.> 456 457=head2 Shift Operators 458X<shift operator> X<operator, shift> X<<< << >>> 459X<<< >> >>> X<right shift> X<left shift> X<bitwise shift> 460X<shl> X<shr> X<shift, right> X<shift, left> 461 462Binary C<<< "<<" >>> returns the value of its left argument shifted left by the 463number of bits specified by the right argument. Arguments should be 464integers. (See also L</Integer Arithmetic>.) 465 466Binary C<<< ">>" >>> returns the value of its left argument shifted right by 467the number of bits specified by the right argument. Arguments should 468be integers. (See also L</Integer Arithmetic>.) 469 470If S<C<use integer>> (see L</Integer Arithmetic>) is in force then 471signed C integers are used (I<arithmetic shift>), otherwise unsigned C 472integers are used (I<logical shift>), even for negative shiftees. 473In arithmetic right shift the sign bit is replicated on the left, 474in logical shift zero bits come in from the left. 475 476Either way, the implementation isn't going to generate results larger 477than the size of the integer type Perl was built with (32 bits or 64 bits). 478 479Shifting by negative number of bits means the reverse shift: left 480shift becomes right shift, right shift becomes left shift. This is 481unlike in C, where negative shift is undefined. 482 483Shifting by more bits than the size of the integers means most of the 484time zero (all bits fall off), except that under S<C<use integer>> 485right overshifting a negative shiftee results in -1. This is unlike 486in C, where shifting by too many bits is undefined. A common C 487behavior is "shift by modulo wordbits", so that for example 488 489 1 >> 64 == 1 >> (64 % 64) == 1 >> 0 == 1 # Common C behavior. 490 491but that is completely accidental. 492 493If you get tired of being subject to your platform's native integers, 494the S<C<use bigint>> pragma neatly sidesteps the issue altogether: 495 496 print 20 << 20; # 20971520 497 print 20 << 40; # 5120 on 32-bit machines, 498 # 21990232555520 on 64-bit machines 499 use bigint; 500 print 20 << 100; # 25353012004564588029934064107520 501 502=head2 Named Unary Operators 503X<operator, named unary> 504 505The various named unary operators are treated as functions with one 506argument, with optional parentheses. 507 508If any list operator (C<print()>, etc.) or any unary operator (C<chdir()>, etc.) 509is followed by a left parenthesis as the next token, the operator and 510arguments within parentheses are taken to be of highest precedence, 511just like a normal function call. For example, 512because named unary operators are higher precedence than C<||>: 513 514 chdir $foo || die; # (chdir $foo) || die 515 chdir($foo) || die; # (chdir $foo) || die 516 chdir ($foo) || die; # (chdir $foo) || die 517 chdir +($foo) || die; # (chdir $foo) || die 518 519but, because C<"*"> is higher precedence than named operators: 520 521 chdir $foo * 20; # chdir ($foo * 20) 522 chdir($foo) * 20; # (chdir $foo) * 20 523 chdir ($foo) * 20; # (chdir $foo) * 20 524 chdir +($foo) * 20; # chdir ($foo * 20) 525 526 rand 10 * 20; # rand (10 * 20) 527 rand(10) * 20; # (rand 10) * 20 528 rand (10) * 20; # (rand 10) * 20 529 rand +(10) * 20; # rand (10 * 20) 530 531Regarding precedence, the filetest operators, like C<-f>, C<-M>, etc. are 532treated like named unary operators, but they don't follow this functional 533parenthesis rule. That means, for example, that C<-f($file).".bak"> is 534equivalent to S<C<-f "$file.bak">>. 535X<-X> X<filetest> X<operator, filetest> 536 537See also L</"Terms and List Operators (Leftward)">. 538 539=head2 Relational Operators 540X<relational operator> X<operator, relational> 541 542Perl operators that return true or false generally return values 543that can be safely used as numbers. For example, the relational 544operators in this section and the equality operators in the next 545one return C<1> for true and a special version of the defined empty 546string, C<"">, which counts as a zero but is exempt from warnings 547about improper numeric conversions, just as S<C<"0 but true">> is. 548 549Binary C<< "<" >> returns true if the left argument is numerically less than 550the right argument. 551X<< < >> 552 553Binary C<< ">" >> returns true if the left argument is numerically greater 554than the right argument. 555X<< > >> 556 557Binary C<< "<=" >> returns true if the left argument is numerically less than 558or equal to the right argument. 559X<< <= >> 560 561Binary C<< ">=" >> returns true if the left argument is numerically greater 562than or equal to the right argument. 563X<< >= >> 564 565Binary C<"lt"> returns true if the left argument is stringwise less than 566the right argument. 567X<< lt >> 568 569Binary C<"gt"> returns true if the left argument is stringwise greater 570than the right argument. 571X<< gt >> 572 573Binary C<"le"> returns true if the left argument is stringwise less than 574or equal to the right argument. 575X<< le >> 576 577Binary C<"ge"> returns true if the left argument is stringwise greater 578than or equal to the right argument. 579X<< ge >> 580 581A sequence of relational operators, such as S<C<"$x E<lt> $y E<lt>= 582$z">>, performs chained comparisons, in the manner described above in 583the section L</"Operator Precedence and Associativity">. 584Beware that they do not chain with equality operators, which have lower 585precedence. 586 587=head2 Equality Operators 588X<equality> X<equal> X<equals> X<operator, equality> 589 590Binary C<< "==" >> returns true if the left argument is numerically equal to 591the right argument. 592X<==> 593 594Binary C<< "!=" >> returns true if the left argument is numerically not equal 595to the right argument. 596X<!=> 597 598Binary C<"eq"> returns true if the left argument is stringwise equal to 599the right argument. 600X<eq> 601 602Binary C<"ne"> returns true if the left argument is stringwise not equal 603to the right argument. 604X<ne> 605 606A sequence of the above equality operators, such as S<C<"$x == $y == 607$z">>, performs chained comparisons, in the manner described above in 608the section L</"Operator Precedence and Associativity">. 609Beware that they do not chain with relational operators, which have 610higher precedence. 611 612Binary C<< "<=>" >> returns -1, 0, or 1 depending on whether the left 613argument is numerically less than, equal to, or greater than the right 614argument. If your platform supports C<NaN>'s (not-a-numbers) as numeric 615values, using them with C<< "<=>" >> returns undef. C<NaN> is not 616C<< "<" >>, C<< "==" >>, C<< ">" >>, C<< "<=" >> or C<< ">=" >> anything 617(even C<NaN>), so those 5 return false. S<C<< NaN != NaN >>> returns 618true, as does S<C<NaN !=> I<anything else>>. If your platform doesn't 619support C<NaN>'s then C<NaN> is just a string with numeric value 0. 620X<< <=> >> 621X<spaceship> 622 623 $ perl -le '$x = "NaN"; print "No NaN support here" if $x == $x' 624 $ perl -le '$x = "NaN"; print "NaN support here" if $x != $x' 625 626(Note that the L<bigint>, L<bigrat>, and L<bignum> pragmas all 627support C<"NaN">.) 628 629Binary C<"cmp"> returns -1, 0, or 1 depending on whether the left 630argument is stringwise less than, equal to, or greater than the right 631argument. 632 633Here we can see the difference between <=> and cmp, 634 635 print 10 <=> 2 #prints 1 636 print 10 cmp 2 #prints -1 637 638(likewise between gt and >, lt and <, etc.) 639X<cmp> 640 641Binary C<"~~"> does a smartmatch between its arguments. Smart matching 642is described in the next section. 643X<~~> 644 645The two-sided ordering operators C<"E<lt>=E<gt>"> and C<"cmp">, and the 646smartmatch operator C<"~~">, are non-associative with respect to each 647other and with respect to the equality operators of the same precedence. 648 649C<"lt">, C<"le">, C<"ge">, C<"gt"> and C<"cmp"> use the collation (sort) 650order specified by the current C<LC_COLLATE> locale if a S<C<use 651locale>> form that includes collation is in effect. See L<perllocale>. 652Do not mix these with Unicode, 653only use them with legacy 8-bit locale encodings. 654The standard C<L<Unicode::Collate>> and 655C<L<Unicode::Collate::Locale>> modules offer much more powerful 656solutions to collation issues. 657 658For case-insensitive comparisons, look at the L<perlfunc/fc> case-folding 659function, available in Perl v5.16 or later: 660 661 if ( fc($x) eq fc($y) ) { ... } 662 663=head2 Class Instance Operator 664X<isa operator> 665 666Binary C<isa> evaluates to true when the left argument is an object instance of 667the class (or a subclass derived from that class) given by the right argument. 668If the left argument is not defined, not a blessed object instance, nor does 669not derive from the class given by the right argument, the operator evaluates 670as false. The right argument may give the class either as a bareword or a 671scalar expression that yields a string class name: 672 673 if( $obj isa Some::Class ) { ... } 674 675 if( $obj isa "Different::Class" ) { ... } 676 if( $obj isa $name_of_class ) { ... } 677 678This feature is available from Perl 5.31.6 onwards when enabled by 679C<use feature 'isa'>. This feature is enabled automatically by a 680C<use v5.36> (or higher) declaration in the current scope. 681 682=head2 Smartmatch Operator 683 684First available in Perl 5.10.1 (the 5.10.0 version behaved differently), 685binary C<~~> does a "smartmatch" between its arguments. This is mostly 686used implicitly in the C<when> construct described in L<perlsyn>, although 687not all C<when> clauses call the smartmatch operator. Unique among all of 688Perl's operators, the smartmatch operator can recurse. The smartmatch 689operator is L<experimental|perlpolicy/experimental> and its behavior is 690subject to change. 691 692It is also unique in that all other Perl operators impose a context 693(usually string or numeric context) on their operands, autoconverting 694those operands to those imposed contexts. In contrast, smartmatch 695I<infers> contexts from the actual types of its operands and uses that 696type information to select a suitable comparison mechanism. 697 698The C<~~> operator compares its operands "polymorphically", determining how 699to compare them according to their actual types (numeric, string, array, 700hash, etc.). Like the equality operators with which it shares the same 701precedence, C<~~> returns 1 for true and C<""> for false. It is often best 702read aloud as "in", "inside of", or "is contained in", because the left 703operand is often looked for I<inside> the right operand. That makes the 704order of the operands to the smartmatch operand often opposite that of 705the regular match operator. In other words, the "smaller" thing is usually 706placed in the left operand and the larger one in the right. 707 708The behavior of a smartmatch depends on what type of things its arguments 709are, as determined by the following table. The first row of the table 710whose types apply determines the smartmatch behavior. Because what 711actually happens is mostly determined by the type of the second operand, 712the table is sorted on the right operand instead of on the left. 713 714 Left Right Description and pseudocode 715 =============================================================== 716 Any undef check whether Any is undefined 717 like: !defined Any 718 719 Any Object invoke ~~ overloading on Object, or die 720 721 Right operand is an ARRAY: 722 723 Left Right Description and pseudocode 724 =============================================================== 725 ARRAY1 ARRAY2 recurse on paired elements of ARRAY1 and ARRAY2[2] 726 like: (ARRAY1[0] ~~ ARRAY2[0]) 727 && (ARRAY1[1] ~~ ARRAY2[1]) && ... 728 HASH ARRAY any ARRAY elements exist as HASH keys 729 like: grep { exists HASH->{$_} } ARRAY 730 Regexp ARRAY any ARRAY elements pattern match Regexp 731 like: grep { /Regexp/ } ARRAY 732 undef ARRAY undef in ARRAY 733 like: grep { !defined } ARRAY 734 Any ARRAY smartmatch each ARRAY element[3] 735 like: grep { Any ~~ $_ } ARRAY 736 737 Right operand is a HASH: 738 739 Left Right Description and pseudocode 740 =============================================================== 741 HASH1 HASH2 all same keys in both HASHes 742 like: keys HASH1 == 743 grep { exists HASH2->{$_} } keys HASH1 744 ARRAY HASH any ARRAY elements exist as HASH keys 745 like: grep { exists HASH->{$_} } ARRAY 746 Regexp HASH any HASH keys pattern match Regexp 747 like: grep { /Regexp/ } keys HASH 748 undef HASH always false (undef cannot be a key) 749 like: 0 == 1 750 Any HASH HASH key existence 751 like: exists HASH->{Any} 752 753 Right operand is CODE: 754 755 Left Right Description and pseudocode 756 =============================================================== 757 ARRAY CODE sub returns true on all ARRAY elements[1] 758 like: !grep { !CODE->($_) } ARRAY 759 HASH CODE sub returns true on all HASH keys[1] 760 like: !grep { !CODE->($_) } keys HASH 761 Any CODE sub passed Any returns true 762 like: CODE->(Any) 763 764 Right operand is a Regexp: 765 766 Left Right Description and pseudocode 767 =============================================================== 768 ARRAY Regexp any ARRAY elements match Regexp 769 like: grep { /Regexp/ } ARRAY 770 HASH Regexp any HASH keys match Regexp 771 like: grep { /Regexp/ } keys HASH 772 Any Regexp pattern match 773 like: Any =~ /Regexp/ 774 775 Other: 776 777 Left Right Description and pseudocode 778 =============================================================== 779 Object Any invoke ~~ overloading on Object, 780 or fall back to... 781 782 Any Num numeric equality 783 like: Any == Num 784 Num nummy[4] numeric equality 785 like: Num == nummy 786 undef Any check whether undefined 787 like: !defined(Any) 788 Any Any string equality 789 like: Any eq Any 790 791 792Notes: 793 794=over 795 796=item 1. 797Empty hashes or arrays match. 798 799=item 2. 800That is, each element smartmatches the element of the same index in the other array.[3] 801 802=item 3. 803If a circular reference is found, fall back to referential equality. 804 805=item 4. 806Either an actual number, or a string that looks like one. 807 808=back 809 810The smartmatch implicitly dereferences any non-blessed hash or array 811reference, so the C<I<HASH>> and C<I<ARRAY>> entries apply in those cases. 812For blessed references, the C<I<Object>> entries apply. Smartmatches 813involving hashes only consider hash keys, never hash values. 814 815The "like" code entry is not always an exact rendition. For example, the 816smartmatch operator short-circuits whenever possible, but C<grep> does 817not. Also, C<grep> in scalar context returns the number of matches, but 818C<~~> returns only true or false. 819 820Unlike most operators, the smartmatch operator knows to treat C<undef> 821specially: 822 823 use v5.10.1; 824 @array = (1, 2, 3, undef, 4, 5); 825 say "some elements undefined" if undef ~~ @array; 826 827Each operand is considered in a modified scalar context, the modification 828being that array and hash variables are passed by reference to the 829operator, which implicitly dereferences them. Both elements 830of each pair are the same: 831 832 use v5.10.1; 833 834 my %hash = (red => 1, blue => 2, green => 3, 835 orange => 4, yellow => 5, purple => 6, 836 black => 7, grey => 8, white => 9); 837 838 my @array = qw(red blue green); 839 840 say "some array elements in hash keys" if @array ~~ %hash; 841 say "some array elements in hash keys" if \@array ~~ \%hash; 842 843 say "red in array" if "red" ~~ @array; 844 say "red in array" if "red" ~~ \@array; 845 846 say "some keys end in e" if /e$/ ~~ %hash; 847 say "some keys end in e" if /e$/ ~~ \%hash; 848 849Two arrays smartmatch if each element in the first array smartmatches 850(that is, is "in") the corresponding element in the second array, 851recursively. 852 853 use v5.10.1; 854 my @little = qw(red blue green); 855 my @bigger = ("red", "blue", [ "orange", "green" ] ); 856 if (@little ~~ @bigger) { # true! 857 say "little is contained in bigger"; 858 } 859 860Because the smartmatch operator recurses on nested arrays, this 861will still report that "red" is in the array. 862 863 use v5.10.1; 864 my @array = qw(red blue green); 865 my $nested_array = [[[[[[[ @array ]]]]]]]; 866 say "red in array" if "red" ~~ $nested_array; 867 868If two arrays smartmatch each other, then they are deep 869copies of each others' values, as this example reports: 870 871 use v5.12.0; 872 my @a = (0, 1, 2, [3, [4, 5], 6], 7); 873 my @b = (0, 1, 2, [3, [4, 5], 6], 7); 874 875 if (@a ~~ @b && @b ~~ @a) { 876 say "a and b are deep copies of each other"; 877 } 878 elsif (@a ~~ @b) { 879 say "a smartmatches in b"; 880 } 881 elsif (@b ~~ @a) { 882 say "b smartmatches in a"; 883 } 884 else { 885 say "a and b don't smartmatch each other at all"; 886 } 887 888 889If you were to set S<C<$b[3] = 4>>, then instead of reporting that "a and b 890are deep copies of each other", it now reports that C<"b smartmatches in a">. 891That's because the corresponding position in C<@a> contains an array that 892(eventually) has a 4 in it. 893 894Smartmatching one hash against another reports whether both contain the 895same keys, no more and no less. This could be used to see whether two 896records have the same field names, without caring what values those fields 897might have. For example: 898 899 use v5.10.1; 900 sub make_dogtag { 901 state $REQUIRED_FIELDS = { name=>1, rank=>1, serial_num=>1 }; 902 903 my ($class, $init_fields) = @_; 904 905 die "Must supply (only) name, rank, and serial number" 906 unless $init_fields ~~ $REQUIRED_FIELDS; 907 908 ... 909 } 910 911However, this only does what you mean if C<$init_fields> is indeed a hash 912reference. The condition C<$init_fields ~~ $REQUIRED_FIELDS> also allows the 913strings C<"name">, C<"rank">, C<"serial_num"> as well as any array reference 914that contains C<"name"> or C<"rank"> or C<"serial_num"> anywhere to pass 915through. 916 917The smartmatch operator is most often used as the implicit operator of a 918C<when> clause. See the section on "Switch Statements" in L<perlsyn>. 919 920=head3 Smartmatching of Objects 921 922To avoid relying on an object's underlying representation, if the 923smartmatch's right operand is an object that doesn't overload C<~~>, 924it raises the exception "C<Smartmatching a non-overloaded object 925breaks encapsulation>". That's because one has no business digging 926around to see whether something is "in" an object. These are all 927illegal on objects without a C<~~> overload: 928 929 %hash ~~ $object 930 42 ~~ $object 931 "fred" ~~ $object 932 933However, you can change the way an object is smartmatched by overloading 934the C<~~> operator. This is allowed to 935extend the usual smartmatch semantics. 936For objects that do have an C<~~> overload, see L<overload>. 937 938Using an object as the left operand is allowed, although not very useful. 939Smartmatching rules take precedence over overloading, so even if the 940object in the left operand has smartmatch overloading, this will be 941ignored. A left operand that is a non-overloaded object falls back on a 942string or numeric comparison of whatever the C<ref> operator returns. That 943means that 944 945 $object ~~ X 946 947does I<not> invoke the overload method with C<I<X>> as an argument. 948Instead the above table is consulted as normal, and based on the type of 949C<I<X>>, overloading may or may not be invoked. For simple strings or 950numbers, "in" becomes equivalent to this: 951 952 $object ~~ $number ref($object) == $number 953 $object ~~ $string ref($object) eq $string 954 955For example, this reports that the handle smells IOish 956(but please don't really do this!): 957 958 use IO::Handle; 959 my $fh = IO::Handle->new(); 960 if ($fh ~~ /\bIO\b/) { 961 say "handle smells IOish"; 962 } 963 964That's because it treats C<$fh> as a string like 965C<"IO::Handle=GLOB(0x8039e0)">, then pattern matches against that. 966 967=head2 Bitwise And 968X<operator, bitwise, and> X<bitwise and> X<&> 969 970Binary C<"&"> returns its operands ANDed together bit by bit. Although no 971warning is currently raised, the result is not well defined when this operation 972is performed on operands that aren't either numbers (see 973L</Integer Arithmetic>) nor bitstrings (see L</Bitwise String Operators>). 974 975Note that C<"&"> has lower priority than relational operators, so for example 976the parentheses are essential in a test like 977 978 print "Even\n" if ($x & 1) == 0; 979 980If the "bitwise" feature is enabled via S<C<use feature 'bitwise'>> or 981C<use v5.28>, then this operator always treats its operands as numbers. 982Before Perl 5.28 this feature produced a warning in the 983C<"experimental::bitwise"> category. 984 985=head2 Bitwise Or and Exclusive Or 986X<operator, bitwise, or> X<bitwise or> X<|> X<operator, bitwise, xor> 987X<bitwise xor> X<^> 988 989Binary C<"|"> returns its operands ORed together bit by bit. 990 991Binary C<"^"> returns its operands XORed together bit by bit. 992 993Although no warning is currently raised, the results are not well 994defined when these operations are performed on operands that aren't either 995numbers (see L</Integer Arithmetic>) nor bitstrings (see L</Bitwise String 996Operators>). 997 998Note that C<"|"> and C<"^"> have lower priority than relational operators, so 999for example the parentheses are essential in a test like 1000 1001 print "false\n" if (8 | 2) != 10; 1002 1003If the "bitwise" feature is enabled via S<C<use feature 'bitwise'>> or 1004C<use v5.28>, then this operator always treats its operands as numbers. 1005Before Perl 5.28. this feature produced a warning in the 1006C<"experimental::bitwise"> category. 1007 1008=head2 C-style Logical And 1009X<&&> X<logical and> X<operator, logical, and> 1010 1011Binary C<"&&"> performs a short-circuit logical AND operation. That is, 1012if the left operand is false, the right operand is not even evaluated. 1013Scalar or list context propagates down to the right operand if it 1014is evaluated. 1015 1016=head2 C-style Logical Or 1017X<||> X<operator, logical, or> 1018 1019Binary C<"||"> performs a short-circuit logical OR operation. That is, 1020if the left operand is true, the right operand is not even evaluated. 1021Scalar or list context propagates down to the right operand if it 1022is evaluated. 1023 1024=head2 C-style Logical Xor 1025X<^^> X<operator, logical, xor> 1026 1027Binary C<"^^"> performs a logical XOR operation. Both operands are 1028evaluated and the result is true only if exactly one of the operands is true. 1029Scalar or list context propagates down to the right operand. 1030 1031=head2 Logical Defined-Or 1032X<//> X<operator, logical, defined-or> 1033 1034Although it has no direct equivalent in C, Perl's C<//> operator is related 1035to its C-style "or". In fact, it's exactly the same as C<||>, except that it 1036tests the left hand side's definedness instead of its truth. Thus, 1037S<C<< EXPR1 // EXPR2 >>> returns the value of C<< EXPR1 >> if it's defined, 1038otherwise, the value of C<< EXPR2 >> is returned. 1039(C<< EXPR1 >> is evaluated in scalar context, C<< EXPR2 >> 1040in the context of C<< // >> itself). Usually, 1041this is the same result as S<C<< defined(EXPR1) ? EXPR1 : EXPR2 >>> (except that 1042the ternary-operator form can be used as a lvalue, while S<C<< EXPR1 // EXPR2 >>> 1043cannot). This is very useful for 1044providing default values for variables. If you actually want to test if 1045at least one of C<$x> and C<$y> is defined, use S<C<defined($x // $y)>>. 1046 1047The C<||>, C<//> and C<&&> operators return the last value evaluated 1048(unlike C's C<||> and C<&&>, which return 0 or 1). Thus, a reasonably 1049portable way to find out the home directory might be: 1050 1051 $home = $ENV{HOME} 1052 // $ENV{LOGDIR} 1053 // (getpwuid($<))[7] 1054 // die "You're homeless!\n"; 1055 1056In particular, this means that you shouldn't use this 1057for selecting between two aggregates for assignment: 1058 1059 @a = @b || @c; # This doesn't do the right thing 1060 @a = scalar(@b) || @c; # because it really means this. 1061 @a = @b ? @b : @c; # This works fine, though. 1062 1063As alternatives to C<&&> and C<||> when used for 1064control flow, Perl provides the C<and> and C<or> operators (see below). 1065The short-circuit behavior is identical. The precedence of C<"and"> 1066and C<"or"> is much lower, however, so that you can safely use them after a 1067list operator without the need for parentheses: 1068 1069 unlink "alpha", "beta", "gamma" 1070 or gripe(), next LINE; 1071 1072With the C-style operators that would have been written like this: 1073 1074 unlink("alpha", "beta", "gamma") 1075 || (gripe(), next LINE); 1076 1077It would be even more readable to write that this way: 1078 1079 unless(unlink("alpha", "beta", "gamma")) { 1080 gripe(); 1081 next LINE; 1082 } 1083 1084Using C<"or"> for assignment is unlikely to do what you want; see below. 1085 1086=head2 Range Operators 1087X<operator, range> X<range> X<..> X<...> 1088 1089Binary C<".."> is the range operator, which is really two different 1090operators depending on the context. In list context, it returns a 1091list of values counting (up by ones) from the left value to the right 1092value. If the left value is greater than the right value then it 1093returns the empty list. The range operator is useful for writing 1094S<C<foreach (1..10)>> loops and for doing slice operations on arrays. In 1095the current implementation, no temporary array is created when the 1096range operator is used as the expression in C<foreach> loops, but older 1097versions of Perl might burn a lot of memory when you write something 1098like this: 1099 1100 for (1 .. 1_000_000) { 1101 # code 1102 } 1103 1104The range operator also works on strings, using the magical 1105auto-increment, see below. 1106 1107In scalar context, C<".."> returns a boolean value. The operator is 1108bistable, like a flip-flop, and emulates the line-range (comma) 1109operator of B<sed>, B<awk>, and various editors. Each C<".."> operator 1110maintains its own boolean state, even across calls to a subroutine 1111that contains it. It is false as long as its left operand is false. 1112Once the left operand is true, the range operator stays true until the 1113right operand is true, I<AFTER> which the range operator becomes false 1114again. It doesn't become false till the next time the range operator 1115is evaluated. It can test the right operand and become false on the 1116same evaluation it became true (as in B<awk>), but it still returns 1117true once. If you don't want it to test the right operand until the 1118next evaluation, as in B<sed>, just use three dots (C<"...">) instead of 1119two. In all other regards, C<"..."> behaves just like C<".."> does. 1120 1121The right operand is not evaluated while the operator is in the 1122"false" state, and the left operand is not evaluated while the 1123operator is in the "true" state. The precedence is a little lower 1124than || and &&. The value returned is either the empty string for 1125false, or a sequence number (beginning with 1) for true. The sequence 1126number is reset for each range encountered. The final sequence number 1127in a range has the string C<"E0"> appended to it, which doesn't affect 1128its numeric value, but gives you something to search for if you want 1129to exclude the endpoint. You can exclude the beginning point by 1130waiting for the sequence number to be greater than 1. 1131 1132If either operand of scalar C<".."> is a constant expression, 1133that operand is considered true if it is equal (C<==>) to the current 1134input line number (the C<$.> variable). 1135 1136To be pedantic, the comparison is actually S<C<int(EXPR) == int(EXPR)>>, 1137but that is only an issue if you use a floating point expression; when 1138implicitly using C<$.> as described in the previous paragraph, the 1139comparison is S<C<int(EXPR) == int($.)>> which is only an issue when C<$.> 1140is set to a floating point value and you are not reading from a file. 1141Furthermore, S<C<"span" .. "spat">> or S<C<2.18 .. 3.14>> will not do what 1142you want in scalar context because each of the operands are evaluated 1143using their integer representation. 1144 1145Examples: 1146 1147As a scalar operator: 1148 1149 if (101 .. 200) { print; } # print 2nd hundred lines, short for 1150 # if ($. == 101 .. $. == 200) { print; } 1151 1152 next LINE if (1 .. /^$/); # skip header lines, short for 1153 # next LINE if ($. == 1 .. /^$/); 1154 # (typically in a loop labeled LINE) 1155 1156 s/^/> / if (/^$/ .. eof()); # quote body 1157 1158 # parse mail messages 1159 while (<>) { 1160 $in_header = 1 .. /^$/; 1161 $in_body = /^$/ .. eof; 1162 if ($in_header) { 1163 # do something 1164 } else { # in body 1165 # do something else 1166 } 1167 } continue { 1168 close ARGV if eof; # reset $. each file 1169 } 1170 1171Here's a simple example to illustrate the difference between 1172the two range operators: 1173 1174 @lines = (" - Foo", 1175 "01 - Bar", 1176 "1 - Baz", 1177 " - Quux"); 1178 1179 foreach (@lines) { 1180 if (/0/ .. /1/) { 1181 print "$_\n"; 1182 } 1183 } 1184 1185This program will print only the line containing "Bar". If 1186the range operator is changed to C<...>, it will also print the 1187"Baz" line. 1188 1189And now some examples as a list operator: 1190 1191 for (101 .. 200) { print } # print $_ 100 times 1192 @foo = @foo[0 .. $#foo]; # an expensive no-op 1193 @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items 1194 1195Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will 1196return two elements in list context. 1197 1198 @list = (2.18 .. 3.14); # same as @list = (2 .. 3); 1199 1200The range operator in list context can make use of the magical 1201auto-increment algorithm if both operands are strings, subject to the 1202following rules: 1203 1204=over 1205 1206=item * 1207 1208With one exception (below), if both strings look like numbers to Perl, 1209the magic increment will not be applied, and the strings will be treated 1210as numbers (more specifically, integers) instead. 1211 1212For example, C<"-2".."2"> is the same as C<-2..2>, and 1213C<"2.18".."3.14"> produces C<2, 3>. 1214 1215=item * 1216 1217The exception to the above rule is when the left-hand string begins with 1218C<0> and is longer than one character, in this case the magic increment 1219I<will> be applied, even though strings like C<"01"> would normally look 1220like a number to Perl. 1221 1222For example, C<"01".."04"> produces C<"01", "02", "03", "04">, and 1223C<"00".."-1"> produces C<"00"> through C<"99"> - this may seem 1224surprising, but see the following rules for why it works this way. 1225To get dates with leading zeros, you can say: 1226 1227 @z2 = ("01" .. "31"); 1228 print $z2[$mday]; 1229 1230If you want to force strings to be interpreted as numbers, you could say 1231 1232 @numbers = ( 0+$first .. 0+$last ); 1233 1234B<Note:> In Perl versions 5.30 and below, I<any> string on the left-hand 1235side beginning with C<"0">, including the string C<"0"> itself, would 1236cause the magic string increment behavior. This means that on these Perl 1237versions, C<"0".."-1"> would produce C<"0"> through C<"99">, which was 1238inconsistent with C<0..-1>, which produces the empty list. This also means 1239that C<"0".."9"> now produces a list of integers instead of a list of 1240strings. 1241 1242=item * 1243 1244If the initial value specified isn't part of a magical increment 1245sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>), 1246only the initial value will be returned. 1247 1248For example, C<"ax".."az"> produces C<"ax", "ay", "az">, but 1249C<"*x".."az"> produces only C<"*x">. 1250 1251=item * 1252 1253For other initial values that are strings that do follow the rules of the 1254magical increment, the corresponding sequence will be returned. 1255 1256For example, you can say 1257 1258 @alphabet = ("A" .. "Z"); 1259 1260to get all normal letters of the English alphabet, or 1261 1262 $hexdigit = (0 .. 9, "a" .. "f")[$num & 15]; 1263 1264to get a hexadecimal digit. 1265 1266=item * 1267 1268If the final value specified is not in the sequence that the magical 1269increment would produce, the sequence goes until the next value would 1270be longer than the final value specified. If the length of the final 1271string is shorter than the first, the empty list is returned. 1272 1273For example, C<"a".."--"> is the same as C<"a".."zz">, C<"0".."xx"> 1274produces C<"0"> through C<"99">, and C<"aaa".."--"> returns the empty 1275list. 1276 1277=back 1278 1279As of Perl 5.26, the list-context range operator on strings works as expected 1280in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The 1281'unicode_strings' feature >>. In previous versions, and outside the scope of 1282that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior 1283depends on the internal encoding of the range endpoint. 1284 1285Because the magical increment only works on non-empty strings matching 1286C</^[a-zA-Z]*[0-9]*\z/>, the following will only return an alpha: 1287 1288 use charnames "greek"; 1289 my @greek_small = ("\N{alpha}" .. "\N{omega}"); 1290 1291To get the 25 traditional lowercase Greek letters, including both sigmas, 1292you could use this instead: 1293 1294 use charnames "greek"; 1295 my @greek_small = map { chr } ( ord("\N{alpha}") 1296 .. 1297 ord("\N{omega}") 1298 ); 1299 1300However, because there are I<many> other lowercase Greek characters than 1301just those, to match lowercase Greek characters in a regular expression, 1302you could use the pattern C</(?:(?=\p{Greek})\p{Lower})+/> (or the 1303L<experimental feature|perlrecharclass/Extended Bracketed Character 1304Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>). 1305 1306=head2 Conditional Operator 1307X<operator, conditional> X<operator, ternary> X<ternary> X<?:> 1308 1309Ternary C<"?:"> is the conditional operator, just as in C. It works much 1310like an if-then-else. If the argument before the C<?> is true, the 1311argument before the C<:> is returned, otherwise the argument after the 1312C<:> is returned. For example: 1313 1314 printf "I have %d dog%s.\n", $n, 1315 ($n == 1) ? "" : "s"; 1316 1317Scalar or list context propagates downward into the 2nd 1318or 3rd argument, whichever is selected. 1319 1320 $x = $ok ? $y : $z; # get a scalar 1321 @x = $ok ? @y : @z; # get an array 1322 $x = $ok ? @y : @z; # oops, that's just a count! 1323 1324The operator may be assigned to if both the 2nd and 3rd arguments are 1325legal lvalues (meaning that you can assign to them): 1326 1327 ($x_or_y ? $x : $y) = $z; 1328 1329Because this operator produces an assignable result, using assignments 1330without parentheses will get you in trouble. For example, this: 1331 1332 $x % 2 ? $x += 10 : $x += 2 1333 1334Really means this: 1335 1336 (($x % 2) ? ($x += 10) : $x) += 2 1337 1338Rather than this: 1339 1340 ($x % 2) ? ($x += 10) : ($x += 2) 1341 1342That should probably be written more simply as: 1343 1344 $x += ($x % 2) ? 10 : 2; 1345 1346=head2 Assignment Operators 1347X<assignment> X<operator, assignment> X<=> X<**=> X<+=> X<*=> X<&=> 1348X<<< <<= >>> X<&&=> X<-=> X</=> X<|=> X<<< >>= >>> X<||=> X<//=> X<.=> 1349X<%=> X<^=> X<x=> X<&.=> X<|.=> X<^.=> 1350 1351C<"="> is the ordinary assignment operator. 1352 1353Assignment operators work as in C. That is, 1354 1355 $x += 2; 1356 1357is equivalent to 1358 1359 $x = $x + 2; 1360 1361although without duplicating any side effects that dereferencing the lvalue 1362might trigger, such as from C<tie()>. Other assignment operators work similarly. 1363The following are recognized: 1364 1365 **= += *= &= &.= <<= &&= 1366 -= /= |= |.= >>= ||= 1367 .= %= ^= ^.= //= 1368 x= 1369 1370Although these are grouped by family, they all have the precedence 1371of assignment. These combined assignment operators can only operate on 1372scalars, whereas the ordinary assignment operator can assign to arrays, 1373hashes, lists and even references. (See L<"Context"|perldata/Context> 1374and L<perldata/List value constructors>, and L<perlref/Assigning to 1375References>.) 1376 1377Unlike in C, the scalar assignment operator produces a valid lvalue. 1378Modifying an assignment is equivalent to doing the assignment and 1379then modifying the variable that was assigned to. This is useful 1380for modifying a copy of something, like this: 1381 1382 ($tmp = $global) =~ tr/13579/24680/; 1383 1384Although as of 5.14, that can be also be accomplished this way: 1385 1386 use v5.14; 1387 $tmp = ($global =~ tr/13579/24680/r); 1388 1389Likewise, 1390 1391 ($x += 2) *= 3; 1392 1393is equivalent to 1394 1395 $x += 2; 1396 $x *= 3; 1397 1398Similarly, a list assignment in list context produces the list of 1399lvalues assigned to, and a list assignment in scalar context returns 1400the number of elements produced by the expression on the right hand 1401side of the assignment. 1402 1403The three dotted bitwise assignment operators (C<&.=> C<|.=> C<^.=>) are new in 1404Perl 5.22. See L</Bitwise String Operators>. 1405 1406=head2 Comma Operator 1407X<comma> X<operator, comma> X<,> 1408 1409Binary C<","> is the comma operator. In scalar context it evaluates 1410its left argument, throws that value away, then evaluates its right 1411argument and returns that value. This is just like C's comma operator. 1412 1413In list context, it's just the list argument separator, and inserts 1414both its arguments into the list. These arguments are also evaluated 1415from left to right. 1416 1417The C<< => >> operator (sometimes pronounced "fat comma") is a synonym 1418for the comma except that it causes a 1419word on its left to be interpreted as a string if it begins with a letter 1420or underscore and is composed only of letters, digits and underscores. 1421This includes operands that might otherwise be interpreted as operators, 1422constants, single number v-strings or function calls. If in doubt about 1423this behavior, the left operand can be quoted explicitly. 1424 1425Otherwise, the C<< => >> operator behaves exactly as the comma operator 1426or list argument separator, according to context. 1427 1428For example: 1429 1430 use constant FOO => "something"; 1431 1432 my %h = ( FOO => 23 ); 1433 1434is equivalent to: 1435 1436 my %h = ("FOO", 23); 1437 1438It is I<NOT>: 1439 1440 my %h = ("something", 23); 1441 1442The C<< => >> operator is helpful in documenting the correspondence 1443between keys and values in hashes, and other paired elements in lists. 1444 1445 %hash = ( $key => $value ); 1446 login( $username => $password ); 1447 1448The special quoting behavior ignores precedence, and hence may apply to 1449I<part> of the left operand: 1450 1451 print time.shift => "bbb"; 1452 1453That example prints something like C<"1314363215shiftbbb">, because the 1454C<< => >> implicitly quotes the C<shift> immediately on its left, ignoring 1455the fact that C<time.shift> is the entire left operand. 1456 1457=head2 List Operators (Rightward) 1458X<operator, list, rightward> X<list operator> 1459 1460On the right side of a list operator, the comma has very low precedence, 1461such that it controls all comma-separated expressions found there. 1462The only operators with lower precedence are the logical operators 1463C<"and">, C<"or">, and C<"not">, which may be used to evaluate calls to list 1464operators without the need for parentheses: 1465 1466 open HANDLE, "< :encoding(UTF-8)", "filename" 1467 or die "Can't open: $!\n"; 1468 1469However, some people find that code harder to read than writing 1470it with parentheses: 1471 1472 open(HANDLE, "< :encoding(UTF-8)", "filename") 1473 or die "Can't open: $!\n"; 1474 1475in which case you might as well just use the more customary C<"||"> operator: 1476 1477 open(HANDLE, "< :encoding(UTF-8)", "filename") 1478 || die "Can't open: $!\n"; 1479 1480See also discussion of list operators in L</Terms and List Operators (Leftward)>. 1481 1482=head2 Logical Not 1483X<operator, logical, not> X<not> 1484 1485Unary C<"not"> returns the logical negation of the expression to its right. 1486It's the equivalent of C<"!"> except for the very low precedence. 1487 1488=head2 Logical And 1489X<operator, logical, and> X<and> 1490 1491Binary C<"and"> returns the logical conjunction of the two surrounding 1492expressions. It's equivalent to C<&&> except for the very low 1493precedence. This means that it short-circuits: the right 1494expression is evaluated only if the left expression is true. 1495 1496=head2 Logical or and Exclusive Or 1497X<operator, logical, or> X<operator, logical, xor> 1498X<operator, logical, exclusive or> 1499X<or> X<xor> 1500 1501Binary C<"or"> returns the logical disjunction of the two surrounding 1502expressions. It's equivalent to C<||> except for the very low precedence. 1503This makes it useful for control flow: 1504 1505 print FH $data or die "Can't write to FH: $!"; 1506 1507This means that it short-circuits: the right expression is evaluated 1508only if the left expression is false. Due to its precedence, you must 1509be careful to avoid using it as replacement for the C<||> operator. 1510It usually works out better for flow control than in assignments: 1511 1512 $x = $y or $z; # bug: this is wrong 1513 ($x = $y) or $z; # really means this 1514 $x = $y || $z; # better written this way 1515 1516However, when it's a list-context assignment and you're trying to use 1517C<||> for control flow, you probably need C<"or"> so that the assignment 1518takes higher precedence. 1519 1520 @info = stat($file) || die; # oops, scalar sense of stat! 1521 @info = stat($file) or die; # better, now @info gets its due 1522 1523Then again, you could always use parentheses. 1524 1525Binary C<"xor"> returns the exclusive-OR of the two surrounding expressions. 1526It cannot short-circuit (of course). 1527 1528There is no low precedence operator for defined-OR. 1529 1530=head2 C Operators Missing From Perl 1531X<operator, missing from perl> X<&> X<*> 1532X<typecasting> X<(TYPE)> 1533 1534Here is what C has that Perl doesn't: 1535 1536=over 8 1537 1538=item unary & 1539 1540Address-of operator. (But see the C<"\"> operator for taking a reference.) 1541 1542=item unary * 1543 1544Dereference-address operator. (Perl's prefix dereferencing 1545operators are typed: C<$>, C<@>, C<%>, and C<&>.) 1546 1547=item (TYPE) 1548 1549Type-casting operator. 1550 1551=back 1552 1553=head2 Quote and Quote-like Operators 1554X<operator, quote> X<operator, quote-like> X<q> X<qq> X<qx> X<qw> X<m> 1555X<qr> X<s> X<tr> X<'> X<''> X<"> X<""> X<//> X<`> X<``> X<<< << >>> 1556X<escape sequence> X<escape> 1557 1558While we usually think of quotes as literal values, in Perl they 1559function as operators, providing various kinds of interpolating and 1560pattern matching capabilities. Perl provides customary quote characters 1561for these behaviors, but also provides a way for you to choose your 1562quote character for any of them. In the following table, a C<{}> represents 1563any pair of delimiters you choose. 1564 1565 Customary Generic Meaning Interpolates 1566 '' q{} Literal no 1567 "" qq{} Literal yes 1568 `` qx{} Command yes* 1569 qw{} Word list no 1570 // m{} Pattern match yes* 1571 qr{} Pattern yes* 1572 s{}{} Substitution yes* 1573 tr{}{} Transliteration no (but see below) 1574 y{}{} Transliteration no (but see below) 1575 <<EOF here-doc yes* 1576 1577 * unless the delimiter is ''. 1578 1579Non-bracketing delimiters use the same character fore and aft, but the four 1580sorts of ASCII brackets (round, angle, square, curly) all nest, which means 1581that 1582 1583 q{foo{bar}baz} 1584 1585is the same as 1586 1587 'foo{bar}baz' 1588 1589Note, however, that this does not always work for quoting Perl code: 1590 1591 $s = q{ if($x eq "}") ... }; # WRONG 1592 1593is a syntax error. The C<L<Text::Balanced>> module (standard as of v5.8, 1594and from CPAN before then) is able to do this properly. 1595 1596If the C<extra_paired_delimiters> feature is enabled, then Perl will 1597additionally recognise a variety of Unicode characters as being paired. For 1598a full list, see the L</List of Extra Paired Delimiters> at the end of this 1599document. 1600 1601There can (and in some cases, must) be whitespace between the operator 1602and the quoting 1603characters, except when C<#> is being used as the quoting character. 1604C<q#foo#> is parsed as the string C<foo>, while S<C<q #foo#>> is the 1605operator C<q> followed by a comment. Its argument will be taken 1606from the next line. This allows you to write: 1607 1608 s {foo} # Replace foo 1609 {bar} # with bar. 1610 1611The cases where whitespace must be used are when the quoting character 1612is a word character (meaning it matches C</\w/>): 1613 1614 q XfooX # Works: means the string 'foo' 1615 qXfooX # WRONG! 1616 1617The following escape sequences are available in constructs that interpolate, 1618and in transliterations whose delimiters aren't single quotes (C<"'">). 1619In all the ones with braces, any number of blanks and/or tabs adjoining 1620and within the braces are allowed (and ignored). 1621X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N> X<\N{}> 1622X<\o{}> 1623 1624 Sequence Note Description 1625 \t tab (HT, TAB) 1626 \n newline (NL) 1627 \r return (CR) 1628 \f form feed (FF) 1629 \b backspace (BS) 1630 \a alarm (bell) (BEL) 1631 \e escape (ESC) 1632 \x{263A} [1,8] hex char (example shown: SMILEY) 1633 \x{ 263A } Same, but shows optional blanks inside and 1634 adjoining the braces 1635 \x1b [2,8] restricted range hex char (example: ESC) 1636 \N{name} [3] named Unicode character or character sequence 1637 \N{U+263D} [4,8] Unicode character (example: FIRST QUARTER MOON) 1638 \c[ [5] control char (example: chr(27)) 1639 \o{23072} [6,8] octal char (example: SMILEY) 1640 \033 [7,8] restricted range octal char (example: ESC) 1641 1642Note that any escape sequence using braces inside interpolated 1643constructs may have optional blanks (tab or space characters) adjoining 1644with and inside of the braces, as illustrated above by the second 1645S<C<\x{ }>> example. 1646 1647=over 4 1648 1649=item [1] 1650 1651The result is the character specified by the hexadecimal number between 1652the braces. See L</[8]> below for details on which character. 1653 1654Blanks (tab or space characters) may separate the number from either or 1655both of the braces. 1656 1657Otherwise, only hexadecimal digits are valid between the braces. If an 1658invalid character is encountered, a warning will be issued and the 1659invalid character and all subsequent characters (valid or invalid) 1660within the braces will be discarded. 1661 1662If there are no valid digits between the braces, the generated character is 1663the NULL character (C<\x{00}>). However, an explicit empty brace (C<\x{}>) 1664will not cause a warning (currently). 1665 1666=item [2] 1667 1668The result is the character specified by the hexadecimal number in the range 16690x00 to 0xFF. See L</[8]> below for details on which character. 1670 1671Only hexadecimal digits are valid following C<\x>. When C<\x> is followed 1672by fewer than two valid digits, any valid digits will be zero-padded. This 1673means that C<\x7> will be interpreted as C<\x07>, and a lone C<"\x"> will be 1674interpreted as C<\x00>. Except at the end of a string, having fewer than 1675two valid digits will result in a warning. Note that although the warning 1676says the illegal character is ignored, it is only ignored as part of the 1677escape and will still be used as the subsequent character in the string. 1678For example: 1679 1680 Original Result Warns? 1681 "\x7" "\x07" no 1682 "\x" "\x00" no 1683 "\x7q" "\x07q" yes 1684 "\xq" "\x00q" yes 1685 1686=item [3] 1687 1688The result is the Unicode character or character sequence given by I<name>. 1689See L<charnames>. 1690 1691=item [4] 1692 1693S<C<\N{U+I<hexadecimal number>}>> means the Unicode character whose Unicode code 1694point is I<hexadecimal number>. 1695 1696=item [5] 1697 1698The character following C<\c> is mapped to some other character as shown in the 1699table: 1700 1701 Sequence Value 1702 \c@ chr(0) 1703 \cA chr(1) 1704 \ca chr(1) 1705 \cB chr(2) 1706 \cb chr(2) 1707 ... 1708 \cZ chr(26) 1709 \cz chr(26) 1710 \c[ chr(27) 1711 # See below for chr(28) 1712 \c] chr(29) 1713 \c^ chr(30) 1714 \c_ chr(31) 1715 \c? chr(127) # (on ASCII platforms; see below for link to 1716 # EBCDIC discussion) 1717 1718In other words, it's the character whose code point has had 64 xor'd with 1719its uppercase. C<\c?> is DELETE on ASCII platforms because 1720S<C<ord("?") ^ 64>> is 127, and 1721C<\c@> is NULL because the ord of C<"@"> is 64, so xor'ing 64 itself produces 0. 1722 1723Also, C<\c\I<X>> yields S<C< chr(28) . "I<X>">> for any I<X>, but cannot come at the 1724end of a string, because the backslash would be parsed as escaping the end 1725quote. 1726 1727On ASCII platforms, the resulting characters from the list above are the 1728complete set of ASCII controls. This isn't the case on EBCDIC platforms; see 1729L<perlebcdic/OPERATOR DIFFERENCES> for a full discussion of the 1730differences between these for ASCII versus EBCDIC platforms. 1731 1732Use of any other character following the C<"c"> besides those listed above is 1733discouraged, and as of Perl v5.20, the only characters actually allowed 1734are the printable ASCII ones, minus the left brace C<"{">. What happens 1735for any of the allowed other characters is that the value is derived by 1736xor'ing with the seventh bit, which is 64, and a warning raised if 1737enabled. Using the non-allowed characters generates a fatal error. 1738 1739To get platform independent controls, you can use C<\N{...}>. 1740 1741=item [6] 1742 1743The result is the character specified by the octal number between the braces. 1744See L</[8]> below for details on which character. 1745 1746Blanks (tab or space characters) may separate the number from either or 1747both of the braces. 1748 1749Otherwise, if a character that isn't an octal digit is encountered, a 1750warning is raised, and the value is based on the octal digits before it, 1751discarding it and all following characters up to the closing brace. It 1752is a fatal error if there are no octal digits at all. 1753 1754=item [7] 1755 1756The result is the character specified by the three-digit octal number in the 1757range 000 to 777 (but best to not use above 077, see next paragraph). See 1758L</[8]> below for details on which character. 1759 1760Some contexts allow 2 or even 1 digit, but any usage without exactly 1761three digits, the first being a zero, may give unintended results. (For 1762example, in a regular expression it may be confused with a backreference; 1763see L<perlrebackslash/Octal escapes>.) Starting in Perl 5.14, you may 1764use C<\o{}> instead, which avoids all these problems. Otherwise, it is best to 1765use this construct only for ordinals C<\077> and below, remembering to pad to 1766the left with zeros to make three digits. For larger ordinals, either use 1767C<\o{}>, or convert to something else, such as to hex and use C<\N{U+}> 1768(which is portable between platforms with different character sets) or 1769C<\x{}> instead. 1770 1771=item [8] 1772 1773Several constructs above specify a character by a number. That number 1774gives the character's position in the character set encoding (indexed from 0). 1775This is called synonymously its ordinal, code position, or code point. Perl 1776works on platforms that have a native encoding currently of either ASCII/Latin1 1777or EBCDIC, each of which allow specification of 256 characters. In general, if 1778the number is 255 (0xFF, 0377) or below, Perl interprets this in the platform's 1779native encoding. If the number is 256 (0x100, 0400) or above, Perl interprets 1780it as a Unicode code point and the result is the corresponding Unicode 1781character. For example C<\x{50}> and C<\o{120}> both are the number 80 in 1782decimal, which is less than 256, so the number is interpreted in the native 1783character set encoding. In ASCII the character in the 80th position (indexed 1784from 0) is the letter C<"P">, and in EBCDIC it is the ampersand symbol C<"&">. 1785C<\x{100}> and C<\o{400}> are both 256 in decimal, so the number is interpreted 1786as a Unicode code point no matter what the native encoding is. The name of the 1787character in the 256th position (indexed by 0) in Unicode is 1788C<LATIN CAPITAL LETTER A WITH MACRON>. 1789 1790An exception to the above rule is that S<C<\N{U+I<hex number>}>> is 1791always interpreted as a Unicode code point, so that C<\N{U+0050}> is C<"P"> even 1792on EBCDIC platforms. 1793 1794=back 1795 1796B<NOTE>: Unlike C and other languages, Perl has no C<\v> escape sequence for 1797the vertical tab (VT, which is 11 in both ASCII and EBCDIC), but you may 1798use C<\N{VT}>, C<\ck>, C<\N{U+0b}>, or C<\x0b>. (C<\v> 1799does have meaning in regular expression patterns in Perl, see L<perlre>.) 1800 1801The following escape sequences are available in constructs that interpolate, 1802but not in transliterations. 1803X<\l> X<\u> X<\L> X<\U> X<\E> X<\Q> X<\F> 1804 1805 \l lowercase next character only 1806 \u titlecase (not uppercase!) next character only 1807 \L lowercase all characters till \E or end of string 1808 \U uppercase all characters till \E or end of string 1809 \F foldcase all characters till \E or end of string 1810 \Q quote (disable) pattern metacharacters till \E or 1811 end of string 1812 \E end either case modification or quoted section 1813 (whichever was last seen) 1814 1815See L<perlfunc/quotemeta> for the exact definition of characters that 1816are quoted by C<\Q>. 1817 1818C<\L>, C<\U>, C<\F>, and C<\Q> can stack, in which case you need one 1819C<\E> for each. For example: 1820 1821 say "This \Qquoting \ubusiness \Uhere isn't quite\E done yet,\E is it?"; 1822 This quoting\ Business\ HERE\ ISN\'T\ QUITE\ done\ yet\, is it? 1823 1824If a S<C<use locale>> form that includes C<LC_CTYPE> is in effect (see 1825L<perllocale>), the case map used by C<\l>, C<\L>, C<\u>, and C<\U> is 1826taken from the current locale. If Unicode (for example, C<\N{}> or code 1827points of 0x100 or beyond) is being used, the case map used by C<\l>, 1828C<\L>, C<\u>, and C<\U> is as defined by Unicode. That means that 1829case-mapping a single character can sometimes produce a sequence of 1830several characters. 1831Under S<C<use locale>>, C<\F> produces the same results as C<\L> 1832for all locales but a UTF-8 one, where it instead uses the Unicode 1833definition. 1834 1835All systems use the virtual C<"\n"> to represent a line terminator, 1836called a "newline". There is no such thing as an unvarying, physical 1837newline character. It is only an illusion that the operating system, 1838device drivers, C libraries, and Perl all conspire to preserve. Not all 1839systems read C<"\r"> as ASCII CR and C<"\n"> as ASCII LF. For example, 1840on the ancient Macs (pre-MacOS X) of yesteryear, these used to be reversed, 1841and on systems without a line terminator, 1842printing C<"\n"> might emit no actual data. In general, use C<"\n"> when 1843you mean a "newline" for your system, but use the literal ASCII when you 1844need an exact character. For example, most networking protocols expect 1845and prefer a CR+LF (C<"\015\012"> or C<"\cM\cJ">) for line terminators, 1846and although they often accept just C<"\012">, they seldom tolerate just 1847C<"\015">. If you get in the habit of using C<"\n"> for networking, 1848you may be burned some day. 1849X<newline> X<line terminator> X<eol> X<end of line> 1850X<\n> X<\r> X<\r\n> 1851 1852For constructs that do interpolate, variables beginning with "C<$>" 1853or "C<@>" are interpolated. Subscripted variables such as C<$a[3]> or 1854C<< $href->{key}[0] >> are also interpolated, as are array and hash slices. 1855But method calls such as C<< $obj->meth >> are not. 1856 1857Interpolating an array or slice interpolates the elements in order, 1858separated by the value of C<$">, so is equivalent to interpolating 1859S<C<join $", @array>>. "Punctuation" arrays such as C<@*> are usually 1860interpolated only if the name is enclosed in braces C<@{*}>, but the 1861arrays C<@_>, C<@+>, and C<@-> are interpolated even without braces. 1862 1863For double-quoted strings, the quoting from C<\Q> is applied after 1864interpolation and escapes are processed. 1865 1866 "abc\Qfoo\tbar$s\Exyz" 1867 1868is equivalent to 1869 1870 "abc" . quotemeta("foo\tbar$s") . "xyz" 1871 1872For the pattern of regex operators (C<qr//>, C<m//> and C<s///>), 1873the quoting from C<\Q> is applied after interpolation is processed, 1874but before escapes are processed. This allows the pattern to match 1875literally (except for C<$> and C<@>). For example, the following matches: 1876 1877 '\s\t' =~ /\Q\s\t/ 1878 1879Because C<$> or C<@> trigger interpolation, you'll need to use something 1880like C</\Quser\E\@\Qhost/> to match them literally. 1881 1882Patterns are subject to an additional level of interpretation as a 1883regular expression. This is done as a second pass, after variables are 1884interpolated, so that regular expressions may be incorporated into the 1885pattern from the variables. If this is not what you want, use C<\Q> to 1886interpolate a variable literally. 1887 1888Apart from the behavior described above, Perl does not expand 1889multiple levels of interpolation. In particular, contrary to the 1890expectations of shell programmers, back-quotes do I<NOT> interpolate 1891within double quotes, nor do single quotes impede evaluation of 1892variables when used within double quotes. 1893 1894=head2 Regexp Quote-Like Operators 1895X<operator, regexp> 1896 1897Here are the quote-like operators that apply to pattern 1898matching and related activities. 1899 1900=over 8 1901 1902=item C<qr/I<STRING>/msixpodualn> 1903X<qr> X</i> X</m> X</o> X</s> X</x> X</p> 1904 1905This operator quotes (and possibly compiles) its I<STRING> as a regular 1906expression. I<STRING> is interpolated the same way as I<PATTERN> 1907in C<m/I<PATTERN>/>. If C<"'"> is used as the delimiter, no variable 1908interpolation is done. Returns a Perl value which may be used instead of the 1909corresponding C</I<STRING>/msixpodualn> expression. The returned value is a 1910normalized version of the original pattern. It magically differs from 1911a string containing the same characters: C<ref(qr/x/)> returns "Regexp"; 1912however, dereferencing it is not well defined (you currently get the 1913normalized version of the original pattern, but this may change). 1914 1915 1916For example, 1917 1918 $rex = qr/my.STRING/is; 1919 print $rex; # prints (?si-xm:my.STRING) 1920 s/$rex/foo/; 1921 1922is equivalent to 1923 1924 s/my.STRING/foo/is; 1925 1926The result may be used as a subpattern in a match: 1927 1928 $re = qr/$pattern/; 1929 $string =~ /foo${re}bar/; # can be interpolated in other 1930 # patterns 1931 $string =~ $re; # or used standalone 1932 $string =~ /$re/; # or this way 1933 1934Since Perl may compile the pattern at the moment of execution of the C<qr()> 1935operator, using C<qr()> may have speed advantages in some situations, 1936notably if the result of C<qr()> is used standalone: 1937 1938 sub match { 1939 my $patterns = shift; 1940 my @compiled = map qr/$_/i, @$patterns; 1941 grep { 1942 my $success = 0; 1943 foreach my $pat (@compiled) { 1944 $success = 1, last if /$pat/; 1945 } 1946 $success; 1947 } @_; 1948 } 1949 1950Precompilation of the pattern into an internal representation at 1951the moment of C<qr()> avoids the need to recompile the pattern every 1952time a match C</$pat/> is attempted. (Perl has many other internal 1953optimizations, but none would be triggered in the above example if 1954we did not use C<qr()> operator.) 1955 1956Options (specified by the following modifiers) are: 1957 1958 m Treat string as multiple lines. 1959 s Treat string as single line. (Make . match a newline) 1960 i Do case-insensitive pattern matching. 1961 x Use extended regular expressions; specifying two 1962 x's means \t and the SPACE character are ignored within 1963 square-bracketed character classes 1964 p When matching preserve a copy of the matched string so 1965 that ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be 1966 defined (ignored starting in v5.20 as these are always 1967 defined starting in that release) 1968 o Compile pattern only once. 1969 a ASCII-restrict: Use ASCII for \d, \s, \w and [[:posix:]] 1970 character classes; specifying two a's adds the further 1971 restriction that no ASCII character will match a 1972 non-ASCII one under /i. 1973 l Use the current run-time locale's rules. 1974 u Use Unicode rules. 1975 d Use Unicode or native charset, as in 5.12 and earlier. 1976 n Non-capture mode. Don't let () fill in $1, $2, etc... 1977 1978If a precompiled pattern is embedded in a larger pattern then the effect 1979of C<"msixpluadn"> will be propagated appropriately. The effect that the 1980C</o> modifier has is not propagated, being restricted to those patterns 1981explicitly using it. 1982 1983The C</a>, C</d>, C</l>, and C</u> modifiers (added in Perl 5.14) 1984control the character set rules, but C</a> is the only one you are likely 1985to want to specify explicitly; the other three are selected 1986automatically by various pragmas. 1987 1988See L<perlre> for additional information on valid syntax for I<STRING>, and 1989for a detailed look at the semantics of regular expressions. In 1990particular, all modifiers except the largely obsolete C</o> are further 1991explained in L<perlre/Modifiers>. C</o> is described in the next section. 1992 1993=item C<m/I<PATTERN>/msixpodualngc> 1994X<m> X<operator, match> 1995X<regexp, options> X<regexp> X<regex, options> X<regex> 1996X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c> 1997 1998=item C</I<PATTERN>/msixpodualngc> 1999 2000Searches a string for a pattern match, and in scalar context returns 2001true if it succeeds, false if it fails. If no string is specified 2002via the C<=~> or C<!~> operator, the C<$_> string is searched. (The 2003string specified with C<=~> need not be an lvalue--it may be the 2004result of an expression evaluation, but remember the C<=~> binds 2005rather tightly.) See also L<perlre>. 2006 2007Options are as described in C<qr//> above; in addition, the following match 2008process modifiers are available: 2009 2010 g Match globally, i.e., find all occurrences. 2011 c Do not reset search position on a failed match when /g is 2012 in effect. 2013 2014If C<"/"> is the delimiter then the initial C<m> is optional. With the C<m> 2015you can use any pair of non-whitespace (ASCII) characters 2016as delimiters. This is particularly useful for matching path names 2017that contain C<"/">, to avoid LTS (leaning toothpick syndrome). If C<"?"> is 2018the delimiter, then a match-only-once rule applies, 2019described in C<m?I<PATTERN>?> below. If C<"'"> (single quote) is the delimiter, 2020no variable interpolation is performed on the I<PATTERN>. 2021When using a delimiter character valid in an identifier, whitespace is required 2022after the C<m>. 2023 2024I<PATTERN> may contain variables, which will be interpolated 2025every time the pattern search is evaluated, except 2026for when the delimiter is a single quote. (Note that C<$(>, C<$)>, and 2027C<$|> are not interpolated because they look like end-of-string tests.) 2028Perl will not recompile the pattern unless an interpolated 2029variable that it contains changes. You can force Perl to skip the 2030test and never recompile by adding a C</o> (which stands for "once") 2031after the trailing delimiter. 2032Once upon a time, Perl would recompile regular expressions 2033unnecessarily, and this modifier was useful to tell it not to do so, in the 2034interests of speed. But now, the only reasons to use C</o> are one of: 2035 2036=over 2037 2038=item 1 2039 2040The variables are thousands of characters long and you know that they 2041don't change, and you need to wring out the last little bit of speed by 2042having Perl skip testing for that. (There is a maintenance penalty for 2043doing this, as mentioning C</o> constitutes a promise that you won't 2044change the variables in the pattern. If you do change them, Perl won't 2045even notice.) 2046 2047=item 2 2048 2049you want the pattern to use the initial values of the variables 2050regardless of whether they change or not. (But there are saner ways 2051of accomplishing this than using C</o>.) 2052 2053=item 3 2054 2055If the pattern contains embedded code, such as 2056 2057 use re 'eval'; 2058 $code = 'foo(?{ $x })'; 2059 /$code/ 2060 2061then perl will recompile each time, even though the pattern string hasn't 2062changed, to ensure that the current value of C<$x> is seen each time. 2063Use C</o> if you want to avoid this. 2064 2065=back 2066 2067The bottom line is that using C</o> is almost never a good idea. 2068 2069=item The empty pattern C<//> 2070 2071If the I<PATTERN> evaluates to the empty string, the last 2072I<successfully> matched regular expression in the current dynamic 2073scope is used instead (see also L<perlvar/Scoping Rules of Regex Variables>). 2074In this case, only the C<g> and C<c> flags on the empty pattern are 2075honored; the other flags are taken from the original pattern. If no 2076match has previously succeeded, this will (silently) act instead as a 2077genuine empty pattern (which will always match). Using a user supplied 2078string as a pattern has the risk that if the string is empty that it 2079triggers the "last successful match" behavior, which can be very 2080confusing. In such cases you are recommended to replace C<m/$pattern/> 2081with C<m/(?:$pattern)/> to avoid this behavior. 2082 2083The last successful pattern may be accessed as a variable via 2084C<${^LAST_SUCCESSFUL_PATTERN}>. Matching against it, or the empty 2085pattern should have the same effect, with the exception that when there 2086is no last successful pattern the empty pattern will silently match, 2087whereas using the C<${^LAST_SUCCESSFUL_PATTERN}> variable will produce 2088undefined warnings (if warnings are enabled). You can check 2089C<defined(${^LAST_SUCCESSFUL_PATTERN})> to test if there is a "last 2090successful match" in the current scope. 2091 2092Note that it's possible to confuse Perl into thinking C<//> (the empty 2093regex) is really C<//> (the defined-or operator). Perl is usually pretty 2094good about this, but some pathological cases might trigger this, such as 2095C<$x///> (is that S<C<($x) / (//)>> or S<C<$x // />>?) and S<C<print $fh //>> 2096(S<C<print $fh(//>> or S<C<print($fh //>>?). In all of these examples, Perl 2097will assume you meant defined-or. If you meant the empty regex, just 2098use parentheses or spaces to disambiguate, or even prefix the empty 2099regex with an C<m> (so C<//> becomes C<m//>). 2100 2101=item Matching in list context 2102 2103If the C</g> option is not used, C<m//> in list context returns a 2104list consisting of the subexpressions matched by the parentheses in the 2105pattern, that is, (C<$1>, C<$2>, C<$3>...) (Note that here C<$1> etc. are 2106also set). When there are no parentheses in the pattern, the return 2107value is the list C<(1)> for success. 2108With or without parentheses, an empty list is returned upon failure. 2109 2110Examples: 2111 2112 open(TTY, "+</dev/tty") 2113 || die "can't access /dev/tty: $!"; 2114 2115 <TTY> =~ /^y/i && foo(); # do foo if desired 2116 2117 if (/Version: *([0-9.]*)/) { $version = $1; } 2118 2119 next if m#^/usr/spool/uucp#; 2120 2121 # poor man's grep 2122 $arg = shift; 2123 while (<>) { 2124 print if /$arg/; 2125 } 2126 if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/)) 2127 2128This last example splits C<$foo> into the first two words and the 2129remainder of the line, and assigns those three fields to C<$F1>, C<$F2>, and 2130C<$Etc>. The conditional is true if any variables were assigned; that is, 2131if the pattern matched. 2132 2133The C</g> modifier specifies global pattern matching--that is, 2134matching as many times as possible within the string. How it behaves 2135depends on the context. In list context, it returns a list of the 2136substrings matched by any capturing parentheses in the regular 2137expression. If there are no parentheses, it returns a list of all 2138the matched strings, as if there were parentheses around the whole 2139pattern. 2140 2141In scalar context, each execution of C<m//g> finds the next match, 2142returning true if it matches, and false if there is no further match. 2143The position after the last match can be read or set using the C<pos()> 2144function; see L<perlfunc/pos>. A failed match normally resets the 2145search position to the beginning of the string, but you can avoid that 2146by adding the C</c> modifier (for example, C<m//gc>). Modifying the target 2147string also resets the search position. 2148 2149=item C<\G I<assertion>> 2150 2151You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a 2152zero-width assertion that matches the exact position where the 2153previous C<m//g>, if any, left off. Without the C</g> modifier, the 2154C<\G> assertion still anchors at C<pos()> as it was at the start of 2155the operation (see L<perlfunc/pos>), but the match is of course only 2156attempted once. Using C<\G> without C</g> on a target string that has 2157not previously had a C</g> match applied to it is the same as using 2158the C<\A> assertion to match the beginning of the string. Note also 2159that, currently, C<\G> is only properly supported when anchored at the 2160very beginning of the pattern. 2161 2162Examples: 2163 2164 # list context 2165 ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g); 2166 2167 # scalar context 2168 local $/ = ""; 2169 while ($paragraph = <>) { 2170 while ($paragraph =~ /\p{Ll}['")]*[.!?]+['")]*\s/g) { 2171 $sentences++; 2172 } 2173 } 2174 say $sentences; 2175 2176Here's another way to check for sentences in a paragraph: 2177 2178 my $sentence_rx = qr{ 2179 (?: (?<= ^ ) | (?<= \s ) ) # after start-of-string or 2180 # whitespace 2181 \p{Lu} # capital letter 2182 .*? # a bunch of anything 2183 (?<= \S ) # that ends in non- 2184 # whitespace 2185 (?<! \b [DMS]r ) # but isn't a common abbr. 2186 (?<! \b Mrs ) 2187 (?<! \b Sra ) 2188 (?<! \b St ) 2189 [.?!] # followed by a sentence 2190 # ender 2191 (?= $ | \s ) # in front of end-of-string 2192 # or whitespace 2193 }sx; 2194 local $/ = ""; 2195 while (my $paragraph = <>) { 2196 say "NEW PARAGRAPH"; 2197 my $count = 0; 2198 while ($paragraph =~ /($sentence_rx)/g) { 2199 printf "\tgot sentence %d: <%s>\n", ++$count, $1; 2200 } 2201 } 2202 2203Here's how to use C<m//gc> with C<\G>: 2204 2205 $_ = "ppooqppqq"; 2206 while ($i++ < 2) { 2207 print "1: '"; 2208 print $1 while /(o)/gc; print "', pos=", pos, "\n"; 2209 print "2: '"; 2210 print $1 if /\G(q)/gc; print "', pos=", pos, "\n"; 2211 print "3: '"; 2212 print $1 while /(p)/gc; print "', pos=", pos, "\n"; 2213 } 2214 print "Final: '$1', pos=",pos,"\n" if /\G(.)/; 2215 2216The last example should print: 2217 2218 1: 'oo', pos=4 2219 2: 'q', pos=5 2220 3: 'pp', pos=7 2221 1: '', pos=7 2222 2: 'q', pos=8 2223 3: '', pos=8 2224 Final: 'q', pos=8 2225 2226Notice that the final match matched C<q> instead of C<p>, which a match 2227without the C<\G> anchor would have done. Also note that the final match 2228did not update C<pos>. C<pos> is only updated on a C</g> match. If the 2229final match did indeed match C<p>, it's a good bet that you're running an 2230ancient (pre-5.6.0) version of Perl. 2231 2232A useful idiom for C<lex>-like scanners is C</\G.../gc>. You can 2233combine several regexps like this to process a string part-by-part, 2234doing different actions depending on which regexp matched. Each 2235regexp tries to match where the previous one leaves off. 2236 2237 $_ = <<'EOL'; 2238 $url = URI::URL->new( "http://example.com/" ); 2239 die if $url eq "xXx"; 2240 EOL 2241 2242 LOOP: { 2243 print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc; 2244 print(" lowercase"), redo LOOP 2245 if /\G\p{Ll}+\b[,.;]?\s*/gc; 2246 print(" UPPERCASE"), redo LOOP 2247 if /\G\p{Lu}+\b[,.;]?\s*/gc; 2248 print(" Capitalized"), redo LOOP 2249 if /\G\p{Lu}\p{Ll}+\b[,.;]?\s*/gc; 2250 print(" MiXeD"), redo LOOP if /\G\pL+\b[,.;]?\s*/gc; 2251 print(" alphanumeric"), redo LOOP 2252 if /\G[\p{Alpha}\pN]+\b[,.;]?\s*/gc; 2253 print(" line-noise"), redo LOOP if /\G\W+/gc; 2254 print ". That's all!\n"; 2255 } 2256 2257Here is the output (split into several lines): 2258 2259 line-noise lowercase line-noise UPPERCASE line-noise UPPERCASE 2260 line-noise lowercase line-noise lowercase line-noise lowercase 2261 lowercase line-noise lowercase lowercase line-noise lowercase 2262 lowercase line-noise MiXeD line-noise. That's all! 2263 2264=item C<m?I<PATTERN>?msixpodualngc> 2265X<?> X<operator, match-once> 2266 2267This is just like the C<m/I<PATTERN>/> search, except that it matches 2268only once between calls to the C<reset()> operator. This is a useful 2269optimization when you want to see only the first occurrence of 2270something in each file of a set of files, for instance. Only C<m??> 2271patterns local to the current package are reset. 2272 2273 while (<>) { 2274 if (m?^$?) { 2275 # blank line between header and body 2276 } 2277 } continue { 2278 reset if eof; # clear m?? status for next file 2279 } 2280 2281Another example switched the first "latin1" encoding it finds 2282to "utf8" in a pod file: 2283 2284 s//utf8/ if m? ^ =encoding \h+ \K latin1 ?x; 2285 2286The match-once behavior is controlled by the match delimiter being 2287C<?>; with any other delimiter this is the normal C<m//> operator. 2288 2289In the past, the leading C<m> in C<m?I<PATTERN>?> was optional, but omitting it 2290would produce a deprecation warning. As of v5.22.0, omitting it produces a 2291syntax error. If you encounter this construct in older code, you can just add 2292C<m>. 2293 2294=item C<s/I<PATTERN>/I<REPLACEMENT>/msixpodualngcer> 2295X<s> X<substitute> X<substitution> X<replace> X<regexp, replace> 2296X<regexp, substitute> X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c> X</e> X</r> 2297 2298Searches a string for a pattern, and if found, replaces that pattern 2299with the replacement text and returns the number of substitutions 2300made. Otherwise it returns false (a value that is both an empty string (C<"">) 2301and numeric zero (C<0>) as described in L</Relational Operators>). 2302 2303If the C</r> (non-destructive) option is used then it runs the 2304substitution on a copy of the string and instead of returning the 2305number of substitutions, it returns the copy whether or not a 2306substitution occurred. The original string is never changed when 2307C</r> is used. The copy will always be a plain string, even if the 2308input is an object or a tied variable. 2309 2310If no string is specified via the C<=~> or C<!~> operator, the C<$_> 2311variable is searched and modified. Unless the C</r> option is used, 2312the string specified must be a scalar variable, an array element, a 2313hash element, or an assignment to one of those; that is, some sort of 2314scalar lvalue. 2315 2316If the delimiter chosen is a single quote, no variable interpolation is 2317done on either the I<PATTERN> or the I<REPLACEMENT>. Otherwise, if the 2318I<PATTERN> contains a C<$> that looks like a variable rather than an 2319end-of-string test, the variable will be interpolated into the pattern 2320at run-time. If you want the pattern compiled only once the first time 2321the variable is interpolated, use the C</o> option. If the pattern 2322evaluates to the empty string, the last successfully executed regular 2323expression is used instead. See L<perlre> for further explanation on these. 2324 2325Options are as with C<m//> with the addition of the following replacement 2326specific options: 2327 2328 e Evaluate the right side as an expression. 2329 ee Evaluate the right side as a string then eval the 2330 result. 2331 r Return substitution and leave the original string 2332 untouched. 2333 2334Any non-whitespace delimiter may replace the slashes. Add space after 2335the C<s> when using a character allowed in identifiers. If single quotes 2336are used, no interpretation is done on the replacement string (the C</e> 2337modifier overrides this, however). Note that Perl treats backticks 2338as normal delimiters; the replacement text is not evaluated as a command. 2339If the I<PATTERN> is delimited by bracketing quotes, the I<REPLACEMENT> has 2340its own pair of quotes, which may or may not be bracketing quotes, for example, 2341C<s(foo)(bar)> or C<< s<foo>/bar/ >>. A C</e> will cause the 2342replacement portion to be treated as a full-fledged Perl expression 2343and evaluated right then and there. It is, however, syntax checked at 2344compile-time. A second C<e> modifier will cause the replacement portion 2345to be C<eval>ed before being run as a Perl expression. 2346 2347Examples: 2348 2349 s/\bgreen\b/mauve/g; # don't change wintergreen 2350 2351 $path =~ s|/usr/bin|/usr/local/bin|; 2352 2353 s/Login: $foo/Login: $bar/; # run-time pattern 2354 2355 ($foo = $bar) =~ s/this/that/; # copy first, then 2356 # change 2357 ($foo = "$bar") =~ s/this/that/; # convert to string, 2358 # copy, then change 2359 $foo = $bar =~ s/this/that/r; # Same as above using /r 2360 $foo = $bar =~ s/this/that/r 2361 =~ s/that/the other/r; # Chained substitutes 2362 # using /r 2363 @foo = map { s/this/that/r } @bar # /r is very useful in 2364 # maps 2365 2366 $count = ($paragraph =~ s/Mister\b/Mr./g); # get change-cnt 2367 2368 $_ = 'abc123xyz'; 2369 s/\d+/$&*2/e; # yields 'abc246xyz' 2370 s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz' 2371 s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz' 2372 2373 s/%(.)/$percent{$1}/g; # change percent escapes; no /e 2374 s/%(.)/$percent{$1} || $&/ge; # expr now, so /e 2375 s/^=(\w+)/pod($1)/ge; # use function call 2376 2377 $_ = 'abc123xyz'; 2378 $x = s/abc/def/r; # $x is 'def123xyz' and 2379 # $_ remains 'abc123xyz'. 2380 2381 # expand variables in $_, but dynamics only, using 2382 # symbolic dereferencing 2383 s/\$(\w+)/${$1}/g; 2384 2385 # Add one to the value of any numbers in the string 2386 s/(\d+)/1 + $1/eg; 2387 2388 # Titlecase words in the last 30 characters only (presuming 2389 # that the substring doesn't start in the middle of a word) 2390 substr($str, -30) =~ s/\b(\p{Alpha})(\p{Alpha}*)\b/\u$1\L$2/g; 2391 2392 # This will expand any embedded scalar variable 2393 # (including lexicals) in $_ : First $1 is interpolated 2394 # to the variable name, and then evaluated 2395 s/(\$\w+)/$1/eeg; 2396 2397 # Delete (most) C comments. 2398 $program =~ s { 2399 /\* # Match the opening delimiter. 2400 .*? # Match a minimal number of characters. 2401 \*/ # Match the closing delimiter. 2402 } []gsx; 2403 2404 s/^\s*(.*?)\s*$/$1/; # trim whitespace in $_, 2405 # expensively 2406 2407 for ($variable) { # trim whitespace in $variable, 2408 # cheap 2409 s/^\s+//; 2410 s/\s+$//; 2411 } 2412 2413 s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields 2414 2415 $foo !~ s/A/a/g; # Lowercase all A's in $foo; return 2416 # 0 if any were found and changed; 2417 # otherwise return 1 2418 2419Note the use of C<$> instead of C<\> in the last example. Unlike 2420B<sed>, we use the \<I<digit>> form only in the left hand side. 2421Anywhere else it's $<I<digit>>. 2422 2423Occasionally, you can't use just a C</g> to get all the changes 2424to occur that you might want. Here are two common cases: 2425 2426 # put commas in the right places in an integer 2427 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g; 2428 2429 # expand tabs to 8-column spacing 2430 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e; 2431 2432X</c>While C<s///> accepts the C</c> flag, it has no effect beyond 2433producing a warning if warnings are enabled. 2434 2435=back 2436 2437=head2 Quote-Like Operators 2438X<operator, quote-like> 2439 2440=over 4 2441 2442=item C<q/I<STRING>/> 2443X<q> X<quote, single> X<'> X<''> 2444 2445=item C<'I<STRING>'> 2446 2447A single-quoted, literal string. A backslash represents a backslash 2448unless followed by the delimiter or another backslash, in which case 2449the delimiter or backslash is interpolated. 2450 2451 $foo = q!I said, "You said, 'She said it.'"!; 2452 $bar = q('This is it.'); 2453 $baz = '\n'; # a two-character string 2454 2455=item C<qq/I<STRING>/> 2456X<qq> X<quote, double> X<"> X<""> 2457 2458=item C<"I<STRING>"> 2459 2460A double-quoted, interpolated string. 2461 2462 $_ .= qq 2463 (*** The previous line contains the naughty word "$1".\n) 2464 if /\b(tcl|java|python)\b/i; # :-) 2465 $baz = "\n"; # a one-character string 2466 2467=item C<qx/I<STRING>/> 2468X<qx> X<`> X<``> X<backtick> 2469 2470=item C<`I<STRING>`> 2471 2472A string which is (possibly) interpolated and then executed as a 2473system command, via F</bin/sh> or its equivalent if required. Shell 2474wildcards, pipes, and redirections will be honored. Similarly to 2475C<system>, if the string contains no shell metacharacters then it will 2476executed directly. The collected standard output of the command is 2477returned; standard error is unaffected. In scalar context, it comes 2478back as a single (potentially multi-line) string, or C<undef> if the 2479shell (or command) could not be started. In list context, returns a 2480list of lines (however you've defined lines with C<$/> or 2481C<$INPUT_RECORD_SEPARATOR>), or an empty list if the shell (or command) 2482could not be started. 2483 2484 print qx/date/; # prints "Sun Jan 28 06:16:19 CST 2024" 2485 2486Because backticks do not affect standard error, use shell file descriptor 2487syntax (assuming the shell supports this) if you care to address this. 2488To capture a command's STDERR and STDOUT together: 2489 2490 $output = `cmd 2>&1`; 2491 2492To capture a command's STDOUT but discard its STDERR: 2493 2494 $output = `cmd 2>/dev/null`; 2495 2496To capture a command's STDERR but discard its STDOUT (ordering is 2497important here): 2498 2499 $output = `cmd 2>&1 1>/dev/null`; 2500 2501To exchange a command's STDOUT and STDERR in order to capture the STDERR 2502but leave its STDOUT to come out the old STDERR: 2503 2504 $output = `cmd 3>&1 1>&2 2>&3 3>&-`; 2505 2506To read both a command's STDOUT and its STDERR separately, it's easiest 2507to redirect them separately to files, and then read from those files 2508when the program is done: 2509 2510 system("program args 1>program.stdout 2>program.stderr"); 2511 2512The STDIN filehandle used by the command is inherited from Perl's STDIN. 2513For example: 2514 2515 open(SPLAT, "stuff") || die "can't open stuff: $!"; 2516 open(STDIN, "<&SPLAT") || die "can't dupe SPLAT: $!"; 2517 print STDOUT `sort`; 2518 2519will print the sorted contents of the file named F<"stuff">. 2520 2521Using single-quote as a delimiter protects the command from Perl's 2522double-quote interpolation, passing it on to the shell instead: 2523 2524 $perl_info = qx(ps $$); # that's Perl's $$ 2525 $shell_info = qx'ps $$'; # that's the new shell's $$ 2526 2527How that string gets evaluated is entirely subject to the command 2528interpreter on your system. On most platforms, you will have to protect 2529shell metacharacters if you want them treated literally. This is in 2530practice difficult to do, as it's unclear how to escape which characters. 2531See L<perlsec> for a clean and safe example of a manual C<fork()> and C<exec()> 2532to emulate backticks safely. 2533 2534On some platforms (notably DOS-like ones), the shell may not be 2535capable of dealing with multiline commands, so putting newlines in 2536the string may not get you what you want. You may be able to evaluate 2537multiple commands in a single line by separating them with the command 2538separator character, if your shell supports that (for example, C<;> on 2539many Unix shells and C<&> on the Windows NT C<cmd> shell). 2540 2541Perl will attempt to flush all files opened for 2542output before starting the child process, but this may not be supported 2543on some platforms (see L<perlport>). To be safe, you may need to set 2544C<$|> (C<$AUTOFLUSH> in C<L<English>>) or call the C<autoflush()> method of 2545C<L<IO::Handle>> on any open handles. 2546 2547Beware that some command shells may place restrictions on the length 2548of the command line. You must ensure your strings don't exceed this 2549limit after any necessary interpolations. See the platform-specific 2550release notes for more details about your particular environment. 2551 2552Using this operator can lead to programs that are difficult to port, 2553because the shell commands called vary between systems, and may in 2554fact not be present at all. As one example, the C<type> command under 2555the POSIX shell is very different from the C<type> command under DOS. 2556That doesn't mean you should go out of your way to avoid backticks 2557when they're the right way to get something done. Perl was made to be 2558a glue language, and one of the things it glues together is commands. 2559Just understand what you're getting yourself into. 2560 2561Like C<system>, backticks put the child process exit code in C<$?>. 2562If you'd like to manually inspect failure, you can check all possible 2563failure modes by inspecting C<$?> like this: 2564 2565 if ($? == -1) { 2566 print "failed to execute: $!\n"; 2567 } 2568 elsif ($? & 127) { 2569 printf "child died with signal %d, %s coredump\n", 2570 ($? & 127), ($? & 128) ? 'with' : 'without'; 2571 } 2572 else { 2573 printf "child exited with value %d\n", $? >> 8; 2574 } 2575 2576Use the L<open> pragma to control the I/O layers used when reading the 2577output of the command, for example: 2578 2579 use open IN => ":encoding(UTF-8)"; 2580 my $x = `cmd-producing-utf-8`; 2581 2582C<qx//> can also be called like a function with L<perlfunc/readpipe>. 2583 2584See L</"I/O Operators"> for more discussion. 2585 2586=item C<qw/I<STRING>/> 2587X<qw> X<quote, list> X<quote, words> 2588 2589Evaluates to a list of the words extracted out of I<STRING>, using embedded 2590whitespace as the word delimiters. It can be understood as being roughly 2591equivalent to: 2592 2593 split(" ", q/STRING/); 2594 2595the differences being that it only splits on ASCII whitespace, 2596generates a real list at compile time, and 2597in scalar context it returns the last element in the list. So 2598this expression: 2599 2600 qw(foo bar baz) 2601 2602is semantically equivalent to the list: 2603 2604 "foo", "bar", "baz" 2605 2606Some frequently seen examples: 2607 2608 use POSIX qw( setlocale localeconv ) 2609 @EXPORT = qw( foo bar baz ); 2610 2611A common mistake is to try to separate the words with commas or to 2612put comments into a multi-line C<qw>-string. For this reason, the 2613S<C<use warnings>> pragma and the B<-w> switch (that is, the C<$^W> variable) 2614produces warnings if the I<STRING> contains the C<","> or the C<"#"> character. 2615 2616=item C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr> 2617X<tr> X<y> X<transliterate> X</c> X</d> X</s> 2618 2619=item C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr> 2620 2621Transliterates all occurrences of the characters found (or not found 2622if the C</c> modifier is specified) in the search list with the 2623positionally corresponding character in the replacement list, possibly 2624deleting some, depending on the modifiers specified. It returns the 2625number of characters replaced or deleted. If no string is specified via 2626the C<=~> or C<!~> operator, the C<$_> string is transliterated. 2627 2628For B<sed> devotees, C<y> is provided as a synonym for C<tr>. 2629 2630If the C</r> (non-destructive) option is present, a new copy of the string 2631is made and its characters transliterated, and this copy is returned no 2632matter whether it was modified or not: the original string is always 2633left unchanged. The new copy is always a plain string, even if the input 2634string is an object or a tied variable. 2635 2636Unless the C</r> option is used, the string specified with C<=~> must be a 2637scalar variable, an array element, a hash element, or an assignment to one 2638of those; in other words, an lvalue. 2639 2640The characters delimitting I<SEARCHLIST> and I<REPLACEMENTLIST> 2641can be any printable character, not just forward slashes. If they 2642are single quotes (C<tr'I<SEARCHLIST>'I<REPLACEMENTLIST>'>), the only 2643interpolation is removal of C<\> from pairs of C<\\>; so hyphens are 2644interpreted literally rather than specifying a character range. 2645 2646Otherwise, a character range may be specified with a hyphen, so 2647C<tr/A-J/0-9/> does the same replacement as 2648C<tr/ACEGIBDFHJ/0246813579/>. 2649 2650If the I<SEARCHLIST> is delimited by bracketing quotes, the 2651I<REPLACEMENTLIST> must have its own pair of quotes, which may or may 2652not be bracketing quotes; for example, C<tr(aeiouy)(yuoiea)> or 2653C<tr[+\-*/]"ABCD">. This final example shows a way to visually clarify 2654what is going on for people who are more familiar with regular 2655expression patterns than with C<tr>, and who may think forward slash 2656delimiters imply that C<tr> is more like a regular expression pattern 2657than it actually is. (Another option might be to use C<tr[...][...]>.) 2658 2659C<tr> isn't fully like bracketed character classes, just 2660(significantly) more like them than it is to full patterns. For 2661example, characters appearing more than once in either list behave 2662differently here than in patterns, and C<tr> lists do not allow 2663backslashed character classes such as C<\d> or C<\pL>, nor variable 2664interpolation, so C<"$"> and C<"@"> are always treated as literals. 2665 2666The allowed elements are literals plus C<\'> (meaning a single quote). 2667If the delimiters aren't single quotes, also allowed are any of the 2668escape sequences accepted in double-quoted strings. Escape sequence 2669details are in L<the table near the beginning of this section|/Quote and 2670Quote-like Operators>. 2671 2672A hyphen at the beginning or end, or preceded by a backslash is also 2673always considered a literal. Precede a delimiter character with a 2674backslash to allow it. 2675 2676The C<tr> operator is not equivalent to the C<L<tr(1)>> utility. 2677C<tr[a-z][A-Z]> will uppercase the 26 letters "a" through "z", but for 2678case changing not confined to ASCII, use L<C<lc>|perlfunc/lc>, 2679L<C<uc>|perlfunc/uc>, L<C<lcfirst>|perlfunc/lcfirst>, 2680L<C<ucfirst>|perlfunc/ucfirst> (all documented in L<perlfunc>), or the 2681L<substitution operator 2682C<sE<sol>I<PATTERN>E<sol>I<REPLACEMENT>E<sol>>|/sE<sol>PATTERNE<sol>REPLACEMENTE<sol>msixpodualngcer> 2683(with C<\U>, C<\u>, C<\L>, and C<\l> string-interpolation escapes in the 2684I<REPLACEMENT> portion). 2685 2686Most ranges are unportable between character sets, but certain ones 2687signal Perl to do special handling to make them portable. There are two 2688classes of portable ranges. The first are any subsets of the ranges 2689C<A-Z>, C<a-z>, and C<0-9>, when expressed as literal characters. 2690 2691 tr/h-k/H-K/ 2692 2693capitalizes the letters C<"h">, C<"i">, C<"j">, and C<"k"> and nothing 2694else, no matter what the platform's character set is. In contrast, all 2695of 2696 2697 tr/\x68-\x6B/\x48-\x4B/ 2698 tr/h-\x6B/H-\x4B/ 2699 tr/\x68-k/\x48-K/ 2700 2701do the same capitalizations as the previous example when run on ASCII 2702platforms, but something completely different on EBCDIC ones. 2703 2704The second class of portable ranges is invoked when one or both of the 2705range's end points are expressed as C<\N{...}> 2706 2707 $string =~ tr/\N{U+20}-\N{U+7E}//d; 2708 2709removes from C<$string> all the platform's characters which are 2710equivalent to any of Unicode U+0020, U+0021, ... U+007D, U+007E. This 2711is a portable range, and has the same effect on every platform it is 2712run on. In this example, these are the ASCII 2713printable characters. So after this is run, C<$string> has only 2714controls and characters which have no ASCII equivalents. 2715 2716But, even for portable ranges, it is not generally obvious what is 2717included without having to look things up in the manual. A sound 2718principle is to use only ranges that both begin from, and end at, either 2719ASCII alphabetics of equal case (C<b-e>, C<B-E>), or digits (C<1-4>). 2720Anything else is unclear (and unportable unless C<\N{...}> is used). If 2721in doubt, spell out the character sets in full. 2722 2723Options: 2724 2725 c Complement the SEARCHLIST. 2726 d Delete found but unreplaced characters. 2727 r Return the modified string and leave the original string 2728 untouched. 2729 s Squash duplicate replaced characters. 2730 2731If the C</d> modifier is specified, any characters specified by 2732I<SEARCHLIST> not found in I<REPLACEMENTLIST> are deleted. (Note that 2733this is slightly more flexible than the behavior of some B<tr> programs, 2734which delete anything they find in the I<SEARCHLIST>, period.) 2735 2736If the C</s> modifier is specified, sequences of characters, all in a 2737row, that were transliterated to the same character are squashed down to 2738a single instance of that character. 2739 2740 my $x = "aaabbbca"; 2741 $x =~ tr/ab/dd/s; # $x now is "dcd" 2742 2743If the C</d> modifier is used, the I<REPLACEMENTLIST> is always interpreted 2744exactly as specified. Otherwise, if the I<REPLACEMENTLIST> is shorter 2745than the I<SEARCHLIST>, the final character, if any, is replicated until 2746it is long enough. There won't be a final character if and only if the 2747I<REPLACEMENTLIST> is empty, in which case I<REPLACEMENTLIST> is 2748copied from I<SEARCHLIST>. An empty I<REPLACEMENTLIST> is useful 2749for counting characters in a class, or for squashing character sequences 2750in a class. 2751 2752 tr/abcd// tr/abcd/abcd/ 2753 tr/abcd/AB/ tr/abcd/ABBB/ 2754 tr/abcd//d s/[abcd]//g 2755 tr/abcd/AB/d (tr/ab/AB/ + s/[cd]//g) - but run together 2756 2757If the C</c> modifier is specified, the characters to be transliterated 2758are the ones NOT in I<SEARCHLIST>, that is, it is complemented. If 2759C</d> and/or C</s> are also specified, they apply to the complemented 2760I<SEARCHLIST>. Recall, that if I<REPLACEMENTLIST> is empty (except 2761under C</d>) a copy of I<SEARCHLIST> is used instead. That copy is made 2762after complementing under C</c>. I<SEARCHLIST> is sorted by code point 2763order after complementing, and any I<REPLACEMENTLIST> is applied to 2764that sorted result. This means that under C</c>, the order of the 2765characters specified in I<SEARCHLIST> is irrelevant. This can 2766lead to different results on EBCDIC systems if I<REPLACEMENTLIST> 2767contains more than one character, hence it is generally non-portable to 2768use C</c> with such a I<REPLACEMENTLIST>. 2769 2770Another way of describing the operation is this: 2771If C</c> is specified, the I<SEARCHLIST> is sorted by code point order, 2772then complemented. If I<REPLACEMENTLIST> is empty and C</d> is not 2773specified, I<REPLACEMENTLIST> is replaced by a copy of I<SEARCHLIST> (as 2774modified under C</c>), and these potentially modified lists are used as 2775the basis for what follows. Any character in the target string that 2776isn't in I<SEARCHLIST> is passed through unchanged. Every other 2777character in the target string is replaced by the character in 2778I<REPLACEMENTLIST> that positionally corresponds to its mate in 2779I<SEARCHLIST>, except that under C</s>, the 2nd and following characters 2780are squeezed out in a sequence of characters in a row that all translate 2781to the same character. If I<SEARCHLIST> is longer than 2782I<REPLACEMENTLIST>, characters in the target string that match a 2783character in I<SEARCHLIST> that doesn't have a correspondence in 2784I<REPLACEMENTLIST> are either deleted from the target string if C</d> is 2785specified; or replaced by the final character in I<REPLACEMENTLIST> if 2786C</d> isn't specified. 2787 2788Some examples: 2789 2790 $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case ASCII 2791 2792 $cnt = tr/*/*/; # count the stars in $_ 2793 $cnt = tr/*//; # same thing 2794 2795 $cnt = $sky =~ tr/*/*/; # count the stars in $sky 2796 $cnt = $sky =~ tr/*//; # same thing 2797 2798 $cnt = $sky =~ tr/*//c; # count all the non-stars in $sky 2799 $cnt = $sky =~ tr/*/*/c; # same, but transliterate each non-star 2800 # into a star, leaving the already-stars 2801 # alone. Afterwards, everything in $sky 2802 # is a star. 2803 2804 $cnt = tr/0-9//; # count the ASCII digits in $_ 2805 2806 tr/a-zA-Z//s; # bookkeeper -> bokeper 2807 tr/o/o/s; # bookkeeper -> bokkeeper 2808 tr/oe/oe/s; # bookkeeper -> bokkeper 2809 tr/oe//s; # bookkeeper -> bokkeper 2810 tr/oe/o/s; # bookkeeper -> bokkopor 2811 2812 ($HOST = $host) =~ tr/a-z/A-Z/; 2813 $HOST = $host =~ tr/a-z/A-Z/r; # same thing 2814 2815 $HOST = $host =~ tr/a-z/A-Z/r # chained with s///r 2816 =~ s/:/ -p/r; 2817 2818 tr/a-zA-Z/ /cs; # change non-alphas to single space 2819 2820 @stripped = map tr/a-zA-Z/ /csr, @original; 2821 # /r with map 2822 2823 tr [\200-\377] 2824 [\000-\177]; # wickedly delete 8th bit 2825 2826 $foo !~ tr/A/a/ # transliterate all the A's in $foo to 'a', 2827 # return 0 if any were found and changed. 2828 # Otherwise return 1 2829 2830If multiple transliterations are given for a character, only the 2831first one is used: 2832 2833 tr/AAA/XYZ/ 2834 2835will transliterate any A to X. 2836 2837Because the transliteration table is built at compile time, neither 2838the I<SEARCHLIST> nor the I<REPLACEMENTLIST> are subjected to double quote 2839interpolation. That means that if you want to use variables, you 2840must use an C<eval()>: 2841 2842 eval "tr/$oldlist/$newlist/"; 2843 die $@ if $@; 2844 2845 eval "tr/$oldlist/$newlist/, 1" or die $@; 2846 2847=item C<< <<I<EOF> >> 2848X<here-doc> X<heredoc> X<here-document> X<<< << >>> 2849 2850A line-oriented form of quoting is based on the shell "here-document" 2851syntax. Following a C<< << >> you specify a string to terminate 2852the quoted material, and all lines following the current line down to 2853the terminating string are the value of the item. 2854 2855Prefixing the terminating string with a C<~> specifies that you 2856want to use L</Indented Here-docs> (see below). 2857 2858The terminating string may be either an identifier (a word), or some 2859quoted text. An unquoted identifier works like double quotes. 2860There may not be a space between the C<< << >> and the identifier, 2861unless the identifier is explicitly quoted. The terminating string 2862must appear by itself (unquoted and with no surrounding whitespace) 2863on the terminating line. 2864 2865If the terminating string is quoted, the type of quotes used determine 2866the treatment of the text. 2867 2868=over 4 2869 2870=item Double Quotes 2871 2872Double quotes indicate that the text will be interpolated using exactly 2873the same rules as normal double quoted strings. 2874 2875 print <<EOF; 2876 The price is $Price. 2877 EOF 2878 2879 print << "EOF"; # same as above 2880 The price is $Price. 2881 EOF 2882 2883 2884=item Single Quotes 2885 2886Single quotes indicate the text is to be treated literally with no 2887interpolation of its content. This is similar to single quoted 2888strings except that backslashes have no special meaning, with C<\\> 2889being treated as two backslashes and not one as they would in every 2890other quoting construct. 2891 2892Just as in the shell, a backslashed bareword following the C<<< << >>> 2893means the same thing as a single-quoted string does: 2894 2895 $cost = <<'VISTA'; # hasta la ... 2896 That'll be $10 please, ma'am. 2897 VISTA 2898 2899 $cost = <<\VISTA; # Same thing! 2900 That'll be $10 please, ma'am. 2901 VISTA 2902 2903This is the only form of quoting in perl where there is no need 2904to worry about escaping content, something that code generators 2905can and do make good use of. 2906 2907=item Backticks 2908 2909The content of the here doc is treated just as it would be if the 2910string were embedded in backticks. Thus the content is interpolated 2911as though it were double quoted and then executed via the shell, with 2912the results of the execution returned. 2913 2914 print << `EOC`; # execute command and get results 2915 echo hi there 2916 EOC 2917 2918=back 2919 2920=over 4 2921 2922=item Indented Here-docs 2923 2924The here-doc modifier C<~> allows you to indent your here-docs to make 2925the code more readable: 2926 2927 if ($some_var) { 2928 print <<~EOF; 2929 This is a here-doc 2930 EOF 2931 } 2932 2933This will print... 2934 2935 This is a here-doc 2936 2937...with no leading whitespace. 2938 2939The line containing the delimiter that marks the end of the here-doc 2940determines the indentation template for the whole thing. Compilation 2941croaks if any non-empty line inside the here-doc does not begin with the 2942precise indentation of the terminating line. (An empty line consists of 2943the single character "\n".) For example, suppose the terminating line 2944begins with a tab character followed by 4 space characters. Every 2945non-empty line in the here-doc must begin with a tab followed by 4 2946spaces. They are stripped from each line, and any leading white space 2947remaining on a line serves as the indentation for that line. Currently, 2948only the TAB and SPACE characters are treated as whitespace for this 2949purpose. Tabs and spaces may be mixed, but are matched exactly; tabs 2950remain tabs and are not expanded. 2951 2952Additional beginning whitespace (beyond what preceded the 2953delimiter) will be preserved: 2954 2955 print <<~EOF; 2956 This text is not indented 2957 This text is indented with two spaces 2958 This text is indented with two tabs 2959 EOF 2960 2961Finally, the modifier may be used with all of the forms 2962mentioned above: 2963 2964 <<~\EOF; 2965 <<~'EOF' 2966 <<~"EOF" 2967 <<~`EOF` 2968 2969And whitespace may be used between the C<~> and quoted delimiters: 2970 2971 <<~ 'EOF'; # ... "EOF", `EOF` 2972 2973=back 2974 2975It is possible to stack multiple here-docs in a row: 2976 2977 print <<"foo", <<"bar"; # you can stack them 2978 I said foo. 2979 foo 2980 I said bar. 2981 bar 2982 2983 myfunc(<< "THIS", 23, <<'THAT'); 2984 Here's a line 2985 or two. 2986 THIS 2987 and here's another. 2988 THAT 2989 2990Just don't forget that you have to put a semicolon on the end 2991to finish the statement, as Perl doesn't know you're not going to 2992try to do this: 2993 2994 print <<ABC 2995 179231 2996 ABC 2997 + 20; 2998 2999If you want to remove the line terminator from your here-docs, 3000use C<chomp()>. 3001 3002 chomp($string = <<'END'); 3003 This is a string. 3004 END 3005 3006If you want your here-docs to be indented with the rest of the code, 3007use the C<<< <<~FOO >>> construct described under L</Indented Here-docs>: 3008 3009 $quote = <<~'FINIS'; 3010 The Road goes ever on and on, 3011 down from the door where it began. 3012 FINIS 3013 3014If you use a here-doc within a delimited construct, such as in C<s///eg>, 3015the quoted material must still come on the line following the 3016C<<< <<FOO >>> marker, which means it may be inside the delimited 3017construct: 3018 3019 s/this/<<E . 'that' 3020 the other 3021 E 3022 . 'more '/eg; 3023 3024It works this way as of Perl 5.18. Historically, it was inconsistent, and 3025you would have to write 3026 3027 s/this/<<E . 'that' 3028 . 'more '/eg; 3029 the other 3030 E 3031 3032outside of string evals. 3033 3034Additionally, quoting rules for the end-of-string identifier are 3035unrelated to Perl's quoting rules. C<q()>, C<qq()>, and the like are not 3036supported in place of C<''> and C<"">, and the only interpolation is for 3037backslashing the quoting character: 3038 3039 print << "abc\"def"; 3040 testing... 3041 abc"def 3042 3043Finally, quoted strings cannot span multiple lines. The general rule is 3044that the identifier must be a string literal. Stick with that, and you 3045should be safe. 3046 3047=back 3048 3049=head2 Gory details of parsing quoted constructs 3050X<quote, gory details> 3051 3052When presented with something that might have several different 3053interpretations, Perl uses the B<DWIM> (that's "Do What I Mean") 3054principle to pick the most probable interpretation. This strategy 3055is so successful that Perl programmers often do not suspect the 3056ambivalence of what they write. But from time to time, Perl's 3057notions differ substantially from what the author honestly meant. 3058 3059This section hopes to clarify how Perl handles quoted constructs. 3060Although the most common reason to learn this is to unravel labyrinthine 3061regular expressions, because the initial steps of parsing are the 3062same for all quoting operators, they are all discussed together. 3063 3064The most important Perl parsing rule is the first one discussed 3065below: when processing a quoted construct, Perl first finds the end 3066of that construct, then interprets its contents. If you understand 3067this rule, you may skip the rest of this section on the first 3068reading. The other rules are likely to contradict the user's 3069expectations much less frequently than this first one. 3070 3071Some passes discussed below are performed concurrently, but because 3072their results are the same, we consider them individually. For different 3073quoting constructs, Perl performs different numbers of passes, from 3074one to four, but these passes are always performed in the same order. 3075 3076=over 4 3077 3078=item Finding the end 3079 3080The first pass is finding the end of the quoted construct. This results 3081in saving to a safe location a copy of the text (between the starting 3082and ending delimiters), normalized as necessary to avoid needing to know 3083what the original delimiters were. 3084 3085If the construct is a here-doc, the ending delimiter is a line 3086that has a terminating string as the content. Therefore C<<<EOF> is 3087terminated by C<EOF> immediately followed by C<"\n"> and starting 3088from the first column of the terminating line. 3089When searching for the terminating line of a here-doc, nothing 3090is skipped. In other words, lines after the here-doc syntax 3091are compared with the terminating string line by line. 3092 3093For the constructs except here-docs, single characters are used as starting 3094and ending delimiters. If the starting delimiter is an opening punctuation 3095(that is C<(>, C<[>, C<{>, or C<< < >>), the ending delimiter is the 3096corresponding closing punctuation (that is C<)>, C<]>, C<}>, or C<< > >>). 3097If the starting delimiter is an unpaired character like C</> or a closing 3098punctuation, the ending delimiter is the same as the starting delimiter. 3099Therefore a C</> terminates a C<qq//> construct, while a C<]> terminates 3100both C<qq[]> and C<qq]]> constructs. 3101 3102When searching for single-character delimiters, escaped delimiters 3103and C<\\> are skipped. For example, while searching for terminating C</>, 3104combinations of C<\\> and C<\/> are skipped. If the delimiters are 3105bracketing, nested pairs are also skipped. For example, while searching 3106for a closing C<]> paired with the opening C<[>, combinations of C<\\>, C<\]>, 3107and C<\[> are all skipped, and nested C<[> and C<]> are skipped as well. 3108However, when backslashes are used as the delimiters (like C<qq\\> and 3109C<tr\\\>), nothing is skipped. 3110During the search for the end, backslashes that escape delimiters or 3111other backslashes are removed (exactly speaking, they are not copied to the 3112safe location). 3113 3114For constructs with three-part delimiters (C<s///>, C<y///>, and 3115C<tr///>), the search is repeated once more. 3116If the first delimiter is not an opening punctuation, the three delimiters must 3117be the same, such as C<s!!!> and C<tr)))>, 3118in which case the second delimiter 3119terminates the left part and starts the right part at once. 3120If the left part is delimited by bracketing punctuation (that is C<()>, 3121C<[]>, C<{}>, or C<< <> >>), the right part needs another pair of 3122delimiters such as C<s(){}> and C<tr[]//>. In these cases, whitespace 3123and comments are allowed between the two parts, although the comment must follow 3124at least one whitespace character; otherwise a character expected as the 3125start of the comment may be regarded as the starting delimiter of the right part. 3126 3127During this search no attention is paid to the semantics of the construct. 3128Thus: 3129 3130 "$hash{"$foo/$bar"}" 3131 3132or: 3133 3134 m/ 3135 bar # NOT a comment, this slash / terminated m//! 3136 /x 3137 3138do not form legal quoted expressions. The quoted part ends on the 3139first C<"> and C</>, and the rest happens to be a syntax error. 3140Because the slash that terminated C<m//> was followed by a C<SPACE>, 3141the example above is not C<m//x>, but rather C<m//> with no C</x> 3142modifier. So the embedded C<#> is interpreted as a literal C<#>. 3143 3144Also no attention is paid to C<\c\> (multichar control char syntax) during 3145this search. Thus the second C<\> in C<qq/\c\/> is interpreted as a part 3146of C<\/>, and the following C</> is not recognized as a delimiter. 3147Instead, use C<\034> or C<\x1c> at the end of quoted constructs. 3148 3149=item Interpolation 3150X<interpolation> 3151 3152The next step is interpolation in the text obtained, which is now 3153delimiter-independent. There are multiple cases. 3154 3155=over 4 3156 3157=item C<<<'EOF'> 3158 3159No interpolation is performed. 3160Note that the combination C<\\> is left intact, since escaped delimiters 3161are not available for here-docs. 3162 3163=item C<m''>, the pattern of C<s'''> 3164 3165No interpolation is performed at this stage. 3166Any backslashed sequences including C<\\> are treated at the stage 3167of L</"Parsing regular expressions">. 3168 3169=item C<''>, C<q//>, C<tr'''>, C<y'''>, the replacement of C<s'''> 3170 3171The only interpolation is removal of C<\> from pairs of C<\\>. 3172Therefore C<"-"> in C<tr'''> and C<y'''> is treated literally 3173as a hyphen and no character range is available. 3174C<\1> in the replacement of C<s'''> does not work as C<$1>. 3175 3176=item C<tr///>, C<y///> 3177 3178No variable interpolation occurs. String modifying combinations for 3179case and quoting such as C<\Q>, C<\U>, and C<\E> are not recognized. 3180The other escape sequences such as C<\200> and C<\t> and backslashed 3181characters such as C<\\> and C<\-> are converted to appropriate literals. 3182The character C<"-"> is treated specially and therefore C<\-> is treated 3183as a literal C<"-">. 3184 3185=item C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >>, C<<<"EOF"> 3186 3187C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F> (possibly paired with C<\E>) are 3188converted to corresponding Perl constructs. Thus, C<"$foo\Qbaz$bar"> 3189is converted to S<C<$foo . (quotemeta("baz" . $bar))>> internally. 3190The other escape sequences such as C<\200> and C<\t> and backslashed 3191characters such as C<\\> and C<\-> are replaced with appropriate 3192expansions. 3193 3194Let it be stressed that I<whatever falls between C<\Q> and C<\E>> 3195is interpolated in the usual way. Something like C<"\Q\\E"> has 3196no C<\E> inside. Instead, it has C<\Q>, C<\\>, and C<E>, so the 3197result is the same as for C<"\\\\E">. As a general rule, backslashes 3198between C<\Q> and C<\E> may lead to counterintuitive results. So, 3199C<"\Q\t\E"> is converted to C<quotemeta("\t")>, which is the same 3200as C<"\\\t"> (since TAB is not alphanumeric). Note also that: 3201 3202 $str = '\t'; 3203 return "\Q$str"; 3204 3205may be closer to the conjectural I<intention> of the writer of C<"\Q\t\E">. 3206 3207Interpolated scalars and arrays are converted internally to the C<join> and 3208C<"."> catenation operations. Thus, S<C<"$foo XXX '@arr'">> becomes: 3209 3210 $foo . " XXX '" . (join $", @arr) . "'"; 3211 3212All operations above are performed simultaneously, left to right. 3213 3214Because the result of S<C<"\Q I<STRING> \E">> has all metacharacters 3215quoted, there is no way to insert a literal C<$> or C<@> inside a 3216C<\Q\E> pair. If protected by C<\>, C<$> will be quoted to become 3217C<"\\\$">; if not, it is interpreted as the start of an interpolated 3218scalar. 3219 3220Note also that the interpolation code needs to make a decision on 3221where the interpolated scalar ends. For instance, whether 3222S<C<< "a $x -> {c}" >>> really means: 3223 3224 "a " . $x . " -> {c}"; 3225 3226or: 3227 3228 "a " . $x -> {c}; 3229 3230Most of the time, the longest possible text that does not include 3231spaces between components and which contains matching braces or 3232brackets. because the outcome may be determined by voting based 3233on heuristic estimators, the result is not strictly predictable. 3234Fortunately, it's usually correct for ambiguous cases. 3235 3236=item The replacement of C<s///> 3237 3238Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F> and interpolation 3239happens as with C<qq//> constructs. 3240 3241It is at this step that C<\1> is begrudgingly converted to C<$1> in 3242the replacement text of C<s///>, in order to correct the incorrigible 3243I<sed> hackers who haven't picked up the saner idiom yet. A warning 3244is emitted if the S<C<use warnings>> pragma or the B<-w> command-line flag 3245(that is, the C<$^W> variable) was set. 3246 3247=item C<RE> in C<m?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>, 3248 3249Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F>, C<\E>, 3250and interpolation happens (almost) as with C<qq//> constructs. 3251 3252Processing of C<\N{...}> is also done here, and compiled into an intermediate 3253form for the regex compiler. (This is because, as mentioned below, the regex 3254compilation may be done at execution time, and C<\N{...}> is a compile-time 3255construct.) 3256 3257However any other combinations of C<\> followed by a character 3258are not substituted but only skipped, in order to parse them 3259as regular expressions at the following step. 3260As C<\c> is skipped at this step, C<@> of C<\c@> in RE is possibly 3261treated as an array symbol (for example C<@foo>), 3262even though the same text in C<qq//> gives interpolation of C<\c@>. 3263 3264Code blocks such as C<(?{BLOCK})> are handled by temporarily passing control 3265back to the perl parser, in a similar way that an interpolated array 3266subscript expression such as C<"foo$array[1+f("[xyz")]bar"> would be. 3267 3268Moreover, inside C<(?{BLOCK})>, S<C<(?# comment )>>, and 3269a C<#>-comment in a C</x>-regular expression, no processing is 3270performed whatsoever. This is the first step at which the presence 3271of the C</x> modifier is relevant. 3272 3273Interpolation in patterns has several quirks: C<$|>, C<$(>, C<$)>, C<@+> 3274and C<@-> are not interpolated, and constructs C<$var[SOMETHING]> are 3275voted (by several different estimators) to be either an array element 3276or C<$var> followed by an RE alternative. This is where the notation 3277C<${arr[$bar]}> comes handy: C</${arr[0-9]}/> is interpreted as 3278array element C<-9>, not as a regular expression from the variable 3279C<$arr> followed by a digit, which would be the interpretation of 3280C</$arr[0-9]/>. Since voting among different estimators may occur, 3281the result is not predictable. 3282 3283The lack of processing of C<\\> creates specific restrictions on 3284the post-processed text. If the delimiter is C</>, one cannot get 3285the combination C<\/> into the result of this step. C</> will 3286finish the regular expression, C<\/> will be stripped to C</> on 3287the previous step, and C<\\/> will be left as is. Because C</> is 3288equivalent to C<\/> inside a regular expression, this does not 3289matter unless the delimiter happens to be character special to the 3290RE engine, such as in C<s*foo*bar*>, C<m[foo]>, or C<m?foo?>; or an 3291alphanumeric char, as in: 3292 3293 m m ^ a \s* b mmx; 3294 3295In the RE above, which is intentionally obfuscated for illustration, the 3296delimiter is C<m>, the modifier is C<mx>, and after delimiter-removal the 3297RE is the same as for S<C<m/ ^ a \s* b /mx>>. There's more than one 3298reason you're encouraged to restrict your delimiters to non-alphanumeric, 3299non-whitespace choices. 3300 3301=back 3302 3303This step is the last one for all constructs except regular expressions, 3304which are processed further. 3305 3306=item Parsing regular expressions 3307X<regexp, parse> 3308 3309Previous steps were performed during the compilation of Perl code, 3310but this one happens at run time, although it may be optimized to 3311be calculated at compile time if appropriate. After preprocessing 3312described above, and possibly after evaluation if concatenation, 3313joining, casing translation, or metaquoting are involved, the 3314resulting I<string> is passed to the RE engine for compilation. 3315 3316Whatever happens in the RE engine might be better discussed in L<perlre>, 3317but for the sake of continuity, we shall do so here. 3318 3319This is another step where the presence of the C</x> modifier is 3320relevant. The RE engine scans the string from left to right and 3321converts it into a finite automaton. 3322 3323Backslashed characters are either replaced with corresponding 3324literal strings (as with C<\{>), or else they generate special nodes 3325in the finite automaton (as with C<\b>). Characters special to the 3326RE engine (such as C<|>) generate corresponding nodes or groups of 3327nodes. C<(?#...)> comments are ignored. All the rest is either 3328converted to literal strings to match, or else is ignored (as is 3329whitespace and C<#>-style comments if C</x> is present). 3330 3331Parsing of the bracketed character class construct, C<[...]>, is 3332rather different than the rule used for the rest of the pattern. 3333The terminator of this construct is found using the same rules as 3334for finding the terminator of a C<{}>-delimited construct, the only 3335exception being that C<]> immediately following C<[> is treated as 3336though preceded by a backslash. 3337 3338The terminator of runtime C<(?{...})> is found by temporarily switching 3339control to the perl parser, which should stop at the point where the 3340logically balancing terminating C<}> is found. 3341 3342It is possible to inspect both the string given to RE engine and the 3343resulting finite automaton. See the arguments C<debug>/C<debugcolor> 3344in the S<C<use L<re>>> pragma, as well as Perl's B<-Dr> command-line 3345switch documented in L<perlrun/"Command Switches">. 3346 3347=item Optimization of regular expressions 3348X<regexp, optimization> 3349 3350This step is listed for completeness only. Since it does not change 3351semantics, details of this step are not documented and are subject 3352to change without notice. This step is performed over the finite 3353automaton that was generated during the previous pass. 3354 3355It is at this stage that C<split()> silently optimizes C</^/> to 3356mean C</^/m>. 3357 3358=back 3359 3360=head2 I/O Operators 3361X<operator, i/o> X<operator, io> X<io> X<while> X<filehandle> 3362X<< <> >> X<< <<>> >> X<@ARGV> 3363 3364There are several I/O operators you should know about. 3365 3366A string enclosed by backticks (grave accents) first undergoes 3367double-quote interpolation. It is then interpreted as an external 3368command, and the output of that command is the value of the 3369backtick string, like in a shell. In scalar context, a single string 3370consisting of all output is returned. In list context, a list of 3371values is returned, one per line of output. (You can set C<$/> to use 3372a different line terminator.) The command is executed each time the 3373pseudo-literal is evaluated. The status value of the command is 3374returned in C<$?> (see L<perlvar> for the interpretation of C<$?>). 3375Unlike in B<csh>, no translation is done on the return data--newlines 3376remain newlines. Unlike in any of the shells, single quotes do not 3377hide variable names in the command from interpretation. To pass a 3378literal dollar-sign through to the shell you need to hide it with a 3379backslash. The generalized form of backticks is C<qx//>, or you can 3380call the L<perlfunc/readpipe> function. (Because 3381backticks always undergo shell expansion as well, see L<perlsec> for 3382security concerns.) 3383X<qx> X<`> X<``> X<backtick> X<glob> 3384 3385In scalar context, evaluating a filehandle in angle brackets yields 3386the next line from that file (the newline, if any, included), or 3387C<undef> at end-of-file or on error. When C<$/> is set to C<undef> 3388(sometimes known as file-slurp mode) and the file is empty, it 3389returns C<''> the first time, followed by C<undef> subsequently. 3390 3391Ordinarily you must assign the returned value to a variable, but 3392there is one situation where an automatic assignment happens. If 3393and only if the input symbol is the only thing inside the conditional 3394of a C<while> statement (even if disguised as a C<for(;;)> loop), 3395the value is automatically assigned to the global variable C<$_>, 3396destroying whatever was there previously. (This may seem like an 3397odd thing to you, but you'll use the construct in almost every Perl 3398script you write.) The C<$_> variable is not implicitly localized. 3399You'll have to put a S<C<local $_;>> before the loop if you want that 3400to happen. Furthermore, if the input symbol or an explicit assignment 3401of the input symbol to a scalar is used as a C<while>/C<for> condition, 3402then the condition actually tests for definedness of the expression's 3403value, not for its regular truth value. 3404 3405Thus the following lines are equivalent: 3406 3407 while (defined($_ = <STDIN>)) { print; } 3408 while ($_ = <STDIN>) { print; } 3409 while (<STDIN>) { print; } 3410 for (;<STDIN>;) { print; } 3411 print while defined($_ = <STDIN>); 3412 print while ($_ = <STDIN>); 3413 print while <STDIN>; 3414 3415This also behaves similarly, but assigns to a lexical variable 3416instead of to C<$_>: 3417 3418 while (my $line = <STDIN>) { print $line } 3419 3420In these loop constructs, the assigned value (whether assignment 3421is automatic or explicit) is then tested to see whether it is 3422defined. The defined test avoids problems where the line has a string 3423value that would be treated as false by Perl; for example a "" or 3424a C<"0"> with no trailing newline. If you really mean for such values 3425to terminate the loop, they should be tested for explicitly: 3426 3427 while (($_ = <STDIN>) ne '0') { ... } 3428 while (<STDIN>) { last unless $_; ... } 3429 3430In other boolean contexts, C<< <I<FILEHANDLE>> >> without an 3431explicit C<defined> test or comparison elicits a warning if the 3432S<C<use warnings>> pragma or the B<-w> 3433command-line switch (the C<$^W> variable) is in effect. 3434 3435The filehandles STDIN, STDOUT, and STDERR are predefined. (The 3436filehandles C<stdin>, C<stdout>, and C<stderr> will also work except 3437in packages, where they would be interpreted as local identifiers 3438rather than global.) Additional filehandles may be created with 3439the C<open()> function, amongst others. See L<perlopentut> and 3440L<perlfunc/open> for details on this. 3441X<stdin> X<stdout> X<sterr> 3442 3443If a C<< <I<FILEHANDLE>> >> is used in a context that is looking for 3444a list, a list comprising all input lines is returned, one line per 3445list element. It's easy to grow to a rather large data space this 3446way, so use with care. 3447 3448C<< <I<FILEHANDLE>> >> may also be spelled C<readline(*I<FILEHANDLE>)>. 3449See L<perlfunc/readline>. 3450 3451The null filehandle C<< <> >> (sometimes called the diamond operator) is 3452special: it can be used to emulate the 3453behavior of B<sed> and B<awk>, and any other Unix filter program 3454that takes a list of filenames, doing the same to each line 3455of input from all of them. Input from C<< <> >> comes either from 3456standard input, or from each file listed on the command line. Here's 3457how it works: the first time C<< <> >> is evaluated, the C<@ARGV> array is 3458checked, and if it is empty, C<$ARGV[0]> is set to C<"-">, which when opened 3459gives you standard input. The C<@ARGV> array is then processed as a list 3460of filenames. The loop 3461 3462 while (<>) { 3463 ... # code for each line 3464 } 3465 3466is equivalent to the following Perl-like pseudo code: 3467 3468 unshift(@ARGV, '-') unless @ARGV; 3469 while ($ARGV = shift) { 3470 open(ARGV, $ARGV); 3471 while (<ARGV>) { 3472 ... # code for each line 3473 } 3474 } 3475 3476except that it isn't so cumbersome to say, and will actually work. 3477It really does shift the C<@ARGV> array and put the current filename 3478into the C<$ARGV> variable. It also uses filehandle I<ARGV> 3479internally. C<< <> >> is just a synonym for C<< <ARGV> >>, which 3480is magical. (The pseudo code above doesn't work because it treats 3481C<< <ARGV> >> as non-magical.) 3482 3483Since the null filehandle uses the two argument form of L<perlfunc/open> 3484it interprets special characters, so if you have a script like this: 3485 3486 while (<>) { 3487 print; 3488 } 3489 3490and call it with S<C<perl dangerous.pl 'rm -rfv *|'>>, it actually opens a 3491pipe, executes the C<rm> command and reads C<rm>'s output from that pipe. 3492If you want all items in C<@ARGV> to be interpreted as file names, you 3493can use the module C<ARGV::readonly> from CPAN, or use the double 3494diamond bracket: 3495 3496 while (<<>>) { 3497 print; 3498 } 3499 3500Using double angle brackets inside of a while causes the open to use the 3501three argument form (with the second argument being C<< < >>), so all 3502arguments in C<ARGV> are treated as literal filenames (including C<"-">). 3503(Note that for convenience, if you use C<< <<>> >> and if C<@ARGV> is 3504empty, it will still read from the standard input.) 3505 3506You can modify C<@ARGV> before the first C<< <> >> as long as the array ends up 3507containing the list of filenames you really want. Line numbers (C<$.>) 3508continue as though the input were one big happy file. See the example 3509in L<perlfunc/eof> for how to reset line numbers on each file. 3510 3511If you want to set C<@ARGV> to your own list of files, go right ahead. 3512This sets C<@ARGV> to all plain text files if no C<@ARGV> was given: 3513 3514 @ARGV = grep { -f && -T } glob('*') unless @ARGV; 3515 3516You can even set them to pipe commands. For example, this automatically 3517filters compressed arguments through B<gzip>: 3518 3519 @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV; 3520 3521If you want to pass switches into your script, you can use one of the 3522C<Getopts> modules or put a loop on the front like this: 3523 3524 while ($_ = $ARGV[0], /^-/) { 3525 shift; 3526 last if /^--$/; 3527 if (/^-D(.*)/) { $debug = $1 } 3528 if (/^-v/) { $verbose++ } 3529 # ... # other switches 3530 } 3531 3532 while (<>) { 3533 # ... # code for each line 3534 } 3535 3536The C<< <> >> symbol will return C<undef> for end-of-file only once. 3537If you call it again after this, it will assume you are processing another 3538C<@ARGV> list, and if you haven't set C<@ARGV>, will read input from STDIN. 3539 3540If what the angle brackets contain is a simple scalar variable (for example, 3541C<$foo>), then that variable contains the name of the 3542filehandle to input from, or its typeglob, or a reference to the 3543same. For example: 3544 3545 $fh = \*STDIN; 3546 $line = <$fh>; 3547 3548If what's within the angle brackets is neither a filehandle nor a simple 3549scalar variable containing a filehandle name, typeglob, or typeglob 3550reference, it is interpreted as a filename pattern to be globbed, and 3551either a list of filenames or the next filename in the list is returned, 3552depending on context. This distinction is determined on syntactic 3553grounds alone. That means C<< <$x> >> is always a C<readline()> from 3554an indirect handle, but C<< <$hash{key}> >> is always a C<glob()>. 3555That's because C<$x> is a simple scalar variable, but C<$hash{key}> is 3556not--it's a hash element. Even C<< <$x > >> (note the extra space) 3557is treated as C<glob("$x ")>, not C<readline($x)>. 3558 3559One level of double-quote interpretation is done first, but you can't 3560say C<< <$foo> >> because that's an indirect filehandle as explained 3561in the previous paragraph. (In older versions of Perl, programmers 3562would insert curly brackets to force interpretation as a filename glob: 3563C<< <${foo}> >>. These days, it's considered cleaner to call the 3564internal function directly as C<glob($foo)>, which is probably the right 3565way to have done it in the first place.) For example: 3566 3567 while (<*.c>) { 3568 chmod 0644, $_; 3569 } 3570 3571is roughly equivalent to: 3572 3573 open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|"); 3574 while (<FOO>) { 3575 chomp; 3576 chmod 0644, $_; 3577 } 3578 3579except that the globbing is actually done internally using the standard 3580C<L<File::Glob>> extension. Of course, the shortest way to do the above is: 3581 3582 chmod 0644, <*.c>; 3583 3584A (file)glob evaluates its (embedded) argument only when it is 3585starting a new list. All values must be read before it will start 3586over. In list context, this isn't important because you automatically 3587get them all anyway. However, in scalar context the operator returns 3588the next value each time it's called, or C<undef> when the list has 3589run out. As with filehandle reads, an automatic C<defined> is 3590generated when the glob occurs in the test part of a C<while>, 3591because legal glob returns (for example, 3592a file called F<0>) would otherwise 3593terminate the loop. Again, C<undef> is returned only once. So if 3594you're expecting a single value from a glob, it is much better to 3595say 3596 3597 ($file) = <blurch*>; 3598 3599than 3600 3601 $file = <blurch*>; 3602 3603because the latter will alternate between returning a filename and 3604returning false. 3605 3606If you're trying to do variable interpolation, it's definitely better 3607to use the C<glob()> function, because the older notation can cause people 3608to become confused with the indirect filehandle notation. 3609 3610 @files = glob("$dir/*.[ch]"); 3611 @files = glob($files[$i]); 3612 3613If an angle-bracket-based globbing expression is used as the condition of 3614a C<while> or C<for> loop, then it will be implicitly assigned to C<$_>. 3615If either a globbing expression or an explicit assignment of a globbing 3616expression to a scalar is used as a C<while>/C<for> condition, then 3617the condition actually tests for definedness of the expression's value, 3618not for its regular truth value. 3619 3620=head2 Constant Folding 3621X<constant folding> X<folding> 3622 3623Like C, Perl does a certain amount of expression evaluation at 3624compile time whenever it determines that all arguments to an 3625operator are static and have no side effects. In particular, string 3626concatenation happens at compile time between literals that don't do 3627variable substitution. Backslash interpolation also happens at 3628compile time. You can say 3629 3630 'Now is the time for all' 3631 . "\n" 3632 . 'good men to come to.' 3633 3634and this all reduces to one string internally. Likewise, if 3635you say 3636 3637 foreach $file (@filenames) { 3638 if (-s $file > 5 + 100 * 2**16) { } 3639 } 3640 3641the compiler precomputes the number which that expression 3642represents so that the interpreter won't have to. 3643 3644=head2 No-ops 3645X<no-op> X<nop> 3646 3647Perl doesn't officially have a no-op operator, but the bare constants 3648C<0> and C<1> are special-cased not to produce a warning in void 3649context, so you can for example safely do 3650 3651 1 while foo(); 3652 3653=head2 Bitwise String Operators 3654X<operator, bitwise, string> X<&.> X<|.> X<^.> X<~.> 3655 3656Bitstrings of any size may be manipulated by the bitwise operators 3657(C<~ | & ^>). 3658 3659If the operands to a binary bitwise op are strings of different 3660sizes, B<|> and B<^> ops act as though the shorter operand had 3661additional zero bits on the right, while the B<&> op acts as though 3662the longer operand were truncated to the length of the shorter. 3663The granularity for such extension or truncation is one or more 3664bytes. 3665 3666 # ASCII-based examples 3667 print "j p \n" ^ " a h"; # prints "JAPH\n" 3668 print "JA" | " ph\n"; # prints "japh\n" 3669 print "japh\nJunk" & '_____'; # prints "JAPH\n"; 3670 print 'p N$' ^ " E<H\n"; # prints "Perl\n"; 3671 3672If you are intending to manipulate bitstrings, be certain that 3673you're supplying bitstrings: If an operand is a number, that will imply 3674a B<numeric> bitwise operation. You may explicitly show which type of 3675operation you intend by using C<""> or C<0+>, as in the examples below. 3676 3677 $foo = 150 | 105; # yields 255 (0x96 | 0x69 is 0xFF) 3678 $foo = '150' | 105; # yields 255 3679 $foo = 150 | '105'; # yields 255 3680 $foo = '150' | '105'; # yields string '155' (under ASCII) 3681 3682 $baz = 0+$foo & 0+$bar; # both ops explicitly numeric 3683 $biz = "$foo" ^ "$bar"; # both ops explicitly stringy 3684 3685This somewhat unpredictable behavior can be avoided with the "bitwise" 3686feature, new in Perl 5.22. You can enable it via S<C<use feature 3687'bitwise'>> or C<use v5.28>. Before Perl 5.28, it used to emit a warning 3688in the C<"experimental::bitwise"> category. Under this feature, the four 3689standard bitwise operators (C<~ | & ^>) are always numeric. Adding a dot 3690after each operator (C<~. |. &. ^.>) forces it to treat its operands as 3691strings: 3692 3693 use feature "bitwise"; 3694 $foo = 150 | 105; # yields 255 (0x96 | 0x69 is 0xFF) 3695 $foo = '150' | 105; # yields 255 3696 $foo = 150 | '105'; # yields 255 3697 $foo = '150' | '105'; # yields 255 3698 $foo = 150 |. 105; # yields string '155' 3699 $foo = '150' |. 105; # yields string '155' 3700 $foo = 150 |.'105'; # yields string '155' 3701 $foo = '150' |.'105'; # yields string '155' 3702 3703 $baz = $foo & $bar; # both operands numeric 3704 $biz = $foo ^. $bar; # both operands stringy 3705 3706The assignment variants of these operators (C<&= |= ^= &.= |.= ^.=>) 3707behave likewise under the feature. 3708 3709It is a fatal error if an operand contains a character whose ordinal 3710value is above 0xFF, and hence not expressible except in UTF-8. The 3711operation is performed on a non-UTF-8 copy for other operands encoded in 3712UTF-8. See L<perlunicode/Byte and Character Semantics>. 3713 3714See L<perlfunc/vec> for information on how to manipulate individual bits 3715in a bit vector. 3716 3717=head2 Integer Arithmetic 3718X<integer> 3719 3720By default, Perl assumes that it must do most of its arithmetic in 3721floating point. But by saying 3722 3723 use integer; 3724 3725you may tell the compiler to use integer operations 3726(see L<integer> for a detailed explanation) from here to the end of 3727the enclosing BLOCK. An inner BLOCK may countermand this by saying 3728 3729 no integer; 3730 3731which lasts until the end of that BLOCK. Note that this doesn't 3732mean everything is an integer, merely that Perl will use integer 3733operations for arithmetic, comparison, and bitwise operators. For 3734example, even under S<C<use integer>>, if you take the C<sqrt(2)>, you'll 3735still get C<1.4142135623731> or so. 3736 3737Used on numbers, the bitwise operators (C<&> C<|> C<^> C<~> C<< << >> 3738C<< >> >>) always produce integral results. (But see also 3739L</Bitwise String Operators>.) However, S<C<use integer>> still has meaning for 3740them. By default, their results are interpreted as unsigned integers, but 3741if S<C<use integer>> is in effect, their results are interpreted 3742as signed integers. For example, C<~0> usually evaluates to a large 3743integral value. However, S<C<use integer; ~0>> is C<-1> on two's-complement 3744machines. 3745 3746=head2 Floating-point Arithmetic 3747 3748X<floating-point> X<floating point> X<float> X<real> 3749 3750While S<C<use integer>> provides integer-only arithmetic, there is no 3751analogous mechanism to provide automatic rounding or truncation to a 3752certain number of decimal places. For rounding to a certain number 3753of digits, C<sprintf()> or C<printf()> is usually the easiest route. 3754See L<perlfaq4>. 3755 3756Floating-point numbers are only approximations to what a mathematician 3757would call real numbers. There are infinitely more reals than floats, 3758so some corners must be cut. For example: 3759 3760 printf "%.20g\n", 123456789123456789; 3761 # produces 123456789123456784 3762 3763Testing for exact floating-point equality or inequality is not a 3764good idea. Here's a (relatively expensive) work-around to compare 3765whether two floating-point numbers are equal to a particular number of 3766decimal places. See Knuth, volume II, for a more robust treatment of 3767this topic. 3768 3769 sub fp_equal { 3770 my ($X, $Y, $POINTS) = @_; 3771 my ($tX, $tY); 3772 $tX = sprintf("%.${POINTS}g", $X); 3773 $tY = sprintf("%.${POINTS}g", $Y); 3774 return $tX eq $tY; 3775 } 3776 3777The POSIX module (part of the standard perl distribution) implements 3778C<ceil()>, C<floor()>, and other mathematical and trigonometric functions. 3779The C<L<Math::Complex>> module (part of the standard perl distribution) 3780defines mathematical functions that work on both the reals and the 3781imaginary numbers. C<Math::Complex> is not as efficient as POSIX, but 3782POSIX can't work with complex numbers. 3783 3784Rounding in financial applications can have serious implications, and 3785the rounding method used should be specified precisely. In these 3786cases, it probably pays not to trust whichever system rounding is 3787being used by Perl, but to instead implement the rounding function you 3788need yourself. 3789 3790=head2 Bigger Numbers 3791X<number, arbitrary precision> 3792 3793The standard C<L<Math::BigInt>>, C<L<Math::BigRat>>, and 3794C<L<Math::BigFloat>> modules, 3795along with the C<bignum>, C<bigint>, and C<bigrat> pragmas, provide 3796variable-precision arithmetic and overloaded operators, although 3797they're currently pretty slow. At the cost of some space and 3798considerable speed, they avoid the normal pitfalls associated with 3799limited-precision representations. 3800 3801 use 5.010; 3802 use bigint; # easy interface to Math::BigInt 3803 $x = 123456789123456789; 3804 say $x * $x; 3805 +15241578780673678515622620750190521 3806 3807Or with rationals: 3808 3809 use 5.010; 3810 use bigrat; 3811 $x = 3/22; 3812 $y = 4/6; 3813 say "x/y is ", $x/$y; 3814 say "x*y is ", $x*$y; 3815 x/y is 9/44 3816 x*y is 1/11 3817 3818Several modules let you calculate with unlimited or fixed precision 3819(bound only by memory and CPU time). There 3820are also some non-standard modules that 3821provide faster implementations via external C libraries. 3822 3823Here is a short, but incomplete summary: 3824 3825 Math::String treat string sequences like numbers 3826 Math::FixedPrecision calculate with a fixed precision 3827 Math::Currency for currency calculations 3828 Bit::Vector manipulate bit vectors fast (uses C) 3829 Math::BigIntFast Bit::Vector wrapper for big numbers 3830 Math::Pari provides access to the Pari C library 3831 Math::Cephes uses the external Cephes C library (no 3832 big numbers) 3833 Math::Cephes::Fraction fractions via the Cephes library 3834 Math::GMP another one using an external C library 3835 Math::GMPz an alternative interface to libgmp's big ints 3836 Math::GMPq an interface to libgmp's fraction numbers 3837 Math::GMPf an interface to libgmp's floating point numbers 3838 3839Choose wisely. 3840 3841=head1 APPENDIX 3842 3843=head2 List of Extra Paired Delimiters 3844 3845The complete list of accepted paired delimiters as of Unicode 14.0 is: 3846 3847 ( ) U+0028, U+0029 LEFT/RIGHT PARENTHESIS 3848 < > U+003C, U+003E LESS-THAN/GREATER-THAN SIGN 3849 [ ] U+005B, U+005D LEFT/RIGHT SQUARE BRACKET 3850 { } U+007B, U+007D LEFT/RIGHT CURLY BRACKET 3851 « » U+00AB, U+00BB LEFT/RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 3852 » « U+00BB, U+00AB RIGHT/LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 3853 ༺ ༻ U+0F3A, U+0F3B TIBETAN MARK GUG RTAGS GYON, TIBETAN MARK GUG 3854 RTAGS GYAS 3855 ༼ ༽ U+0F3C, U+0F3D TIBETAN MARK ANG KHANG GYON, TIBETAN MARK ANG 3856 KHANG GYAS 3857 ᚛ ᚜ U+169B, U+169C OGHAM FEATHER MARK, OGHAM REVERSED FEATHER MARK 3858 ‘ ’ U+2018, U+2019 LEFT/RIGHT SINGLE QUOTATION MARK 3859 ’ ‘ U+2019, U+2018 RIGHT/LEFT SINGLE QUOTATION MARK 3860 “ ” U+201C, U+201D LEFT/RIGHT DOUBLE QUOTATION MARK 3861 ” “ U+201D, U+201C RIGHT/LEFT DOUBLE QUOTATION MARK 3862 ‵ ′ U+2035, U+2032 REVERSED PRIME, PRIME 3863 ‶ ″ U+2036, U+2033 REVERSED DOUBLE PRIME, DOUBLE PRIME 3864 ‷ ‴ U+2037, U+2034 REVERSED TRIPLE PRIME, TRIPLE PRIME 3865 ‹ › U+2039, U+203A SINGLE LEFT/RIGHT-POINTING ANGLE QUOTATION MARK 3866 › ‹ U+203A, U+2039 SINGLE RIGHT/LEFT-POINTING ANGLE QUOTATION MARK 3867 ⁅ ⁆ U+2045, U+2046 LEFT/RIGHT SQUARE BRACKET WITH QUILL 3868 ⁍ ⁌ U+204D, U+204C BLACK RIGHT/LEFTWARDS BULLET 3869 ⁽ ⁾ U+207D, U+207E SUPERSCRIPT LEFT/RIGHT PARENTHESIS 3870 ₍ ₎ U+208D, U+208E SUBSCRIPT LEFT/RIGHT PARENTHESIS 3871 → ← U+2192, U+2190 RIGHT/LEFTWARDS ARROW 3872 ↛ ↚ U+219B, U+219A RIGHT/LEFTWARDS ARROW WITH STROKE 3873 ↝ ↜ U+219D, U+219C RIGHT/LEFTWARDS WAVE ARROW 3874 ↠ ↞ U+21A0, U+219E RIGHT/LEFTWARDS TWO HEADED ARROW 3875 ↣ ↢ U+21A3, U+21A2 RIGHT/LEFTWARDS ARROW WITH TAIL 3876 ↦ ↤ U+21A6, U+21A4 RIGHT/LEFTWARDS ARROW FROM BAR 3877 ↪ ↩ U+21AA, U+21A9 RIGHT/LEFTWARDS ARROW WITH HOOK 3878 ↬ ↫ U+21AC, U+21AB RIGHT/LEFTWARDS ARROW WITH LOOP 3879 ↱ ↰ U+21B1, U+21B0 UPWARDS ARROW WITH TIP RIGHT/LEFTWARDS 3880 ↳ ↲ U+21B3, U+21B2 DOWNWARDS ARROW WITH TIP RIGHT/LEFTWARDS 3881 ⇀ ↼ U+21C0, U+21BC RIGHT/LEFTWARDS HARPOON WITH BARB UPWARDS 3882 ⇁ ↽ U+21C1, U+21BD RIGHT/LEFTWARDS HARPOON WITH BARB DOWNWARDS 3883 ⇉ ⇇ U+21C9, U+21C7 RIGHT/LEFTWARDS PAIRED ARROWS 3884 ⇏ ⇍ U+21CF, U+21CD RIGHT/LEFTWARDS DOUBLE ARROW WITH STROKE 3885 ⇒ ⇐ U+21D2, U+21D0 RIGHT/LEFTWARDS DOUBLE ARROW 3886 ⇛ ⇚ U+21DB, U+21DA RIGHT/LEFTWARDS TRIPLE ARROW 3887 ⇝ ⇜ U+21DD, U+21DC RIGHT/LEFTWARDS SQUIGGLE ARROW 3888 ⇢ ⇠ U+21E2, U+21E0 RIGHT/LEFTWARDS DASHED ARROW 3889 ⇥ ⇤ U+21E5, U+21E4 RIGHT/LEFTWARDS ARROW TO BAR 3890 ⇨ ⇦ U+21E8, U+21E6 RIGHT/LEFTWARDS WHITE ARROW 3891 ⇴ ⬰ U+21F4, U+2B30 RIGHT/LEFT ARROW WITH SMALL CIRCLE 3892 ⇶ ⬱ U+21F6, U+2B31 THREE RIGHT/LEFTWARDS ARROWS 3893 ⇸ ⇷ U+21F8, U+21F7 RIGHT/LEFTWARDS ARROW WITH VERTICAL STROKE 3894 ⇻ ⇺ U+21FB, U+21FA RIGHT/LEFTWARDS ARROW WITH DOUBLE VERTICAL 3895 STROKE 3896 ⇾ ⇽ U+21FE, U+21FD RIGHT/LEFTWARDS OPEN-HEADED ARROW 3897 ∈ ∋ U+2208, U+220B ELEMENT OF, CONTAINS AS MEMBER 3898 ∉ ∌ U+2209, U+220C NOT AN ELEMENT OF, DOES NOT CONTAIN AS MEMBER 3899 ∊ ∍ U+220A, U+220D SMALL ELEMENT OF, SMALL CONTAINS AS MEMBER 3900 ≤ ≥ U+2264, U+2265 LESS-THAN/GREATER-THAN OR EQUAL TO 3901 ≦ ≧ U+2266, U+2267 LESS-THAN/GREATER-THAN OVER EQUAL TO 3902 ≨ ≩ U+2268, U+2269 LESS-THAN/GREATER-THAN BUT NOT EQUAL TO 3903 ≪ ≫ U+226A, U+226B MUCH LESS-THAN/GREATER-THAN 3904 ≮ ≯ U+226E, U+226F NOT LESS-THAN/GREATER-THAN 3905 ≰ ≱ U+2270, U+2271 NEITHER LESS-THAN/GREATER-THAN NOR EQUAL TO 3906 ≲ ≳ U+2272, U+2273 LESS-THAN/GREATER-THAN OR EQUIVALENT TO 3907 ≴ ≵ U+2274, U+2275 NEITHER LESS-THAN/GREATER-THAN NOR EQUIVALENT TO 3908 ≺ ≻ U+227A, U+227B PRECEDES/SUCCEEDS 3909 ≼ ≽ U+227C, U+227D PRECEDES/SUCCEEDS OR EQUAL TO 3910 ≾ ≿ U+227E, U+227F PRECEDES/SUCCEEDS OR EQUIVALENT TO 3911 ⊀ ⊁ U+2280, U+2281 DOES NOT PRECEDE/SUCCEED 3912 ⊂ ⊃ U+2282, U+2283 SUBSET/SUPERSET OF 3913 ⊄ ⊅ U+2284, U+2285 NOT A SUBSET/SUPERSET OF 3914 ⊆ ⊇ U+2286, U+2287 SUBSET/SUPERSET OF OR EQUAL TO 3915 ⊈ ⊉ U+2288, U+2289 NEITHER A SUBSET/SUPERSET OF NOR EQUAL TO 3916 ⊊ ⊋ U+228A, U+228B SUBSET/SUPERSET OF WITH NOT EQUAL TO 3917 ⊣ ⊢ U+22A3, U+22A2 LEFT/RIGHT TACK 3918 ⊦ ⫞ U+22A6, U+2ADE ASSERTION, SHORT LEFT TACK 3919 ⊨ ⫤ U+22A8, U+2AE4 TRUE, VERTICAL BAR DOUBLE LEFT TURNSTILE 3920 ⊩ ⫣ U+22A9, U+2AE3 FORCES, DOUBLE VERTICAL BAR LEFT TURNSTILE 3921 ⊰ ⊱ U+22B0, U+22B1 PRECEDES/SUCCEEDS UNDER RELATION 3922 ⋐ ⋑ U+22D0, U+22D1 DOUBLE SUBSET/SUPERSET 3923 ⋖ ⋗ U+22D6, U+22D7 LESS-THAN/GREATER-THAN WITH DOT 3924 ⋘ ⋙ U+22D8, U+22D9 VERY MUCH LESS-THAN/GREATER-THAN 3925 ⋜ ⋝ U+22DC, U+22DD EQUAL TO OR LESS-THAN/GREATER-THAN 3926 ⋞ ⋟ U+22DE, U+22DF EQUAL TO OR PRECEDES/SUCCEEDS 3927 ⋠ ⋡ U+22E0, U+22E1 DOES NOT PRECEDE/SUCCEED OR EQUAL 3928 ⋦ ⋧ U+22E6, U+22E7 LESS-THAN/GREATER-THAN BUT NOT EQUIVALENT TO 3929 ⋨ ⋩ U+22E8, U+22E9 PRECEDES/SUCCEEDS BUT NOT EQUIVALENT TO 3930 ⋲ ⋺ U+22F2, U+22FA ELEMENT OF/CONTAINS WITH LONG HORIZONTAL STROKE 3931 ⋳ ⋻ U+22F3, U+22FB ELEMENT OF/CONTAINS WITH VERTICAL BAR AT END OF 3932 HORIZONTAL STROKE 3933 ⋴ ⋼ U+22F4, U+22FC SMALL ELEMENT OF/CONTAINS WITH VERTICAL BAR AT 3934 END OF HORIZONTAL STROKE 3935 ⋶ ⋽ U+22F6, U+22FD ELEMENT OF/CONTAINS WITH OVERBAR 3936 ⋷ ⋾ U+22F7, U+22FE SMALL ELEMENT OF/CONTAINS WITH OVERBAR 3937 ⌈ ⌉ U+2308, U+2309 LEFT/RIGHT CEILING 3938 ⌊ ⌋ U+230A, U+230B LEFT/RIGHT FLOOR 3939 ⌦ ⌫ U+2326, U+232B ERASE TO THE RIGHT/LEFT 3940 〈 〉 U+2329, U+232A LEFT/RIGHT-POINTING ANGLE BRACKET 3941 ⍈ ⍇ U+2348, U+2347 APL FUNCTIONAL SYMBOL QUAD RIGHT/LEFTWARDS ARROW 3942 ⏩ ⏪ U+23E9, U+23EA BLACK RIGHT/LEFT-POINTING DOUBLE TRIANGLE 3943 ⏭ ⏮ U+23ED, U+23EE BLACK RIGHT/LEFT-POINTING DOUBLE TRIANGLE WITH 3944 VERTICAL BAR 3945 ☛ ☚ U+261B, U+261A BLACK RIGHT/LEFT POINTING INDEX 3946 ☞ ☜ U+261E, U+261C WHITE RIGHT/LEFT POINTING INDEX 3947 ⚞ ⚟ U+269E, U+269F THREE LINES CONVERGING RIGHT/LEFT 3948 ❨ ❩ U+2768, U+2769 MEDIUM LEFT/RIGHT PARENTHESIS ORNAMENT 3949 ❪ ❫ U+276A, U+276B MEDIUM FLATTENED LEFT/RIGHT PARENTHESIS ORNAMENT 3950 ❬ ❭ U+276C, U+276D MEDIUM LEFT/RIGHT-POINTING ANGLE BRACKET 3951 ORNAMENT 3952 ❮ ❯ U+276E, U+276F HEAVY LEFT/RIGHT-POINTING ANGLE QUOTATION MARK 3953 ORNAMENT 3954 ❰ ❱ U+2770, U+2771 HEAVY LEFT/RIGHT-POINTING ANGLE BRACKET ORNAMENT 3955 ❲ ❳ U+2772, U+2773 LIGHT LEFT/RIGHT TORTOISE SHELL BRACKET ORNAMENT 3956 ❴ ❵ U+2774, U+2775 MEDIUM LEFT/RIGHT CURLY BRACKET ORNAMENT 3957 ⟃ ⟄ U+27C3, U+27C4 OPEN SUBSET/SUPERSET 3958 ⟅ ⟆ U+27C5, U+27C6 LEFT/RIGHT S-SHAPED BAG DELIMITER 3959 ⟈ ⟉ U+27C8, U+27C9 REVERSE SOLIDUS PRECEDING SUBSET, SUPERSET 3960 PRECEDING SOLIDUS 3961 ⟞ ⟝ U+27DE, U+27DD LONG LEFT/RIGHT TACK 3962 ⟦ ⟧ U+27E6, U+27E7 MATHEMATICAL LEFT/RIGHT WHITE SQUARE BRACKET 3963 ⟨ ⟩ U+27E8, U+27E9 MATHEMATICAL LEFT/RIGHT ANGLE BRACKET 3964 ⟪ ⟫ U+27EA, U+27EB MATHEMATICAL LEFT/RIGHT DOUBLE ANGLE BRACKET 3965 ⟬ ⟭ U+27EC, U+27ED MATHEMATICAL LEFT/RIGHT WHITE TORTOISE SHELL 3966 BRACKET 3967 ⟮ ⟯ U+27EE, U+27EF MATHEMATICAL LEFT/RIGHT FLATTENED PARENTHESIS 3968 ⟴ ⬲ U+27F4, U+2B32 RIGHT/LEFT ARROW WITH CIRCLED PLUS 3969 ⟶ ⟵ U+27F6, U+27F5 LONG RIGHT/LEFTWARDS ARROW 3970 ⟹ ⟸ U+27F9, U+27F8 LONG RIGHT/LEFTWARDS DOUBLE ARROW 3971 ⟼ ⟻ U+27FC, U+27FB LONG RIGHT/LEFTWARDS ARROW FROM BAR 3972 ⟾ ⟽ U+27FE, U+27FD LONG RIGHT/LEFTWARDS DOUBLE ARROW FROM BAR 3973 ⟿ ⬳ U+27FF, U+2B33 LONG RIGHT/LEFTWARDS SQUIGGLE ARROW 3974 ⤀ ⬴ U+2900, U+2B34 RIGHT/LEFTWARDS TWO-HEADED ARROW WITH VERTICAL 3975 STROKE 3976 ⤁ ⬵ U+2901, U+2B35 RIGHT/LEFTWARDS TWO-HEADED ARROW WITH DOUBLE 3977 VERTICAL STROKE 3978 ⤃ ⤂ U+2903, U+2902 RIGHT/LEFTWARDS DOUBLE ARROW WITH VERTICAL 3979 STROKE 3980 ⤅ ⬶ U+2905, U+2B36 RIGHT/LEFTWARDS TWO-HEADED ARROW FROM BAR 3981 ⤇ ⤆ U+2907, U+2906 RIGHT/LEFTWARDS DOUBLE ARROW FROM BAR 3982 ⤍ ⤌ U+290D, U+290C RIGHT/LEFTWARDS DOUBLE DASH ARROW 3983 ⤏ ⤎ U+290F, U+290E RIGHT/LEFTWARDS TRIPLE DASH ARROW 3984 ⤐ ⬷ U+2910, U+2B37 RIGHT/LEFTWARDS TWO-HEADED TRIPLE DASH ARROW 3985 ⤑ ⬸ U+2911, U+2B38 RIGHT/LEFTWARDS ARROW WITH DOTTED STEM 3986 ⤔ ⬹ U+2914, U+2B39 RIGHT/LEFTWARDS ARROW WITH TAIL WITH VERTICAL 3987 STROKE 3988 ⤕ ⬺ U+2915, U+2B3A RIGHT/LEFTWARDS ARROW WITH TAIL WITH DOUBLE 3989 VERTICAL STROKE 3990 ⤖ ⬻ U+2916, U+2B3B RIGHT/LEFTWARDS TWO-HEADED ARROW WITH TAIL 3991 ⤗ ⬼ U+2917, U+2B3C RIGHT/LEFTWARDS TWO-HEADED ARROW WITH TAIL WITH 3992 VERTICAL STROKE 3993 ⤘ ⬽ U+2918, U+2B3D RIGHT/LEFTWARDS TWO-HEADED ARROW WITH TAIL WITH 3994 DOUBLE VERTICAL STROKE 3995 ⤚ ⤙ U+291A, U+2919 RIGHT/LEFTWARDS ARROW-TAIL 3996 ⤜ ⤛ U+291C, U+291B RIGHT/LEFTWARDS DOUBLE ARROW-TAIL 3997 ⤞ ⤝ U+291E, U+291D RIGHT/LEFTWARDS ARROW TO BLACK DIAMOND 3998 ⤠ ⤟ U+2920, U+291F RIGHT/LEFTWARDS ARROW FROM BAR TO BLACK DIAMOND 3999 ⤳ ⬿ U+2933, U+2B3F WAVE ARROW POINTING DIRECTLY RIGHT/LEFT 4000 ⤷ ⤶ U+2937, U+2936 ARROW POINTING DOWNWARDS THEN CURVING RIGHT/ 4001 LEFTWARDS 4002 ⥅ ⥆ U+2945, U+2946 RIGHT/LEFTWARDS ARROW WITH PLUS BELOW 4003 ⥇ ⬾ U+2947, U+2B3E RIGHT/LEFTWARDS ARROW THROUGH X 4004 ⥓ ⥒ U+2953, U+2952 RIGHT/LEFTWARDS HARPOON WITH BARB UP TO BAR 4005 ⥗ ⥖ U+2957, U+2956 RIGHT/LEFTWARDS HARPOON WITH BARB DOWN TO BAR 4006 ⥛ ⥚ U+295B, U+295A RIGHT/LEFTWARDS HARPOON WITH BARB UP FROM BAR 4007 ⥟ ⥞ U+295F, U+295E RIGHT/LEFTWARDS HARPOON WITH BARB DOWN FROM BAR 4008 ⥤ ⥢ U+2964, U+2962 RIGHT/LEFTWARDS HARPOON WITH BARB UP ABOVE 4009 RIGHT/LEFTWARDS HARPOON WITH BARB DOWN 4010 ⥬ ⥪ U+296C, U+296A RIGHT/LEFTWARDS HARPOON WITH BARB UP ABOVE LONG 4011 DASH 4012 ⥭ ⥫ U+296D, U+296B RIGHT/LEFTWARDS HARPOON WITH BARB DOWN BELOW 4013 LONG DASH 4014 ⥱ ⭀ U+2971, U+2B40 EQUALS SIGN ABOVE RIGHT/LEFTWARDS ARROW 4015 ⥲ ⭁ U+2972, U+2B41 TILDE OPERATOR ABOVE RIGHTWARDS ARROW, REVERSE 4016 TILDE OPERATOR ABOVE LEFTWARDS ARROW 4017 ⥴ ⭋ U+2974, U+2B4B RIGHTWARDS ARROW ABOVE TILDE OPERATOR, 4018 LEFTWARDS ARROW ABOVE REVERSE TILDE OPERATOR 4019 ⥵ ⭂ U+2975, U+2B42 RIGHTWARDS ARROW ABOVE ALMOST EQUAL TO, 4020 LEFTWARDS ARROW ABOVE REVERSE ALMOST EQUAL TO 4021 ⥹ ⥻ U+2979, U+297B SUBSET/SUPERSET ABOVE RIGHT/LEFTWARDS ARROW 4022 ⦃ ⦄ U+2983, U+2984 LEFT/RIGHT WHITE CURLY BRACKET 4023 ⦅ ⦆ U+2985, U+2986 LEFT/RIGHT WHITE PARENTHESIS 4024 ⦇ ⦈ U+2987, U+2988 Z NOTATION LEFT/RIGHT IMAGE BRACKET 4025 ⦉ ⦊ U+2989, U+298A Z NOTATION LEFT/RIGHT BINDING BRACKET 4026 ⦋ ⦌ U+298B, U+298C LEFT/RIGHT SQUARE BRACKET WITH UNDERBAR 4027 ⦍ ⦐ U+298D, U+2990 LEFT/RIGHT SQUARE BRACKET WITH TICK IN TOP 4028 CORNER 4029 ⦏ ⦎ U+298F, U+298E LEFT/RIGHT SQUARE BRACKET WITH TICK IN BOTTOM 4030 CORNER 4031 ⦑ ⦒ U+2991, U+2992 LEFT/RIGHT ANGLE BRACKET WITH DOT 4032 ⦓ ⦔ U+2993, U+2994 LEFT/RIGHT ARC LESS-THAN/GREATER-THAN BRACKET 4033 ⦕ ⦖ U+2995, U+2996 DOUBLE LEFT/RIGHT ARC GREATER-THAN/LESS-THAN 4034 BRACKET 4035 ⦗ ⦘ U+2997, U+2998 LEFT/RIGHT BLACK TORTOISE SHELL BRACKET 4036 ⦨ ⦩ U+29A8, U+29A9 MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW 4037 POINTING UP AND RIGHT/LEFT 4038 ⦪ ⦫ U+29AA, U+29AB MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW 4039 POINTING DOWN AND RIGHT/LEFT 4040 ⦳ ⦴ U+29B3, U+29B4 EMPTY SET WITH RIGHT/LEFT ARROW ABOVE 4041 ⧀ ⧁ U+29C0, U+29C1 CIRCLED LESS-THAN/GREATER-THAN 4042 ⧘ ⧙ U+29D8, U+29D9 LEFT/RIGHT WIGGLY FENCE 4043 ⧚ ⧛ U+29DA, U+29DB LEFT/RIGHT DOUBLE WIGGLY FENCE 4044 ⧼ ⧽ U+29FC, U+29FD LEFT/RIGHT-POINTING CURVED ANGLE BRACKET 4045 ⩹ ⩺ U+2A79, U+2A7A LESS-THAN/GREATER-THAN WITH CIRCLE INSIDE 4046 ⩻ ⩼ U+2A7B, U+2A7C LESS-THAN/GREATER-THAN WITH QUESTION MARK ABOVE 4047 ⩽ ⩾ U+2A7D, U+2A7E LESS-THAN/GREATER-THAN OR SLANTED EQUAL TO 4048 ⩿ ⪀ U+2A7F, U+2A80 LESS-THAN/GREATER-THAN OR SLANTED EQUAL TO WITH 4049 DOT INSIDE 4050 ⪁ ⪂ U+2A81, U+2A82 LESS-THAN/GREATER-THAN OR SLANTED EQUAL TO WITH 4051 DOT ABOVE 4052 ⪃ ⪄ U+2A83, U+2A84 LESS-THAN/GREATER-THAN OR SLANTED EQUAL TO WITH 4053 DOT ABOVE RIGHT/LEFT 4054 ⪅ ⪆ U+2A85, U+2A86 LESS-THAN/GREATER-THAN OR APPROXIMATE 4055 ⪇ ⪈ U+2A87, U+2A88 LESS-THAN/GREATER-THAN AND SINGLE-LINE NOT 4056 EQUAL TO 4057 ⪉ ⪊ U+2A89, U+2A8A LESS-THAN/GREATER-THAN AND NOT APPROXIMATE 4058 ⪍ ⪎ U+2A8D, U+2A8E LESS-THAN/GREATER-THAN ABOVE SIMILAR OR EQUAL 4059 ⪕ ⪖ U+2A95, U+2A96 SLANTED EQUAL TO OR LESS-THAN/GREATER-THAN 4060 ⪗ ⪘ U+2A97, U+2A98 SLANTED EQUAL TO OR LESS-THAN/GREATER-THAN WITH 4061 DOT INSIDE 4062 ⪙ ⪚ U+2A99, U+2A9A DOUBLE-LINE EQUAL TO OR LESS-THAN/GREATER-THAN 4063 ⪛ ⪜ U+2A9B, U+2A9C DOUBLE-LINE SLANTED EQUAL TO OR LESS-THAN/ 4064 GREATER-THAN 4065 ⪝ ⪞ U+2A9D, U+2A9E SIMILAR OR LESS-THAN/GREATER-THAN 4066 ⪟ ⪠ U+2A9F, U+2AA0 SIMILAR ABOVE LESS-THAN/GREATER-THAN ABOVE 4067 EQUALS SIGN 4068 ⪡ ⪢ U+2AA1, U+2AA2 DOUBLE NESTED LESS-THAN/GREATER-THAN 4069 ⪦ ⪧ U+2AA6, U+2AA7 LESS-THAN/GREATER-THAN CLOSED BY CURVE 4070 ⪨ ⪩ U+2AA8, U+2AA9 LESS-THAN/GREATER-THAN CLOSED BY CURVE ABOVE 4071 SLANTED EQUAL 4072 ⪪ ⪫ U+2AAA, U+2AAB SMALLER THAN/LARGER THAN 4073 ⪬ ⪭ U+2AAC, U+2AAD SMALLER THAN/LARGER THAN OR EQUAL TO 4074 ⪯ ⪰ U+2AAF, U+2AB0 PRECEDES/SUCCEEDS ABOVE SINGLE-LINE EQUALS SIGN 4075 ⪱ ⪲ U+2AB1, U+2AB2 PRECEDES/SUCCEEDS ABOVE SINGLE-LINE NOT EQUAL TO 4076 ⪳ ⪴ U+2AB3, U+2AB4 PRECEDES/SUCCEEDS ABOVE EQUALS SIGN 4077 ⪵ ⪶ U+2AB5, U+2AB6 PRECEDES/SUCCEEDS ABOVE NOT EQUAL TO 4078 ⪷ ⪸ U+2AB7, U+2AB8 PRECEDES/SUCCEEDS ABOVE ALMOST EQUAL TO 4079 ⪹ ⪺ U+2AB9, U+2ABA PRECEDES/SUCCEEDS ABOVE NOT ALMOST EQUAL TO 4080 ⪻ ⪼ U+2ABB, U+2ABC DOUBLE PRECEDES/SUCCEEDS 4081 ⪽ ⪾ U+2ABD, U+2ABE SUBSET/SUPERSET WITH DOT 4082 ⪿ ⫀ U+2ABF, U+2AC0 SUBSET/SUPERSET WITH PLUS SIGN BELOW 4083 ⫁ ⫂ U+2AC1, U+2AC2 SUBSET/SUPERSET WITH MULTIPLICATION SIGN BELOW 4084 ⫃ ⫄ U+2AC3, U+2AC4 SUBSET/SUPERSET OF OR EQUAL TO WITH DOT ABOVE 4085 ⫅ ⫆ U+2AC5, U+2AC6 SUBSET/SUPERSET OF ABOVE EQUALS SIGN 4086 ⫇ ⫈ U+2AC7, U+2AC8 SUBSET/SUPERSET OF ABOVE TILDE OPERATOR 4087 ⫉ ⫊ U+2AC9, U+2ACA SUBSET/SUPERSET OF ABOVE ALMOST EQUAL TO 4088 ⫋ ⫌ U+2ACB, U+2ACC SUBSET/SUPERSET OF ABOVE NOT EQUAL TO 4089 ⫏ ⫐ U+2ACF, U+2AD0 CLOSED SUBSET/SUPERSET 4090 ⫑ ⫒ U+2AD1, U+2AD2 CLOSED SUBSET/SUPERSET OR EQUAL TO 4091 ⫕ ⫖ U+2AD5, U+2AD6 SUBSET/SUPERSET ABOVE SUBSET/SUPERSET 4092 ⫥ ⊫ U+2AE5, U+22AB DOUBLE VERTICAL BAR DOUBLE LEFT/RIGHT TURNSTILE 4093 ⫷ ⫸ U+2AF7, U+2AF8 TRIPLE NESTED LESS-THAN/GREATER-THAN 4094 ⫹ ⫺ U+2AF9, U+2AFA DOUBLE-LINE SLANTED LESS-THAN/GREATER-THAN OR 4095 EQUAL TO 4096 ⭆ ⭅ U+2B46, U+2B45 RIGHT/LEFTWARDS QUADRUPLE ARROW 4097 ⭇ ⭉ U+2B47, U+2B49 REVERSE TILDE OPERATOR ABOVE RIGHTWARDS ARROW, 4098 TILDE OPERATOR ABOVE LEFTWARDS ARROW 4099 ⭈ ⭊ U+2B48, U+2B4A RIGHTWARDS ARROW ABOVE REVERSE ALMOST EQUAL 4100 TO, LEFTWARDS ARROW ABOVE ALMOST EQUAL TO 4101 ⭌ ⥳ U+2B4C, U+2973 RIGHTWARDS ARROW ABOVE REVERSE TILDE OPERATOR, 4102 LEFTWARDS ARROW ABOVE TILDE OPERATOR 4103 ⭢ ⭠ U+2B62, U+2B60 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW 4104 ⭬ ⭪ U+2B6C, U+2B6A RIGHT/LEFTWARDS TRIANGLE-HEADED DASHED ARROW 4105 ⭲ ⭰ U+2B72, U+2B70 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW TO BAR 4106 ⭼ ⭺ U+2B7C, U+2B7A RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH 4107 DOUBLE VERTICAL STROKE 4108 ⮆ ⮄ U+2B86, U+2B84 RIGHT/LEFTWARDS TRIANGLE-HEADED PAIRED ARROWS 4109 ⮊ ⮈ U+2B8A, U+2B88 RIGHT/LEFTWARDS BLACK CIRCLED WHITE ARROW 4110 ⮕ ⬅ U+2B95, U+2B05 RIGHT/LEFTWARDS BLACK ARROW 4111 ⮚ ⮘ U+2B9A, U+2B98 THREE-D TOP-LIGHTED RIGHT/LEFTWARDS EQUILATERAL 4112 ARROWHEAD 4113 ⮞ ⮜ U+2B9E, U+2B9C BLACK RIGHT/LEFTWARDS EQUILATERAL ARROWHEAD 4114 ⮡ ⮠ U+2BA1, U+2BA0 DOWNWARDS TRIANGLE-HEADED ARROW WITH LONG TIP 4115 RIGHT/LEFTWARDS 4116 ⮣ ⮢ U+2BA3, U+2BA2 UPWARDS TRIANGLE-HEADED ARROW WITH LONG TIP 4117 RIGHT/LEFTWARDS 4118 ⮩ ⮨ U+2BA9, U+2BA8 BLACK CURVED DOWNWARDS AND RIGHT/LEFTWARDS ARROW 4119 ⮫ ⮪ U+2BAB, U+2BAA BLACK CURVED UPWARDS AND RIGHT/LEFTWARDS ARROW 4120 ⮱ ⮰ U+2BB1, U+2BB0 RIBBON ARROW DOWN RIGHT/LEFT 4121 ⮳ ⮲ U+2BB3, U+2BB2 RIBBON ARROW UP RIGHT/LEFT 4122 ⯮ ⯬ U+2BEE, U+2BEC RIGHT/LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE 4123 ARROWHEADS 4124 ⸂ ⸃ U+2E02, U+2E03 LEFT/RIGHT SUBSTITUTION BRACKET 4125 ⸃ ⸂ U+2E03, U+2E02 RIGHT/LEFT SUBSTITUTION BRACKET 4126 ⸄ ⸅ U+2E04, U+2E05 LEFT/RIGHT DOTTED SUBSTITUTION BRACKET 4127 ⸅ ⸄ U+2E05, U+2E04 RIGHT/LEFT DOTTED SUBSTITUTION BRACKET 4128 ⸉ ⸊ U+2E09, U+2E0A LEFT/RIGHT TRANSPOSITION BRACKET 4129 ⸊ ⸉ U+2E0A, U+2E09 RIGHT/LEFT TRANSPOSITION BRACKET 4130 ⸌ ⸍ U+2E0C, U+2E0D LEFT/RIGHT RAISED OMISSION BRACKET 4131 ⸍ ⸌ U+2E0D, U+2E0C RIGHT/LEFT RAISED OMISSION BRACKET 4132 ⸑ ⸐ U+2E11, U+2E10 REVERSED FORKED PARAGRAPHOS, FORKED PARAGRAPHOS 4133 ⸜ ⸝ U+2E1C, U+2E1D LEFT/RIGHT LOW PARAPHRASE BRACKET 4134 ⸝ ⸜ U+2E1D, U+2E1C RIGHT/LEFT LOW PARAPHRASE BRACKET 4135 ⸠ ⸡ U+2E20, U+2E21 LEFT/RIGHT VERTICAL BAR WITH QUILL 4136 ⸡ ⸠ U+2E21, U+2E20 RIGHT/LEFT VERTICAL BAR WITH QUILL 4137 ⸢ ⸣ U+2E22, U+2E23 TOP LEFT/RIGHT HALF BRACKET 4138 ⸤ ⸥ U+2E24, U+2E25 BOTTOM LEFT/RIGHT HALF BRACKET 4139 ⸦ ⸧ U+2E26, U+2E27 LEFT/RIGHT SIDEWAYS U BRACKET 4140 ⸨ ⸩ U+2E28, U+2E29 LEFT/RIGHT DOUBLE PARENTHESIS 4141 ⸶ ⸷ U+2E36, U+2E37 DAGGER WITH LEFT/RIGHT GUARD 4142 ⹂ „ U+2E42, U+201E DOUBLE LOW-REVERSED-9 QUOTATION MARK, DOUBLE 4143 LOW-9 QUOTATION MARK 4144 ⹕ ⹖ U+2E55, U+2E56 LEFT/RIGHT SQUARE BRACKET WITH STROKE 4145 ⹗ ⹘ U+2E57, U+2E58 LEFT/RIGHT SQUARE BRACKET WITH DOUBLE STROKE 4146 ⹙ ⹚ U+2E59, U+2E5A TOP HALF LEFT/RIGHT PARENTHESIS 4147 ⹛ ⹜ U+2E5B, U+2E5C BOTTOM HALF LEFT/RIGHT PARENTHESIS 4148 〈 〉 U+3008, U+3009 LEFT/RIGHT ANGLE BRACKET 4149 《 》 U+300A, U+300B LEFT/RIGHT DOUBLE ANGLE BRACKET 4150 「 」 U+300C, U+300D LEFT/RIGHT CORNER BRACKET 4151 『 』 U+300E, U+300F LEFT/RIGHT WHITE CORNER BRACKET 4152 【 】 U+3010, U+3011 LEFT/RIGHT BLACK LENTICULAR BRACKET 4153 〔 〕 U+3014, U+3015 LEFT/RIGHT TORTOISE SHELL BRACKET 4154 〖 〗 U+3016, U+3017 LEFT/RIGHT WHITE LENTICULAR BRACKET 4155 〘 〙 U+3018, U+3019 LEFT/RIGHT WHITE TORTOISE SHELL BRACKET 4156 〚 〛 U+301A, U+301B LEFT/RIGHT WHITE SQUARE BRACKET 4157 〝 〞 U+301D, U+301E REVERSED DOUBLE PRIME QUOTATION MARK, DOUBLE 4158 PRIME QUOTATION MARK 4159 ꧁ ꧂ U+A9C1, U+A9C2 JAVANESE LEFT/RIGHT RERENGGAN 4160 ﴾ ﴿ U+FD3E, U+FD3F ORNATE LEFT/RIGHT PARENTHESIS 4161 ﹙ ﹚ U+FE59, U+FE5A SMALL LEFT/RIGHT PARENTHESIS 4162 ﹛ ﹜ U+FE5B, U+FE5C SMALL LEFT/RIGHT CURLY BRACKET 4163 ﹝ ﹞ U+FE5D, U+FE5E SMALL LEFT/RIGHT TORTOISE SHELL BRACKET 4164 ﹤ ﹥ U+FE64, U+FE65 SMALL LESS-THAN/GREATER-THAN SIGN 4165 ( ) U+FF08, U+FF09 FULLWIDTH LEFT/RIGHT PARENTHESIS 4166 < > U+FF1C, U+FF1E FULLWIDTH LESS-THAN/GREATER-THAN SIGN 4167 [ ] U+FF3B, U+FF3D FULLWIDTH LEFT/RIGHT SQUARE BRACKET 4168 { } U+FF5B, U+FF5D FULLWIDTH LEFT/RIGHT CURLY BRACKET 4169 ⦅ ⦆ U+FF5F, U+FF60 FULLWIDTH LEFT/RIGHT WHITE PARENTHESIS 4170 「 」 U+FF62, U+FF63 HALFWIDTH LEFT/RIGHT CORNER BRACKET 4171 → ← U+FFEB, U+FFE9 HALFWIDTH RIGHT/LEFTWARDS ARROW 4172 U+1D103, U+1D102 MUSICAL SYMBOL REVERSE FINAL BARLINE, MUSICAL 4173 SYMBOL FINAL BARLINE 4174 U+1D106, U+1D107 MUSICAL SYMBOL LEFT/RIGHT REPEAT SIGN 4175 U+1F449, U+1F448 WHITE RIGHT/LEFT POINTING BACKHAND INDEX 4176 U+1F508, U+1F568 SPEAKER, RIGHT SPEAKER 4177 U+1F509, U+1F569 SPEAKER WITH ONE SOUND WAVE, RIGHT SPEAKER WITH 4178 ONE SOUND WAVE 4179 U+1F50A, U+1F56A SPEAKER WITH THREE SOUND WAVES, RIGHT SPEAKER 4180 WITH THREE SOUND WAVES 4181 U+1F57B, U+1F57D LEFT/RIGHT HAND TELEPHONE RECEIVER 4182 U+1F599, U+1F598 SIDEWAYS WHITE RIGHT/LEFT POINTING INDEX 4183 U+1F59B, U+1F59A SIDEWAYS BLACK RIGHT/LEFT POINTING INDEX 4184 U+1F59D, U+1F59C BLACK RIGHT/LEFT POINTING BACKHAND INDEX 4185 U+1F5E6, U+1F5E7 THREE RAYS LEFT/RIGHT 4186 U+1F802, U+1F800 RIGHT/LEFTWARDS ARROW WITH SMALL TRIANGLE 4187 ARROWHEAD 4188 U+1F806, U+1F804 RIGHT/LEFTWARDS ARROW WITH MEDIUM TRIANGLE 4189 ARROWHEAD 4190 U+1F80A, U+1F808 RIGHT/LEFTWARDS ARROW WITH LARGE TRIANGLE 4191 ARROWHEAD 4192 U+1F812, U+1F810 RIGHT/LEFTWARDS ARROW WITH SMALL EQUILATERAL 4193 ARROWHEAD 4194 U+1F816, U+1F814 RIGHT/LEFTWARDS ARROW WITH EQUILATERAL ARROWHEAD 4195 U+1F81A, U+1F818 HEAVY RIGHT/LEFTWARDS ARROW WITH EQUILATERAL 4196 ARROWHEAD 4197 U+1F81E, U+1F81C HEAVY RIGHT/LEFTWARDS ARROW WITH LARGE 4198 EQUILATERAL ARROWHEAD 4199 U+1F822, U+1F820 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH 4200 NARROW SHAFT 4201 U+1F826, U+1F824 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH 4202 MEDIUM SHAFT 4203 U+1F82A, U+1F828 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH BOLD 4204 SHAFT 4205 U+1F82E, U+1F82C RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH 4206 HEAVY SHAFT 4207 U+1F832, U+1F830 RIGHT/LEFTWARDS TRIANGLE-HEADED ARROW WITH VERY 4208 HEAVY SHAFT 4209 U+1F836, U+1F834 RIGHT/LEFTWARDS FINGER-POST ARROW 4210 U+1F83A, U+1F838 RIGHT/LEFTWARDS SQUARED ARROW 4211 U+1F83E, U+1F83C RIGHT/LEFTWARDS COMPRESSED ARROW 4212 U+1F842, U+1F840 RIGHT/LEFTWARDS HEAVY COMPRESSED ARROW 4213 U+1F846, U+1F844 RIGHT/LEFTWARDS HEAVY ARROW 4214 U+1F852, U+1F850 RIGHT/LEFTWARDS SANS-SERIF ARROW 4215 U+1F862, U+1F860 WIDE-HEADED RIGHT/LEFTWARDS LIGHT BARB ARROW 4216 U+1F86A, U+1F868 WIDE-HEADED RIGHT/LEFTWARDS BARB ARROW 4217 U+1F872, U+1F870 WIDE-HEADED RIGHT/LEFTWARDS MEDIUM BARB ARROW 4218 U+1F87A, U+1F878 WIDE-HEADED RIGHT/LEFTWARDS HEAVY BARB ARROW 4219 U+1F882, U+1F880 WIDE-HEADED RIGHT/LEFTWARDS VERY HEAVY BARB 4220 ARROW 4221 U+1F892, U+1F890 RIGHT/LEFTWARDS TRIANGLE ARROWHEAD 4222 U+1F896, U+1F894 RIGHT/LEFTWARDS WHITE ARROW WITHIN TRIANGLE 4223 ARROWHEAD 4224 U+1F89A, U+1F898 RIGHT/LEFTWARDS ARROW WITH NOTCHED TAIL 4225 U+1F8A1, U+1F8A0 RIGHTWARDS BOTTOM SHADED WHITE ARROW, 4226 LEFTWARDS BOTTOM-SHADED WHITE ARROW 4227 U+1F8A3, U+1F8A2 RIGHT/LEFTWARDS TOP SHADED WHITE ARROW 4228 U+1F8A5, U+1F8A6 RIGHT/LEFTWARDS RIGHT-SHADED WHITE ARROW 4229 U+1F8A7, U+1F8A4 RIGHT/LEFTWARDS LEFT-SHADED WHITE ARROW 4230 U+1F8A9, U+1F8A8 RIGHT/LEFTWARDS BACK-TILTED SHADOWED WHITE ARROW 4231 U+1F8AB, U+1F8AA RIGHT/LEFTWARDS FRONT-TILTED SHADOWED WHITE 4232 ARROW 4233=cut 4234