1*0Sstevel@tonic-gate 2*0Sstevel@tonic-gate=head1 NAME 3*0Sstevel@tonic-gate 4*0Sstevel@tonic-gateperlreftut - Mark's very short tutorial about references 5*0Sstevel@tonic-gate 6*0Sstevel@tonic-gate=head1 DESCRIPTION 7*0Sstevel@tonic-gate 8*0Sstevel@tonic-gateOne of the most important new features in Perl 5 was the capability to 9*0Sstevel@tonic-gatemanage complicated data structures like multidimensional arrays and 10*0Sstevel@tonic-gatenested hashes. To enable these, Perl 5 introduced a feature called 11*0Sstevel@tonic-gate`references', and using references is the key to managing complicated, 12*0Sstevel@tonic-gatestructured data in Perl. Unfortunately, there's a lot of funny syntax 13*0Sstevel@tonic-gateto learn, and the main manual page can be hard to follow. The manual 14*0Sstevel@tonic-gateis quite complete, and sometimes people find that a problem, because 15*0Sstevel@tonic-gateit can be hard to tell what is important and what isn't. 16*0Sstevel@tonic-gate 17*0Sstevel@tonic-gateFortunately, you only need to know 10% of what's in the main page to get 18*0Sstevel@tonic-gate90% of the benefit. This page will show you that 10%. 19*0Sstevel@tonic-gate 20*0Sstevel@tonic-gate=head1 Who Needs Complicated Data Structures? 21*0Sstevel@tonic-gate 22*0Sstevel@tonic-gateOne problem that came up all the time in Perl 4 was how to represent a 23*0Sstevel@tonic-gatehash whose values were lists. Perl 4 had hashes, of course, but the 24*0Sstevel@tonic-gatevalues had to be scalars; they couldn't be lists. 25*0Sstevel@tonic-gate 26*0Sstevel@tonic-gateWhy would you want a hash of lists? Let's take a simple example: You 27*0Sstevel@tonic-gatehave a file of city and country names, like this: 28*0Sstevel@tonic-gate 29*0Sstevel@tonic-gate Chicago, USA 30*0Sstevel@tonic-gate Frankfurt, Germany 31*0Sstevel@tonic-gate Berlin, Germany 32*0Sstevel@tonic-gate Washington, USA 33*0Sstevel@tonic-gate Helsinki, Finland 34*0Sstevel@tonic-gate New York, USA 35*0Sstevel@tonic-gate 36*0Sstevel@tonic-gateand you want to produce an output like this, with each country mentioned 37*0Sstevel@tonic-gateonce, and then an alphabetical list of the cities in that country: 38*0Sstevel@tonic-gate 39*0Sstevel@tonic-gate Finland: Helsinki. 40*0Sstevel@tonic-gate Germany: Berlin, Frankfurt. 41*0Sstevel@tonic-gate USA: Chicago, New York, Washington. 42*0Sstevel@tonic-gate 43*0Sstevel@tonic-gateThe natural way to do this is to have a hash whose keys are country 44*0Sstevel@tonic-gatenames. Associated with each country name key is a list of the cities in 45*0Sstevel@tonic-gatethat country. Each time you read a line of input, split it into a country 46*0Sstevel@tonic-gateand a city, look up the list of cities already known to be in that 47*0Sstevel@tonic-gatecountry, and append the new city to the list. When you're done reading 48*0Sstevel@tonic-gatethe input, iterate over the hash as usual, sorting each list of cities 49*0Sstevel@tonic-gatebefore you print it out. 50*0Sstevel@tonic-gate 51*0Sstevel@tonic-gateIf hash values can't be lists, you lose. In Perl 4, hash values can't 52*0Sstevel@tonic-gatebe lists; they can only be strings. You lose. You'd probably have to 53*0Sstevel@tonic-gatecombine all the cities into a single string somehow, and then when 54*0Sstevel@tonic-gatetime came to write the output, you'd have to break the string into a 55*0Sstevel@tonic-gatelist, sort the list, and turn it back into a string. This is messy 56*0Sstevel@tonic-gateand error-prone. And it's frustrating, because Perl already has 57*0Sstevel@tonic-gateperfectly good lists that would solve the problem if only you could 58*0Sstevel@tonic-gateuse them. 59*0Sstevel@tonic-gate 60*0Sstevel@tonic-gate=head1 The Solution 61*0Sstevel@tonic-gate 62*0Sstevel@tonic-gateBy the time Perl 5 rolled around, we were already stuck with this 63*0Sstevel@tonic-gatedesign: Hash values must be scalars. The solution to this is 64*0Sstevel@tonic-gatereferences. 65*0Sstevel@tonic-gate 66*0Sstevel@tonic-gateA reference is a scalar value that I<refers to> an entire array or an 67*0Sstevel@tonic-gateentire hash (or to just about anything else). Names are one kind of 68*0Sstevel@tonic-gatereference that you're already familiar with. Think of the President 69*0Sstevel@tonic-gateof the United States: a messy, inconvenient bag of blood and bones. 70*0Sstevel@tonic-gateBut to talk about him, or to represent him in a computer program, all 71*0Sstevel@tonic-gateyou need is the easy, convenient scalar string "George Bush". 72*0Sstevel@tonic-gate 73*0Sstevel@tonic-gateReferences in Perl are like names for arrays and hashes. They're 74*0Sstevel@tonic-gatePerl's private, internal names, so you can be sure they're 75*0Sstevel@tonic-gateunambiguous. Unlike "George Bush", a reference only refers to one 76*0Sstevel@tonic-gatething, and you always know what it refers to. If you have a reference 77*0Sstevel@tonic-gateto an array, you can recover the entire array from it. If you have a 78*0Sstevel@tonic-gatereference to a hash, you can recover the entire hash. But the 79*0Sstevel@tonic-gatereference is still an easy, compact scalar value. 80*0Sstevel@tonic-gate 81*0Sstevel@tonic-gateYou can't have a hash whose values are arrays; hash values can only be 82*0Sstevel@tonic-gatescalars. We're stuck with that. But a single reference can refer to 83*0Sstevel@tonic-gatean entire array, and references are scalars, so you can have a hash of 84*0Sstevel@tonic-gatereferences to arrays, and it'll act a lot like a hash of arrays, and 85*0Sstevel@tonic-gateit'll be just as useful as a hash of arrays. 86*0Sstevel@tonic-gate 87*0Sstevel@tonic-gateWe'll come back to this city-country problem later, after we've seen 88*0Sstevel@tonic-gatesome syntax for managing references. 89*0Sstevel@tonic-gate 90*0Sstevel@tonic-gate 91*0Sstevel@tonic-gate=head1 Syntax 92*0Sstevel@tonic-gate 93*0Sstevel@tonic-gateThere are just two ways to make a reference, and just two ways to use 94*0Sstevel@tonic-gateit once you have it. 95*0Sstevel@tonic-gate 96*0Sstevel@tonic-gate=head2 Making References 97*0Sstevel@tonic-gate 98*0Sstevel@tonic-gate=head3 B<Make Rule 1> 99*0Sstevel@tonic-gate 100*0Sstevel@tonic-gateIf you put a C<\> in front of a variable, you get a 101*0Sstevel@tonic-gatereference to that variable. 102*0Sstevel@tonic-gate 103*0Sstevel@tonic-gate $aref = \@array; # $aref now holds a reference to @array 104*0Sstevel@tonic-gate $href = \%hash; # $href now holds a reference to %hash 105*0Sstevel@tonic-gate 106*0Sstevel@tonic-gateOnce the reference is stored in a variable like $aref or $href, you 107*0Sstevel@tonic-gatecan copy it or store it just the same as any other scalar value: 108*0Sstevel@tonic-gate 109*0Sstevel@tonic-gate $xy = $aref; # $xy now holds a reference to @array 110*0Sstevel@tonic-gate $p[3] = $href; # $p[3] now holds a reference to %hash 111*0Sstevel@tonic-gate $z = $p[3]; # $z now holds a reference to %hash 112*0Sstevel@tonic-gate 113*0Sstevel@tonic-gate 114*0Sstevel@tonic-gateThese examples show how to make references to variables with names. 115*0Sstevel@tonic-gateSometimes you want to make an array or a hash that doesn't have a 116*0Sstevel@tonic-gatename. This is analogous to the way you like to be able to use the 117*0Sstevel@tonic-gatestring C<"\n"> or the number 80 without having to store it in a named 118*0Sstevel@tonic-gatevariable first. 119*0Sstevel@tonic-gate 120*0Sstevel@tonic-gateB<Make Rule 2> 121*0Sstevel@tonic-gate 122*0Sstevel@tonic-gateC<[ ITEMS ]> makes a new, anonymous array, and returns a reference to 123*0Sstevel@tonic-gatethat array. C<{ ITEMS }> makes a new, anonymous hash, and returns a 124*0Sstevel@tonic-gatereference to that hash. 125*0Sstevel@tonic-gate 126*0Sstevel@tonic-gate $aref = [ 1, "foo", undef, 13 ]; 127*0Sstevel@tonic-gate # $aref now holds a reference to an array 128*0Sstevel@tonic-gate 129*0Sstevel@tonic-gate $href = { APR => 4, AUG => 8 }; 130*0Sstevel@tonic-gate # $href now holds a reference to a hash 131*0Sstevel@tonic-gate 132*0Sstevel@tonic-gate 133*0Sstevel@tonic-gateThe references you get from rule 2 are the same kind of 134*0Sstevel@tonic-gatereferences that you get from rule 1: 135*0Sstevel@tonic-gate 136*0Sstevel@tonic-gate # This: 137*0Sstevel@tonic-gate $aref = [ 1, 2, 3 ]; 138*0Sstevel@tonic-gate 139*0Sstevel@tonic-gate # Does the same as this: 140*0Sstevel@tonic-gate @array = (1, 2, 3); 141*0Sstevel@tonic-gate $aref = \@array; 142*0Sstevel@tonic-gate 143*0Sstevel@tonic-gate 144*0Sstevel@tonic-gateThe first line is an abbreviation for the following two lines, except 145*0Sstevel@tonic-gatethat it doesn't create the superfluous array variable C<@array>. 146*0Sstevel@tonic-gate 147*0Sstevel@tonic-gateIf you write just C<[]>, you get a new, empty anonymous array. 148*0Sstevel@tonic-gateIf you write just C<{}>, you get a new, empty anonymous hash. 149*0Sstevel@tonic-gate 150*0Sstevel@tonic-gate 151*0Sstevel@tonic-gate=head2 Using References 152*0Sstevel@tonic-gate 153*0Sstevel@tonic-gateWhat can you do with a reference once you have it? It's a scalar 154*0Sstevel@tonic-gatevalue, and we've seen that you can store it as a scalar and get it back 155*0Sstevel@tonic-gateagain just like any scalar. There are just two more ways to use it: 156*0Sstevel@tonic-gate 157*0Sstevel@tonic-gate=head3 B<Use Rule 1> 158*0Sstevel@tonic-gate 159*0Sstevel@tonic-gateYou can always use an array reference, in curly braces, in place of 160*0Sstevel@tonic-gatethe name of an array. For example, C<@{$aref}> instead of C<@array>. 161*0Sstevel@tonic-gate 162*0Sstevel@tonic-gateHere are some examples of that: 163*0Sstevel@tonic-gate 164*0Sstevel@tonic-gateArrays: 165*0Sstevel@tonic-gate 166*0Sstevel@tonic-gate 167*0Sstevel@tonic-gate @a @{$aref} An array 168*0Sstevel@tonic-gate reverse @a reverse @{$aref} Reverse the array 169*0Sstevel@tonic-gate $a[3] ${$aref}[3] An element of the array 170*0Sstevel@tonic-gate $a[3] = 17; ${$aref}[3] = 17 Assigning an element 171*0Sstevel@tonic-gate 172*0Sstevel@tonic-gate 173*0Sstevel@tonic-gateOn each line are two expressions that do the same thing. The 174*0Sstevel@tonic-gateleft-hand versions operate on the array C<@a>. The right-hand 175*0Sstevel@tonic-gateversions operate on the array that is referred to by C<$aref>. Once 176*0Sstevel@tonic-gatethey find the array they're operating on, both versions do the same 177*0Sstevel@tonic-gatethings to the arrays. 178*0Sstevel@tonic-gate 179*0Sstevel@tonic-gateUsing a hash reference is I<exactly> the same: 180*0Sstevel@tonic-gate 181*0Sstevel@tonic-gate %h %{$href} A hash 182*0Sstevel@tonic-gate keys %h keys %{$href} Get the keys from the hash 183*0Sstevel@tonic-gate $h{'red'} ${$href}{'red'} An element of the hash 184*0Sstevel@tonic-gate $h{'red'} = 17 ${$href}{'red'} = 17 Assigning an element 185*0Sstevel@tonic-gate 186*0Sstevel@tonic-gateWhatever you want to do with a reference, B<Use Rule 1> tells you how 187*0Sstevel@tonic-gateto do it. You just write the Perl code that you would have written 188*0Sstevel@tonic-gatefor doing the same thing to a regular array or hash, and then replace 189*0Sstevel@tonic-gatethe array or hash name with C<{$reference}>. "How do I loop over an 190*0Sstevel@tonic-gatearray when all I have is a reference?" Well, to loop over an array, you 191*0Sstevel@tonic-gatewould write 192*0Sstevel@tonic-gate 193*0Sstevel@tonic-gate for my $element (@array) { 194*0Sstevel@tonic-gate ... 195*0Sstevel@tonic-gate } 196*0Sstevel@tonic-gate 197*0Sstevel@tonic-gateso replace the array name, C<@array>, with the reference: 198*0Sstevel@tonic-gate 199*0Sstevel@tonic-gate for my $element (@{$aref}) { 200*0Sstevel@tonic-gate ... 201*0Sstevel@tonic-gate } 202*0Sstevel@tonic-gate 203*0Sstevel@tonic-gate"How do I print out the contents of a hash when all I have is a 204*0Sstevel@tonic-gatereference?" First write the code for printing out a hash: 205*0Sstevel@tonic-gate 206*0Sstevel@tonic-gate for my $key (keys %hash) { 207*0Sstevel@tonic-gate print "$key => $hash{$key}\n"; 208*0Sstevel@tonic-gate } 209*0Sstevel@tonic-gate 210*0Sstevel@tonic-gateAnd then replace the hash name with the reference: 211*0Sstevel@tonic-gate 212*0Sstevel@tonic-gate for my $key (keys %{$href}) { 213*0Sstevel@tonic-gate print "$key => ${$href}{$key}\n"; 214*0Sstevel@tonic-gate } 215*0Sstevel@tonic-gate 216*0Sstevel@tonic-gate=head3 B<Use Rule 2> 217*0Sstevel@tonic-gate 218*0Sstevel@tonic-gateB<Use Rule 1> is all you really need, because it tells you how to to 219*0Sstevel@tonic-gateabsolutely everything you ever need to do with references. But the 220*0Sstevel@tonic-gatemost common thing to do with an array or a hash is to extract a single 221*0Sstevel@tonic-gateelement, and the B<Use Rule 1> notation is cumbersome. So there is an 222*0Sstevel@tonic-gateabbreviation. 223*0Sstevel@tonic-gate 224*0Sstevel@tonic-gateC<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >> 225*0Sstevel@tonic-gateinstead. 226*0Sstevel@tonic-gate 227*0Sstevel@tonic-gateC<${$href}{red}> is too hard to read, so you can write 228*0Sstevel@tonic-gateC<< $href->{red} >> instead. 229*0Sstevel@tonic-gate 230*0Sstevel@tonic-gateIf C<$aref> holds a reference to an array, then C<< $aref->[3] >> is 231*0Sstevel@tonic-gatethe fourth element of the array. Don't confuse this with C<$aref[3]>, 232*0Sstevel@tonic-gatewhich is the fourth element of a totally different array, one 233*0Sstevel@tonic-gatedeceptively named C<@aref>. C<$aref> and C<@aref> are unrelated the 234*0Sstevel@tonic-gatesame way that C<$item> and C<@item> are. 235*0Sstevel@tonic-gate 236*0Sstevel@tonic-gateSimilarly, C<< $href->{'red'} >> is part of the hash referred to by 237*0Sstevel@tonic-gatethe scalar variable C<$href>, perhaps even one with no name. 238*0Sstevel@tonic-gateC<$href{'red'}> is part of the deceptively named C<%href> hash. It's 239*0Sstevel@tonic-gateeasy to forget to leave out the C<< -> >>, and if you do, you'll get 240*0Sstevel@tonic-gatebizarre results when your program gets array and hash elements out of 241*0Sstevel@tonic-gatetotally unexpected hashes and arrays that weren't the ones you wanted 242*0Sstevel@tonic-gateto use. 243*0Sstevel@tonic-gate 244*0Sstevel@tonic-gate 245*0Sstevel@tonic-gate=head2 An Example 246*0Sstevel@tonic-gate 247*0Sstevel@tonic-gateLet's see a quick example of how all this is useful. 248*0Sstevel@tonic-gate 249*0Sstevel@tonic-gateFirst, remember that C<[1, 2, 3]> makes an anonymous array containing 250*0Sstevel@tonic-gateC<(1, 2, 3)>, and gives you a reference to that array. 251*0Sstevel@tonic-gate 252*0Sstevel@tonic-gateNow think about 253*0Sstevel@tonic-gate 254*0Sstevel@tonic-gate @a = ( [1, 2, 3], 255*0Sstevel@tonic-gate [4, 5, 6], 256*0Sstevel@tonic-gate [7, 8, 9] 257*0Sstevel@tonic-gate ); 258*0Sstevel@tonic-gate 259*0Sstevel@tonic-gate@a is an array with three elements, and each one is a reference to 260*0Sstevel@tonic-gateanother array. 261*0Sstevel@tonic-gate 262*0Sstevel@tonic-gateC<$a[1]> is one of these references. It refers to an array, the array 263*0Sstevel@tonic-gatecontaining C<(4, 5, 6)>, and because it is a reference to an array, 264*0Sstevel@tonic-gateB<Use Rule 2> says that we can write C<< $a[1]->[2] >> to get the 265*0Sstevel@tonic-gatethird element from that array. C<< $a[1]->[2] >> is the 6. 266*0Sstevel@tonic-gateSimilarly, C<< $a[0]->[1] >> is the 2. What we have here is like a 267*0Sstevel@tonic-gatetwo-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get 268*0Sstevel@tonic-gateor set the element in any row and any column of the array. 269*0Sstevel@tonic-gate 270*0Sstevel@tonic-gateThe notation still looks a little cumbersome, so there's one more 271*0Sstevel@tonic-gateabbreviation: 272*0Sstevel@tonic-gate 273*0Sstevel@tonic-gate=head2 Arrow Rule 274*0Sstevel@tonic-gate 275*0Sstevel@tonic-gateIn between two B<subscripts>, the arrow is optional. 276*0Sstevel@tonic-gate 277*0Sstevel@tonic-gateInstead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the 278*0Sstevel@tonic-gatesame thing. Instead of C<< $a[0]->[1] = 23 >>, we can write 279*0Sstevel@tonic-gateC<$a[0][1] = 23>; it means the same thing. 280*0Sstevel@tonic-gate 281*0Sstevel@tonic-gateNow it really looks like two-dimensional arrays! 282*0Sstevel@tonic-gate 283*0Sstevel@tonic-gateYou can see why the arrows are important. Without them, we would have 284*0Sstevel@tonic-gatehad to write C<${$a[1]}[2]> instead of C<$a[1][2]>. For 285*0Sstevel@tonic-gatethree-dimensional arrays, they let us write C<$x[2][3][5]> instead of 286*0Sstevel@tonic-gatethe unreadable C<${${$x[2]}[3]}[5]>. 287*0Sstevel@tonic-gate 288*0Sstevel@tonic-gate=head1 Solution 289*0Sstevel@tonic-gate 290*0Sstevel@tonic-gateHere's the answer to the problem I posed earlier, of reformatting a 291*0Sstevel@tonic-gatefile of city and country names. 292*0Sstevel@tonic-gate 293*0Sstevel@tonic-gate 1 my %table; 294*0Sstevel@tonic-gate 295*0Sstevel@tonic-gate 2 while (<>) { 296*0Sstevel@tonic-gate 3 chomp; 297*0Sstevel@tonic-gate 4 my ($city, $country) = split /, /; 298*0Sstevel@tonic-gate 5 $table{$country} = [] unless exists $table{$country}; 299*0Sstevel@tonic-gate 6 push @{$table{$country}}, $city; 300*0Sstevel@tonic-gate 7 } 301*0Sstevel@tonic-gate 302*0Sstevel@tonic-gate 8 foreach $country (sort keys %table) { 303*0Sstevel@tonic-gate 9 print "$country: "; 304*0Sstevel@tonic-gate 10 my @cities = @{$table{$country}}; 305*0Sstevel@tonic-gate 11 print join ', ', sort @cities; 306*0Sstevel@tonic-gate 12 print ".\n"; 307*0Sstevel@tonic-gate 13 } 308*0Sstevel@tonic-gate 309*0Sstevel@tonic-gate 310*0Sstevel@tonic-gateThe program has two pieces: Lines 2--7 read the input and build a data 311*0Sstevel@tonic-gatestructure, and lines 8-13 analyze the data and print out the report. 312*0Sstevel@tonic-gateWe're going to have a hash, C<%table>, whose keys are country names, 313*0Sstevel@tonic-gateand whose values are references to arrays of city names. The data 314*0Sstevel@tonic-gatestructure will look like this: 315*0Sstevel@tonic-gate 316*0Sstevel@tonic-gate 317*0Sstevel@tonic-gate %table 318*0Sstevel@tonic-gate +-------+---+ 319*0Sstevel@tonic-gate | | | +-----------+--------+ 320*0Sstevel@tonic-gate |Germany| *---->| Frankfurt | Berlin | 321*0Sstevel@tonic-gate | | | +-----------+--------+ 322*0Sstevel@tonic-gate +-------+---+ 323*0Sstevel@tonic-gate | | | +----------+ 324*0Sstevel@tonic-gate |Finland| *---->| Helsinki | 325*0Sstevel@tonic-gate | | | +----------+ 326*0Sstevel@tonic-gate +-------+---+ 327*0Sstevel@tonic-gate | | | +---------+------------+----------+ 328*0Sstevel@tonic-gate | USA | *---->| Chicago | Washington | New York | 329*0Sstevel@tonic-gate | | | +---------+------------+----------+ 330*0Sstevel@tonic-gate +-------+---+ 331*0Sstevel@tonic-gate 332*0Sstevel@tonic-gateWe'll look at output first. Supposing we already have this structure, 333*0Sstevel@tonic-gatehow do we print it out? 334*0Sstevel@tonic-gate 335*0Sstevel@tonic-gate 8 foreach $country (sort keys %table) { 336*0Sstevel@tonic-gate 9 print "$country: "; 337*0Sstevel@tonic-gate 10 my @cities = @{$table{$country}}; 338*0Sstevel@tonic-gate 11 print join ', ', sort @cities; 339*0Sstevel@tonic-gate 12 print ".\n"; 340*0Sstevel@tonic-gate 13 } 341*0Sstevel@tonic-gate 342*0Sstevel@tonic-gateC<%table> is an 343*0Sstevel@tonic-gateordinary hash, and we get a list of keys from it, sort the keys, and 344*0Sstevel@tonic-gateloop over the keys as usual. The only use of references is in line 10. 345*0Sstevel@tonic-gateC<$table{$country}> looks up the key C<$country> in the hash 346*0Sstevel@tonic-gateand gets the value, which is a reference to an array of cities in that country. 347*0Sstevel@tonic-gateB<Use Rule 1> says that 348*0Sstevel@tonic-gatewe can recover the array by saying 349*0Sstevel@tonic-gateC<@{$table{$country}}>. Line 10 is just like 350*0Sstevel@tonic-gate 351*0Sstevel@tonic-gate @cities = @array; 352*0Sstevel@tonic-gate 353*0Sstevel@tonic-gateexcept that the name C<array> has been replaced by the reference 354*0Sstevel@tonic-gateC<{$table{$country}}>. The C<@> tells Perl to get the entire array. 355*0Sstevel@tonic-gateHaving gotten the list of cities, we sort it, join it, and print it 356*0Sstevel@tonic-gateout as usual. 357*0Sstevel@tonic-gate 358*0Sstevel@tonic-gateLines 2-7 are responsible for building the structure in the first 359*0Sstevel@tonic-gateplace. Here they are again: 360*0Sstevel@tonic-gate 361*0Sstevel@tonic-gate 2 while (<>) { 362*0Sstevel@tonic-gate 3 chomp; 363*0Sstevel@tonic-gate 4 my ($city, $country) = split /, /; 364*0Sstevel@tonic-gate 5 $table{$country} = [] unless exists $table{$country}; 365*0Sstevel@tonic-gate 6 push @{$table{$country}}, $city; 366*0Sstevel@tonic-gate 7 } 367*0Sstevel@tonic-gate 368*0Sstevel@tonic-gateLines 2-4 acquire a city and country name. Line 5 looks to see if the 369*0Sstevel@tonic-gatecountry is already present as a key in the hash. If it's not, the 370*0Sstevel@tonic-gateprogram uses the C<[]> notation (B<Make Rule 2>) to manufacture a new, 371*0Sstevel@tonic-gateempty anonymous array of cities, and installs a reference to it into 372*0Sstevel@tonic-gatethe hash under the appropriate key. 373*0Sstevel@tonic-gate 374*0Sstevel@tonic-gateLine 6 installs the city name into the appropriate array. 375*0Sstevel@tonic-gateC<$table{$country}> now holds a reference to the array of cities seen 376*0Sstevel@tonic-gatein that country so far. Line 6 is exactly like 377*0Sstevel@tonic-gate 378*0Sstevel@tonic-gate push @array, $city; 379*0Sstevel@tonic-gate 380*0Sstevel@tonic-gateexcept that the name C<array> has been replaced by the reference 381*0Sstevel@tonic-gateC<{$table{$country}}>. The C<push> adds a city name to the end of the 382*0Sstevel@tonic-gatereferred-to array. 383*0Sstevel@tonic-gate 384*0Sstevel@tonic-gateThere's one fine point I skipped. Line 5 is unnecessary, and we can 385*0Sstevel@tonic-gateget rid of it. 386*0Sstevel@tonic-gate 387*0Sstevel@tonic-gate 2 while (<>) { 388*0Sstevel@tonic-gate 3 chomp; 389*0Sstevel@tonic-gate 4 my ($city, $country) = split /, /; 390*0Sstevel@tonic-gate 5 #### $table{$country} = [] unless exists $table{$country}; 391*0Sstevel@tonic-gate 6 push @{$table{$country}}, $city; 392*0Sstevel@tonic-gate 7 } 393*0Sstevel@tonic-gate 394*0Sstevel@tonic-gateIf there's already an entry in C<%table> for the current C<$country>, 395*0Sstevel@tonic-gatethen nothing is different. Line 6 will locate the value in 396*0Sstevel@tonic-gateC<$table{$country}>, which is a reference to an array, and push 397*0Sstevel@tonic-gateC<$city> into the array. But 398*0Sstevel@tonic-gatewhat does it do when 399*0Sstevel@tonic-gateC<$country> holds a key, say C<Greece>, that is not yet in C<%table>? 400*0Sstevel@tonic-gate 401*0Sstevel@tonic-gateThis is Perl, so it does the exact right thing. It sees that you want 402*0Sstevel@tonic-gateto push C<Athens> onto an array that doesn't exist, so it helpfully 403*0Sstevel@tonic-gatemakes a new, empty, anonymous array for you, installs it into 404*0Sstevel@tonic-gateC<%table>, and then pushes C<Athens> onto it. This is called 405*0Sstevel@tonic-gate`autovivification'--bringing things to life automatically. Perl saw 406*0Sstevel@tonic-gatethat they key wasn't in the hash, so it created a new hash entry 407*0Sstevel@tonic-gateautomatically. Perl saw that you wanted to use the hash value as an 408*0Sstevel@tonic-gatearray, so it created a new empty array and installed a reference to it 409*0Sstevel@tonic-gatein the hash automatically. And as usual, Perl made the array one 410*0Sstevel@tonic-gateelement longer to hold the new city name. 411*0Sstevel@tonic-gate 412*0Sstevel@tonic-gate=head1 The Rest 413*0Sstevel@tonic-gate 414*0Sstevel@tonic-gateI promised to give you 90% of the benefit with 10% of the details, and 415*0Sstevel@tonic-gatethat means I left out 90% of the details. Now that you have an 416*0Sstevel@tonic-gateoverview of the important parts, it should be easier to read the 417*0Sstevel@tonic-gateL<perlref> manual page, which discusses 100% of the details. 418*0Sstevel@tonic-gate 419*0Sstevel@tonic-gateSome of the highlights of L<perlref>: 420*0Sstevel@tonic-gate 421*0Sstevel@tonic-gate=over 4 422*0Sstevel@tonic-gate 423*0Sstevel@tonic-gate=item * 424*0Sstevel@tonic-gate 425*0Sstevel@tonic-gateYou can make references to anything, including scalars, functions, and 426*0Sstevel@tonic-gateother references. 427*0Sstevel@tonic-gate 428*0Sstevel@tonic-gate=item * 429*0Sstevel@tonic-gate 430*0Sstevel@tonic-gateIn B<Use Rule 1>, you can omit the curly brackets whenever the thing 431*0Sstevel@tonic-gateinside them is an atomic scalar variable like C<$aref>. For example, 432*0Sstevel@tonic-gateC<@$aref> is the same as C<@{$aref}>, and C<$$aref[1]> is the same as 433*0Sstevel@tonic-gateC<${$aref}[1]>. If you're just starting out, you may want to adopt 434*0Sstevel@tonic-gatethe habit of always including the curly brackets. 435*0Sstevel@tonic-gate 436*0Sstevel@tonic-gate=item * 437*0Sstevel@tonic-gate 438*0Sstevel@tonic-gateThis doesn't copy the underlying array: 439*0Sstevel@tonic-gate 440*0Sstevel@tonic-gate $aref2 = $aref1; 441*0Sstevel@tonic-gate 442*0Sstevel@tonic-gateYou get two references to the same array. If you modify 443*0Sstevel@tonic-gateC<< $aref1->[23] >> and then look at 444*0Sstevel@tonic-gateC<< $aref2->[23] >> you'll see the change. 445*0Sstevel@tonic-gate 446*0Sstevel@tonic-gateTo copy the array, use 447*0Sstevel@tonic-gate 448*0Sstevel@tonic-gate $aref2 = [@{$aref1}]; 449*0Sstevel@tonic-gate 450*0Sstevel@tonic-gateThis uses C<[...]> notation to create a new anonymous array, and 451*0Sstevel@tonic-gateC<$aref2> is assigned a reference to the new array. The new array is 452*0Sstevel@tonic-gateinitialized with the contents of the array referred to by C<$aref1>. 453*0Sstevel@tonic-gate 454*0Sstevel@tonic-gateSimilarly, to copy an anonymous hash, you can use 455*0Sstevel@tonic-gate 456*0Sstevel@tonic-gate $href2 = {%{$href1}}; 457*0Sstevel@tonic-gate 458*0Sstevel@tonic-gate=item * 459*0Sstevel@tonic-gate 460*0Sstevel@tonic-gateTo see if a variable contains a reference, use the C<ref> function. It 461*0Sstevel@tonic-gatereturns true if its argument is a reference. Actually it's a little 462*0Sstevel@tonic-gatebetter than that: It returns C<HASH> for hash references and C<ARRAY> 463*0Sstevel@tonic-gatefor array references. 464*0Sstevel@tonic-gate 465*0Sstevel@tonic-gate=item * 466*0Sstevel@tonic-gate 467*0Sstevel@tonic-gateIf you try to use a reference like a string, you get strings like 468*0Sstevel@tonic-gate 469*0Sstevel@tonic-gate ARRAY(0x80f5dec) or HASH(0x826afc0) 470*0Sstevel@tonic-gate 471*0Sstevel@tonic-gateIf you ever see a string that looks like this, you'll know you 472*0Sstevel@tonic-gateprinted out a reference by mistake. 473*0Sstevel@tonic-gate 474*0Sstevel@tonic-gateA side effect of this representation is that you can use C<eq> to see 475*0Sstevel@tonic-gateif two references refer to the same thing. (But you should usually use 476*0Sstevel@tonic-gateC<==> instead because it's much faster.) 477*0Sstevel@tonic-gate 478*0Sstevel@tonic-gate=item * 479*0Sstevel@tonic-gate 480*0Sstevel@tonic-gateYou can use a string as if it were a reference. If you use the string 481*0Sstevel@tonic-gateC<"foo"> as an array reference, it's taken to be a reference to the 482*0Sstevel@tonic-gatearray C<@foo>. This is called a I<soft reference> or I<symbolic 483*0Sstevel@tonic-gatereference>. The declaration C<use strict 'refs'> disables this 484*0Sstevel@tonic-gatefeature, which can cause all sorts of trouble if you use it by accident. 485*0Sstevel@tonic-gate 486*0Sstevel@tonic-gate=back 487*0Sstevel@tonic-gate 488*0Sstevel@tonic-gateYou might prefer to go on to L<perllol> instead of L<perlref>; it 489*0Sstevel@tonic-gatediscusses lists of lists and multidimensional arrays in detail. After 490*0Sstevel@tonic-gatethat, you should move on to L<perldsc>; it's a Data Structure Cookbook 491*0Sstevel@tonic-gatethat shows recipes for using and printing out arrays of hashes, hashes 492*0Sstevel@tonic-gateof arrays, and other kinds of data. 493*0Sstevel@tonic-gate 494*0Sstevel@tonic-gate=head1 Summary 495*0Sstevel@tonic-gate 496*0Sstevel@tonic-gateEveryone needs compound data structures, and in Perl the way you get 497*0Sstevel@tonic-gatethem is with references. There are four important rules for managing 498*0Sstevel@tonic-gatereferences: Two for making references and two for using them. Once 499*0Sstevel@tonic-gateyou know these rules you can do most of the important things you need 500*0Sstevel@tonic-gateto do with references. 501*0Sstevel@tonic-gate 502*0Sstevel@tonic-gate=head1 Credits 503*0Sstevel@tonic-gate 504*0Sstevel@tonic-gateAuthor: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+@plover.com>) 505*0Sstevel@tonic-gate 506*0Sstevel@tonic-gateThis article originally appeared in I<The Perl Journal> 507*0Sstevel@tonic-gate( http://www.tpj.com/ ) volume 3, #2. Reprinted with permission. 508*0Sstevel@tonic-gate 509*0Sstevel@tonic-gateThe original title was I<Understand References Today>. 510*0Sstevel@tonic-gate 511*0Sstevel@tonic-gate=head2 Distribution Conditions 512*0Sstevel@tonic-gate 513*0Sstevel@tonic-gateCopyright 1998 The Perl Journal. 514*0Sstevel@tonic-gate 515*0Sstevel@tonic-gateThis documentation is free; you can redistribute it and/or modify it 516*0Sstevel@tonic-gateunder the same terms as Perl itself. 517*0Sstevel@tonic-gate 518*0Sstevel@tonic-gateIrrespective of its distribution, all code examples in these files are 519*0Sstevel@tonic-gatehereby placed into the public domain. You are permitted and 520*0Sstevel@tonic-gateencouraged to use this code in your own programs for fun or for profit 521*0Sstevel@tonic-gateas you see fit. A simple comment in the code giving credit would be 522*0Sstevel@tonic-gatecourteous but is not required. 523*0Sstevel@tonic-gate 524*0Sstevel@tonic-gate 525*0Sstevel@tonic-gate 526*0Sstevel@tonic-gate 527*0Sstevel@tonic-gate=cut 528