xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/pod/perlreftut.pod (revision 0:68f95e015346)
1*0Sstevel@tonic-gate
2*0Sstevel@tonic-gate=head1 NAME
3*0Sstevel@tonic-gate
4*0Sstevel@tonic-gateperlreftut - Mark's very short tutorial about references
5*0Sstevel@tonic-gate
6*0Sstevel@tonic-gate=head1 DESCRIPTION
7*0Sstevel@tonic-gate
8*0Sstevel@tonic-gateOne of the most important new features in Perl 5 was the capability to
9*0Sstevel@tonic-gatemanage complicated data structures like multidimensional arrays and
10*0Sstevel@tonic-gatenested hashes.  To enable these, Perl 5 introduced a feature called
11*0Sstevel@tonic-gate`references', and using references is the key to managing complicated,
12*0Sstevel@tonic-gatestructured data in Perl.  Unfortunately, there's a lot of funny syntax
13*0Sstevel@tonic-gateto learn, and the main manual page can be hard to follow.  The manual
14*0Sstevel@tonic-gateis quite complete, and sometimes people find that a problem, because
15*0Sstevel@tonic-gateit can be hard to tell what is important and what isn't.
16*0Sstevel@tonic-gate
17*0Sstevel@tonic-gateFortunately, you only need to know 10% of what's in the main page to get
18*0Sstevel@tonic-gate90% of the benefit.  This page will show you that 10%.
19*0Sstevel@tonic-gate
20*0Sstevel@tonic-gate=head1 Who Needs Complicated Data Structures?
21*0Sstevel@tonic-gate
22*0Sstevel@tonic-gateOne problem that came up all the time in Perl 4 was how to represent a
23*0Sstevel@tonic-gatehash whose values were lists.  Perl 4 had hashes, of course, but the
24*0Sstevel@tonic-gatevalues had to be scalars; they couldn't be lists.
25*0Sstevel@tonic-gate
26*0Sstevel@tonic-gateWhy would you want a hash of lists?  Let's take a simple example: You
27*0Sstevel@tonic-gatehave a file of city and country names, like this:
28*0Sstevel@tonic-gate
29*0Sstevel@tonic-gate	Chicago, USA
30*0Sstevel@tonic-gate	Frankfurt, Germany
31*0Sstevel@tonic-gate	Berlin, Germany
32*0Sstevel@tonic-gate	Washington, USA
33*0Sstevel@tonic-gate	Helsinki, Finland
34*0Sstevel@tonic-gate	New York, USA
35*0Sstevel@tonic-gate
36*0Sstevel@tonic-gateand you want to produce an output like this, with each country mentioned
37*0Sstevel@tonic-gateonce, and then an alphabetical list of the cities in that country:
38*0Sstevel@tonic-gate
39*0Sstevel@tonic-gate	Finland: Helsinki.
40*0Sstevel@tonic-gate	Germany: Berlin, Frankfurt.
41*0Sstevel@tonic-gate	USA:  Chicago, New York, Washington.
42*0Sstevel@tonic-gate
43*0Sstevel@tonic-gateThe natural way to do this is to have a hash whose keys are country
44*0Sstevel@tonic-gatenames.  Associated with each country name key is a list of the cities in
45*0Sstevel@tonic-gatethat country.  Each time you read a line of input, split it into a country
46*0Sstevel@tonic-gateand a city, look up the list of cities already known to be in that
47*0Sstevel@tonic-gatecountry, and append the new city to the list.  When you're done reading
48*0Sstevel@tonic-gatethe input, iterate over the hash as usual, sorting each list of cities
49*0Sstevel@tonic-gatebefore you print it out.
50*0Sstevel@tonic-gate
51*0Sstevel@tonic-gateIf hash values can't be lists, you lose.  In Perl 4, hash values can't
52*0Sstevel@tonic-gatebe lists; they can only be strings.  You lose.  You'd probably have to
53*0Sstevel@tonic-gatecombine all the cities into a single string somehow, and then when
54*0Sstevel@tonic-gatetime came to write the output, you'd have to break the string into a
55*0Sstevel@tonic-gatelist, sort the list, and turn it back into a string.  This is messy
56*0Sstevel@tonic-gateand error-prone.  And it's frustrating, because Perl already has
57*0Sstevel@tonic-gateperfectly good lists that would solve the problem if only you could
58*0Sstevel@tonic-gateuse them.
59*0Sstevel@tonic-gate
60*0Sstevel@tonic-gate=head1 The Solution
61*0Sstevel@tonic-gate
62*0Sstevel@tonic-gateBy the time Perl 5 rolled around, we were already stuck with this
63*0Sstevel@tonic-gatedesign: Hash values must be scalars.  The solution to this is
64*0Sstevel@tonic-gatereferences.
65*0Sstevel@tonic-gate
66*0Sstevel@tonic-gateA reference is a scalar value that I<refers to> an entire array or an
67*0Sstevel@tonic-gateentire hash (or to just about anything else).  Names are one kind of
68*0Sstevel@tonic-gatereference that you're already familiar with.  Think of the President
69*0Sstevel@tonic-gateof the United States: a messy, inconvenient bag of blood and bones.
70*0Sstevel@tonic-gateBut to talk about him, or to represent him in a computer program, all
71*0Sstevel@tonic-gateyou need is the easy, convenient scalar string "George Bush".
72*0Sstevel@tonic-gate
73*0Sstevel@tonic-gateReferences in Perl are like names for arrays and hashes.  They're
74*0Sstevel@tonic-gatePerl's private, internal names, so you can be sure they're
75*0Sstevel@tonic-gateunambiguous.  Unlike "George Bush", a reference only refers to one
76*0Sstevel@tonic-gatething, and you always know what it refers to.  If you have a reference
77*0Sstevel@tonic-gateto an array, you can recover the entire array from it.  If you have a
78*0Sstevel@tonic-gatereference to a hash, you can recover the entire hash.  But the
79*0Sstevel@tonic-gatereference is still an easy, compact scalar value.
80*0Sstevel@tonic-gate
81*0Sstevel@tonic-gateYou can't have a hash whose values are arrays; hash values can only be
82*0Sstevel@tonic-gatescalars.  We're stuck with that.  But a single reference can refer to
83*0Sstevel@tonic-gatean entire array, and references are scalars, so you can have a hash of
84*0Sstevel@tonic-gatereferences to arrays, and it'll act a lot like a hash of arrays, and
85*0Sstevel@tonic-gateit'll be just as useful as a hash of arrays.
86*0Sstevel@tonic-gate
87*0Sstevel@tonic-gateWe'll come back to this city-country problem later, after we've seen
88*0Sstevel@tonic-gatesome syntax for managing references.
89*0Sstevel@tonic-gate
90*0Sstevel@tonic-gate
91*0Sstevel@tonic-gate=head1 Syntax
92*0Sstevel@tonic-gate
93*0Sstevel@tonic-gateThere are just two ways to make a reference, and just two ways to use
94*0Sstevel@tonic-gateit once you have it.
95*0Sstevel@tonic-gate
96*0Sstevel@tonic-gate=head2 Making References
97*0Sstevel@tonic-gate
98*0Sstevel@tonic-gate=head3 B<Make Rule 1>
99*0Sstevel@tonic-gate
100*0Sstevel@tonic-gateIf you put a C<\> in front of a variable, you get a
101*0Sstevel@tonic-gatereference to that variable.
102*0Sstevel@tonic-gate
103*0Sstevel@tonic-gate    $aref = \@array;         # $aref now holds a reference to @array
104*0Sstevel@tonic-gate    $href = \%hash;          # $href now holds a reference to %hash
105*0Sstevel@tonic-gate
106*0Sstevel@tonic-gateOnce the reference is stored in a variable like $aref or $href, you
107*0Sstevel@tonic-gatecan copy it or store it just the same as any other scalar value:
108*0Sstevel@tonic-gate
109*0Sstevel@tonic-gate    $xy = $aref;             # $xy now holds a reference to @array
110*0Sstevel@tonic-gate    $p[3] = $href;           # $p[3] now holds a reference to %hash
111*0Sstevel@tonic-gate    $z = $p[3];              # $z now holds a reference to %hash
112*0Sstevel@tonic-gate
113*0Sstevel@tonic-gate
114*0Sstevel@tonic-gateThese examples show how to make references to variables with names.
115*0Sstevel@tonic-gateSometimes you want to make an array or a hash that doesn't have a
116*0Sstevel@tonic-gatename.  This is analogous to the way you like to be able to use the
117*0Sstevel@tonic-gatestring C<"\n"> or the number 80 without having to store it in a named
118*0Sstevel@tonic-gatevariable first.
119*0Sstevel@tonic-gate
120*0Sstevel@tonic-gateB<Make Rule 2>
121*0Sstevel@tonic-gate
122*0Sstevel@tonic-gateC<[ ITEMS ]> makes a new, anonymous array, and returns a reference to
123*0Sstevel@tonic-gatethat array.  C<{ ITEMS }> makes a new, anonymous hash, and returns a
124*0Sstevel@tonic-gatereference to that hash.
125*0Sstevel@tonic-gate
126*0Sstevel@tonic-gate    $aref = [ 1, "foo", undef, 13 ];
127*0Sstevel@tonic-gate    # $aref now holds a reference to an array
128*0Sstevel@tonic-gate
129*0Sstevel@tonic-gate    $href = { APR => 4, AUG => 8 };
130*0Sstevel@tonic-gate    # $href now holds a reference to a hash
131*0Sstevel@tonic-gate
132*0Sstevel@tonic-gate
133*0Sstevel@tonic-gateThe references you get from rule 2 are the same kind of
134*0Sstevel@tonic-gatereferences that you get from rule 1:
135*0Sstevel@tonic-gate
136*0Sstevel@tonic-gate	# This:
137*0Sstevel@tonic-gate	$aref = [ 1, 2, 3 ];
138*0Sstevel@tonic-gate
139*0Sstevel@tonic-gate	# Does the same as this:
140*0Sstevel@tonic-gate	@array = (1, 2, 3);
141*0Sstevel@tonic-gate	$aref = \@array;
142*0Sstevel@tonic-gate
143*0Sstevel@tonic-gate
144*0Sstevel@tonic-gateThe first line is an abbreviation for the following two lines, except
145*0Sstevel@tonic-gatethat it doesn't create the superfluous array variable C<@array>.
146*0Sstevel@tonic-gate
147*0Sstevel@tonic-gateIf you write just C<[]>, you get a new, empty anonymous array.
148*0Sstevel@tonic-gateIf you write just C<{}>, you get a new, empty anonymous hash.
149*0Sstevel@tonic-gate
150*0Sstevel@tonic-gate
151*0Sstevel@tonic-gate=head2 Using References
152*0Sstevel@tonic-gate
153*0Sstevel@tonic-gateWhat can you do with a reference once you have it?  It's a scalar
154*0Sstevel@tonic-gatevalue, and we've seen that you can store it as a scalar and get it back
155*0Sstevel@tonic-gateagain just like any scalar.  There are just two more ways to use it:
156*0Sstevel@tonic-gate
157*0Sstevel@tonic-gate=head3 B<Use Rule 1>
158*0Sstevel@tonic-gate
159*0Sstevel@tonic-gateYou can always use an array reference, in curly braces, in place of
160*0Sstevel@tonic-gatethe name of an array.  For example, C<@{$aref}> instead of C<@array>.
161*0Sstevel@tonic-gate
162*0Sstevel@tonic-gateHere are some examples of that:
163*0Sstevel@tonic-gate
164*0Sstevel@tonic-gateArrays:
165*0Sstevel@tonic-gate
166*0Sstevel@tonic-gate
167*0Sstevel@tonic-gate	@a		@{$aref}		An array
168*0Sstevel@tonic-gate	reverse @a	reverse @{$aref}	Reverse the array
169*0Sstevel@tonic-gate	$a[3]		${$aref}[3]		An element of the array
170*0Sstevel@tonic-gate	$a[3] = 17;	${$aref}[3] = 17	Assigning an element
171*0Sstevel@tonic-gate
172*0Sstevel@tonic-gate
173*0Sstevel@tonic-gateOn each line are two expressions that do the same thing.  The
174*0Sstevel@tonic-gateleft-hand versions operate on the array C<@a>.  The right-hand
175*0Sstevel@tonic-gateversions operate on the array that is referred to by C<$aref>.  Once
176*0Sstevel@tonic-gatethey find the array they're operating on, both versions do the same
177*0Sstevel@tonic-gatethings to the arrays.
178*0Sstevel@tonic-gate
179*0Sstevel@tonic-gateUsing a hash reference is I<exactly> the same:
180*0Sstevel@tonic-gate
181*0Sstevel@tonic-gate	%h		%{$href}	      A hash
182*0Sstevel@tonic-gate	keys %h		keys %{$href}	      Get the keys from the hash
183*0Sstevel@tonic-gate	$h{'red'}	${$href}{'red'}	      An element of the hash
184*0Sstevel@tonic-gate	$h{'red'} = 17	${$href}{'red'} = 17  Assigning an element
185*0Sstevel@tonic-gate
186*0Sstevel@tonic-gateWhatever you want to do with a reference, B<Use Rule 1> tells you how
187*0Sstevel@tonic-gateto do it.  You just write the Perl code that you would have written
188*0Sstevel@tonic-gatefor doing the same thing to a regular array or hash, and then replace
189*0Sstevel@tonic-gatethe array or hash name with C<{$reference}>.  "How do I loop over an
190*0Sstevel@tonic-gatearray when all I have is a reference?"  Well, to loop over an array, you
191*0Sstevel@tonic-gatewould write
192*0Sstevel@tonic-gate
193*0Sstevel@tonic-gate        for my $element (@array) {
194*0Sstevel@tonic-gate           ...
195*0Sstevel@tonic-gate        }
196*0Sstevel@tonic-gate
197*0Sstevel@tonic-gateso replace the array name, C<@array>, with the reference:
198*0Sstevel@tonic-gate
199*0Sstevel@tonic-gate        for my $element (@{$aref}) {
200*0Sstevel@tonic-gate           ...
201*0Sstevel@tonic-gate        }
202*0Sstevel@tonic-gate
203*0Sstevel@tonic-gate"How do I print out the contents of a hash when all I have is a
204*0Sstevel@tonic-gatereference?"  First write the code for printing out a hash:
205*0Sstevel@tonic-gate
206*0Sstevel@tonic-gate        for my $key (keys %hash) {
207*0Sstevel@tonic-gate          print "$key => $hash{$key}\n";
208*0Sstevel@tonic-gate        }
209*0Sstevel@tonic-gate
210*0Sstevel@tonic-gateAnd then replace the hash name with the reference:
211*0Sstevel@tonic-gate
212*0Sstevel@tonic-gate        for my $key (keys %{$href}) {
213*0Sstevel@tonic-gate          print "$key => ${$href}{$key}\n";
214*0Sstevel@tonic-gate        }
215*0Sstevel@tonic-gate
216*0Sstevel@tonic-gate=head3 B<Use Rule 2>
217*0Sstevel@tonic-gate
218*0Sstevel@tonic-gateB<Use Rule 1> is all you really need, because it tells you how to to
219*0Sstevel@tonic-gateabsolutely everything you ever need to do with references.  But the
220*0Sstevel@tonic-gatemost common thing to do with an array or a hash is to extract a single
221*0Sstevel@tonic-gateelement, and the B<Use Rule 1> notation is cumbersome.  So there is an
222*0Sstevel@tonic-gateabbreviation.
223*0Sstevel@tonic-gate
224*0Sstevel@tonic-gateC<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >>
225*0Sstevel@tonic-gateinstead.
226*0Sstevel@tonic-gate
227*0Sstevel@tonic-gateC<${$href}{red}> is too hard to read, so you can write
228*0Sstevel@tonic-gateC<< $href->{red} >> instead.
229*0Sstevel@tonic-gate
230*0Sstevel@tonic-gateIf C<$aref> holds a reference to an array, then C<< $aref->[3] >> is
231*0Sstevel@tonic-gatethe fourth element of the array.  Don't confuse this with C<$aref[3]>,
232*0Sstevel@tonic-gatewhich is the fourth element of a totally different array, one
233*0Sstevel@tonic-gatedeceptively named C<@aref>.  C<$aref> and C<@aref> are unrelated the
234*0Sstevel@tonic-gatesame way that C<$item> and C<@item> are.
235*0Sstevel@tonic-gate
236*0Sstevel@tonic-gateSimilarly, C<< $href->{'red'} >> is part of the hash referred to by
237*0Sstevel@tonic-gatethe scalar variable C<$href>, perhaps even one with no name.
238*0Sstevel@tonic-gateC<$href{'red'}> is part of the deceptively named C<%href> hash.  It's
239*0Sstevel@tonic-gateeasy to forget to leave out the C<< -> >>, and if you do, you'll get
240*0Sstevel@tonic-gatebizarre results when your program gets array and hash elements out of
241*0Sstevel@tonic-gatetotally unexpected hashes and arrays that weren't the ones you wanted
242*0Sstevel@tonic-gateto use.
243*0Sstevel@tonic-gate
244*0Sstevel@tonic-gate
245*0Sstevel@tonic-gate=head2 An Example
246*0Sstevel@tonic-gate
247*0Sstevel@tonic-gateLet's see a quick example of how all this is useful.
248*0Sstevel@tonic-gate
249*0Sstevel@tonic-gateFirst, remember that C<[1, 2, 3]> makes an anonymous array containing
250*0Sstevel@tonic-gateC<(1, 2, 3)>, and gives you a reference to that array.
251*0Sstevel@tonic-gate
252*0Sstevel@tonic-gateNow think about
253*0Sstevel@tonic-gate
254*0Sstevel@tonic-gate	@a = ( [1, 2, 3],
255*0Sstevel@tonic-gate               [4, 5, 6],
256*0Sstevel@tonic-gate	       [7, 8, 9]
257*0Sstevel@tonic-gate             );
258*0Sstevel@tonic-gate
259*0Sstevel@tonic-gate@a is an array with three elements, and each one is a reference to
260*0Sstevel@tonic-gateanother array.
261*0Sstevel@tonic-gate
262*0Sstevel@tonic-gateC<$a[1]> is one of these references.  It refers to an array, the array
263*0Sstevel@tonic-gatecontaining C<(4, 5, 6)>, and because it is a reference to an array,
264*0Sstevel@tonic-gateB<Use Rule 2> says that we can write C<< $a[1]->[2] >> to get the
265*0Sstevel@tonic-gatethird element from that array.  C<< $a[1]->[2] >> is the 6.
266*0Sstevel@tonic-gateSimilarly, C<< $a[0]->[1] >> is the 2.  What we have here is like a
267*0Sstevel@tonic-gatetwo-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get
268*0Sstevel@tonic-gateor set the element in any row and any column of the array.
269*0Sstevel@tonic-gate
270*0Sstevel@tonic-gateThe notation still looks a little cumbersome, so there's one more
271*0Sstevel@tonic-gateabbreviation:
272*0Sstevel@tonic-gate
273*0Sstevel@tonic-gate=head2 Arrow Rule
274*0Sstevel@tonic-gate
275*0Sstevel@tonic-gateIn between two B<subscripts>, the arrow is optional.
276*0Sstevel@tonic-gate
277*0Sstevel@tonic-gateInstead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the
278*0Sstevel@tonic-gatesame thing.  Instead of C<< $a[0]->[1] = 23 >>, we can write
279*0Sstevel@tonic-gateC<$a[0][1] = 23>; it means the same thing.
280*0Sstevel@tonic-gate
281*0Sstevel@tonic-gateNow it really looks like two-dimensional arrays!
282*0Sstevel@tonic-gate
283*0Sstevel@tonic-gateYou can see why the arrows are important.  Without them, we would have
284*0Sstevel@tonic-gatehad to write C<${$a[1]}[2]> instead of C<$a[1][2]>.  For
285*0Sstevel@tonic-gatethree-dimensional arrays, they let us write C<$x[2][3][5]> instead of
286*0Sstevel@tonic-gatethe unreadable C<${${$x[2]}[3]}[5]>.
287*0Sstevel@tonic-gate
288*0Sstevel@tonic-gate=head1 Solution
289*0Sstevel@tonic-gate
290*0Sstevel@tonic-gateHere's the answer to the problem I posed earlier, of reformatting a
291*0Sstevel@tonic-gatefile of city and country names.
292*0Sstevel@tonic-gate
293*0Sstevel@tonic-gate    1   my %table;
294*0Sstevel@tonic-gate
295*0Sstevel@tonic-gate    2   while (<>) {
296*0Sstevel@tonic-gate    3    chomp;
297*0Sstevel@tonic-gate    4     my ($city, $country) = split /, /;
298*0Sstevel@tonic-gate    5     $table{$country} = [] unless exists $table{$country};
299*0Sstevel@tonic-gate    6     push @{$table{$country}}, $city;
300*0Sstevel@tonic-gate    7   }
301*0Sstevel@tonic-gate
302*0Sstevel@tonic-gate    8   foreach $country (sort keys %table) {
303*0Sstevel@tonic-gate    9     print "$country: ";
304*0Sstevel@tonic-gate   10     my @cities = @{$table{$country}};
305*0Sstevel@tonic-gate   11     print join ', ', sort @cities;
306*0Sstevel@tonic-gate   12     print ".\n";
307*0Sstevel@tonic-gate   13	}
308*0Sstevel@tonic-gate
309*0Sstevel@tonic-gate
310*0Sstevel@tonic-gateThe program has two pieces: Lines 2--7 read the input and build a data
311*0Sstevel@tonic-gatestructure, and lines 8-13 analyze the data and print out the report.
312*0Sstevel@tonic-gateWe're going to have a hash, C<%table>, whose keys are country names,
313*0Sstevel@tonic-gateand whose values are references to arrays of city names.  The data
314*0Sstevel@tonic-gatestructure will look like this:
315*0Sstevel@tonic-gate
316*0Sstevel@tonic-gate
317*0Sstevel@tonic-gate           %table
318*0Sstevel@tonic-gate        +-------+---+
319*0Sstevel@tonic-gate        |       |   |   +-----------+--------+
320*0Sstevel@tonic-gate        |Germany| *---->| Frankfurt | Berlin |
321*0Sstevel@tonic-gate        |       |   |   +-----------+--------+
322*0Sstevel@tonic-gate        +-------+---+
323*0Sstevel@tonic-gate        |       |   |   +----------+
324*0Sstevel@tonic-gate        |Finland| *---->| Helsinki |
325*0Sstevel@tonic-gate        |       |   |   +----------+
326*0Sstevel@tonic-gate        +-------+---+
327*0Sstevel@tonic-gate        |       |   |   +---------+------------+----------+
328*0Sstevel@tonic-gate        |  USA  | *---->| Chicago | Washington | New York |
329*0Sstevel@tonic-gate        |       |   |   +---------+------------+----------+
330*0Sstevel@tonic-gate        +-------+---+
331*0Sstevel@tonic-gate
332*0Sstevel@tonic-gateWe'll look at output first.  Supposing we already have this structure,
333*0Sstevel@tonic-gatehow do we print it out?
334*0Sstevel@tonic-gate
335*0Sstevel@tonic-gate    8   foreach $country (sort keys %table) {
336*0Sstevel@tonic-gate    9     print "$country: ";
337*0Sstevel@tonic-gate   10     my @cities = @{$table{$country}};
338*0Sstevel@tonic-gate   11     print join ', ', sort @cities;
339*0Sstevel@tonic-gate   12     print ".\n";
340*0Sstevel@tonic-gate   13	}
341*0Sstevel@tonic-gate
342*0Sstevel@tonic-gateC<%table> is an
343*0Sstevel@tonic-gateordinary hash, and we get a list of keys from it, sort the keys, and
344*0Sstevel@tonic-gateloop over the keys as usual.  The only use of references is in line 10.
345*0Sstevel@tonic-gateC<$table{$country}> looks up the key C<$country> in the hash
346*0Sstevel@tonic-gateand gets the value, which is a reference to an array of cities in that country.
347*0Sstevel@tonic-gateB<Use Rule 1> says that
348*0Sstevel@tonic-gatewe can recover the array by saying
349*0Sstevel@tonic-gateC<@{$table{$country}}>.  Line 10 is just like
350*0Sstevel@tonic-gate
351*0Sstevel@tonic-gate	@cities = @array;
352*0Sstevel@tonic-gate
353*0Sstevel@tonic-gateexcept that the name C<array> has been replaced by the reference
354*0Sstevel@tonic-gateC<{$table{$country}}>.  The C<@> tells Perl to get the entire array.
355*0Sstevel@tonic-gateHaving gotten the list of cities, we sort it, join it, and print it
356*0Sstevel@tonic-gateout as usual.
357*0Sstevel@tonic-gate
358*0Sstevel@tonic-gateLines 2-7 are responsible for building the structure in the first
359*0Sstevel@tonic-gateplace.  Here they are again:
360*0Sstevel@tonic-gate
361*0Sstevel@tonic-gate    2   while (<>) {
362*0Sstevel@tonic-gate    3    chomp;
363*0Sstevel@tonic-gate    4     my ($city, $country) = split /, /;
364*0Sstevel@tonic-gate    5     $table{$country} = [] unless exists $table{$country};
365*0Sstevel@tonic-gate    6     push @{$table{$country}}, $city;
366*0Sstevel@tonic-gate    7   }
367*0Sstevel@tonic-gate
368*0Sstevel@tonic-gateLines 2-4 acquire a city and country name.  Line 5 looks to see if the
369*0Sstevel@tonic-gatecountry is already present as a key in the hash.  If it's not, the
370*0Sstevel@tonic-gateprogram uses the C<[]> notation (B<Make Rule 2>) to manufacture a new,
371*0Sstevel@tonic-gateempty anonymous array of cities, and installs a reference to it into
372*0Sstevel@tonic-gatethe hash under the appropriate key.
373*0Sstevel@tonic-gate
374*0Sstevel@tonic-gateLine 6 installs the city name into the appropriate array.
375*0Sstevel@tonic-gateC<$table{$country}> now holds a reference to the array of cities seen
376*0Sstevel@tonic-gatein that country so far.  Line 6 is exactly like
377*0Sstevel@tonic-gate
378*0Sstevel@tonic-gate	push @array, $city;
379*0Sstevel@tonic-gate
380*0Sstevel@tonic-gateexcept that the name C<array> has been replaced by the reference
381*0Sstevel@tonic-gateC<{$table{$country}}>.  The C<push> adds a city name to the end of the
382*0Sstevel@tonic-gatereferred-to array.
383*0Sstevel@tonic-gate
384*0Sstevel@tonic-gateThere's one fine point I skipped.  Line 5 is unnecessary, and we can
385*0Sstevel@tonic-gateget rid of it.
386*0Sstevel@tonic-gate
387*0Sstevel@tonic-gate    2   while (<>) {
388*0Sstevel@tonic-gate    3    chomp;
389*0Sstevel@tonic-gate    4     my ($city, $country) = split /, /;
390*0Sstevel@tonic-gate    5   ####  $table{$country} = [] unless exists $table{$country};
391*0Sstevel@tonic-gate    6     push @{$table{$country}}, $city;
392*0Sstevel@tonic-gate    7   }
393*0Sstevel@tonic-gate
394*0Sstevel@tonic-gateIf there's already an entry in C<%table> for the current C<$country>,
395*0Sstevel@tonic-gatethen nothing is different.  Line 6 will locate the value in
396*0Sstevel@tonic-gateC<$table{$country}>, which is a reference to an array, and push
397*0Sstevel@tonic-gateC<$city> into the array.  But
398*0Sstevel@tonic-gatewhat does it do when
399*0Sstevel@tonic-gateC<$country> holds a key, say C<Greece>, that is not yet in C<%table>?
400*0Sstevel@tonic-gate
401*0Sstevel@tonic-gateThis is Perl, so it does the exact right thing.  It sees that you want
402*0Sstevel@tonic-gateto push C<Athens> onto an array that doesn't exist, so it helpfully
403*0Sstevel@tonic-gatemakes a new, empty, anonymous array for you, installs it into
404*0Sstevel@tonic-gateC<%table>, and then pushes C<Athens> onto it.  This is called
405*0Sstevel@tonic-gate`autovivification'--bringing things to life automatically.  Perl saw
406*0Sstevel@tonic-gatethat they key wasn't in the hash, so it created a new hash entry
407*0Sstevel@tonic-gateautomatically. Perl saw that you wanted to use the hash value as an
408*0Sstevel@tonic-gatearray, so it created a new empty array and installed a reference to it
409*0Sstevel@tonic-gatein the hash automatically.  And as usual, Perl made the array one
410*0Sstevel@tonic-gateelement longer to hold the new city name.
411*0Sstevel@tonic-gate
412*0Sstevel@tonic-gate=head1 The Rest
413*0Sstevel@tonic-gate
414*0Sstevel@tonic-gateI promised to give you 90% of the benefit with 10% of the details, and
415*0Sstevel@tonic-gatethat means I left out 90% of the details.  Now that you have an
416*0Sstevel@tonic-gateoverview of the important parts, it should be easier to read the
417*0Sstevel@tonic-gateL<perlref> manual page, which discusses 100% of the details.
418*0Sstevel@tonic-gate
419*0Sstevel@tonic-gateSome of the highlights of L<perlref>:
420*0Sstevel@tonic-gate
421*0Sstevel@tonic-gate=over 4
422*0Sstevel@tonic-gate
423*0Sstevel@tonic-gate=item *
424*0Sstevel@tonic-gate
425*0Sstevel@tonic-gateYou can make references to anything, including scalars, functions, and
426*0Sstevel@tonic-gateother references.
427*0Sstevel@tonic-gate
428*0Sstevel@tonic-gate=item *
429*0Sstevel@tonic-gate
430*0Sstevel@tonic-gateIn B<Use Rule 1>, you can omit the curly brackets whenever the thing
431*0Sstevel@tonic-gateinside them is an atomic scalar variable like C<$aref>.  For example,
432*0Sstevel@tonic-gateC<@$aref> is the same as C<@{$aref}>, and C<$$aref[1]> is the same as
433*0Sstevel@tonic-gateC<${$aref}[1]>.  If you're just starting out, you may want to adopt
434*0Sstevel@tonic-gatethe habit of always including the curly brackets.
435*0Sstevel@tonic-gate
436*0Sstevel@tonic-gate=item *
437*0Sstevel@tonic-gate
438*0Sstevel@tonic-gateThis doesn't copy the underlying array:
439*0Sstevel@tonic-gate
440*0Sstevel@tonic-gate        $aref2 = $aref1;
441*0Sstevel@tonic-gate
442*0Sstevel@tonic-gateYou get two references to the same array.  If you modify
443*0Sstevel@tonic-gateC<< $aref1->[23] >> and then look at
444*0Sstevel@tonic-gateC<< $aref2->[23] >> you'll see the change.
445*0Sstevel@tonic-gate
446*0Sstevel@tonic-gateTo copy the array, use
447*0Sstevel@tonic-gate
448*0Sstevel@tonic-gate        $aref2 = [@{$aref1}];
449*0Sstevel@tonic-gate
450*0Sstevel@tonic-gateThis uses C<[...]> notation to create a new anonymous array, and
451*0Sstevel@tonic-gateC<$aref2> is assigned a reference to the new array.  The new array is
452*0Sstevel@tonic-gateinitialized with the contents of the array referred to by C<$aref1>.
453*0Sstevel@tonic-gate
454*0Sstevel@tonic-gateSimilarly, to copy an anonymous hash, you can use
455*0Sstevel@tonic-gate
456*0Sstevel@tonic-gate        $href2 = {%{$href1}};
457*0Sstevel@tonic-gate
458*0Sstevel@tonic-gate=item *
459*0Sstevel@tonic-gate
460*0Sstevel@tonic-gateTo see if a variable contains a reference, use the C<ref> function.  It
461*0Sstevel@tonic-gatereturns true if its argument is a reference.  Actually it's a little
462*0Sstevel@tonic-gatebetter than that: It returns C<HASH> for hash references and C<ARRAY>
463*0Sstevel@tonic-gatefor array references.
464*0Sstevel@tonic-gate
465*0Sstevel@tonic-gate=item *
466*0Sstevel@tonic-gate
467*0Sstevel@tonic-gateIf you try to use a reference like a string, you get strings like
468*0Sstevel@tonic-gate
469*0Sstevel@tonic-gate	ARRAY(0x80f5dec)   or    HASH(0x826afc0)
470*0Sstevel@tonic-gate
471*0Sstevel@tonic-gateIf you ever see a string that looks like this, you'll know you
472*0Sstevel@tonic-gateprinted out a reference by mistake.
473*0Sstevel@tonic-gate
474*0Sstevel@tonic-gateA side effect of this representation is that you can use C<eq> to see
475*0Sstevel@tonic-gateif two references refer to the same thing.  (But you should usually use
476*0Sstevel@tonic-gateC<==> instead because it's much faster.)
477*0Sstevel@tonic-gate
478*0Sstevel@tonic-gate=item *
479*0Sstevel@tonic-gate
480*0Sstevel@tonic-gateYou can use a string as if it were a reference.  If you use the string
481*0Sstevel@tonic-gateC<"foo"> as an array reference, it's taken to be a reference to the
482*0Sstevel@tonic-gatearray C<@foo>.  This is called a I<soft reference> or I<symbolic
483*0Sstevel@tonic-gatereference>.  The declaration C<use strict 'refs'> disables this
484*0Sstevel@tonic-gatefeature, which can cause all sorts of trouble if you use it by accident.
485*0Sstevel@tonic-gate
486*0Sstevel@tonic-gate=back
487*0Sstevel@tonic-gate
488*0Sstevel@tonic-gateYou might prefer to go on to L<perllol> instead of L<perlref>; it
489*0Sstevel@tonic-gatediscusses lists of lists and multidimensional arrays in detail.  After
490*0Sstevel@tonic-gatethat, you should move on to L<perldsc>; it's a Data Structure Cookbook
491*0Sstevel@tonic-gatethat shows recipes for using and printing out arrays of hashes, hashes
492*0Sstevel@tonic-gateof arrays, and other kinds of data.
493*0Sstevel@tonic-gate
494*0Sstevel@tonic-gate=head1 Summary
495*0Sstevel@tonic-gate
496*0Sstevel@tonic-gateEveryone needs compound data structures, and in Perl the way you get
497*0Sstevel@tonic-gatethem is with references.  There are four important rules for managing
498*0Sstevel@tonic-gatereferences: Two for making references and two for using them.  Once
499*0Sstevel@tonic-gateyou know these rules you can do most of the important things you need
500*0Sstevel@tonic-gateto do with references.
501*0Sstevel@tonic-gate
502*0Sstevel@tonic-gate=head1 Credits
503*0Sstevel@tonic-gate
504*0Sstevel@tonic-gateAuthor: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+@plover.com>)
505*0Sstevel@tonic-gate
506*0Sstevel@tonic-gateThis article originally appeared in I<The Perl Journal>
507*0Sstevel@tonic-gate( http://www.tpj.com/ ) volume 3, #2.  Reprinted with permission.
508*0Sstevel@tonic-gate
509*0Sstevel@tonic-gateThe original title was I<Understand References Today>.
510*0Sstevel@tonic-gate
511*0Sstevel@tonic-gate=head2 Distribution Conditions
512*0Sstevel@tonic-gate
513*0Sstevel@tonic-gateCopyright 1998 The Perl Journal.
514*0Sstevel@tonic-gate
515*0Sstevel@tonic-gateThis documentation is free; you can redistribute it and/or modify it
516*0Sstevel@tonic-gateunder the same terms as Perl itself.
517*0Sstevel@tonic-gate
518*0Sstevel@tonic-gateIrrespective of its distribution, all code examples in these files are
519*0Sstevel@tonic-gatehereby placed into the public domain.  You are permitted and
520*0Sstevel@tonic-gateencouraged to use this code in your own programs for fun or for profit
521*0Sstevel@tonic-gateas you see fit.  A simple comment in the code giving credit would be
522*0Sstevel@tonic-gatecourteous but is not required.
523*0Sstevel@tonic-gate
524*0Sstevel@tonic-gate
525*0Sstevel@tonic-gate
526*0Sstevel@tonic-gate
527*0Sstevel@tonic-gate=cut
528