xref: /onnv-gate/usr/src/cmd/perl/5.8.4/distrib/pod/perlhack.pod (revision 0:68f95e015346)
1*0Sstevel@tonic-gate=head1 NAME
2*0Sstevel@tonic-gate
3*0Sstevel@tonic-gateperlhack - How to hack at the Perl internals
4*0Sstevel@tonic-gate
5*0Sstevel@tonic-gate=head1 DESCRIPTION
6*0Sstevel@tonic-gate
7*0Sstevel@tonic-gateThis document attempts to explain how Perl development takes place,
8*0Sstevel@tonic-gateand ends with some suggestions for people wanting to become bona fide
9*0Sstevel@tonic-gateporters.
10*0Sstevel@tonic-gate
11*0Sstevel@tonic-gateThe perl5-porters mailing list is where the Perl standard distribution
12*0Sstevel@tonic-gateis maintained and developed.  The list can get anywhere from 10 to 150
13*0Sstevel@tonic-gatemessages a day, depending on the heatedness of the debate.  Most days
14*0Sstevel@tonic-gatethere are two or three patches, extensions, features, or bugs being
15*0Sstevel@tonic-gatediscussed at a time.
16*0Sstevel@tonic-gate
17*0Sstevel@tonic-gateA searchable archive of the list is at either:
18*0Sstevel@tonic-gate
19*0Sstevel@tonic-gate    http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
20*0Sstevel@tonic-gate
21*0Sstevel@tonic-gateor
22*0Sstevel@tonic-gate
23*0Sstevel@tonic-gate    http://archive.develooper.com/perl5-porters@perl.org/
24*0Sstevel@tonic-gate
25*0Sstevel@tonic-gateList subscribers (the porters themselves) come in several flavours.
26*0Sstevel@tonic-gateSome are quiet curious lurkers, who rarely pitch in and instead watch
27*0Sstevel@tonic-gatethe ongoing development to ensure they're forewarned of new changes or
28*0Sstevel@tonic-gatefeatures in Perl.  Some are representatives of vendors, who are there
29*0Sstevel@tonic-gateto make sure that Perl continues to compile and work on their
30*0Sstevel@tonic-gateplatforms.  Some patch any reported bug that they know how to fix,
31*0Sstevel@tonic-gatesome are actively patching their pet area (threads, Win32, the regexp
32*0Sstevel@tonic-gateengine), while others seem to do nothing but complain.  In other
33*0Sstevel@tonic-gatewords, it's your usual mix of technical people.
34*0Sstevel@tonic-gate
35*0Sstevel@tonic-gateOver this group of porters presides Larry Wall.  He has the final word
36*0Sstevel@tonic-gatein what does and does not change in the Perl language.  Various
37*0Sstevel@tonic-gatereleases of Perl are shepherded by a ``pumpking'', a porter
38*0Sstevel@tonic-gateresponsible for gathering patches, deciding on a patch-by-patch
39*0Sstevel@tonic-gatefeature-by-feature basis what will and will not go into the release.
40*0Sstevel@tonic-gateFor instance, Gurusamy Sarathy was the pumpking for the 5.6 release of
41*0Sstevel@tonic-gatePerl, and Jarkko Hietaniemi is the pumpking for the 5.8 release, and
42*0Sstevel@tonic-gateHugo van der Sanden will be the pumpking for the 5.10 release.
43*0Sstevel@tonic-gate
44*0Sstevel@tonic-gateIn addition, various people are pumpkings for different things.  For
45*0Sstevel@tonic-gateinstance, Andy Dougherty and Jarkko Hietaniemi share the I<Configure>
46*0Sstevel@tonic-gatepumpkin.
47*0Sstevel@tonic-gate
48*0Sstevel@tonic-gateLarry sees Perl development along the lines of the US government:
49*0Sstevel@tonic-gatethere's the Legislature (the porters), the Executive branch (the
50*0Sstevel@tonic-gatepumpkings), and the Supreme Court (Larry).  The legislature can
51*0Sstevel@tonic-gatediscuss and submit patches to the executive branch all they like, but
52*0Sstevel@tonic-gatethe executive branch is free to veto them.  Rarely, the Supreme Court
53*0Sstevel@tonic-gatewill side with the executive branch over the legislature, or the
54*0Sstevel@tonic-gatelegislature over the executive branch.  Mostly, however, the
55*0Sstevel@tonic-gatelegislature and the executive branch are supposed to get along and
56*0Sstevel@tonic-gatework out their differences without impeachment or court cases.
57*0Sstevel@tonic-gate
58*0Sstevel@tonic-gateYou might sometimes see reference to Rule 1 and Rule 2.  Larry's power
59*0Sstevel@tonic-gateas Supreme Court is expressed in The Rules:
60*0Sstevel@tonic-gate
61*0Sstevel@tonic-gate=over 4
62*0Sstevel@tonic-gate
63*0Sstevel@tonic-gate=item 1
64*0Sstevel@tonic-gate
65*0Sstevel@tonic-gateLarry is always by definition right about how Perl should behave.
66*0Sstevel@tonic-gateThis means he has final veto power on the core functionality.
67*0Sstevel@tonic-gate
68*0Sstevel@tonic-gate=item 2
69*0Sstevel@tonic-gate
70*0Sstevel@tonic-gateLarry is allowed to change his mind about any matter at a later date,
71*0Sstevel@tonic-gateregardless of whether he previously invoked Rule 1.
72*0Sstevel@tonic-gate
73*0Sstevel@tonic-gate=back
74*0Sstevel@tonic-gate
75*0Sstevel@tonic-gateGot that?  Larry is always right, even when he was wrong.  It's rare
76*0Sstevel@tonic-gateto see either Rule exercised, but they are often alluded to.
77*0Sstevel@tonic-gate
78*0Sstevel@tonic-gateNew features and extensions to the language are contentious, because
79*0Sstevel@tonic-gatethe criteria used by the pumpkings, Larry, and other porters to decide
80*0Sstevel@tonic-gatewhich features should be implemented and incorporated are not codified
81*0Sstevel@tonic-gatein a few small design goals as with some other languages.  Instead,
82*0Sstevel@tonic-gatethe heuristics are flexible and often difficult to fathom.  Here is
83*0Sstevel@tonic-gateone person's list, roughly in decreasing order of importance, of
84*0Sstevel@tonic-gateheuristics that new features have to be weighed against:
85*0Sstevel@tonic-gate
86*0Sstevel@tonic-gate=over 4
87*0Sstevel@tonic-gate
88*0Sstevel@tonic-gate=item Does concept match the general goals of Perl?
89*0Sstevel@tonic-gate
90*0Sstevel@tonic-gateThese haven't been written anywhere in stone, but one approximation
91*0Sstevel@tonic-gateis:
92*0Sstevel@tonic-gate
93*0Sstevel@tonic-gate 1. Keep it fast, simple, and useful.
94*0Sstevel@tonic-gate 2. Keep features/concepts as orthogonal as possible.
95*0Sstevel@tonic-gate 3. No arbitrary limits (platforms, data sizes, cultures).
96*0Sstevel@tonic-gate 4. Keep it open and exciting to use/patch/advocate Perl everywhere.
97*0Sstevel@tonic-gate 5. Either assimilate new technologies, or build bridges to them.
98*0Sstevel@tonic-gate
99*0Sstevel@tonic-gate=item Where is the implementation?
100*0Sstevel@tonic-gate
101*0Sstevel@tonic-gateAll the talk in the world is useless without an implementation.  In
102*0Sstevel@tonic-gatealmost every case, the person or people who argue for a new feature
103*0Sstevel@tonic-gatewill be expected to be the ones who implement it.  Porters capable
104*0Sstevel@tonic-gateof coding new features have their own agendas, and are not available
105*0Sstevel@tonic-gateto implement your (possibly good) idea.
106*0Sstevel@tonic-gate
107*0Sstevel@tonic-gate=item Backwards compatibility
108*0Sstevel@tonic-gate
109*0Sstevel@tonic-gateIt's a cardinal sin to break existing Perl programs.  New warnings are
110*0Sstevel@tonic-gatecontentious--some say that a program that emits warnings is not
111*0Sstevel@tonic-gatebroken, while others say it is.  Adding keywords has the potential to
112*0Sstevel@tonic-gatebreak programs, changing the meaning of existing token sequences or
113*0Sstevel@tonic-gatefunctions might break programs.
114*0Sstevel@tonic-gate
115*0Sstevel@tonic-gate=item Could it be a module instead?
116*0Sstevel@tonic-gate
117*0Sstevel@tonic-gatePerl 5 has extension mechanisms, modules and XS, specifically to avoid
118*0Sstevel@tonic-gatethe need to keep changing the Perl interpreter.  You can write modules
119*0Sstevel@tonic-gatethat export functions, you can give those functions prototypes so they
120*0Sstevel@tonic-gatecan be called like built-in functions, you can even write XS code to
121*0Sstevel@tonic-gatemess with the runtime data structures of the Perl interpreter if you
122*0Sstevel@tonic-gatewant to implement really complicated things.  If it can be done in a
123*0Sstevel@tonic-gatemodule instead of in the core, it's highly unlikely to be added.
124*0Sstevel@tonic-gate
125*0Sstevel@tonic-gate=item Is the feature generic enough?
126*0Sstevel@tonic-gate
127*0Sstevel@tonic-gateIs this something that only the submitter wants added to the language,
128*0Sstevel@tonic-gateor would it be broadly useful?  Sometimes, instead of adding a feature
129*0Sstevel@tonic-gatewith a tight focus, the porters might decide to wait until someone
130*0Sstevel@tonic-gateimplements the more generalized feature.  For instance, instead of
131*0Sstevel@tonic-gateimplementing a ``delayed evaluation'' feature, the porters are waiting
132*0Sstevel@tonic-gatefor a macro system that would permit delayed evaluation and much more.
133*0Sstevel@tonic-gate
134*0Sstevel@tonic-gate=item Does it potentially introduce new bugs?
135*0Sstevel@tonic-gate
136*0Sstevel@tonic-gateRadical rewrites of large chunks of the Perl interpreter have the
137*0Sstevel@tonic-gatepotential to introduce new bugs.  The smaller and more localized the
138*0Sstevel@tonic-gatechange, the better.
139*0Sstevel@tonic-gate
140*0Sstevel@tonic-gate=item Does it preclude other desirable features?
141*0Sstevel@tonic-gate
142*0Sstevel@tonic-gateA patch is likely to be rejected if it closes off future avenues of
143*0Sstevel@tonic-gatedevelopment.  For instance, a patch that placed a true and final
144*0Sstevel@tonic-gateinterpretation on prototypes is likely to be rejected because there
145*0Sstevel@tonic-gateare still options for the future of prototypes that haven't been
146*0Sstevel@tonic-gateaddressed.
147*0Sstevel@tonic-gate
148*0Sstevel@tonic-gate=item Is the implementation robust?
149*0Sstevel@tonic-gate
150*0Sstevel@tonic-gateGood patches (tight code, complete, correct) stand more chance of
151*0Sstevel@tonic-gategoing in.  Sloppy or incorrect patches might be placed on the back
152*0Sstevel@tonic-gateburner until the pumpking has time to fix, or might be discarded
153*0Sstevel@tonic-gatealtogether without further notice.
154*0Sstevel@tonic-gate
155*0Sstevel@tonic-gate=item Is the implementation generic enough to be portable?
156*0Sstevel@tonic-gate
157*0Sstevel@tonic-gateThe worst patches make use of a system-specific features.  It's highly
158*0Sstevel@tonic-gateunlikely that nonportable additions to the Perl language will be
159*0Sstevel@tonic-gateaccepted.
160*0Sstevel@tonic-gate
161*0Sstevel@tonic-gate=item Is the implementation tested?
162*0Sstevel@tonic-gate
163*0Sstevel@tonic-gatePatches which change behaviour (fixing bugs or introducing new features)
164*0Sstevel@tonic-gatemust include regression tests to verify that everything works as expected.
165*0Sstevel@tonic-gateWithout tests provided by the original author, how can anyone else changing
166*0Sstevel@tonic-gateperl in the future be sure that they haven't unwittingly broken the behaviour
167*0Sstevel@tonic-gatethe patch implements? And without tests, how can the patch's author be
168*0Sstevel@tonic-gateconfident that his/her hard work put into the patch won't be accidentally
169*0Sstevel@tonic-gatethrown away by someone in the future?
170*0Sstevel@tonic-gate
171*0Sstevel@tonic-gate=item Is there enough documentation?
172*0Sstevel@tonic-gate
173*0Sstevel@tonic-gatePatches without documentation are probably ill-thought out or
174*0Sstevel@tonic-gateincomplete.  Nothing can be added without documentation, so submitting
175*0Sstevel@tonic-gatea patch for the appropriate manpages as well as the source code is
176*0Sstevel@tonic-gatealways a good idea.
177*0Sstevel@tonic-gate
178*0Sstevel@tonic-gate=item Is there another way to do it?
179*0Sstevel@tonic-gate
180*0Sstevel@tonic-gateLarry said ``Although the Perl Slogan is I<There's More Than One Way
181*0Sstevel@tonic-gateto Do It>, I hesitate to make 10 ways to do something''.  This is a
182*0Sstevel@tonic-gatetricky heuristic to navigate, though--one man's essential addition is
183*0Sstevel@tonic-gateanother man's pointless cruft.
184*0Sstevel@tonic-gate
185*0Sstevel@tonic-gate=item Does it create too much work?
186*0Sstevel@tonic-gate
187*0Sstevel@tonic-gateWork for the pumpking, work for Perl programmers, work for module
188*0Sstevel@tonic-gateauthors, ...  Perl is supposed to be easy.
189*0Sstevel@tonic-gate
190*0Sstevel@tonic-gate=item Patches speak louder than words
191*0Sstevel@tonic-gate
192*0Sstevel@tonic-gateWorking code is always preferred to pie-in-the-sky ideas.  A patch to
193*0Sstevel@tonic-gateadd a feature stands a much higher chance of making it to the language
194*0Sstevel@tonic-gatethan does a random feature request, no matter how fervently argued the
195*0Sstevel@tonic-gaterequest might be.  This ties into ``Will it be useful?'', as the fact
196*0Sstevel@tonic-gatethat someone took the time to make the patch demonstrates a strong
197*0Sstevel@tonic-gatedesire for the feature.
198*0Sstevel@tonic-gate
199*0Sstevel@tonic-gate=back
200*0Sstevel@tonic-gate
201*0Sstevel@tonic-gateIf you're on the list, you might hear the word ``core'' bandied
202*0Sstevel@tonic-gatearound.  It refers to the standard distribution.  ``Hacking on the
203*0Sstevel@tonic-gatecore'' means you're changing the C source code to the Perl
204*0Sstevel@tonic-gateinterpreter.  ``A core module'' is one that ships with Perl.
205*0Sstevel@tonic-gate
206*0Sstevel@tonic-gate=head2 Keeping in sync
207*0Sstevel@tonic-gate
208*0Sstevel@tonic-gateThe source code to the Perl interpreter, in its different versions, is
209*0Sstevel@tonic-gatekept in a repository managed by a revision control system ( which is
210*0Sstevel@tonic-gatecurrently the Perforce program, see http://perforce.com/ ).  The
211*0Sstevel@tonic-gatepumpkings and a few others have access to the repository to check in
212*0Sstevel@tonic-gatechanges.  Periodically the pumpking for the development version of Perl
213*0Sstevel@tonic-gatewill release a new version, so the rest of the porters can see what's
214*0Sstevel@tonic-gatechanged.  The current state of the main trunk of repository, and patches
215*0Sstevel@tonic-gatethat describe the individual changes that have happened since the last
216*0Sstevel@tonic-gatepublic release are available at this location:
217*0Sstevel@tonic-gate
218*0Sstevel@tonic-gate    http://public.activestate.com/gsar/APC/
219*0Sstevel@tonic-gate    ftp://ftp.linux.activestate.com/pub/staff/gsar/APC/
220*0Sstevel@tonic-gate
221*0Sstevel@tonic-gateIf you're looking for a particular change, or a change that affected
222*0Sstevel@tonic-gatea particular set of files, you may find the B<Perl Repository Browser>
223*0Sstevel@tonic-gateuseful:
224*0Sstevel@tonic-gate
225*0Sstevel@tonic-gate    http://public.activestate.com/cgi-bin/perlbrowse
226*0Sstevel@tonic-gate
227*0Sstevel@tonic-gateYou may also want to subscribe to the perl5-changes mailing list to
228*0Sstevel@tonic-gatereceive a copy of each patch that gets submitted to the maintenance
229*0Sstevel@tonic-gateand development "branches" of the perl repository.  See
230*0Sstevel@tonic-gatehttp://lists.perl.org/ for subscription information.
231*0Sstevel@tonic-gate
232*0Sstevel@tonic-gateIf you are a member of the perl5-porters mailing list, it is a good
233*0Sstevel@tonic-gatething to keep in touch with the most recent changes. If not only to
234*0Sstevel@tonic-gateverify if what you would have posted as a bug report isn't already
235*0Sstevel@tonic-gatesolved in the most recent available perl development branch, also
236*0Sstevel@tonic-gateknown as perl-current, bleading edge perl, bleedperl or bleadperl.
237*0Sstevel@tonic-gate
238*0Sstevel@tonic-gateNeedless to say, the source code in perl-current is usually in a perpetual
239*0Sstevel@tonic-gatestate of evolution.  You should expect it to be very buggy.  Do B<not> use
240*0Sstevel@tonic-gateit for any purpose other than testing and development.
241*0Sstevel@tonic-gate
242*0Sstevel@tonic-gateKeeping in sync with the most recent branch can be done in several ways,
243*0Sstevel@tonic-gatebut the most convenient and reliable way is using B<rsync>, available at
244*0Sstevel@tonic-gateftp://rsync.samba.org/pub/rsync/ .  (You can also get the most recent
245*0Sstevel@tonic-gatebranch by FTP.)
246*0Sstevel@tonic-gate
247*0Sstevel@tonic-gateIf you choose to keep in sync using rsync, there are two approaches
248*0Sstevel@tonic-gateto doing so:
249*0Sstevel@tonic-gate
250*0Sstevel@tonic-gate=over 4
251*0Sstevel@tonic-gate
252*0Sstevel@tonic-gate=item rsync'ing the source tree
253*0Sstevel@tonic-gate
254*0Sstevel@tonic-gatePresuming you are in the directory where your perl source resides
255*0Sstevel@tonic-gateand you have rsync installed and available, you can `upgrade' to
256*0Sstevel@tonic-gatethe bleadperl using:
257*0Sstevel@tonic-gate
258*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current/ .
259*0Sstevel@tonic-gate
260*0Sstevel@tonic-gateThis takes care of updating every single item in the source tree to
261*0Sstevel@tonic-gatethe latest applied patch level, creating files that are new (to your
262*0Sstevel@tonic-gatedistribution) and setting date/time stamps of existing files to
263*0Sstevel@tonic-gatereflect the bleadperl status.
264*0Sstevel@tonic-gate
265*0Sstevel@tonic-gateNote that this will not delete any files that were in '.' before
266*0Sstevel@tonic-gatethe rsync. Once you are sure that the rsync is running correctly,
267*0Sstevel@tonic-gaterun it with the --delete and the --dry-run options like this:
268*0Sstevel@tonic-gate
269*0Sstevel@tonic-gate # rsync -avz --delete --dry-run rsync://ftp.linux.activestate.com/perl-current/ .
270*0Sstevel@tonic-gate
271*0Sstevel@tonic-gateThis will I<simulate> an rsync run that also deletes files not
272*0Sstevel@tonic-gatepresent in the bleadperl master copy. Observe the results from
273*0Sstevel@tonic-gatethis run closely. If you are sure that the actual run would delete
274*0Sstevel@tonic-gateno files precious to you, you could remove the '--dry-run' option.
275*0Sstevel@tonic-gate
276*0Sstevel@tonic-gateYou can than check what patch was the latest that was applied by
277*0Sstevel@tonic-gatelooking in the file B<.patch>, which will show the number of the
278*0Sstevel@tonic-gatelatest patch.
279*0Sstevel@tonic-gate
280*0Sstevel@tonic-gateIf you have more than one machine to keep in sync, and not all of
281*0Sstevel@tonic-gatethem have access to the WAN (so you are not able to rsync all the
282*0Sstevel@tonic-gatesource trees to the real source), there are some ways to get around
283*0Sstevel@tonic-gatethis problem.
284*0Sstevel@tonic-gate
285*0Sstevel@tonic-gate=over 4
286*0Sstevel@tonic-gate
287*0Sstevel@tonic-gate=item Using rsync over the LAN
288*0Sstevel@tonic-gate
289*0Sstevel@tonic-gateSet up a local rsync server which makes the rsynced source tree
290*0Sstevel@tonic-gateavailable to the LAN and sync the other machines against this
291*0Sstevel@tonic-gatedirectory.
292*0Sstevel@tonic-gate
293*0Sstevel@tonic-gateFrom http://rsync.samba.org/README.html :
294*0Sstevel@tonic-gate
295*0Sstevel@tonic-gate   "Rsync uses rsh or ssh for communication. It does not need to be
296*0Sstevel@tonic-gate    setuid and requires no special privileges for installation.  It
297*0Sstevel@tonic-gate    does not require an inetd entry or a daemon.  You must, however,
298*0Sstevel@tonic-gate    have a working rsh or ssh system.  Using ssh is recommended for
299*0Sstevel@tonic-gate    its security features."
300*0Sstevel@tonic-gate
301*0Sstevel@tonic-gate=item Using pushing over the NFS
302*0Sstevel@tonic-gate
303*0Sstevel@tonic-gateHaving the other systems mounted over the NFS, you can take an
304*0Sstevel@tonic-gateactive pushing approach by checking the just updated tree against
305*0Sstevel@tonic-gatethe other not-yet synced trees. An example would be
306*0Sstevel@tonic-gate
307*0Sstevel@tonic-gate  #!/usr/bin/perl -w
308*0Sstevel@tonic-gate
309*0Sstevel@tonic-gate  use strict;
310*0Sstevel@tonic-gate  use File::Copy;
311*0Sstevel@tonic-gate
312*0Sstevel@tonic-gate  my %MF = map {
313*0Sstevel@tonic-gate      m/(\S+)/;
314*0Sstevel@tonic-gate      $1 => [ (stat $1)[2, 7, 9] ];	# mode, size, mtime
315*0Sstevel@tonic-gate      } `cat MANIFEST`;
316*0Sstevel@tonic-gate
317*0Sstevel@tonic-gate  my %remote = map { $_ => "/$_/pro/3gl/CPAN/perl-5.7.1" } qw(host1 host2);
318*0Sstevel@tonic-gate
319*0Sstevel@tonic-gate  foreach my $host (keys %remote) {
320*0Sstevel@tonic-gate      unless (-d $remote{$host}) {
321*0Sstevel@tonic-gate	  print STDERR "Cannot Xsync for host $host\n";
322*0Sstevel@tonic-gate	  next;
323*0Sstevel@tonic-gate	  }
324*0Sstevel@tonic-gate      foreach my $file (keys %MF) {
325*0Sstevel@tonic-gate	  my $rfile = "$remote{$host}/$file";
326*0Sstevel@tonic-gate	  my ($mode, $size, $mtime) = (stat $rfile)[2, 7, 9];
327*0Sstevel@tonic-gate	  defined $size or ($mode, $size, $mtime) = (0, 0, 0);
328*0Sstevel@tonic-gate	  $size == $MF{$file}[1] && $mtime == $MF{$file}[2] and next;
329*0Sstevel@tonic-gate	  printf "%4s %-34s %8d %9d  %8d %9d\n",
330*0Sstevel@tonic-gate	      $host, $file, $MF{$file}[1], $MF{$file}[2], $size, $mtime;
331*0Sstevel@tonic-gate	  unlink $rfile;
332*0Sstevel@tonic-gate	  copy ($file, $rfile);
333*0Sstevel@tonic-gate	  utime time, $MF{$file}[2], $rfile;
334*0Sstevel@tonic-gate	  chmod $MF{$file}[0], $rfile;
335*0Sstevel@tonic-gate	  }
336*0Sstevel@tonic-gate      }
337*0Sstevel@tonic-gate
338*0Sstevel@tonic-gatethough this is not perfect. It could be improved with checking
339*0Sstevel@tonic-gatefile checksums before updating. Not all NFS systems support
340*0Sstevel@tonic-gatereliable utime support (when used over the NFS).
341*0Sstevel@tonic-gate
342*0Sstevel@tonic-gate=back
343*0Sstevel@tonic-gate
344*0Sstevel@tonic-gate=item rsync'ing the patches
345*0Sstevel@tonic-gate
346*0Sstevel@tonic-gateThe source tree is maintained by the pumpking who applies patches to
347*0Sstevel@tonic-gatethe files in the tree. These patches are either created by the
348*0Sstevel@tonic-gatepumpking himself using C<diff -c> after updating the file manually or
349*0Sstevel@tonic-gateby applying patches sent in by posters on the perl5-porters list.
350*0Sstevel@tonic-gateThese patches are also saved and rsync'able, so you can apply them
351*0Sstevel@tonic-gateyourself to the source files.
352*0Sstevel@tonic-gate
353*0Sstevel@tonic-gatePresuming you are in a directory where your patches reside, you can
354*0Sstevel@tonic-gateget them in sync with
355*0Sstevel@tonic-gate
356*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ .
357*0Sstevel@tonic-gate
358*0Sstevel@tonic-gateThis makes sure the latest available patch is downloaded to your
359*0Sstevel@tonic-gatepatch directory.
360*0Sstevel@tonic-gate
361*0Sstevel@tonic-gateIt's then up to you to apply these patches, using something like
362*0Sstevel@tonic-gate
363*0Sstevel@tonic-gate # last=`ls -t *.gz | sed q`
364*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ .
365*0Sstevel@tonic-gate # find . -name '*.gz' -newer $last -exec gzcat {} \; >blead.patch
366*0Sstevel@tonic-gate # cd ../perl-current
367*0Sstevel@tonic-gate # patch -p1 -N <../perl-current-diffs/blead.patch
368*0Sstevel@tonic-gate
369*0Sstevel@tonic-gateor, since this is only a hint towards how it works, use CPAN-patchaperl
370*0Sstevel@tonic-gatefrom Andreas K�nig to have better control over the patching process.
371*0Sstevel@tonic-gate
372*0Sstevel@tonic-gate=back
373*0Sstevel@tonic-gate
374*0Sstevel@tonic-gate=head2 Why rsync the source tree
375*0Sstevel@tonic-gate
376*0Sstevel@tonic-gate=over 4
377*0Sstevel@tonic-gate
378*0Sstevel@tonic-gate=item It's easier to rsync the source tree
379*0Sstevel@tonic-gate
380*0Sstevel@tonic-gateSince you don't have to apply the patches yourself, you are sure all
381*0Sstevel@tonic-gatefiles in the source tree are in the right state.
382*0Sstevel@tonic-gate
383*0Sstevel@tonic-gate=item It's more reliable
384*0Sstevel@tonic-gate
385*0Sstevel@tonic-gateWhile both the rsync-able source and patch areas are automatically
386*0Sstevel@tonic-gateupdated every few minutes, keep in mind that applying patches may
387*0Sstevel@tonic-gatesometimes mean careful hand-holding, especially if your version of
388*0Sstevel@tonic-gatethe C<patch> program does not understand how to deal with new files,
389*0Sstevel@tonic-gatefiles with 8-bit characters, or files without trailing newlines.
390*0Sstevel@tonic-gate
391*0Sstevel@tonic-gate=back
392*0Sstevel@tonic-gate
393*0Sstevel@tonic-gate=head2 Why rsync the patches
394*0Sstevel@tonic-gate
395*0Sstevel@tonic-gate=over 4
396*0Sstevel@tonic-gate
397*0Sstevel@tonic-gate=item It's easier to rsync the patches
398*0Sstevel@tonic-gate
399*0Sstevel@tonic-gateIf you have more than one machine that you want to keep in track with
400*0Sstevel@tonic-gatebleadperl, it's easier to rsync the patches only once and then apply
401*0Sstevel@tonic-gatethem to all the source trees on the different machines.
402*0Sstevel@tonic-gate
403*0Sstevel@tonic-gateIn case you try to keep in pace on 5 different machines, for which
404*0Sstevel@tonic-gateonly one of them has access to the WAN, rsync'ing all the source
405*0Sstevel@tonic-gatetrees should than be done 5 times over the NFS. Having
406*0Sstevel@tonic-gatersync'ed the patches only once, I can apply them to all the source
407*0Sstevel@tonic-gatetrees automatically. Need you say more ;-)
408*0Sstevel@tonic-gate
409*0Sstevel@tonic-gate=item It's a good reference
410*0Sstevel@tonic-gate
411*0Sstevel@tonic-gateIf you do not only like to have the most recent development branch,
412*0Sstevel@tonic-gatebut also like to B<fix> bugs, or extend features, you want to dive
413*0Sstevel@tonic-gateinto the sources. If you are a seasoned perl core diver, you don't
414*0Sstevel@tonic-gateneed no manuals, tips, roadmaps, perlguts.pod or other aids to find
415*0Sstevel@tonic-gateyour way around. But if you are a starter, the patches may help you
416*0Sstevel@tonic-gatein finding where you should start and how to change the bits that
417*0Sstevel@tonic-gatebug you.
418*0Sstevel@tonic-gate
419*0Sstevel@tonic-gateThe file B<Changes> is updated on occasions the pumpking sees as his
420*0Sstevel@tonic-gateown little sync points. On those occasions, he releases a tar-ball of
421*0Sstevel@tonic-gatethe current source tree (i.e. perl@7582.tar.gz), which will be an
422*0Sstevel@tonic-gateexcellent point to start with when choosing to use the 'rsync the
423*0Sstevel@tonic-gatepatches' scheme. Starting with perl@7582, which means a set of source
424*0Sstevel@tonic-gatefiles on which the latest applied patch is number 7582, you apply all
425*0Sstevel@tonic-gatesucceeding patches available from then on (7583, 7584, ...).
426*0Sstevel@tonic-gate
427*0Sstevel@tonic-gateYou can use the patches later as a kind of search archive.
428*0Sstevel@tonic-gate
429*0Sstevel@tonic-gate=over 4
430*0Sstevel@tonic-gate
431*0Sstevel@tonic-gate=item Finding a start point
432*0Sstevel@tonic-gate
433*0Sstevel@tonic-gateIf you want to fix/change the behaviour of function/feature Foo, just
434*0Sstevel@tonic-gatescan the patches for patches that mention Foo either in the subject,
435*0Sstevel@tonic-gatethe comments, or the body of the fix. A good chance the patch shows
436*0Sstevel@tonic-gateyou the files that are affected by that patch which are very likely
437*0Sstevel@tonic-gateto be the starting point of your journey into the guts of perl.
438*0Sstevel@tonic-gate
439*0Sstevel@tonic-gate=item Finding how to fix a bug
440*0Sstevel@tonic-gate
441*0Sstevel@tonic-gateIf you've found I<where> the function/feature Foo misbehaves, but you
442*0Sstevel@tonic-gatedon't know how to fix it (but you do know the change you want to
443*0Sstevel@tonic-gatemake), you can, again, peruse the patches for similar changes and
444*0Sstevel@tonic-gatelook how others apply the fix.
445*0Sstevel@tonic-gate
446*0Sstevel@tonic-gate=item Finding the source of misbehaviour
447*0Sstevel@tonic-gate
448*0Sstevel@tonic-gateWhen you keep in sync with bleadperl, the pumpking would love to
449*0Sstevel@tonic-gateI<see> that the community efforts really work. So after each of his
450*0Sstevel@tonic-gatesync points, you are to 'make test' to check if everything is still
451*0Sstevel@tonic-gatein working order. If it is, you do 'make ok', which will send an OK
452*0Sstevel@tonic-gatereport to perlbug@perl.org. (If you do not have access to a mailer
453*0Sstevel@tonic-gatefrom the system you just finished successfully 'make test', you can
454*0Sstevel@tonic-gatedo 'make okfile', which creates the file C<perl.ok>, which you can
455*0Sstevel@tonic-gatethan take to your favourite mailer and mail yourself).
456*0Sstevel@tonic-gate
457*0Sstevel@tonic-gateBut of course, as always, things will not always lead to a success
458*0Sstevel@tonic-gatepath, and one or more test do not pass the 'make test'. Before
459*0Sstevel@tonic-gatesending in a bug report (using 'make nok' or 'make nokfile'), check
460*0Sstevel@tonic-gatethe mailing list if someone else has reported the bug already and if
461*0Sstevel@tonic-gateso, confirm it by replying to that message. If not, you might want to
462*0Sstevel@tonic-gatetrace the source of that misbehaviour B<before> sending in the bug,
463*0Sstevel@tonic-gatewhich will help all the other porters in finding the solution.
464*0Sstevel@tonic-gate
465*0Sstevel@tonic-gateHere the saved patches come in very handy. You can check the list of
466*0Sstevel@tonic-gatepatches to see which patch changed what file and what change caused
467*0Sstevel@tonic-gatethe misbehaviour. If you note that in the bug report, it saves the
468*0Sstevel@tonic-gateone trying to solve it, looking for that point.
469*0Sstevel@tonic-gate
470*0Sstevel@tonic-gate=back
471*0Sstevel@tonic-gate
472*0Sstevel@tonic-gateIf searching the patches is too bothersome, you might consider using
473*0Sstevel@tonic-gateperl's bugtron to find more information about discussions and
474*0Sstevel@tonic-gateramblings on posted bugs.
475*0Sstevel@tonic-gate
476*0Sstevel@tonic-gateIf you want to get the best of both worlds, rsync both the source
477*0Sstevel@tonic-gatetree for convenience, reliability and ease and rsync the patches
478*0Sstevel@tonic-gatefor reference.
479*0Sstevel@tonic-gate
480*0Sstevel@tonic-gate=back
481*0Sstevel@tonic-gate
482*0Sstevel@tonic-gate
483*0Sstevel@tonic-gate=head2 Perlbug administration
484*0Sstevel@tonic-gate
485*0Sstevel@tonic-gateThere is a single remote administrative interface for modifying bug status,
486*0Sstevel@tonic-gatecategory, open issues etc. using the B<RT> I<bugtracker> system, maintained
487*0Sstevel@tonic-gateby I<Robert Spier>.  Become an administrator, and close any bugs you can get
488*0Sstevel@tonic-gateyour sticky mitts on:
489*0Sstevel@tonic-gate
490*0Sstevel@tonic-gate	http://rt.perl.org
491*0Sstevel@tonic-gate
492*0Sstevel@tonic-gateThe bugtracker mechanism for B<perl5> bugs in particular is at:
493*0Sstevel@tonic-gate
494*0Sstevel@tonic-gate	http://bugs6.perl.org/perlbug
495*0Sstevel@tonic-gate
496*0Sstevel@tonic-gateTo email the bug system administrators:
497*0Sstevel@tonic-gate
498*0Sstevel@tonic-gate	"perlbug-admin" <perlbug-admin@perl.org>
499*0Sstevel@tonic-gate
500*0Sstevel@tonic-gate
501*0Sstevel@tonic-gate=head2 Submitting patches
502*0Sstevel@tonic-gate
503*0Sstevel@tonic-gateAlways submit patches to I<perl5-porters@perl.org>.  If you're
504*0Sstevel@tonic-gatepatching a core module and there's an author listed, send the author a
505*0Sstevel@tonic-gatecopy (see L<Patching a core module>).  This lets other porters review
506*0Sstevel@tonic-gateyour patch, which catches a surprising number of errors in patches.
507*0Sstevel@tonic-gateEither use the diff program (available in source code form from
508*0Sstevel@tonic-gateftp://ftp.gnu.org/pub/gnu/ , or use Johan Vromans' I<makepatch>
509*0Sstevel@tonic-gate(available from I<CPAN/authors/id/JV/>).  Unified diffs are preferred,
510*0Sstevel@tonic-gatebut context diffs are accepted.  Do not send RCS-style diffs or diffs
511*0Sstevel@tonic-gatewithout context lines.  More information is given in the
512*0Sstevel@tonic-gateI<Porting/patching.pod> file in the Perl source distribution.  Please
513*0Sstevel@tonic-gatepatch against the latest B<development> version (e.g., if you're
514*0Sstevel@tonic-gatefixing a bug in the 5.005 track, patch against the latest 5.005_5x
515*0Sstevel@tonic-gateversion).  Only patches that survive the heat of the development
516*0Sstevel@tonic-gatebranch get applied to maintenance versions.
517*0Sstevel@tonic-gate
518*0Sstevel@tonic-gateYour patch should update the documentation and test suite.  See
519*0Sstevel@tonic-gateL<Writing a test>.
520*0Sstevel@tonic-gate
521*0Sstevel@tonic-gateTo report a bug in Perl, use the program I<perlbug> which comes with
522*0Sstevel@tonic-gatePerl (if you can't get Perl to work, send mail to the address
523*0Sstevel@tonic-gateI<perlbug@perl.org> or I<perlbug@perl.com>).  Reporting bugs through
524*0Sstevel@tonic-gateI<perlbug> feeds into the automated bug-tracking system, access to
525*0Sstevel@tonic-gatewhich is provided through the web at http://bugs.perl.org/ .  It
526*0Sstevel@tonic-gateoften pays to check the archives of the perl5-porters mailing list to
527*0Sstevel@tonic-gatesee whether the bug you're reporting has been reported before, and if
528*0Sstevel@tonic-gateso whether it was considered a bug.  See above for the location of
529*0Sstevel@tonic-gatethe searchable archives.
530*0Sstevel@tonic-gate
531*0Sstevel@tonic-gateThe CPAN testers ( http://testers.cpan.org/ ) are a group of
532*0Sstevel@tonic-gatevolunteers who test CPAN modules on a variety of platforms.  Perl
533*0Sstevel@tonic-gateSmokers ( http://archives.develooper.com/daily-build@perl.org/ )
534*0Sstevel@tonic-gateautomatically tests Perl source releases on platforms with various
535*0Sstevel@tonic-gateconfigurations.  Both efforts welcome volunteers.
536*0Sstevel@tonic-gate
537*0Sstevel@tonic-gateIt's a good idea to read and lurk for a while before chipping in.
538*0Sstevel@tonic-gateThat way you'll get to see the dynamic of the conversations, learn the
539*0Sstevel@tonic-gatepersonalities of the players, and hopefully be better prepared to make
540*0Sstevel@tonic-gatea useful contribution when do you speak up.
541*0Sstevel@tonic-gate
542*0Sstevel@tonic-gateIf after all this you still think you want to join the perl5-porters
543*0Sstevel@tonic-gatemailing list, send mail to I<perl5-porters-subscribe@perl.org>.  To
544*0Sstevel@tonic-gateunsubscribe, send mail to I<perl5-porters-unsubscribe@perl.org>.
545*0Sstevel@tonic-gate
546*0Sstevel@tonic-gateTo hack on the Perl guts, you'll need to read the following things:
547*0Sstevel@tonic-gate
548*0Sstevel@tonic-gate=over 3
549*0Sstevel@tonic-gate
550*0Sstevel@tonic-gate=item L<perlguts>
551*0Sstevel@tonic-gate
552*0Sstevel@tonic-gateThis is of paramount importance, since it's the documentation of what
553*0Sstevel@tonic-gategoes where in the Perl source. Read it over a couple of times and it
554*0Sstevel@tonic-gatemight start to make sense - don't worry if it doesn't yet, because the
555*0Sstevel@tonic-gatebest way to study it is to read it in conjunction with poking at Perl
556*0Sstevel@tonic-gatesource, and we'll do that later on.
557*0Sstevel@tonic-gate
558*0Sstevel@tonic-gateYou might also want to look at Gisle Aas's illustrated perlguts -
559*0Sstevel@tonic-gatethere's no guarantee that this will be absolutely up-to-date with the
560*0Sstevel@tonic-gatelatest documentation in the Perl core, but the fundamentals will be
561*0Sstevel@tonic-gateright. ( http://gisle.aas.no/perl/illguts/ )
562*0Sstevel@tonic-gate
563*0Sstevel@tonic-gate=item L<perlxstut> and L<perlxs>
564*0Sstevel@tonic-gate
565*0Sstevel@tonic-gateA working knowledge of XSUB programming is incredibly useful for core
566*0Sstevel@tonic-gatehacking; XSUBs use techniques drawn from the PP code, the portion of the
567*0Sstevel@tonic-gateguts that actually executes a Perl program. It's a lot gentler to learn
568*0Sstevel@tonic-gatethose techniques from simple examples and explanation than from the core
569*0Sstevel@tonic-gateitself.
570*0Sstevel@tonic-gate
571*0Sstevel@tonic-gate=item L<perlapi>
572*0Sstevel@tonic-gate
573*0Sstevel@tonic-gateThe documentation for the Perl API explains what some of the internal
574*0Sstevel@tonic-gatefunctions do, as well as the many macros used in the source.
575*0Sstevel@tonic-gate
576*0Sstevel@tonic-gate=item F<Porting/pumpkin.pod>
577*0Sstevel@tonic-gate
578*0Sstevel@tonic-gateThis is a collection of words of wisdom for a Perl porter; some of it is
579*0Sstevel@tonic-gateonly useful to the pumpkin holder, but most of it applies to anyone
580*0Sstevel@tonic-gatewanting to go about Perl development.
581*0Sstevel@tonic-gate
582*0Sstevel@tonic-gate=item The perl5-porters FAQ
583*0Sstevel@tonic-gate
584*0Sstevel@tonic-gateThis should be available from http://simon-cozens.org/writings/p5p-faq ;
585*0Sstevel@tonic-gatealternatively, you can get the FAQ emailed to you by sending mail to
586*0Sstevel@tonic-gateC<perl5-porters-faq@perl.org>. It contains hints on reading perl5-porters,
587*0Sstevel@tonic-gateinformation on how perl5-porters works and how Perl development in general
588*0Sstevel@tonic-gateworks.
589*0Sstevel@tonic-gate
590*0Sstevel@tonic-gate=back
591*0Sstevel@tonic-gate
592*0Sstevel@tonic-gate=head2 Finding Your Way Around
593*0Sstevel@tonic-gate
594*0Sstevel@tonic-gatePerl maintenance can be split into a number of areas, and certain people
595*0Sstevel@tonic-gate(pumpkins) will have responsibility for each area. These areas sometimes
596*0Sstevel@tonic-gatecorrespond to files or directories in the source kit. Among the areas are:
597*0Sstevel@tonic-gate
598*0Sstevel@tonic-gate=over 3
599*0Sstevel@tonic-gate
600*0Sstevel@tonic-gate=item Core modules
601*0Sstevel@tonic-gate
602*0Sstevel@tonic-gateModules shipped as part of the Perl core live in the F<lib/> and F<ext/>
603*0Sstevel@tonic-gatesubdirectories: F<lib/> is for the pure-Perl modules, and F<ext/>
604*0Sstevel@tonic-gatecontains the core XS modules.
605*0Sstevel@tonic-gate
606*0Sstevel@tonic-gate=item Tests
607*0Sstevel@tonic-gate
608*0Sstevel@tonic-gateThere are tests for nearly all the modules, built-ins and major bits
609*0Sstevel@tonic-gateof functionality.  Test files all have a .t suffix.  Module tests live
610*0Sstevel@tonic-gatein the F<lib/> and F<ext/> directories next to the module being
611*0Sstevel@tonic-gatetested.  Others live in F<t/>.  See L<Writing a test>
612*0Sstevel@tonic-gate
613*0Sstevel@tonic-gate=item Documentation
614*0Sstevel@tonic-gate
615*0Sstevel@tonic-gateDocumentation maintenance includes looking after everything in the
616*0Sstevel@tonic-gateF<pod/> directory, (as well as contributing new documentation) and
617*0Sstevel@tonic-gatethe documentation to the modules in core.
618*0Sstevel@tonic-gate
619*0Sstevel@tonic-gate=item Configure
620*0Sstevel@tonic-gate
621*0Sstevel@tonic-gateThe configure process is the way we make Perl portable across the
622*0Sstevel@tonic-gatemyriad of operating systems it supports. Responsibility for the
623*0Sstevel@tonic-gateconfigure, build and installation process, as well as the overall
624*0Sstevel@tonic-gateportability of the core code rests with the configure pumpkin - others
625*0Sstevel@tonic-gatehelp out with individual operating systems.
626*0Sstevel@tonic-gate
627*0Sstevel@tonic-gateThe files involved are the operating system directories, (F<win32/>,
628*0Sstevel@tonic-gateF<os2/>, F<vms/> and so on) the shell scripts which generate F<config.h>
629*0Sstevel@tonic-gateand F<Makefile>, as well as the metaconfig files which generate
630*0Sstevel@tonic-gateF<Configure>. (metaconfig isn't included in the core distribution.)
631*0Sstevel@tonic-gate
632*0Sstevel@tonic-gate=item Interpreter
633*0Sstevel@tonic-gate
634*0Sstevel@tonic-gateAnd of course, there's the core of the Perl interpreter itself. Let's
635*0Sstevel@tonic-gatehave a look at that in a little more detail.
636*0Sstevel@tonic-gate
637*0Sstevel@tonic-gate=back
638*0Sstevel@tonic-gate
639*0Sstevel@tonic-gateBefore we leave looking at the layout, though, don't forget that
640*0Sstevel@tonic-gateF<MANIFEST> contains not only the file names in the Perl distribution,
641*0Sstevel@tonic-gatebut short descriptions of what's in them, too. For an overview of the
642*0Sstevel@tonic-gateimportant files, try this:
643*0Sstevel@tonic-gate
644*0Sstevel@tonic-gate    perl -lne 'print if /^[^\/]+\.[ch]\s+/' MANIFEST
645*0Sstevel@tonic-gate
646*0Sstevel@tonic-gate=head2 Elements of the interpreter
647*0Sstevel@tonic-gate
648*0Sstevel@tonic-gateThe work of the interpreter has two main stages: compiling the code
649*0Sstevel@tonic-gateinto the internal representation, or bytecode, and then executing it.
650*0Sstevel@tonic-gateL<perlguts/Compiled code> explains exactly how the compilation stage
651*0Sstevel@tonic-gatehappens.
652*0Sstevel@tonic-gate
653*0Sstevel@tonic-gateHere is a short breakdown of perl's operation:
654*0Sstevel@tonic-gate
655*0Sstevel@tonic-gate=over 3
656*0Sstevel@tonic-gate
657*0Sstevel@tonic-gate=item Startup
658*0Sstevel@tonic-gate
659*0Sstevel@tonic-gateThe action begins in F<perlmain.c>. (or F<miniperlmain.c> for miniperl)
660*0Sstevel@tonic-gateThis is very high-level code, enough to fit on a single screen, and it
661*0Sstevel@tonic-gateresembles the code found in L<perlembed>; most of the real action takes
662*0Sstevel@tonic-gateplace in F<perl.c>
663*0Sstevel@tonic-gate
664*0Sstevel@tonic-gateFirst, F<perlmain.c> allocates some memory and constructs a Perl
665*0Sstevel@tonic-gateinterpreter:
666*0Sstevel@tonic-gate
667*0Sstevel@tonic-gate    1 PERL_SYS_INIT3(&argc,&argv,&env);
668*0Sstevel@tonic-gate    2
669*0Sstevel@tonic-gate    3 if (!PL_do_undump) {
670*0Sstevel@tonic-gate    4     my_perl = perl_alloc();
671*0Sstevel@tonic-gate    5     if (!my_perl)
672*0Sstevel@tonic-gate    6         exit(1);
673*0Sstevel@tonic-gate    7     perl_construct(my_perl);
674*0Sstevel@tonic-gate    8     PL_perl_destruct_level = 0;
675*0Sstevel@tonic-gate    9 }
676*0Sstevel@tonic-gate
677*0Sstevel@tonic-gateLine 1 is a macro, and its definition is dependent on your operating
678*0Sstevel@tonic-gatesystem. Line 3 references C<PL_do_undump>, a global variable - all
679*0Sstevel@tonic-gateglobal variables in Perl start with C<PL_>. This tells you whether the
680*0Sstevel@tonic-gatecurrent running program was created with the C<-u> flag to perl and then
681*0Sstevel@tonic-gateF<undump>, which means it's going to be false in any sane context.
682*0Sstevel@tonic-gate
683*0Sstevel@tonic-gateLine 4 calls a function in F<perl.c> to allocate memory for a Perl
684*0Sstevel@tonic-gateinterpreter. It's quite a simple function, and the guts of it looks like
685*0Sstevel@tonic-gatethis:
686*0Sstevel@tonic-gate
687*0Sstevel@tonic-gate    my_perl = (PerlInterpreter*)PerlMem_malloc(sizeof(PerlInterpreter));
688*0Sstevel@tonic-gate
689*0Sstevel@tonic-gateHere you see an example of Perl's system abstraction, which we'll see
690*0Sstevel@tonic-gatelater: C<PerlMem_malloc> is either your system's C<malloc>, or Perl's
691*0Sstevel@tonic-gateown C<malloc> as defined in F<malloc.c> if you selected that option at
692*0Sstevel@tonic-gateconfigure time.
693*0Sstevel@tonic-gate
694*0Sstevel@tonic-gateNext, in line 7, we construct the interpreter; this sets up all the
695*0Sstevel@tonic-gatespecial variables that Perl needs, the stacks, and so on.
696*0Sstevel@tonic-gate
697*0Sstevel@tonic-gateNow we pass Perl the command line options, and tell it to go:
698*0Sstevel@tonic-gate
699*0Sstevel@tonic-gate    exitstatus = perl_parse(my_perl, xs_init, argc, argv, (char **)NULL);
700*0Sstevel@tonic-gate    if (!exitstatus) {
701*0Sstevel@tonic-gate        exitstatus = perl_run(my_perl);
702*0Sstevel@tonic-gate    }
703*0Sstevel@tonic-gate
704*0Sstevel@tonic-gate
705*0Sstevel@tonic-gateC<perl_parse> is actually a wrapper around C<S_parse_body>, as defined
706*0Sstevel@tonic-gatein F<perl.c>, which processes the command line options, sets up any
707*0Sstevel@tonic-gatestatically linked XS modules, opens the program and calls C<yyparse> to
708*0Sstevel@tonic-gateparse it.
709*0Sstevel@tonic-gate
710*0Sstevel@tonic-gate=item Parsing
711*0Sstevel@tonic-gate
712*0Sstevel@tonic-gateThe aim of this stage is to take the Perl source, and turn it into an op
713*0Sstevel@tonic-gatetree. We'll see what one of those looks like later. Strictly speaking,
714*0Sstevel@tonic-gatethere's three things going on here.
715*0Sstevel@tonic-gate
716*0Sstevel@tonic-gateC<yyparse>, the parser, lives in F<perly.c>, although you're better off
717*0Sstevel@tonic-gatereading the original YACC input in F<perly.y>. (Yes, Virginia, there
718*0Sstevel@tonic-gateB<is> a YACC grammar for Perl!) The job of the parser is to take your
719*0Sstevel@tonic-gatecode and `understand' it, splitting it into sentences, deciding which
720*0Sstevel@tonic-gateoperands go with which operators and so on.
721*0Sstevel@tonic-gate
722*0Sstevel@tonic-gateThe parser is nobly assisted by the lexer, which chunks up your input
723*0Sstevel@tonic-gateinto tokens, and decides what type of thing each token is: a variable
724*0Sstevel@tonic-gatename, an operator, a bareword, a subroutine, a core function, and so on.
725*0Sstevel@tonic-gateThe main point of entry to the lexer is C<yylex>, and that and its
726*0Sstevel@tonic-gateassociated routines can be found in F<toke.c>. Perl isn't much like
727*0Sstevel@tonic-gateother computer languages; it's highly context sensitive at times, it can
728*0Sstevel@tonic-gatebe tricky to work out what sort of token something is, or where a token
729*0Sstevel@tonic-gateends. As such, there's a lot of interplay between the tokeniser and the
730*0Sstevel@tonic-gateparser, which can get pretty frightening if you're not used to it.
731*0Sstevel@tonic-gate
732*0Sstevel@tonic-gateAs the parser understands a Perl program, it builds up a tree of
733*0Sstevel@tonic-gateoperations for the interpreter to perform during execution. The routines
734*0Sstevel@tonic-gatewhich construct and link together the various operations are to be found
735*0Sstevel@tonic-gatein F<op.c>, and will be examined later.
736*0Sstevel@tonic-gate
737*0Sstevel@tonic-gate=item Optimization
738*0Sstevel@tonic-gate
739*0Sstevel@tonic-gateNow the parsing stage is complete, and the finished tree represents
740*0Sstevel@tonic-gatethe operations that the Perl interpreter needs to perform to execute our
741*0Sstevel@tonic-gateprogram. Next, Perl does a dry run over the tree looking for
742*0Sstevel@tonic-gateoptimisations: constant expressions such as C<3 + 4> will be computed
743*0Sstevel@tonic-gatenow, and the optimizer will also see if any multiple operations can be
744*0Sstevel@tonic-gatereplaced with a single one. For instance, to fetch the variable C<$foo>,
745*0Sstevel@tonic-gateinstead of grabbing the glob C<*foo> and looking at the scalar
746*0Sstevel@tonic-gatecomponent, the optimizer fiddles the op tree to use a function which
747*0Sstevel@tonic-gatedirectly looks up the scalar in question. The main optimizer is C<peep>
748*0Sstevel@tonic-gatein F<op.c>, and many ops have their own optimizing functions.
749*0Sstevel@tonic-gate
750*0Sstevel@tonic-gate=item Running
751*0Sstevel@tonic-gate
752*0Sstevel@tonic-gateNow we're finally ready to go: we have compiled Perl byte code, and all
753*0Sstevel@tonic-gatethat's left to do is run it. The actual execution is done by the
754*0Sstevel@tonic-gateC<runops_standard> function in F<run.c>; more specifically, it's done by
755*0Sstevel@tonic-gatethese three innocent looking lines:
756*0Sstevel@tonic-gate
757*0Sstevel@tonic-gate    while ((PL_op = CALL_FPTR(PL_op->op_ppaddr)(aTHX))) {
758*0Sstevel@tonic-gate        PERL_ASYNC_CHECK();
759*0Sstevel@tonic-gate    }
760*0Sstevel@tonic-gate
761*0Sstevel@tonic-gateYou may be more comfortable with the Perl version of that:
762*0Sstevel@tonic-gate
763*0Sstevel@tonic-gate    PERL_ASYNC_CHECK() while $Perl::op = &{$Perl::op->{function}};
764*0Sstevel@tonic-gate
765*0Sstevel@tonic-gateWell, maybe not. Anyway, each op contains a function pointer, which
766*0Sstevel@tonic-gatestipulates the function which will actually carry out the operation.
767*0Sstevel@tonic-gateThis function will return the next op in the sequence - this allows for
768*0Sstevel@tonic-gatethings like C<if> which choose the next op dynamically at run time.
769*0Sstevel@tonic-gateThe C<PERL_ASYNC_CHECK> makes sure that things like signals interrupt
770*0Sstevel@tonic-gateexecution if required.
771*0Sstevel@tonic-gate
772*0Sstevel@tonic-gateThe actual functions called are known as PP code, and they're spread
773*0Sstevel@tonic-gatebetween four files: F<pp_hot.c> contains the `hot' code, which is most
774*0Sstevel@tonic-gateoften used and highly optimized, F<pp_sys.c> contains all the
775*0Sstevel@tonic-gatesystem-specific functions, F<pp_ctl.c> contains the functions which
776*0Sstevel@tonic-gateimplement control structures (C<if>, C<while> and the like) and F<pp.c>
777*0Sstevel@tonic-gatecontains everything else. These are, if you like, the C code for Perl's
778*0Sstevel@tonic-gatebuilt-in functions and operators.
779*0Sstevel@tonic-gate
780*0Sstevel@tonic-gate=back
781*0Sstevel@tonic-gate
782*0Sstevel@tonic-gate=head2 Internal Variable Types
783*0Sstevel@tonic-gate
784*0Sstevel@tonic-gateYou should by now have had a look at L<perlguts>, which tells you about
785*0Sstevel@tonic-gatePerl's internal variable types: SVs, HVs, AVs and the rest. If not, do
786*0Sstevel@tonic-gatethat now.
787*0Sstevel@tonic-gate
788*0Sstevel@tonic-gateThese variables are used not only to represent Perl-space variables, but
789*0Sstevel@tonic-gatealso any constants in the code, as well as some structures completely
790*0Sstevel@tonic-gateinternal to Perl. The symbol table, for instance, is an ordinary Perl
791*0Sstevel@tonic-gatehash. Your code is represented by an SV as it's read into the parser;
792*0Sstevel@tonic-gateany program files you call are opened via ordinary Perl filehandles, and
793*0Sstevel@tonic-gateso on.
794*0Sstevel@tonic-gate
795*0Sstevel@tonic-gateThe core L<Devel::Peek|Devel::Peek> module lets us examine SVs from a
796*0Sstevel@tonic-gatePerl program. Let's see, for instance, how Perl treats the constant
797*0Sstevel@tonic-gateC<"hello">.
798*0Sstevel@tonic-gate
799*0Sstevel@tonic-gate      % perl -MDevel::Peek -e 'Dump("hello")'
800*0Sstevel@tonic-gate    1 SV = PV(0xa041450) at 0xa04ecbc
801*0Sstevel@tonic-gate    2   REFCNT = 1
802*0Sstevel@tonic-gate    3   FLAGS = (POK,READONLY,pPOK)
803*0Sstevel@tonic-gate    4   PV = 0xa0484e0 "hello"\0
804*0Sstevel@tonic-gate    5   CUR = 5
805*0Sstevel@tonic-gate    6   LEN = 6
806*0Sstevel@tonic-gate
807*0Sstevel@tonic-gateReading C<Devel::Peek> output takes a bit of practise, so let's go
808*0Sstevel@tonic-gatethrough it line by line.
809*0Sstevel@tonic-gate
810*0Sstevel@tonic-gateLine 1 tells us we're looking at an SV which lives at C<0xa04ecbc> in
811*0Sstevel@tonic-gatememory. SVs themselves are very simple structures, but they contain a
812*0Sstevel@tonic-gatepointer to a more complex structure. In this case, it's a PV, a
813*0Sstevel@tonic-gatestructure which holds a string value, at location C<0xa041450>.  Line 2
814*0Sstevel@tonic-gateis the reference count; there are no other references to this data, so
815*0Sstevel@tonic-gateit's 1.
816*0Sstevel@tonic-gate
817*0Sstevel@tonic-gateLine 3 are the flags for this SV - it's OK to use it as a PV, it's a
818*0Sstevel@tonic-gateread-only SV (because it's a constant) and the data is a PV internally.
819*0Sstevel@tonic-gateNext we've got the contents of the string, starting at location
820*0Sstevel@tonic-gateC<0xa0484e0>.
821*0Sstevel@tonic-gate
822*0Sstevel@tonic-gateLine 5 gives us the current length of the string - note that this does
823*0Sstevel@tonic-gateB<not> include the null terminator. Line 6 is not the length of the
824*0Sstevel@tonic-gatestring, but the length of the currently allocated buffer; as the string
825*0Sstevel@tonic-gategrows, Perl automatically extends the available storage via a routine
826*0Sstevel@tonic-gatecalled C<SvGROW>.
827*0Sstevel@tonic-gate
828*0Sstevel@tonic-gateYou can get at any of these quantities from C very easily; just add
829*0Sstevel@tonic-gateC<Sv> to the name of the field shown in the snippet, and you've got a
830*0Sstevel@tonic-gatemacro which will return the value: C<SvCUR(sv)> returns the current
831*0Sstevel@tonic-gatelength of the string, C<SvREFCOUNT(sv)> returns the reference count,
832*0Sstevel@tonic-gateC<SvPV(sv, len)> returns the string itself with its length, and so on.
833*0Sstevel@tonic-gateMore macros to manipulate these properties can be found in L<perlguts>.
834*0Sstevel@tonic-gate
835*0Sstevel@tonic-gateLet's take an example of manipulating a PV, from C<sv_catpvn>, in F<sv.c>
836*0Sstevel@tonic-gate
837*0Sstevel@tonic-gate     1  void
838*0Sstevel@tonic-gate     2  Perl_sv_catpvn(pTHX_ register SV *sv, register const char *ptr, register STRLEN len)
839*0Sstevel@tonic-gate     3  {
840*0Sstevel@tonic-gate     4      STRLEN tlen;
841*0Sstevel@tonic-gate     5      char *junk;
842*0Sstevel@tonic-gate
843*0Sstevel@tonic-gate     6      junk = SvPV_force(sv, tlen);
844*0Sstevel@tonic-gate     7      SvGROW(sv, tlen + len + 1);
845*0Sstevel@tonic-gate     8      if (ptr == junk)
846*0Sstevel@tonic-gate     9          ptr = SvPVX(sv);
847*0Sstevel@tonic-gate    10      Move(ptr,SvPVX(sv)+tlen,len,char);
848*0Sstevel@tonic-gate    11      SvCUR(sv) += len;
849*0Sstevel@tonic-gate    12      *SvEND(sv) = '\0';
850*0Sstevel@tonic-gate    13      (void)SvPOK_only_UTF8(sv);          /* validate pointer */
851*0Sstevel@tonic-gate    14      SvTAINT(sv);
852*0Sstevel@tonic-gate    15  }
853*0Sstevel@tonic-gate
854*0Sstevel@tonic-gateThis is a function which adds a string, C<ptr>, of length C<len> onto
855*0Sstevel@tonic-gatethe end of the PV stored in C<sv>. The first thing we do in line 6 is
856*0Sstevel@tonic-gatemake sure that the SV B<has> a valid PV, by calling the C<SvPV_force>
857*0Sstevel@tonic-gatemacro to force a PV. As a side effect, C<tlen> gets set to the current
858*0Sstevel@tonic-gatevalue of the PV, and the PV itself is returned to C<junk>.
859*0Sstevel@tonic-gate
860*0Sstevel@tonic-gateIn line 7, we make sure that the SV will have enough room to accommodate
861*0Sstevel@tonic-gatethe old string, the new string and the null terminator. If C<LEN> isn't
862*0Sstevel@tonic-gatebig enough, C<SvGROW> will reallocate space for us.
863*0Sstevel@tonic-gate
864*0Sstevel@tonic-gateNow, if C<junk> is the same as the string we're trying to add, we can
865*0Sstevel@tonic-gategrab the string directly from the SV; C<SvPVX> is the address of the PV
866*0Sstevel@tonic-gatein the SV.
867*0Sstevel@tonic-gate
868*0Sstevel@tonic-gateLine 10 does the actual catenation: the C<Move> macro moves a chunk of
869*0Sstevel@tonic-gatememory around: we move the string C<ptr> to the end of the PV - that's
870*0Sstevel@tonic-gatethe start of the PV plus its current length. We're moving C<len> bytes
871*0Sstevel@tonic-gateof type C<char>. After doing so, we need to tell Perl we've extended the
872*0Sstevel@tonic-gatestring, by altering C<CUR> to reflect the new length. C<SvEND> is a
873*0Sstevel@tonic-gatemacro which gives us the end of the string, so that needs to be a
874*0Sstevel@tonic-gateC<"\0">.
875*0Sstevel@tonic-gate
876*0Sstevel@tonic-gateLine 13 manipulates the flags; since we've changed the PV, any IV or NV
877*0Sstevel@tonic-gatevalues will no longer be valid: if we have C<$a=10; $a.="6";> we don't
878*0Sstevel@tonic-gatewant to use the old IV of 10. C<SvPOK_only_utf8> is a special UTF-8-aware
879*0Sstevel@tonic-gateversion of C<SvPOK_only>, a macro which turns off the IOK and NOK flags
880*0Sstevel@tonic-gateand turns on POK. The final C<SvTAINT> is a macro which launders tainted
881*0Sstevel@tonic-gatedata if taint mode is turned on.
882*0Sstevel@tonic-gate
883*0Sstevel@tonic-gateAVs and HVs are more complicated, but SVs are by far the most common
884*0Sstevel@tonic-gatevariable type being thrown around. Having seen something of how we
885*0Sstevel@tonic-gatemanipulate these, let's go on and look at how the op tree is
886*0Sstevel@tonic-gateconstructed.
887*0Sstevel@tonic-gate
888*0Sstevel@tonic-gate=head2 Op Trees
889*0Sstevel@tonic-gate
890*0Sstevel@tonic-gateFirst, what is the op tree, anyway? The op tree is the parsed
891*0Sstevel@tonic-gaterepresentation of your program, as we saw in our section on parsing, and
892*0Sstevel@tonic-gateit's the sequence of operations that Perl goes through to execute your
893*0Sstevel@tonic-gateprogram, as we saw in L</Running>.
894*0Sstevel@tonic-gate
895*0Sstevel@tonic-gateAn op is a fundamental operation that Perl can perform: all the built-in
896*0Sstevel@tonic-gatefunctions and operators are ops, and there are a series of ops which
897*0Sstevel@tonic-gatedeal with concepts the interpreter needs internally - entering and
898*0Sstevel@tonic-gateleaving a block, ending a statement, fetching a variable, and so on.
899*0Sstevel@tonic-gate
900*0Sstevel@tonic-gateThe op tree is connected in two ways: you can imagine that there are two
901*0Sstevel@tonic-gate"routes" through it, two orders in which you can traverse the tree.
902*0Sstevel@tonic-gateFirst, parse order reflects how the parser understood the code, and
903*0Sstevel@tonic-gatesecondly, execution order tells perl what order to perform the
904*0Sstevel@tonic-gateoperations in.
905*0Sstevel@tonic-gate
906*0Sstevel@tonic-gateThe easiest way to examine the op tree is to stop Perl after it has
907*0Sstevel@tonic-gatefinished parsing, and get it to dump out the tree. This is exactly what
908*0Sstevel@tonic-gatethe compiler backends L<B::Terse|B::Terse>, L<B::Concise|B::Concise>
909*0Sstevel@tonic-gateand L<B::Debug|B::Debug> do.
910*0Sstevel@tonic-gate
911*0Sstevel@tonic-gateLet's have a look at how Perl sees C<$a = $b + $c>:
912*0Sstevel@tonic-gate
913*0Sstevel@tonic-gate     % perl -MO=Terse -e '$a=$b+$c'
914*0Sstevel@tonic-gate     1  LISTOP (0x8179888) leave
915*0Sstevel@tonic-gate     2      OP (0x81798b0) enter
916*0Sstevel@tonic-gate     3      COP (0x8179850) nextstate
917*0Sstevel@tonic-gate     4      BINOP (0x8179828) sassign
918*0Sstevel@tonic-gate     5          BINOP (0x8179800) add [1]
919*0Sstevel@tonic-gate     6              UNOP (0x81796e0) null [15]
920*0Sstevel@tonic-gate     7                  SVOP (0x80fafe0) gvsv  GV (0x80fa4cc) *b
921*0Sstevel@tonic-gate     8              UNOP (0x81797e0) null [15]
922*0Sstevel@tonic-gate     9                  SVOP (0x8179700) gvsv  GV (0x80efeb0) *c
923*0Sstevel@tonic-gate    10          UNOP (0x816b4f0) null [15]
924*0Sstevel@tonic-gate    11              SVOP (0x816dcf0) gvsv  GV (0x80fa460) *a
925*0Sstevel@tonic-gate
926*0Sstevel@tonic-gateLet's start in the middle, at line 4. This is a BINOP, a binary
927*0Sstevel@tonic-gateoperator, which is at location C<0x8179828>. The specific operator in
928*0Sstevel@tonic-gatequestion is C<sassign> - scalar assignment - and you can find the code
929*0Sstevel@tonic-gatewhich implements it in the function C<pp_sassign> in F<pp_hot.c>. As a
930*0Sstevel@tonic-gatebinary operator, it has two children: the add operator, providing the
931*0Sstevel@tonic-gateresult of C<$b+$c>, is uppermost on line 5, and the left hand side is on
932*0Sstevel@tonic-gateline 10.
933*0Sstevel@tonic-gate
934*0Sstevel@tonic-gateLine 10 is the null op: this does exactly nothing. What is that doing
935*0Sstevel@tonic-gatethere? If you see the null op, it's a sign that something has been
936*0Sstevel@tonic-gateoptimized away after parsing. As we mentioned in L</Optimization>,
937*0Sstevel@tonic-gatethe optimization stage sometimes converts two operations into one, for
938*0Sstevel@tonic-gateexample when fetching a scalar variable. When this happens, instead of
939*0Sstevel@tonic-gaterewriting the op tree and cleaning up the dangling pointers, it's easier
940*0Sstevel@tonic-gatejust to replace the redundant operation with the null op. Originally,
941*0Sstevel@tonic-gatethe tree would have looked like this:
942*0Sstevel@tonic-gate
943*0Sstevel@tonic-gate    10          SVOP (0x816b4f0) rv2sv [15]
944*0Sstevel@tonic-gate    11              SVOP (0x816dcf0) gv  GV (0x80fa460) *a
945*0Sstevel@tonic-gate
946*0Sstevel@tonic-gateThat is, fetch the C<a> entry from the main symbol table, and then look
947*0Sstevel@tonic-gateat the scalar component of it: C<gvsv> (C<pp_gvsv> into F<pp_hot.c>)
948*0Sstevel@tonic-gatehappens to do both these things.
949*0Sstevel@tonic-gate
950*0Sstevel@tonic-gateThe right hand side, starting at line 5 is similar to what we've just
951*0Sstevel@tonic-gateseen: we have the C<add> op (C<pp_add> also in F<pp_hot.c>) add together
952*0Sstevel@tonic-gatetwo C<gvsv>s.
953*0Sstevel@tonic-gate
954*0Sstevel@tonic-gateNow, what's this about?
955*0Sstevel@tonic-gate
956*0Sstevel@tonic-gate     1  LISTOP (0x8179888) leave
957*0Sstevel@tonic-gate     2      OP (0x81798b0) enter
958*0Sstevel@tonic-gate     3      COP (0x8179850) nextstate
959*0Sstevel@tonic-gate
960*0Sstevel@tonic-gateC<enter> and C<leave> are scoping ops, and their job is to perform any
961*0Sstevel@tonic-gatehousekeeping every time you enter and leave a block: lexical variables
962*0Sstevel@tonic-gateare tidied up, unreferenced variables are destroyed, and so on. Every
963*0Sstevel@tonic-gateprogram will have those first three lines: C<leave> is a list, and its
964*0Sstevel@tonic-gatechildren are all the statements in the block. Statements are delimited
965*0Sstevel@tonic-gateby C<nextstate>, so a block is a collection of C<nextstate> ops, with
966*0Sstevel@tonic-gatethe ops to be performed for each statement being the children of
967*0Sstevel@tonic-gateC<nextstate>. C<enter> is a single op which functions as a marker.
968*0Sstevel@tonic-gate
969*0Sstevel@tonic-gateThat's how Perl parsed the program, from top to bottom:
970*0Sstevel@tonic-gate
971*0Sstevel@tonic-gate                        Program
972*0Sstevel@tonic-gate                           |
973*0Sstevel@tonic-gate                       Statement
974*0Sstevel@tonic-gate                           |
975*0Sstevel@tonic-gate                           =
976*0Sstevel@tonic-gate                          / \
977*0Sstevel@tonic-gate                         /   \
978*0Sstevel@tonic-gate                        $a   +
979*0Sstevel@tonic-gate                            / \
980*0Sstevel@tonic-gate                          $b   $c
981*0Sstevel@tonic-gate
982*0Sstevel@tonic-gateHowever, it's impossible to B<perform> the operations in this order:
983*0Sstevel@tonic-gateyou have to find the values of C<$b> and C<$c> before you add them
984*0Sstevel@tonic-gatetogether, for instance. So, the other thread that runs through the op
985*0Sstevel@tonic-gatetree is the execution order: each op has a field C<op_next> which points
986*0Sstevel@tonic-gateto the next op to be run, so following these pointers tells us how perl
987*0Sstevel@tonic-gateexecutes the code. We can traverse the tree in this order using
988*0Sstevel@tonic-gatethe C<exec> option to C<B::Terse>:
989*0Sstevel@tonic-gate
990*0Sstevel@tonic-gate     % perl -MO=Terse,exec -e '$a=$b+$c'
991*0Sstevel@tonic-gate     1  OP (0x8179928) enter
992*0Sstevel@tonic-gate     2  COP (0x81798c8) nextstate
993*0Sstevel@tonic-gate     3  SVOP (0x81796c8) gvsv  GV (0x80fa4d4) *b
994*0Sstevel@tonic-gate     4  SVOP (0x8179798) gvsv  GV (0x80efeb0) *c
995*0Sstevel@tonic-gate     5  BINOP (0x8179878) add [1]
996*0Sstevel@tonic-gate     6  SVOP (0x816dd38) gvsv  GV (0x80fa468) *a
997*0Sstevel@tonic-gate     7  BINOP (0x81798a0) sassign
998*0Sstevel@tonic-gate     8  LISTOP (0x8179900) leave
999*0Sstevel@tonic-gate
1000*0Sstevel@tonic-gateThis probably makes more sense for a human: enter a block, start a
1001*0Sstevel@tonic-gatestatement. Get the values of C<$b> and C<$c>, and add them together.
1002*0Sstevel@tonic-gateFind C<$a>, and assign one to the other. Then leave.
1003*0Sstevel@tonic-gate
1004*0Sstevel@tonic-gateThe way Perl builds up these op trees in the parsing process can be
1005*0Sstevel@tonic-gateunravelled by examining F<perly.y>, the YACC grammar. Let's take the
1006*0Sstevel@tonic-gatepiece we need to construct the tree for C<$a = $b + $c>
1007*0Sstevel@tonic-gate
1008*0Sstevel@tonic-gate    1 term    :   term ASSIGNOP term
1009*0Sstevel@tonic-gate    2                { $$ = newASSIGNOP(OPf_STACKED, $1, $2, $3); }
1010*0Sstevel@tonic-gate    3         |   term ADDOP term
1011*0Sstevel@tonic-gate    4                { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }
1012*0Sstevel@tonic-gate
1013*0Sstevel@tonic-gateIf you're not used to reading BNF grammars, this is how it works: You're
1014*0Sstevel@tonic-gatefed certain things by the tokeniser, which generally end up in upper
1015*0Sstevel@tonic-gatecase. Here, C<ADDOP>, is provided when the tokeniser sees C<+> in your
1016*0Sstevel@tonic-gatecode. C<ASSIGNOP> is provided when C<=> is used for assigning. These are
1017*0Sstevel@tonic-gate`terminal symbols', because you can't get any simpler than them.
1018*0Sstevel@tonic-gate
1019*0Sstevel@tonic-gateThe grammar, lines one and three of the snippet above, tells you how to
1020*0Sstevel@tonic-gatebuild up more complex forms. These complex forms, `non-terminal symbols'
1021*0Sstevel@tonic-gateare generally placed in lower case. C<term> here is a non-terminal
1022*0Sstevel@tonic-gatesymbol, representing a single expression.
1023*0Sstevel@tonic-gate
1024*0Sstevel@tonic-gateThe grammar gives you the following rule: you can make the thing on the
1025*0Sstevel@tonic-gateleft of the colon if you see all the things on the right in sequence.
1026*0Sstevel@tonic-gateThis is called a "reduction", and the aim of parsing is to completely
1027*0Sstevel@tonic-gatereduce the input. There are several different ways you can perform a
1028*0Sstevel@tonic-gatereduction, separated by vertical bars: so, C<term> followed by C<=>
1029*0Sstevel@tonic-gatefollowed by C<term> makes a C<term>, and C<term> followed by C<+>
1030*0Sstevel@tonic-gatefollowed by C<term> can also make a C<term>.
1031*0Sstevel@tonic-gate
1032*0Sstevel@tonic-gateSo, if you see two terms with an C<=> or C<+>, between them, you can
1033*0Sstevel@tonic-gateturn them into a single expression. When you do this, you execute the
1034*0Sstevel@tonic-gatecode in the block on the next line: if you see C<=>, you'll do the code
1035*0Sstevel@tonic-gatein line 2. If you see C<+>, you'll do the code in line 4. It's this code
1036*0Sstevel@tonic-gatewhich contributes to the op tree.
1037*0Sstevel@tonic-gate
1038*0Sstevel@tonic-gate            |   term ADDOP term
1039*0Sstevel@tonic-gate            { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }
1040*0Sstevel@tonic-gate
1041*0Sstevel@tonic-gateWhat this does is creates a new binary op, and feeds it a number of
1042*0Sstevel@tonic-gatevariables. The variables refer to the tokens: C<$1> is the first token in
1043*0Sstevel@tonic-gatethe input, C<$2> the second, and so on - think regular expression
1044*0Sstevel@tonic-gatebackreferences. C<$$> is the op returned from this reduction. So, we
1045*0Sstevel@tonic-gatecall C<newBINOP> to create a new binary operator. The first parameter to
1046*0Sstevel@tonic-gateC<newBINOP>, a function in F<op.c>, is the op type. It's an addition
1047*0Sstevel@tonic-gateoperator, so we want the type to be C<ADDOP>. We could specify this
1048*0Sstevel@tonic-gatedirectly, but it's right there as the second token in the input, so we
1049*0Sstevel@tonic-gateuse C<$2>. The second parameter is the op's flags: 0 means `nothing
1050*0Sstevel@tonic-gatespecial'. Then the things to add: the left and right hand side of our
1051*0Sstevel@tonic-gateexpression, in scalar context.
1052*0Sstevel@tonic-gate
1053*0Sstevel@tonic-gate=head2 Stacks
1054*0Sstevel@tonic-gate
1055*0Sstevel@tonic-gateWhen perl executes something like C<addop>, how does it pass on its
1056*0Sstevel@tonic-gateresults to the next op? The answer is, through the use of stacks. Perl
1057*0Sstevel@tonic-gatehas a number of stacks to store things it's currently working on, and
1058*0Sstevel@tonic-gatewe'll look at the three most important ones here.
1059*0Sstevel@tonic-gate
1060*0Sstevel@tonic-gate=over 3
1061*0Sstevel@tonic-gate
1062*0Sstevel@tonic-gate=item Argument stack
1063*0Sstevel@tonic-gate
1064*0Sstevel@tonic-gateArguments are passed to PP code and returned from PP code using the
1065*0Sstevel@tonic-gateargument stack, C<ST>. The typical way to handle arguments is to pop
1066*0Sstevel@tonic-gatethem off the stack, deal with them how you wish, and then push the result
1067*0Sstevel@tonic-gateback onto the stack. This is how, for instance, the cosine operator
1068*0Sstevel@tonic-gateworks:
1069*0Sstevel@tonic-gate
1070*0Sstevel@tonic-gate      NV value;
1071*0Sstevel@tonic-gate      value = POPn;
1072*0Sstevel@tonic-gate      value = Perl_cos(value);
1073*0Sstevel@tonic-gate      XPUSHn(value);
1074*0Sstevel@tonic-gate
1075*0Sstevel@tonic-gateWe'll see a more tricky example of this when we consider Perl's macros
1076*0Sstevel@tonic-gatebelow. C<POPn> gives you the NV (floating point value) of the top SV on
1077*0Sstevel@tonic-gatethe stack: the C<$x> in C<cos($x)>. Then we compute the cosine, and push
1078*0Sstevel@tonic-gatethe result back as an NV. The C<X> in C<XPUSHn> means that the stack
1079*0Sstevel@tonic-gateshould be extended if necessary - it can't be necessary here, because we
1080*0Sstevel@tonic-gateknow there's room for one more item on the stack, since we've just
1081*0Sstevel@tonic-gateremoved one! The C<XPUSH*> macros at least guarantee safety.
1082*0Sstevel@tonic-gate
1083*0Sstevel@tonic-gateAlternatively, you can fiddle with the stack directly: C<SP> gives you
1084*0Sstevel@tonic-gatethe first element in your portion of the stack, and C<TOP*> gives you
1085*0Sstevel@tonic-gatethe top SV/IV/NV/etc. on the stack. So, for instance, to do unary
1086*0Sstevel@tonic-gatenegation of an integer:
1087*0Sstevel@tonic-gate
1088*0Sstevel@tonic-gate     SETi(-TOPi);
1089*0Sstevel@tonic-gate
1090*0Sstevel@tonic-gateJust set the integer value of the top stack entry to its negation.
1091*0Sstevel@tonic-gate
1092*0Sstevel@tonic-gateArgument stack manipulation in the core is exactly the same as it is in
1093*0Sstevel@tonic-gateXSUBs - see L<perlxstut>, L<perlxs> and L<perlguts> for a longer
1094*0Sstevel@tonic-gatedescription of the macros used in stack manipulation.
1095*0Sstevel@tonic-gate
1096*0Sstevel@tonic-gate=item Mark stack
1097*0Sstevel@tonic-gate
1098*0Sstevel@tonic-gateI say `your portion of the stack' above because PP code doesn't
1099*0Sstevel@tonic-gatenecessarily get the whole stack to itself: if your function calls
1100*0Sstevel@tonic-gateanother function, you'll only want to expose the arguments aimed for the
1101*0Sstevel@tonic-gatecalled function, and not (necessarily) let it get at your own data. The
1102*0Sstevel@tonic-gateway we do this is to have a `virtual' bottom-of-stack, exposed to each
1103*0Sstevel@tonic-gatefunction. The mark stack keeps bookmarks to locations in the argument
1104*0Sstevel@tonic-gatestack usable by each function. For instance, when dealing with a tied
1105*0Sstevel@tonic-gatevariable, (internally, something with `P' magic) Perl has to call
1106*0Sstevel@tonic-gatemethods for accesses to the tied variables. However, we need to separate
1107*0Sstevel@tonic-gatethe arguments exposed to the method to the argument exposed to the
1108*0Sstevel@tonic-gateoriginal function - the store or fetch or whatever it may be. Here's how
1109*0Sstevel@tonic-gatethe tied C<push> is implemented; see C<av_push> in F<av.c>:
1110*0Sstevel@tonic-gate
1111*0Sstevel@tonic-gate     1	PUSHMARK(SP);
1112*0Sstevel@tonic-gate     2	EXTEND(SP,2);
1113*0Sstevel@tonic-gate     3	PUSHs(SvTIED_obj((SV*)av, mg));
1114*0Sstevel@tonic-gate     4	PUSHs(val);
1115*0Sstevel@tonic-gate     5	PUTBACK;
1116*0Sstevel@tonic-gate     6	ENTER;
1117*0Sstevel@tonic-gate     7	call_method("PUSH", G_SCALAR|G_DISCARD);
1118*0Sstevel@tonic-gate     8	LEAVE;
1119*0Sstevel@tonic-gate     9	POPSTACK;
1120*0Sstevel@tonic-gate
1121*0Sstevel@tonic-gateThe lines which concern the mark stack are the first, fifth and last
1122*0Sstevel@tonic-gatelines: they save away, restore and remove the current position of the
1123*0Sstevel@tonic-gateargument stack.
1124*0Sstevel@tonic-gate
1125*0Sstevel@tonic-gateLet's examine the whole implementation, for practice:
1126*0Sstevel@tonic-gate
1127*0Sstevel@tonic-gate     1	PUSHMARK(SP);
1128*0Sstevel@tonic-gate
1129*0Sstevel@tonic-gatePush the current state of the stack pointer onto the mark stack. This is
1130*0Sstevel@tonic-gateso that when we've finished adding items to the argument stack, Perl
1131*0Sstevel@tonic-gateknows how many things we've added recently.
1132*0Sstevel@tonic-gate
1133*0Sstevel@tonic-gate     2	EXTEND(SP,2);
1134*0Sstevel@tonic-gate     3	PUSHs(SvTIED_obj((SV*)av, mg));
1135*0Sstevel@tonic-gate     4	PUSHs(val);
1136*0Sstevel@tonic-gate
1137*0Sstevel@tonic-gateWe're going to add two more items onto the argument stack: when you have
1138*0Sstevel@tonic-gatea tied array, the C<PUSH> subroutine receives the object and the value
1139*0Sstevel@tonic-gateto be pushed, and that's exactly what we have here - the tied object,
1140*0Sstevel@tonic-gateretrieved with C<SvTIED_obj>, and the value, the SV C<val>.
1141*0Sstevel@tonic-gate
1142*0Sstevel@tonic-gate     5	PUTBACK;
1143*0Sstevel@tonic-gate
1144*0Sstevel@tonic-gateNext we tell Perl to make the change to the global stack pointer: C<dSP>
1145*0Sstevel@tonic-gateonly gave us a local copy, not a reference to the global.
1146*0Sstevel@tonic-gate
1147*0Sstevel@tonic-gate     6	ENTER;
1148*0Sstevel@tonic-gate     7	call_method("PUSH", G_SCALAR|G_DISCARD);
1149*0Sstevel@tonic-gate     8	LEAVE;
1150*0Sstevel@tonic-gate
1151*0Sstevel@tonic-gateC<ENTER> and C<LEAVE> localise a block of code - they make sure that all
1152*0Sstevel@tonic-gatevariables are tidied up, everything that has been localised gets
1153*0Sstevel@tonic-gateits previous value returned, and so on. Think of them as the C<{> and
1154*0Sstevel@tonic-gateC<}> of a Perl block.
1155*0Sstevel@tonic-gate
1156*0Sstevel@tonic-gateTo actually do the magic method call, we have to call a subroutine in
1157*0Sstevel@tonic-gatePerl space: C<call_method> takes care of that, and it's described in
1158*0Sstevel@tonic-gateL<perlcall>. We call the C<PUSH> method in scalar context, and we're
1159*0Sstevel@tonic-gategoing to discard its return value.
1160*0Sstevel@tonic-gate
1161*0Sstevel@tonic-gate     9	POPSTACK;
1162*0Sstevel@tonic-gate
1163*0Sstevel@tonic-gateFinally, we remove the value we placed on the mark stack, since we
1164*0Sstevel@tonic-gatedon't need it any more.
1165*0Sstevel@tonic-gate
1166*0Sstevel@tonic-gate=item Save stack
1167*0Sstevel@tonic-gate
1168*0Sstevel@tonic-gateC doesn't have a concept of local scope, so perl provides one. We've
1169*0Sstevel@tonic-gateseen that C<ENTER> and C<LEAVE> are used as scoping braces; the save
1170*0Sstevel@tonic-gatestack implements the C equivalent of, for example:
1171*0Sstevel@tonic-gate
1172*0Sstevel@tonic-gate    {
1173*0Sstevel@tonic-gate        local $foo = 42;
1174*0Sstevel@tonic-gate        ...
1175*0Sstevel@tonic-gate    }
1176*0Sstevel@tonic-gate
1177*0Sstevel@tonic-gateSee L<perlguts/Localising Changes> for how to use the save stack.
1178*0Sstevel@tonic-gate
1179*0Sstevel@tonic-gate=back
1180*0Sstevel@tonic-gate
1181*0Sstevel@tonic-gate=head2 Millions of Macros
1182*0Sstevel@tonic-gate
1183*0Sstevel@tonic-gateOne thing you'll notice about the Perl source is that it's full of
1184*0Sstevel@tonic-gatemacros. Some have called the pervasive use of macros the hardest thing
1185*0Sstevel@tonic-gateto understand, others find it adds to clarity. Let's take an example,
1186*0Sstevel@tonic-gatethe code which implements the addition operator:
1187*0Sstevel@tonic-gate
1188*0Sstevel@tonic-gate   1  PP(pp_add)
1189*0Sstevel@tonic-gate   2  {
1190*0Sstevel@tonic-gate   3      dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
1191*0Sstevel@tonic-gate   4      {
1192*0Sstevel@tonic-gate   5        dPOPTOPnnrl_ul;
1193*0Sstevel@tonic-gate   6        SETn( left + right );
1194*0Sstevel@tonic-gate   7        RETURN;
1195*0Sstevel@tonic-gate   8      }
1196*0Sstevel@tonic-gate   9  }
1197*0Sstevel@tonic-gate
1198*0Sstevel@tonic-gateEvery line here (apart from the braces, of course) contains a macro. The
1199*0Sstevel@tonic-gatefirst line sets up the function declaration as Perl expects for PP code;
1200*0Sstevel@tonic-gateline 3 sets up variable declarations for the argument stack and the
1201*0Sstevel@tonic-gatetarget, the return value of the operation. Finally, it tries to see if
1202*0Sstevel@tonic-gatethe addition operation is overloaded; if so, the appropriate subroutine
1203*0Sstevel@tonic-gateis called.
1204*0Sstevel@tonic-gate
1205*0Sstevel@tonic-gateLine 5 is another variable declaration - all variable declarations start
1206*0Sstevel@tonic-gatewith C<d> - which pops from the top of the argument stack two NVs (hence
1207*0Sstevel@tonic-gateC<nn>) and puts them into the variables C<right> and C<left>, hence the
1208*0Sstevel@tonic-gateC<rl>. These are the two operands to the addition operator. Next, we
1209*0Sstevel@tonic-gatecall C<SETn> to set the NV of the return value to the result of adding
1210*0Sstevel@tonic-gatethe two values. This done, we return - the C<RETURN> macro makes sure
1211*0Sstevel@tonic-gatethat our return value is properly handled, and we pass the next operator
1212*0Sstevel@tonic-gateto run back to the main run loop.
1213*0Sstevel@tonic-gate
1214*0Sstevel@tonic-gateMost of these macros are explained in L<perlapi>, and some of the more
1215*0Sstevel@tonic-gateimportant ones are explained in L<perlxs> as well. Pay special attention
1216*0Sstevel@tonic-gateto L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for information on
1217*0Sstevel@tonic-gatethe C<[pad]THX_?> macros.
1218*0Sstevel@tonic-gate
1219*0Sstevel@tonic-gate=head2 The .i Targets
1220*0Sstevel@tonic-gate
1221*0Sstevel@tonic-gateYou can expand the macros in a F<foo.c> file by saying
1222*0Sstevel@tonic-gate
1223*0Sstevel@tonic-gate    make foo.i
1224*0Sstevel@tonic-gate
1225*0Sstevel@tonic-gatewhich will expand the macros using cpp.  Don't be scared by the results.
1226*0Sstevel@tonic-gate
1227*0Sstevel@tonic-gate=head2 Poking at Perl
1228*0Sstevel@tonic-gate
1229*0Sstevel@tonic-gateTo really poke around with Perl, you'll probably want to build Perl for
1230*0Sstevel@tonic-gatedebugging, like this:
1231*0Sstevel@tonic-gate
1232*0Sstevel@tonic-gate    ./Configure -d -D optimize=-g
1233*0Sstevel@tonic-gate    make
1234*0Sstevel@tonic-gate
1235*0Sstevel@tonic-gateC<-g> is a flag to the C compiler to have it produce debugging
1236*0Sstevel@tonic-gateinformation which will allow us to step through a running program.
1237*0Sstevel@tonic-gateF<Configure> will also turn on the C<DEBUGGING> compilation symbol which
1238*0Sstevel@tonic-gateenables all the internal debugging code in Perl. There are a whole bunch
1239*0Sstevel@tonic-gateof things you can debug with this: L<perlrun> lists them all, and the
1240*0Sstevel@tonic-gatebest way to find out about them is to play about with them. The most
1241*0Sstevel@tonic-gateuseful options are probably
1242*0Sstevel@tonic-gate
1243*0Sstevel@tonic-gate    l  Context (loop) stack processing
1244*0Sstevel@tonic-gate    t  Trace execution
1245*0Sstevel@tonic-gate    o  Method and overloading resolution
1246*0Sstevel@tonic-gate    c  String/numeric conversions
1247*0Sstevel@tonic-gate
1248*0Sstevel@tonic-gateSome of the functionality of the debugging code can be achieved using XS
1249*0Sstevel@tonic-gatemodules.
1250*0Sstevel@tonic-gate
1251*0Sstevel@tonic-gate    -Dr => use re 'debug'
1252*0Sstevel@tonic-gate    -Dx => use O 'Debug'
1253*0Sstevel@tonic-gate
1254*0Sstevel@tonic-gate=head2 Using a source-level debugger
1255*0Sstevel@tonic-gate
1256*0Sstevel@tonic-gateIf the debugging output of C<-D> doesn't help you, it's time to step
1257*0Sstevel@tonic-gatethrough perl's execution with a source-level debugger.
1258*0Sstevel@tonic-gate
1259*0Sstevel@tonic-gate=over 3
1260*0Sstevel@tonic-gate
1261*0Sstevel@tonic-gate=item *
1262*0Sstevel@tonic-gate
1263*0Sstevel@tonic-gateWe'll use C<gdb> for our examples here; the principles will apply to any
1264*0Sstevel@tonic-gatedebugger, but check the manual of the one you're using.
1265*0Sstevel@tonic-gate
1266*0Sstevel@tonic-gate=back
1267*0Sstevel@tonic-gate
1268*0Sstevel@tonic-gateTo fire up the debugger, type
1269*0Sstevel@tonic-gate
1270*0Sstevel@tonic-gate    gdb ./perl
1271*0Sstevel@tonic-gate
1272*0Sstevel@tonic-gateYou'll want to do that in your Perl source tree so the debugger can read
1273*0Sstevel@tonic-gatethe source code. You should see the copyright message, followed by the
1274*0Sstevel@tonic-gateprompt.
1275*0Sstevel@tonic-gate
1276*0Sstevel@tonic-gate    (gdb)
1277*0Sstevel@tonic-gate
1278*0Sstevel@tonic-gateC<help> will get you into the documentation, but here are the most
1279*0Sstevel@tonic-gateuseful commands:
1280*0Sstevel@tonic-gate
1281*0Sstevel@tonic-gate=over 3
1282*0Sstevel@tonic-gate
1283*0Sstevel@tonic-gate=item run [args]
1284*0Sstevel@tonic-gate
1285*0Sstevel@tonic-gateRun the program with the given arguments.
1286*0Sstevel@tonic-gate
1287*0Sstevel@tonic-gate=item break function_name
1288*0Sstevel@tonic-gate
1289*0Sstevel@tonic-gate=item break source.c:xxx
1290*0Sstevel@tonic-gate
1291*0Sstevel@tonic-gateTells the debugger that we'll want to pause execution when we reach
1292*0Sstevel@tonic-gateeither the named function (but see L<perlguts/Internal Functions>!) or the given
1293*0Sstevel@tonic-gateline in the named source file.
1294*0Sstevel@tonic-gate
1295*0Sstevel@tonic-gate=item step
1296*0Sstevel@tonic-gate
1297*0Sstevel@tonic-gateSteps through the program a line at a time.
1298*0Sstevel@tonic-gate
1299*0Sstevel@tonic-gate=item next
1300*0Sstevel@tonic-gate
1301*0Sstevel@tonic-gateSteps through the program a line at a time, without descending into
1302*0Sstevel@tonic-gatefunctions.
1303*0Sstevel@tonic-gate
1304*0Sstevel@tonic-gate=item continue
1305*0Sstevel@tonic-gate
1306*0Sstevel@tonic-gateRun until the next breakpoint.
1307*0Sstevel@tonic-gate
1308*0Sstevel@tonic-gate=item finish
1309*0Sstevel@tonic-gate
1310*0Sstevel@tonic-gateRun until the end of the current function, then stop again.
1311*0Sstevel@tonic-gate
1312*0Sstevel@tonic-gate=item 'enter'
1313*0Sstevel@tonic-gate
1314*0Sstevel@tonic-gateJust pressing Enter will do the most recent operation again - it's a
1315*0Sstevel@tonic-gateblessing when stepping through miles of source code.
1316*0Sstevel@tonic-gate
1317*0Sstevel@tonic-gate=item print
1318*0Sstevel@tonic-gate
1319*0Sstevel@tonic-gateExecute the given C code and print its results. B<WARNING>: Perl makes
1320*0Sstevel@tonic-gateheavy use of macros, and F<gdb> does not necessarily support macros
1321*0Sstevel@tonic-gate(see later L</"gdb macro support">).  You'll have to substitute them
1322*0Sstevel@tonic-gateyourself, or to invoke cpp on the source code files
1323*0Sstevel@tonic-gate(see L</"The .i Targets">)
1324*0Sstevel@tonic-gateSo, for instance, you can't say
1325*0Sstevel@tonic-gate
1326*0Sstevel@tonic-gate    print SvPV_nolen(sv)
1327*0Sstevel@tonic-gate
1328*0Sstevel@tonic-gatebut you have to say
1329*0Sstevel@tonic-gate
1330*0Sstevel@tonic-gate    print Perl_sv_2pv_nolen(sv)
1331*0Sstevel@tonic-gate
1332*0Sstevel@tonic-gate=back
1333*0Sstevel@tonic-gate
1334*0Sstevel@tonic-gateYou may find it helpful to have a "macro dictionary", which you can
1335*0Sstevel@tonic-gateproduce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won't
1336*0Sstevel@tonic-gaterecursively apply those macros for you.
1337*0Sstevel@tonic-gate
1338*0Sstevel@tonic-gate=head2 gdb macro support
1339*0Sstevel@tonic-gate
1340*0Sstevel@tonic-gateRecent versions of F<gdb> have fairly good macro support, but
1341*0Sstevel@tonic-gatein order to use it you'll need to compile perl with macro definitions
1342*0Sstevel@tonic-gateincluded in the debugging information.  Using F<gcc> version 3.1, this
1343*0Sstevel@tonic-gatemeans configuring with C<-Doptimize=-g3>.  Other compilers might use a
1344*0Sstevel@tonic-gatedifferent switch (if they support debugging macros at all).
1345*0Sstevel@tonic-gate
1346*0Sstevel@tonic-gate=head2 Dumping Perl Data Structures
1347*0Sstevel@tonic-gate
1348*0Sstevel@tonic-gateOne way to get around this macro hell is to use the dumping functions in
1349*0Sstevel@tonic-gateF<dump.c>; these work a little like an internal
1350*0Sstevel@tonic-gateL<Devel::Peek|Devel::Peek>, but they also cover OPs and other structures
1351*0Sstevel@tonic-gatethat you can't get at from Perl. Let's take an example. We'll use the
1352*0Sstevel@tonic-gateC<$a = $b + $c> we used before, but give it a bit of context:
1353*0Sstevel@tonic-gateC<$b = "6XXXX"; $c = 2.3;>. Where's a good place to stop and poke around?
1354*0Sstevel@tonic-gate
1355*0Sstevel@tonic-gateWhat about C<pp_add>, the function we examined earlier to implement the
1356*0Sstevel@tonic-gateC<+> operator:
1357*0Sstevel@tonic-gate
1358*0Sstevel@tonic-gate    (gdb) break Perl_pp_add
1359*0Sstevel@tonic-gate    Breakpoint 1 at 0x46249f: file pp_hot.c, line 309.
1360*0Sstevel@tonic-gate
1361*0Sstevel@tonic-gateNotice we use C<Perl_pp_add> and not C<pp_add> - see L<perlguts/Internal Functions>.
1362*0Sstevel@tonic-gateWith the breakpoint in place, we can run our program:
1363*0Sstevel@tonic-gate
1364*0Sstevel@tonic-gate    (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c'
1365*0Sstevel@tonic-gate
1366*0Sstevel@tonic-gateLots of junk will go past as gdb reads in the relevant source files and
1367*0Sstevel@tonic-gatelibraries, and then:
1368*0Sstevel@tonic-gate
1369*0Sstevel@tonic-gate    Breakpoint 1, Perl_pp_add () at pp_hot.c:309
1370*0Sstevel@tonic-gate    309         dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
1371*0Sstevel@tonic-gate    (gdb) step
1372*0Sstevel@tonic-gate    311           dPOPTOPnnrl_ul;
1373*0Sstevel@tonic-gate    (gdb)
1374*0Sstevel@tonic-gate
1375*0Sstevel@tonic-gateWe looked at this bit of code before, and we said that C<dPOPTOPnnrl_ul>
1376*0Sstevel@tonic-gatearranges for two C<NV>s to be placed into C<left> and C<right> - let's
1377*0Sstevel@tonic-gateslightly expand it:
1378*0Sstevel@tonic-gate
1379*0Sstevel@tonic-gate    #define dPOPTOPnnrl_ul  NV right = POPn; \
1380*0Sstevel@tonic-gate                            SV *leftsv = TOPs; \
1381*0Sstevel@tonic-gate                            NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0
1382*0Sstevel@tonic-gate
1383*0Sstevel@tonic-gateC<POPn> takes the SV from the top of the stack and obtains its NV either
1384*0Sstevel@tonic-gatedirectly (if C<SvNOK> is set) or by calling the C<sv_2nv> function.
1385*0Sstevel@tonic-gateC<TOPs> takes the next SV from the top of the stack - yes, C<POPn> uses
1386*0Sstevel@tonic-gateC<TOPs> - but doesn't remove it. We then use C<SvNV> to get the NV from
1387*0Sstevel@tonic-gateC<leftsv> in the same way as before - yes, C<POPn> uses C<SvNV>.
1388*0Sstevel@tonic-gate
1389*0Sstevel@tonic-gateSince we don't have an NV for C<$b>, we'll have to use C<sv_2nv> to
1390*0Sstevel@tonic-gateconvert it. If we step again, we'll find ourselves there:
1391*0Sstevel@tonic-gate
1392*0Sstevel@tonic-gate    Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669
1393*0Sstevel@tonic-gate    1669        if (!sv)
1394*0Sstevel@tonic-gate    (gdb)
1395*0Sstevel@tonic-gate
1396*0Sstevel@tonic-gateWe can now use C<Perl_sv_dump> to investigate the SV:
1397*0Sstevel@tonic-gate
1398*0Sstevel@tonic-gate    SV = PV(0xa057cc0) at 0xa0675d0
1399*0Sstevel@tonic-gate    REFCNT = 1
1400*0Sstevel@tonic-gate    FLAGS = (POK,pPOK)
1401*0Sstevel@tonic-gate    PV = 0xa06a510 "6XXXX"\0
1402*0Sstevel@tonic-gate    CUR = 5
1403*0Sstevel@tonic-gate    LEN = 6
1404*0Sstevel@tonic-gate    $1 = void
1405*0Sstevel@tonic-gate
1406*0Sstevel@tonic-gateWe know we're going to get C<6> from this, so let's finish the
1407*0Sstevel@tonic-gatesubroutine:
1408*0Sstevel@tonic-gate
1409*0Sstevel@tonic-gate    (gdb) finish
1410*0Sstevel@tonic-gate    Run till exit from #0  Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671
1411*0Sstevel@tonic-gate    0x462669 in Perl_pp_add () at pp_hot.c:311
1412*0Sstevel@tonic-gate    311           dPOPTOPnnrl_ul;
1413*0Sstevel@tonic-gate
1414*0Sstevel@tonic-gateWe can also dump out this op: the current op is always stored in
1415*0Sstevel@tonic-gateC<PL_op>, and we can dump it with C<Perl_op_dump>. This'll give us
1416*0Sstevel@tonic-gatesimilar output to L<B::Debug|B::Debug>.
1417*0Sstevel@tonic-gate
1418*0Sstevel@tonic-gate    {
1419*0Sstevel@tonic-gate    13  TYPE = add  ===> 14
1420*0Sstevel@tonic-gate        TARG = 1
1421*0Sstevel@tonic-gate        FLAGS = (SCALAR,KIDS)
1422*0Sstevel@tonic-gate        {
1423*0Sstevel@tonic-gate            TYPE = null  ===> (12)
1424*0Sstevel@tonic-gate              (was rv2sv)
1425*0Sstevel@tonic-gate            FLAGS = (SCALAR,KIDS)
1426*0Sstevel@tonic-gate            {
1427*0Sstevel@tonic-gate    11          TYPE = gvsv  ===> 12
1428*0Sstevel@tonic-gate                FLAGS = (SCALAR)
1429*0Sstevel@tonic-gate                GV = main::b
1430*0Sstevel@tonic-gate            }
1431*0Sstevel@tonic-gate        }
1432*0Sstevel@tonic-gate
1433*0Sstevel@tonic-gate# finish this later #
1434*0Sstevel@tonic-gate
1435*0Sstevel@tonic-gate=head2 Patching
1436*0Sstevel@tonic-gate
1437*0Sstevel@tonic-gateAll right, we've now had a look at how to navigate the Perl sources and
1438*0Sstevel@tonic-gatesome things you'll need to know when fiddling with them. Let's now get
1439*0Sstevel@tonic-gateon and create a simple patch. Here's something Larry suggested: if a
1440*0Sstevel@tonic-gateC<U> is the first active format during a C<pack>, (for example,
1441*0Sstevel@tonic-gateC<pack "U3C8", @stuff>) then the resulting string should be treated as
1442*0Sstevel@tonic-gateUTF-8 encoded.
1443*0Sstevel@tonic-gate
1444*0Sstevel@tonic-gateHow do we prepare to fix this up? First we locate the code in question -
1445*0Sstevel@tonic-gatethe C<pack> happens at runtime, so it's going to be in one of the F<pp>
1446*0Sstevel@tonic-gatefiles. Sure enough, C<pp_pack> is in F<pp.c>. Since we're going to be
1447*0Sstevel@tonic-gatealtering this file, let's copy it to F<pp.c~>.
1448*0Sstevel@tonic-gate
1449*0Sstevel@tonic-gate[Well, it was in F<pp.c> when this tutorial was written. It has now been
1450*0Sstevel@tonic-gatesplit off with C<pp_unpack> to its own file, F<pp_pack.c>]
1451*0Sstevel@tonic-gate
1452*0Sstevel@tonic-gateNow let's look over C<pp_pack>: we take a pattern into C<pat>, and then
1453*0Sstevel@tonic-gateloop over the pattern, taking each format character in turn into
1454*0Sstevel@tonic-gateC<datum_type>. Then for each possible format character, we swallow up
1455*0Sstevel@tonic-gatethe other arguments in the pattern (a field width, an asterisk, and so
1456*0Sstevel@tonic-gateon) and convert the next chunk input into the specified format, adding
1457*0Sstevel@tonic-gateit onto the output SV C<cat>.
1458*0Sstevel@tonic-gate
1459*0Sstevel@tonic-gateHow do we know if the C<U> is the first format in the C<pat>? Well, if
1460*0Sstevel@tonic-gatewe have a pointer to the start of C<pat> then, if we see a C<U> we can
1461*0Sstevel@tonic-gatetest whether we're still at the start of the string. So, here's where
1462*0Sstevel@tonic-gateC<pat> is set up:
1463*0Sstevel@tonic-gate
1464*0Sstevel@tonic-gate    STRLEN fromlen;
1465*0Sstevel@tonic-gate    register char *pat = SvPVx(*++MARK, fromlen);
1466*0Sstevel@tonic-gate    register char *patend = pat + fromlen;
1467*0Sstevel@tonic-gate    register I32 len;
1468*0Sstevel@tonic-gate    I32 datumtype;
1469*0Sstevel@tonic-gate    SV *fromstr;
1470*0Sstevel@tonic-gate
1471*0Sstevel@tonic-gateWe'll have another string pointer in there:
1472*0Sstevel@tonic-gate
1473*0Sstevel@tonic-gate    STRLEN fromlen;
1474*0Sstevel@tonic-gate    register char *pat = SvPVx(*++MARK, fromlen);
1475*0Sstevel@tonic-gate    register char *patend = pat + fromlen;
1476*0Sstevel@tonic-gate +  char *patcopy;
1477*0Sstevel@tonic-gate    register I32 len;
1478*0Sstevel@tonic-gate    I32 datumtype;
1479*0Sstevel@tonic-gate    SV *fromstr;
1480*0Sstevel@tonic-gate
1481*0Sstevel@tonic-gateAnd just before we start the loop, we'll set C<patcopy> to be the start
1482*0Sstevel@tonic-gateof C<pat>:
1483*0Sstevel@tonic-gate
1484*0Sstevel@tonic-gate    items = SP - MARK;
1485*0Sstevel@tonic-gate    MARK++;
1486*0Sstevel@tonic-gate    sv_setpvn(cat, "", 0);
1487*0Sstevel@tonic-gate +  patcopy = pat;
1488*0Sstevel@tonic-gate    while (pat < patend) {
1489*0Sstevel@tonic-gate
1490*0Sstevel@tonic-gateNow if we see a C<U> which was at the start of the string, we turn on
1491*0Sstevel@tonic-gatethe C<UTF8> flag for the output SV, C<cat>:
1492*0Sstevel@tonic-gate
1493*0Sstevel@tonic-gate +  if (datumtype == 'U' && pat==patcopy+1)
1494*0Sstevel@tonic-gate +      SvUTF8_on(cat);
1495*0Sstevel@tonic-gate    if (datumtype == '#') {
1496*0Sstevel@tonic-gate        while (pat < patend && *pat != '\n')
1497*0Sstevel@tonic-gate            pat++;
1498*0Sstevel@tonic-gate
1499*0Sstevel@tonic-gateRemember that it has to be C<patcopy+1> because the first character of
1500*0Sstevel@tonic-gatethe string is the C<U> which has been swallowed into C<datumtype!>
1501*0Sstevel@tonic-gate
1502*0Sstevel@tonic-gateOops, we forgot one thing: what if there are spaces at the start of the
1503*0Sstevel@tonic-gatepattern? C<pack("  U*", @stuff)> will have C<U> as the first active
1504*0Sstevel@tonic-gatecharacter, even though it's not the first thing in the pattern. In this
1505*0Sstevel@tonic-gatecase, we have to advance C<patcopy> along with C<pat> when we see spaces:
1506*0Sstevel@tonic-gate
1507*0Sstevel@tonic-gate    if (isSPACE(datumtype))
1508*0Sstevel@tonic-gate        continue;
1509*0Sstevel@tonic-gate
1510*0Sstevel@tonic-gateneeds to become
1511*0Sstevel@tonic-gate
1512*0Sstevel@tonic-gate    if (isSPACE(datumtype)) {
1513*0Sstevel@tonic-gate        patcopy++;
1514*0Sstevel@tonic-gate        continue;
1515*0Sstevel@tonic-gate    }
1516*0Sstevel@tonic-gate
1517*0Sstevel@tonic-gateOK. That's the C part done. Now we must do two additional things before
1518*0Sstevel@tonic-gatethis patch is ready to go: we've changed the behaviour of Perl, and so
1519*0Sstevel@tonic-gatewe must document that change. We must also provide some more regression
1520*0Sstevel@tonic-gatetests to make sure our patch works and doesn't create a bug somewhere
1521*0Sstevel@tonic-gateelse along the line.
1522*0Sstevel@tonic-gate
1523*0Sstevel@tonic-gateThe regression tests for each operator live in F<t/op/>, and so we
1524*0Sstevel@tonic-gatemake a copy of F<t/op/pack.t> to F<t/op/pack.t~>. Now we can add our
1525*0Sstevel@tonic-gatetests to the end. First, we'll test that the C<U> does indeed create
1526*0Sstevel@tonic-gateUnicode strings.
1527*0Sstevel@tonic-gate
1528*0Sstevel@tonic-gatet/op/pack.t has a sensible ok() function, but if it didn't we could
1529*0Sstevel@tonic-gateuse the one from t/test.pl.
1530*0Sstevel@tonic-gate
1531*0Sstevel@tonic-gate require './test.pl';
1532*0Sstevel@tonic-gate plan( tests => 159 );
1533*0Sstevel@tonic-gate
1534*0Sstevel@tonic-gateso instead of this:
1535*0Sstevel@tonic-gate
1536*0Sstevel@tonic-gate print 'not ' unless "1.20.300.4000" eq sprintf "%vd", pack("U*",1,20,300,4000);
1537*0Sstevel@tonic-gate print "ok $test\n"; $test++;
1538*0Sstevel@tonic-gate
1539*0Sstevel@tonic-gatewe can write the more sensible (see L<Test::More> for a full
1540*0Sstevel@tonic-gateexplanation of is() and other testing functions).
1541*0Sstevel@tonic-gate
1542*0Sstevel@tonic-gate is( "1.20.300.4000", sprintf "%vd", pack("U*",1,20,300,4000),
1543*0Sstevel@tonic-gate                                       "U* produces unicode" );
1544*0Sstevel@tonic-gate
1545*0Sstevel@tonic-gateNow we'll test that we got that space-at-the-beginning business right:
1546*0Sstevel@tonic-gate
1547*0Sstevel@tonic-gate is( "1.20.300.4000", sprintf "%vd", pack("  U*",1,20,300,4000),
1548*0Sstevel@tonic-gate                                       "  with spaces at the beginning" );
1549*0Sstevel@tonic-gate
1550*0Sstevel@tonic-gateAnd finally we'll test that we don't make Unicode strings if C<U> is B<not>
1551*0Sstevel@tonic-gatethe first active format:
1552*0Sstevel@tonic-gate
1553*0Sstevel@tonic-gate isnt( v1.20.300.4000, sprintf "%vd", pack("C0U*",1,20,300,4000),
1554*0Sstevel@tonic-gate                                       "U* not first isn't unicode" );
1555*0Sstevel@tonic-gate
1556*0Sstevel@tonic-gateMustn't forget to change the number of tests which appears at the top,
1557*0Sstevel@tonic-gateor else the automated tester will get confused.  This will either look
1558*0Sstevel@tonic-gatelike this:
1559*0Sstevel@tonic-gate
1560*0Sstevel@tonic-gate print "1..156\n";
1561*0Sstevel@tonic-gate
1562*0Sstevel@tonic-gateor this:
1563*0Sstevel@tonic-gate
1564*0Sstevel@tonic-gate plan( tests => 156 );
1565*0Sstevel@tonic-gate
1566*0Sstevel@tonic-gateWe now compile up Perl, and run it through the test suite. Our new
1567*0Sstevel@tonic-gatetests pass, hooray!
1568*0Sstevel@tonic-gate
1569*0Sstevel@tonic-gateFinally, the documentation. The job is never done until the paperwork is
1570*0Sstevel@tonic-gateover, so let's describe the change we've just made. The relevant place
1571*0Sstevel@tonic-gateis F<pod/perlfunc.pod>; again, we make a copy, and then we'll insert
1572*0Sstevel@tonic-gatethis text in the description of C<pack>:
1573*0Sstevel@tonic-gate
1574*0Sstevel@tonic-gate =item *
1575*0Sstevel@tonic-gate
1576*0Sstevel@tonic-gate If the pattern begins with a C<U>, the resulting string will be treated
1577*0Sstevel@tonic-gate as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string
1578*0Sstevel@tonic-gate with an initial C<U0>, and the bytes that follow will be interpreted as
1579*0Sstevel@tonic-gate Unicode characters. If you don't want this to happen, you can begin your
1580*0Sstevel@tonic-gate pattern with C<C0> (or anything else) to force Perl not to UTF-8 encode your
1581*0Sstevel@tonic-gate string, and then follow this with a C<U*> somewhere in your pattern.
1582*0Sstevel@tonic-gate
1583*0Sstevel@tonic-gateAll done. Now let's create the patch. F<Porting/patching.pod> tells us
1584*0Sstevel@tonic-gatethat if we're making major changes, we should copy the entire directory
1585*0Sstevel@tonic-gateto somewhere safe before we begin fiddling, and then do
1586*0Sstevel@tonic-gate
1587*0Sstevel@tonic-gate    diff -ruN old new > patch
1588*0Sstevel@tonic-gate
1589*0Sstevel@tonic-gateHowever, we know which files we've changed, and we can simply do this:
1590*0Sstevel@tonic-gate
1591*0Sstevel@tonic-gate    diff -u pp.c~             pp.c             >  patch
1592*0Sstevel@tonic-gate    diff -u t/op/pack.t~      t/op/pack.t      >> patch
1593*0Sstevel@tonic-gate    diff -u pod/perlfunc.pod~ pod/perlfunc.pod >> patch
1594*0Sstevel@tonic-gate
1595*0Sstevel@tonic-gateWe end up with a patch looking a little like this:
1596*0Sstevel@tonic-gate
1597*0Sstevel@tonic-gate    --- pp.c~       Fri Jun 02 04:34:10 2000
1598*0Sstevel@tonic-gate    +++ pp.c        Fri Jun 16 11:37:25 2000
1599*0Sstevel@tonic-gate    @@ -4375,6 +4375,7 @@
1600*0Sstevel@tonic-gate         register I32 items;
1601*0Sstevel@tonic-gate         STRLEN fromlen;
1602*0Sstevel@tonic-gate         register char *pat = SvPVx(*++MARK, fromlen);
1603*0Sstevel@tonic-gate    +    char *patcopy;
1604*0Sstevel@tonic-gate         register char *patend = pat + fromlen;
1605*0Sstevel@tonic-gate         register I32 len;
1606*0Sstevel@tonic-gate         I32 datumtype;
1607*0Sstevel@tonic-gate    @@ -4405,6 +4406,7 @@
1608*0Sstevel@tonic-gate    ...
1609*0Sstevel@tonic-gate
1610*0Sstevel@tonic-gateAnd finally, we submit it, with our rationale, to perl5-porters. Job
1611*0Sstevel@tonic-gatedone!
1612*0Sstevel@tonic-gate
1613*0Sstevel@tonic-gate=head2 Patching a core module
1614*0Sstevel@tonic-gate
1615*0Sstevel@tonic-gateThis works just like patching anything else, with an extra
1616*0Sstevel@tonic-gateconsideration.  Many core modules also live on CPAN.  If this is so,
1617*0Sstevel@tonic-gatepatch the CPAN version instead of the core and send the patch off to
1618*0Sstevel@tonic-gatethe module maintainer (with a copy to p5p).  This will help the module
1619*0Sstevel@tonic-gatemaintainer keep the CPAN version in sync with the core version without
1620*0Sstevel@tonic-gateconstantly scanning p5p.
1621*0Sstevel@tonic-gate
1622*0Sstevel@tonic-gate=head2 Adding a new function to the core
1623*0Sstevel@tonic-gate
1624*0Sstevel@tonic-gateIf, as part of a patch to fix a bug, or just because you have an
1625*0Sstevel@tonic-gateespecially good idea, you decide to add a new function to the core,
1626*0Sstevel@tonic-gatediscuss your ideas on p5p well before you start work.  It may be that
1627*0Sstevel@tonic-gatesomeone else has already attempted to do what you are considering and
1628*0Sstevel@tonic-gatecan give lots of good advice or even provide you with bits of code
1629*0Sstevel@tonic-gatethat they already started (but never finished).
1630*0Sstevel@tonic-gate
1631*0Sstevel@tonic-gateYou have to follow all of the advice given above for patching.  It is
1632*0Sstevel@tonic-gateextremely important to test any addition thoroughly and add new tests
1633*0Sstevel@tonic-gateto explore all boundary conditions that your new function is expected
1634*0Sstevel@tonic-gateto handle.  If your new function is used only by one module (e.g. toke),
1635*0Sstevel@tonic-gatethen it should probably be named S_your_function (for static); on the
1636*0Sstevel@tonic-gateother hand, if you expect it to accessible from other functions in
1637*0Sstevel@tonic-gatePerl, you should name it Perl_your_function.  See L<perlguts/Internal Functions>
1638*0Sstevel@tonic-gatefor more details.
1639*0Sstevel@tonic-gate
1640*0Sstevel@tonic-gateThe location of any new code is also an important consideration.  Don't
1641*0Sstevel@tonic-gatejust create a new top level .c file and put your code there; you would
1642*0Sstevel@tonic-gatehave to make changes to Configure (so the Makefile is created properly),
1643*0Sstevel@tonic-gateas well as possibly lots of include files.  This is strictly pumpking
1644*0Sstevel@tonic-gatebusiness.
1645*0Sstevel@tonic-gate
1646*0Sstevel@tonic-gateIt is better to add your function to one of the existing top level
1647*0Sstevel@tonic-gatesource code files, but your choice is complicated by the nature of
1648*0Sstevel@tonic-gatethe Perl distribution.  Only the files that are marked as compiled
1649*0Sstevel@tonic-gatestatic are located in the perl executable.  Everything else is located
1650*0Sstevel@tonic-gatein the shared library (or DLL if you are running under WIN32).  So,
1651*0Sstevel@tonic-gatefor example, if a function was only used by functions located in
1652*0Sstevel@tonic-gatetoke.c, then your code can go in toke.c.  If, however, you want to call
1653*0Sstevel@tonic-gatethe function from universal.c, then you should put your code in another
1654*0Sstevel@tonic-gatelocation, for example util.c.
1655*0Sstevel@tonic-gate
1656*0Sstevel@tonic-gateIn addition to writing your c-code, you will need to create an
1657*0Sstevel@tonic-gateappropriate entry in embed.pl describing your function, then run
1658*0Sstevel@tonic-gate'make regen_headers' to create the entries in the numerous header
1659*0Sstevel@tonic-gatefiles that perl needs to compile correctly.  See L<perlguts/Internal Functions>
1660*0Sstevel@tonic-gatefor information on the various options that you can set in embed.pl.
1661*0Sstevel@tonic-gateYou will forget to do this a few (or many) times and you will get
1662*0Sstevel@tonic-gatewarnings during the compilation phase.  Make sure that you mention
1663*0Sstevel@tonic-gatethis when you post your patch to P5P; the pumpking needs to know this.
1664*0Sstevel@tonic-gate
1665*0Sstevel@tonic-gateWhen you write your new code, please be conscious of existing code
1666*0Sstevel@tonic-gateconventions used in the perl source files.  See L<perlstyle> for
1667*0Sstevel@tonic-gatedetails.  Although most of the guidelines discussed seem to focus on
1668*0Sstevel@tonic-gatePerl code, rather than c, they all apply (except when they don't ;).
1669*0Sstevel@tonic-gateSee also I<Porting/patching.pod> file in the Perl source distribution
1670*0Sstevel@tonic-gatefor lots of details about both formatting and submitting patches of
1671*0Sstevel@tonic-gateyour changes.
1672*0Sstevel@tonic-gate
1673*0Sstevel@tonic-gateLastly, TEST TEST TEST TEST TEST any code before posting to p5p.
1674*0Sstevel@tonic-gateTest on as many platforms as you can find.  Test as many perl
1675*0Sstevel@tonic-gateConfigure options as you can (e.g. MULTIPLICITY).  If you have
1676*0Sstevel@tonic-gateprofiling or memory tools, see L<EXTERNAL TOOLS FOR DEBUGGING PERL>
1677*0Sstevel@tonic-gatebelow for how to use them to further test your code.  Remember that
1678*0Sstevel@tonic-gatemost of the people on P5P are doing this on their own time and
1679*0Sstevel@tonic-gatedon't have the time to debug your code.
1680*0Sstevel@tonic-gate
1681*0Sstevel@tonic-gate=head2 Writing a test
1682*0Sstevel@tonic-gate
1683*0Sstevel@tonic-gateEvery module and built-in function has an associated test file (or
1684*0Sstevel@tonic-gateshould...).  If you add or change functionality, you have to write a
1685*0Sstevel@tonic-gatetest.  If you fix a bug, you have to write a test so that bug never
1686*0Sstevel@tonic-gatecomes back.  If you alter the docs, it would be nice to test what the
1687*0Sstevel@tonic-gatenew documentation says.
1688*0Sstevel@tonic-gate
1689*0Sstevel@tonic-gateIn short, if you submit a patch you probably also have to patch the
1690*0Sstevel@tonic-gatetests.
1691*0Sstevel@tonic-gate
1692*0Sstevel@tonic-gateFor modules, the test file is right next to the module itself.
1693*0Sstevel@tonic-gateF<lib/strict.t> tests F<lib/strict.pm>.  This is a recent innovation,
1694*0Sstevel@tonic-gateso there are some snags (and it would be wonderful for you to brush
1695*0Sstevel@tonic-gatethem out), but it basically works that way.  Everything else lives in
1696*0Sstevel@tonic-gateF<t/>.
1697*0Sstevel@tonic-gate
1698*0Sstevel@tonic-gate=over 3
1699*0Sstevel@tonic-gate
1700*0Sstevel@tonic-gate=item F<t/base/>
1701*0Sstevel@tonic-gate
1702*0Sstevel@tonic-gateTesting of the absolute basic functionality of Perl.  Things like
1703*0Sstevel@tonic-gateC<if>, basic file reads and writes, simple regexes, etc.  These are
1704*0Sstevel@tonic-gaterun first in the test suite and if any of them fail, something is
1705*0Sstevel@tonic-gateI<really> broken.
1706*0Sstevel@tonic-gate
1707*0Sstevel@tonic-gate=item F<t/cmd/>
1708*0Sstevel@tonic-gate
1709*0Sstevel@tonic-gateThese test the basic control structures, C<if/else>, C<while>,
1710*0Sstevel@tonic-gatesubroutines, etc.
1711*0Sstevel@tonic-gate
1712*0Sstevel@tonic-gate=item F<t/comp/>
1713*0Sstevel@tonic-gate
1714*0Sstevel@tonic-gateTests basic issues of how Perl parses and compiles itself.
1715*0Sstevel@tonic-gate
1716*0Sstevel@tonic-gate=item F<t/io/>
1717*0Sstevel@tonic-gate
1718*0Sstevel@tonic-gateTests for built-in IO functions, including command line arguments.
1719*0Sstevel@tonic-gate
1720*0Sstevel@tonic-gate=item F<t/lib/>
1721*0Sstevel@tonic-gate
1722*0Sstevel@tonic-gateThe old home for the module tests, you shouldn't put anything new in
1723*0Sstevel@tonic-gatehere.  There are still some bits and pieces hanging around in here
1724*0Sstevel@tonic-gatethat need to be moved.  Perhaps you could move them?  Thanks!
1725*0Sstevel@tonic-gate
1726*0Sstevel@tonic-gate=item F<t/op/>
1727*0Sstevel@tonic-gate
1728*0Sstevel@tonic-gateTests for perl's built in functions that don't fit into any of the
1729*0Sstevel@tonic-gateother directories.
1730*0Sstevel@tonic-gate
1731*0Sstevel@tonic-gate=item F<t/pod/>
1732*0Sstevel@tonic-gate
1733*0Sstevel@tonic-gateTests for POD directives.  There are still some tests for the Pod
1734*0Sstevel@tonic-gatemodules hanging around in here that need to be moved out into F<lib/>.
1735*0Sstevel@tonic-gate
1736*0Sstevel@tonic-gate=item F<t/run/>
1737*0Sstevel@tonic-gate
1738*0Sstevel@tonic-gateTesting features of how perl actually runs, including exit codes and
1739*0Sstevel@tonic-gatehandling of PERL* environment variables.
1740*0Sstevel@tonic-gate
1741*0Sstevel@tonic-gate=item F<t/uni/>
1742*0Sstevel@tonic-gate
1743*0Sstevel@tonic-gateTests for the core support of Unicode.
1744*0Sstevel@tonic-gate
1745*0Sstevel@tonic-gate=item F<t/win32/>
1746*0Sstevel@tonic-gate
1747*0Sstevel@tonic-gateWindows-specific tests.
1748*0Sstevel@tonic-gate
1749*0Sstevel@tonic-gate=item F<t/x2p>
1750*0Sstevel@tonic-gate
1751*0Sstevel@tonic-gateA test suite for the s2p converter.
1752*0Sstevel@tonic-gate
1753*0Sstevel@tonic-gate=back
1754*0Sstevel@tonic-gate
1755*0Sstevel@tonic-gateThe core uses the same testing style as the rest of Perl, a simple
1756*0Sstevel@tonic-gate"ok/not ok" run through Test::Harness, but there are a few special
1757*0Sstevel@tonic-gateconsiderations.
1758*0Sstevel@tonic-gate
1759*0Sstevel@tonic-gateThere are three ways to write a test in the core.  Test::More,
1760*0Sstevel@tonic-gatet/test.pl and ad hoc C<print $test ? "ok 42\n" : "not ok 42\n">.  The
1761*0Sstevel@tonic-gatedecision of which to use depends on what part of the test suite you're
1762*0Sstevel@tonic-gateworking on.  This is a measure to prevent a high-level failure (such
1763*0Sstevel@tonic-gateas Config.pm breaking) from causing basic functionality tests to fail.
1764*0Sstevel@tonic-gate
1765*0Sstevel@tonic-gate=over 4
1766*0Sstevel@tonic-gate
1767*0Sstevel@tonic-gate=item t/base t/comp
1768*0Sstevel@tonic-gate
1769*0Sstevel@tonic-gateSince we don't know if require works, or even subroutines, use ad hoc
1770*0Sstevel@tonic-gatetests for these two.  Step carefully to avoid using the feature being
1771*0Sstevel@tonic-gatetested.
1772*0Sstevel@tonic-gate
1773*0Sstevel@tonic-gate=item t/cmd t/run t/io t/op
1774*0Sstevel@tonic-gate
1775*0Sstevel@tonic-gateNow that basic require() and subroutines are tested, you can use the
1776*0Sstevel@tonic-gatet/test.pl library which emulates the important features of Test::More
1777*0Sstevel@tonic-gatewhile using a minimum of core features.
1778*0Sstevel@tonic-gate
1779*0Sstevel@tonic-gateYou can also conditionally use certain libraries like Config, but be
1780*0Sstevel@tonic-gatesure to skip the test gracefully if it's not there.
1781*0Sstevel@tonic-gate
1782*0Sstevel@tonic-gate=item t/lib ext lib
1783*0Sstevel@tonic-gate
1784*0Sstevel@tonic-gateNow that the core of Perl is tested, Test::More can be used.  You can
1785*0Sstevel@tonic-gatealso use the full suite of core modules in the tests.
1786*0Sstevel@tonic-gate
1787*0Sstevel@tonic-gate=back
1788*0Sstevel@tonic-gate
1789*0Sstevel@tonic-gateWhen you say "make test" Perl uses the F<t/TEST> program to run the
1790*0Sstevel@tonic-gatetest suite.  All tests are run from the F<t/> directory, B<not> the
1791*0Sstevel@tonic-gatedirectory which contains the test.  This causes some problems with the
1792*0Sstevel@tonic-gatetests in F<lib/>, so here's some opportunity for some patching.
1793*0Sstevel@tonic-gate
1794*0Sstevel@tonic-gateYou must be triply conscious of cross-platform concerns.  This usually
1795*0Sstevel@tonic-gateboils down to using File::Spec and avoiding things like C<fork()> and
1796*0Sstevel@tonic-gateC<system()> unless absolutely necessary.
1797*0Sstevel@tonic-gate
1798*0Sstevel@tonic-gate=head2 Special Make Test Targets
1799*0Sstevel@tonic-gate
1800*0Sstevel@tonic-gateThere are various special make targets that can be used to test Perl
1801*0Sstevel@tonic-gateslightly differently than the standard "test" target.  Not all them
1802*0Sstevel@tonic-gateare expected to give a 100% success rate.  Many of them have several
1803*0Sstevel@tonic-gatealiases.
1804*0Sstevel@tonic-gate
1805*0Sstevel@tonic-gate=over 4
1806*0Sstevel@tonic-gate
1807*0Sstevel@tonic-gate=item coretest
1808*0Sstevel@tonic-gate
1809*0Sstevel@tonic-gateRun F<perl> on all core tests (F<t/*> and F<lib/[a-z]*> pragma tests).
1810*0Sstevel@tonic-gate
1811*0Sstevel@tonic-gate=item test.deparse
1812*0Sstevel@tonic-gate
1813*0Sstevel@tonic-gateRun all the tests through B::Deparse.  Not all tests will succeed.
1814*0Sstevel@tonic-gate
1815*0Sstevel@tonic-gate=item test.taintwarn
1816*0Sstevel@tonic-gate
1817*0Sstevel@tonic-gateRun all tests with the B<-t> command-line switch.  Not all tests
1818*0Sstevel@tonic-gateare expected to succeed (until they're specifically fixed, of course).
1819*0Sstevel@tonic-gate
1820*0Sstevel@tonic-gate=item minitest
1821*0Sstevel@tonic-gate
1822*0Sstevel@tonic-gateRun F<miniperl> on F<t/base>, F<t/comp>, F<t/cmd>, F<t/run>, F<t/io>,
1823*0Sstevel@tonic-gateF<t/op>, and F<t/uni> tests.
1824*0Sstevel@tonic-gate
1825*0Sstevel@tonic-gate=item test.valgrind check.valgrind utest.valgrind ucheck.valgrind
1826*0Sstevel@tonic-gate
1827*0Sstevel@tonic-gate(Only in Linux) Run all the tests using the memory leak + naughty
1828*0Sstevel@tonic-gatememory access tool "valgrind".  The log files will be named
1829*0Sstevel@tonic-gateF<testname.valgrind>.
1830*0Sstevel@tonic-gate
1831*0Sstevel@tonic-gate=item test.third check.third utest.third ucheck.third
1832*0Sstevel@tonic-gate
1833*0Sstevel@tonic-gate(Only in Tru64)  Run all the tests using the memory leak + naughty
1834*0Sstevel@tonic-gatememory access tool "Third Degree".  The log files will be named
1835*0Sstevel@tonic-gateF<perl3.log.testname>.
1836*0Sstevel@tonic-gate
1837*0Sstevel@tonic-gate=item test.torture torturetest
1838*0Sstevel@tonic-gate
1839*0Sstevel@tonic-gateRun all the usual tests and some extra tests.  As of Perl 5.8.0 the
1840*0Sstevel@tonic-gateonly extra tests are Abigail's JAPHs, F<t/japh/abigail.t>.
1841*0Sstevel@tonic-gate
1842*0Sstevel@tonic-gateYou can also run the torture test with F<t/harness> by giving
1843*0Sstevel@tonic-gateC<-torture> argument to F<t/harness>.
1844*0Sstevel@tonic-gate
1845*0Sstevel@tonic-gate=item utest ucheck test.utf8 check.utf8
1846*0Sstevel@tonic-gate
1847*0Sstevel@tonic-gateRun all the tests with -Mutf8.  Not all tests will succeed.
1848*0Sstevel@tonic-gate
1849*0Sstevel@tonic-gate=item test_harness
1850*0Sstevel@tonic-gate
1851*0Sstevel@tonic-gateRun the test suite with the F<t/harness> controlling program, instead of
1852*0Sstevel@tonic-gateF<t/TEST>. F<t/harness> is more sophisticated, and uses the
1853*0Sstevel@tonic-gateL<Test::Harness> module, thus using this test target supposes that perl
1854*0Sstevel@tonic-gatemostly works. The main advantage for our purposes is that it prints a
1855*0Sstevel@tonic-gatedetailed summary of failed tests at the end. Also, unlike F<t/TEST>, it
1856*0Sstevel@tonic-gatedoesn't redirect stderr to stdout.
1857*0Sstevel@tonic-gate
1858*0Sstevel@tonic-gate=back
1859*0Sstevel@tonic-gate
1860*0Sstevel@tonic-gate=head2 Running tests by hand
1861*0Sstevel@tonic-gate
1862*0Sstevel@tonic-gateYou can run part of the test suite by hand by using one the following
1863*0Sstevel@tonic-gatecommands from the F<t/> directory :
1864*0Sstevel@tonic-gate
1865*0Sstevel@tonic-gate    ./perl -I../lib TEST list-of-.t-files
1866*0Sstevel@tonic-gate
1867*0Sstevel@tonic-gateor
1868*0Sstevel@tonic-gate
1869*0Sstevel@tonic-gate    ./perl -I../lib harness list-of-.t-files
1870*0Sstevel@tonic-gate
1871*0Sstevel@tonic-gate(if you don't specify test scripts, the whole test suite will be run.)
1872*0Sstevel@tonic-gate
1873*0Sstevel@tonic-gateYou can run an individual test by a command similar to
1874*0Sstevel@tonic-gate
1875*0Sstevel@tonic-gate    ./perl -I../lib patho/to/foo.t
1876*0Sstevel@tonic-gate
1877*0Sstevel@tonic-gateexcept that the harnesses set up some environment variables that may
1878*0Sstevel@tonic-gateaffect the execution of the test :
1879*0Sstevel@tonic-gate
1880*0Sstevel@tonic-gate=over 4
1881*0Sstevel@tonic-gate
1882*0Sstevel@tonic-gate=item PERL_CORE=1
1883*0Sstevel@tonic-gate
1884*0Sstevel@tonic-gateindicates that we're running this test part of the perl core test suite.
1885*0Sstevel@tonic-gateThis is useful for modules that have a dual life on CPAN.
1886*0Sstevel@tonic-gate
1887*0Sstevel@tonic-gate=item PERL_DESTRUCT_LEVEL=2
1888*0Sstevel@tonic-gate
1889*0Sstevel@tonic-gateis set to 2 if it isn't set already (see L</PERL_DESTRUCT_LEVEL>)
1890*0Sstevel@tonic-gate
1891*0Sstevel@tonic-gate=item PERL
1892*0Sstevel@tonic-gate
1893*0Sstevel@tonic-gate(used only by F<t/TEST>) if set, overrides the path to the perl executable
1894*0Sstevel@tonic-gatethat should be used to run the tests (the default being F<./perl>).
1895*0Sstevel@tonic-gate
1896*0Sstevel@tonic-gate=item PERL_SKIP_TTY_TEST
1897*0Sstevel@tonic-gate
1898*0Sstevel@tonic-gateif set, tells to skip the tests that need a terminal. It's actually set
1899*0Sstevel@tonic-gateautomatically by the Makefile, but can also be forced artificially by
1900*0Sstevel@tonic-gaterunning 'make test_notty'.
1901*0Sstevel@tonic-gate
1902*0Sstevel@tonic-gate=back
1903*0Sstevel@tonic-gate
1904*0Sstevel@tonic-gate=head1 EXTERNAL TOOLS FOR DEBUGGING PERL
1905*0Sstevel@tonic-gate
1906*0Sstevel@tonic-gateSometimes it helps to use external tools while debugging and
1907*0Sstevel@tonic-gatetesting Perl.  This section tries to guide you through using
1908*0Sstevel@tonic-gatesome common testing and debugging tools with Perl.  This is
1909*0Sstevel@tonic-gatemeant as a guide to interfacing these tools with Perl, not
1910*0Sstevel@tonic-gateas any kind of guide to the use of the tools themselves.
1911*0Sstevel@tonic-gate
1912*0Sstevel@tonic-gateB<NOTE 1>: Running under memory debuggers such as Purify, valgrind, or
1913*0Sstevel@tonic-gateThird Degree greatly slows down the execution: seconds become minutes,
1914*0Sstevel@tonic-gateminutes become hours.  For example as of Perl 5.8.1, the
1915*0Sstevel@tonic-gateext/Encode/t/Unicode.t takes extraordinarily long to complete under
1916*0Sstevel@tonic-gatee.g. Purify, Third Degree, and valgrind.  Under valgrind it takes more
1917*0Sstevel@tonic-gatethan six hours, even on a snappy computer-- the said test must be
1918*0Sstevel@tonic-gatedoing something that is quite unfriendly for memory debuggers.  If you
1919*0Sstevel@tonic-gatedon't feel like waiting, that you can simply kill away the perl
1920*0Sstevel@tonic-gateprocess.
1921*0Sstevel@tonic-gate
1922*0Sstevel@tonic-gateB<NOTE 2>: To minimize the number of memory leak false alarms (see
1923*0Sstevel@tonic-gateL</PERL_DESTRUCT_LEVEL> for more information), you have to have
1924*0Sstevel@tonic-gateenvironment variable PERL_DESTRUCT_LEVEL set to 2.  The F<TEST>
1925*0Sstevel@tonic-gateand harness scripts do that automatically.  But if you are running
1926*0Sstevel@tonic-gatesome of the tests manually-- for csh-like shells:
1927*0Sstevel@tonic-gate
1928*0Sstevel@tonic-gate    setenv PERL_DESTRUCT_LEVEL 2
1929*0Sstevel@tonic-gate
1930*0Sstevel@tonic-gateand for Bourne-type shells:
1931*0Sstevel@tonic-gate
1932*0Sstevel@tonic-gate    PERL_DESTRUCT_LEVEL=2
1933*0Sstevel@tonic-gate    export PERL_DESTRUCT_LEVEL
1934*0Sstevel@tonic-gate
1935*0Sstevel@tonic-gateor in UNIXy environments you can also use the C<env> command:
1936*0Sstevel@tonic-gate
1937*0Sstevel@tonic-gate    env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ...
1938*0Sstevel@tonic-gate
1939*0Sstevel@tonic-gateB<NOTE 3>: There are known memory leaks when there are compile-time
1940*0Sstevel@tonic-gateerrors within eval or require, seeing C<S_doeval> in the call stack
1941*0Sstevel@tonic-gateis a good sign of these.  Fixing these leaks is non-trivial,
1942*0Sstevel@tonic-gateunfortunately, but they must be fixed eventually.
1943*0Sstevel@tonic-gate
1944*0Sstevel@tonic-gate=head2 Rational Software's Purify
1945*0Sstevel@tonic-gate
1946*0Sstevel@tonic-gatePurify is a commercial tool that is helpful in identifying
1947*0Sstevel@tonic-gatememory overruns, wild pointers, memory leaks and other such
1948*0Sstevel@tonic-gatebadness.  Perl must be compiled in a specific way for
1949*0Sstevel@tonic-gateoptimal testing with Purify.  Purify is available under
1950*0Sstevel@tonic-gateWindows NT, Solaris, HP-UX, SGI, and Siemens Unix.
1951*0Sstevel@tonic-gate
1952*0Sstevel@tonic-gate=head2 Purify on Unix
1953*0Sstevel@tonic-gate
1954*0Sstevel@tonic-gateOn Unix, Purify creates a new Perl binary.  To get the most
1955*0Sstevel@tonic-gatebenefit out of Purify, you should create the perl to Purify
1956*0Sstevel@tonic-gateusing:
1957*0Sstevel@tonic-gate
1958*0Sstevel@tonic-gate    sh Configure -Accflags=-DPURIFY -Doptimize='-g' \
1959*0Sstevel@tonic-gate     -Uusemymalloc -Dusemultiplicity
1960*0Sstevel@tonic-gate
1961*0Sstevel@tonic-gatewhere these arguments mean:
1962*0Sstevel@tonic-gate
1963*0Sstevel@tonic-gate=over 4
1964*0Sstevel@tonic-gate
1965*0Sstevel@tonic-gate=item -Accflags=-DPURIFY
1966*0Sstevel@tonic-gate
1967*0Sstevel@tonic-gateDisables Perl's arena memory allocation functions, as well as
1968*0Sstevel@tonic-gateforcing use of memory allocation functions derived from the
1969*0Sstevel@tonic-gatesystem malloc.
1970*0Sstevel@tonic-gate
1971*0Sstevel@tonic-gate=item -Doptimize='-g'
1972*0Sstevel@tonic-gate
1973*0Sstevel@tonic-gateAdds debugging information so that you see the exact source
1974*0Sstevel@tonic-gatestatements where the problem occurs.  Without this flag, all
1975*0Sstevel@tonic-gateyou will see is the source filename of where the error occurred.
1976*0Sstevel@tonic-gate
1977*0Sstevel@tonic-gate=item -Uusemymalloc
1978*0Sstevel@tonic-gate
1979*0Sstevel@tonic-gateDisable Perl's malloc so that Purify can more closely monitor
1980*0Sstevel@tonic-gateallocations and leaks.  Using Perl's malloc will make Purify
1981*0Sstevel@tonic-gatereport most leaks in the "potential" leaks category.
1982*0Sstevel@tonic-gate
1983*0Sstevel@tonic-gate=item -Dusemultiplicity
1984*0Sstevel@tonic-gate
1985*0Sstevel@tonic-gateEnabling the multiplicity option allows perl to clean up
1986*0Sstevel@tonic-gatethoroughly when the interpreter shuts down, which reduces the
1987*0Sstevel@tonic-gatenumber of bogus leak reports from Purify.
1988*0Sstevel@tonic-gate
1989*0Sstevel@tonic-gate=back
1990*0Sstevel@tonic-gate
1991*0Sstevel@tonic-gateOnce you've compiled a perl suitable for Purify'ing, then you
1992*0Sstevel@tonic-gatecan just:
1993*0Sstevel@tonic-gate
1994*0Sstevel@tonic-gate    make pureperl
1995*0Sstevel@tonic-gate
1996*0Sstevel@tonic-gatewhich creates a binary named 'pureperl' that has been Purify'ed.
1997*0Sstevel@tonic-gateThis binary is used in place of the standard 'perl' binary
1998*0Sstevel@tonic-gatewhen you want to debug Perl memory problems.
1999*0Sstevel@tonic-gate
2000*0Sstevel@tonic-gateAs an example, to show any memory leaks produced during the
2001*0Sstevel@tonic-gatestandard Perl testset you would create and run the Purify'ed
2002*0Sstevel@tonic-gateperl as:
2003*0Sstevel@tonic-gate
2004*0Sstevel@tonic-gate    make pureperl
2005*0Sstevel@tonic-gate    cd t
2006*0Sstevel@tonic-gate    ../pureperl -I../lib harness
2007*0Sstevel@tonic-gate
2008*0Sstevel@tonic-gatewhich would run Perl on test.pl and report any memory problems.
2009*0Sstevel@tonic-gate
2010*0Sstevel@tonic-gatePurify outputs messages in "Viewer" windows by default.  If
2011*0Sstevel@tonic-gateyou don't have a windowing environment or if you simply
2012*0Sstevel@tonic-gatewant the Purify output to unobtrusively go to a log file
2013*0Sstevel@tonic-gateinstead of to the interactive window, use these following
2014*0Sstevel@tonic-gateoptions to output to the log file "perl.log":
2015*0Sstevel@tonic-gate
2016*0Sstevel@tonic-gate    setenv PURIFYOPTIONS "-chain-length=25 -windows=no \
2017*0Sstevel@tonic-gate     -log-file=perl.log -append-logfile=yes"
2018*0Sstevel@tonic-gate
2019*0Sstevel@tonic-gateIf you plan to use the "Viewer" windows, then you only need this option:
2020*0Sstevel@tonic-gate
2021*0Sstevel@tonic-gate    setenv PURIFYOPTIONS "-chain-length=25"
2022*0Sstevel@tonic-gate
2023*0Sstevel@tonic-gateIn Bourne-type shells:
2024*0Sstevel@tonic-gate
2025*0Sstevel@tonic-gate    PURIFYOPTIONS="..."
2026*0Sstevel@tonic-gate    export PURIFYOPTIONS
2027*0Sstevel@tonic-gate
2028*0Sstevel@tonic-gateor if you have the "env" utility:
2029*0Sstevel@tonic-gate
2030*0Sstevel@tonic-gate    env PURIFYOPTIONS="..." ../pureperl ...
2031*0Sstevel@tonic-gate
2032*0Sstevel@tonic-gate=head2 Purify on NT
2033*0Sstevel@tonic-gate
2034*0Sstevel@tonic-gatePurify on Windows NT instruments the Perl binary 'perl.exe'
2035*0Sstevel@tonic-gateon the fly.  There are several options in the makefile you
2036*0Sstevel@tonic-gateshould change to get the most use out of Purify:
2037*0Sstevel@tonic-gate
2038*0Sstevel@tonic-gate=over 4
2039*0Sstevel@tonic-gate
2040*0Sstevel@tonic-gate=item DEFINES
2041*0Sstevel@tonic-gate
2042*0Sstevel@tonic-gateYou should add -DPURIFY to the DEFINES line so the DEFINES
2043*0Sstevel@tonic-gateline looks something like:
2044*0Sstevel@tonic-gate
2045*0Sstevel@tonic-gate    DEFINES = -DWIN32 -D_CONSOLE -DNO_STRICT $(CRYPT_FLAG) -DPURIFY=1
2046*0Sstevel@tonic-gate
2047*0Sstevel@tonic-gateto disable Perl's arena memory allocation functions, as
2048*0Sstevel@tonic-gatewell as to force use of memory allocation functions derived
2049*0Sstevel@tonic-gatefrom the system malloc.
2050*0Sstevel@tonic-gate
2051*0Sstevel@tonic-gate=item USE_MULTI = define
2052*0Sstevel@tonic-gate
2053*0Sstevel@tonic-gateEnabling the multiplicity option allows perl to clean up
2054*0Sstevel@tonic-gatethoroughly when the interpreter shuts down, which reduces the
2055*0Sstevel@tonic-gatenumber of bogus leak reports from Purify.
2056*0Sstevel@tonic-gate
2057*0Sstevel@tonic-gate=item #PERL_MALLOC = define
2058*0Sstevel@tonic-gate
2059*0Sstevel@tonic-gateDisable Perl's malloc so that Purify can more closely monitor
2060*0Sstevel@tonic-gateallocations and leaks.  Using Perl's malloc will make Purify
2061*0Sstevel@tonic-gatereport most leaks in the "potential" leaks category.
2062*0Sstevel@tonic-gate
2063*0Sstevel@tonic-gate=item CFG = Debug
2064*0Sstevel@tonic-gate
2065*0Sstevel@tonic-gateAdds debugging information so that you see the exact source
2066*0Sstevel@tonic-gatestatements where the problem occurs.  Without this flag, all
2067*0Sstevel@tonic-gateyou will see is the source filename of where the error occurred.
2068*0Sstevel@tonic-gate
2069*0Sstevel@tonic-gate=back
2070*0Sstevel@tonic-gate
2071*0Sstevel@tonic-gateAs an example, to show any memory leaks produced during the
2072*0Sstevel@tonic-gatestandard Perl testset you would create and run Purify as:
2073*0Sstevel@tonic-gate
2074*0Sstevel@tonic-gate    cd win32
2075*0Sstevel@tonic-gate    make
2076*0Sstevel@tonic-gate    cd ../t
2077*0Sstevel@tonic-gate    purify ../perl -I../lib harness
2078*0Sstevel@tonic-gate
2079*0Sstevel@tonic-gatewhich would instrument Perl in memory, run Perl on test.pl,
2080*0Sstevel@tonic-gatethen finally report any memory problems.
2081*0Sstevel@tonic-gate
2082*0Sstevel@tonic-gate=head2 valgrind
2083*0Sstevel@tonic-gate
2084*0Sstevel@tonic-gateThe excellent valgrind tool can be used to find out both memory leaks
2085*0Sstevel@tonic-gateand illegal memory accesses.  As of August 2003 it unfortunately works
2086*0Sstevel@tonic-gateonly on x86 (ELF) Linux.  The special "test.valgrind" target can be used
2087*0Sstevel@tonic-gateto run the tests under valgrind.  Found errors and memory leaks are
2088*0Sstevel@tonic-gatelogged in files named F<test.valgrind>.
2089*0Sstevel@tonic-gate
2090*0Sstevel@tonic-gateAs system libraries (most notably glibc) are also triggering errors,
2091*0Sstevel@tonic-gatevalgrind allows to suppress such errors using suppression files. The
2092*0Sstevel@tonic-gatedefault suppression file that comes with valgrind already catches a lot
2093*0Sstevel@tonic-gateof them. Some additional suppressions are defined in F<t/perl.supp>.
2094*0Sstevel@tonic-gate
2095*0Sstevel@tonic-gateTo get valgrind and for more information see
2096*0Sstevel@tonic-gate
2097*0Sstevel@tonic-gate    http://developer.kde.org/~sewardj/
2098*0Sstevel@tonic-gate
2099*0Sstevel@tonic-gate=head2 Compaq's/Digital's/HP's Third Degree
2100*0Sstevel@tonic-gate
2101*0Sstevel@tonic-gateThird Degree is a tool for memory leak detection and memory access checks.
2102*0Sstevel@tonic-gateIt is one of the many tools in the ATOM toolkit.  The toolkit is only
2103*0Sstevel@tonic-gateavailable on Tru64 (formerly known as Digital UNIX formerly known as
2104*0Sstevel@tonic-gateDEC OSF/1).
2105*0Sstevel@tonic-gate
2106*0Sstevel@tonic-gateWhen building Perl, you must first run Configure with -Doptimize=-g
2107*0Sstevel@tonic-gateand -Uusemymalloc flags, after that you can use the make targets
2108*0Sstevel@tonic-gate"perl.third" and "test.third".  (What is required is that Perl must be
2109*0Sstevel@tonic-gatecompiled using the C<-g> flag, you may need to re-Configure.)
2110*0Sstevel@tonic-gate
2111*0Sstevel@tonic-gateThe short story is that with "atom" you can instrument the Perl
2112*0Sstevel@tonic-gateexecutable to create a new executable called F<perl.third>.  When the
2113*0Sstevel@tonic-gateinstrumented executable is run, it creates a log of dubious memory
2114*0Sstevel@tonic-gatetraffic in file called F<perl.3log>.  See the manual pages of atom and
2115*0Sstevel@tonic-gatethird for more information.  The most extensive Third Degree
2116*0Sstevel@tonic-gatedocumentation is available in the Compaq "Tru64 UNIX Programmer's
2117*0Sstevel@tonic-gateGuide", chapter "Debugging Programs with Third Degree".
2118*0Sstevel@tonic-gate
2119*0Sstevel@tonic-gateThe "test.third" leaves a lot of files named F<foo_bar.3log> in the t/
2120*0Sstevel@tonic-gatesubdirectory.  There is a problem with these files: Third Degree is so
2121*0Sstevel@tonic-gateeffective that it finds problems also in the system libraries.
2122*0Sstevel@tonic-gateTherefore you should used the Porting/thirdclean script to cleanup
2123*0Sstevel@tonic-gatethe F<*.3log> files.
2124*0Sstevel@tonic-gate
2125*0Sstevel@tonic-gateThere are also leaks that for given certain definition of a leak,
2126*0Sstevel@tonic-gatearen't.  See L</PERL_DESTRUCT_LEVEL> for more information.
2127*0Sstevel@tonic-gate
2128*0Sstevel@tonic-gate=head2 PERL_DESTRUCT_LEVEL
2129*0Sstevel@tonic-gate
2130*0Sstevel@tonic-gateIf you want to run any of the tests yourself manually using e.g.
2131*0Sstevel@tonic-gatevalgrind, or the pureperl or perl.third executables, please note that
2132*0Sstevel@tonic-gateby default perl B<does not> explicitly cleanup all the memory it has
2133*0Sstevel@tonic-gateallocated (such as global memory arenas) but instead lets the exit()
2134*0Sstevel@tonic-gateof the whole program "take care" of such allocations, also known as
2135*0Sstevel@tonic-gate"global destruction of objects".
2136*0Sstevel@tonic-gate
2137*0Sstevel@tonic-gateThere is a way to tell perl to do complete cleanup: set the
2138*0Sstevel@tonic-gateenvironment variable PERL_DESTRUCT_LEVEL to a non-zero value.
2139*0Sstevel@tonic-gateThe t/TEST wrapper does set this to 2, and this is what you
2140*0Sstevel@tonic-gateneed to do too, if you don't want to see the "global leaks":
2141*0Sstevel@tonic-gateFor example, for "third-degreed" Perl:
2142*0Sstevel@tonic-gate
2143*0Sstevel@tonic-gate	env PERL_DESTRUCT_LEVEL=2 ./perl.third -Ilib t/foo/bar.t
2144*0Sstevel@tonic-gate
2145*0Sstevel@tonic-gate(Note: the mod_perl apache module uses also this environment variable
2146*0Sstevel@tonic-gatefor its own purposes and extended its semantics. Refer to the mod_perl
2147*0Sstevel@tonic-gatedocumentation for more information. Also, spawned threads do the
2148*0Sstevel@tonic-gateequivalent of setting this variable to the value 1.)
2149*0Sstevel@tonic-gate
2150*0Sstevel@tonic-gateIf, at the end of a run you get the message I<N scalars leaked>, you can
2151*0Sstevel@tonic-gaterecompile with C<-DDEBUG_LEAKING_SCALARS>, which will cause
2152*0Sstevel@tonic-gatethe addresses of all those leaked SVs to be dumped; it also converts
2153*0Sstevel@tonic-gateC<new_SV()> from a macro into a real function, so you can use your
2154*0Sstevel@tonic-gatefavourite debugger to discover where those pesky SVs were allocated.
2155*0Sstevel@tonic-gate
2156*0Sstevel@tonic-gate=head2 Profiling
2157*0Sstevel@tonic-gate
2158*0Sstevel@tonic-gateDepending on your platform there are various of profiling Perl.
2159*0Sstevel@tonic-gate
2160*0Sstevel@tonic-gateThere are two commonly used techniques of profiling executables:
2161*0Sstevel@tonic-gateI<statistical time-sampling> and I<basic-block counting>.
2162*0Sstevel@tonic-gate
2163*0Sstevel@tonic-gateThe first method takes periodically samples of the CPU program
2164*0Sstevel@tonic-gatecounter, and since the program counter can be correlated with the code
2165*0Sstevel@tonic-gategenerated for functions, we get a statistical view of in which
2166*0Sstevel@tonic-gatefunctions the program is spending its time.  The caveats are that very
2167*0Sstevel@tonic-gatesmall/fast functions have lower probability of showing up in the
2168*0Sstevel@tonic-gateprofile, and that periodically interrupting the program (this is
2169*0Sstevel@tonic-gateusually done rather frequently, in the scale of milliseconds) imposes
2170*0Sstevel@tonic-gatean additional overhead that may skew the results.  The first problem
2171*0Sstevel@tonic-gatecan be alleviated by running the code for longer (in general this is a
2172*0Sstevel@tonic-gategood idea for profiling), the second problem is usually kept in guard
2173*0Sstevel@tonic-gateby the profiling tools themselves.
2174*0Sstevel@tonic-gate
2175*0Sstevel@tonic-gateThe second method divides up the generated code into I<basic blocks>.
2176*0Sstevel@tonic-gateBasic blocks are sections of code that are entered only in the
2177*0Sstevel@tonic-gatebeginning and exited only at the end.  For example, a conditional jump
2178*0Sstevel@tonic-gatestarts a basic block.  Basic block profiling usually works by
2179*0Sstevel@tonic-gateI<instrumenting> the code by adding I<enter basic block #nnnn>
2180*0Sstevel@tonic-gatebook-keeping code to the generated code.  During the execution of the
2181*0Sstevel@tonic-gatecode the basic block counters are then updated appropriately.  The
2182*0Sstevel@tonic-gatecaveat is that the added extra code can skew the results: again, the
2183*0Sstevel@tonic-gateprofiling tools usually try to factor their own effects out of the
2184*0Sstevel@tonic-gateresults.
2185*0Sstevel@tonic-gate
2186*0Sstevel@tonic-gate=head2 Gprof Profiling
2187*0Sstevel@tonic-gate
2188*0Sstevel@tonic-gategprof is a profiling tool available in many UNIX platforms,
2189*0Sstevel@tonic-gateit uses F<statistical time-sampling>.
2190*0Sstevel@tonic-gate
2191*0Sstevel@tonic-gateYou can build a profiled version of perl called "perl.gprof" by
2192*0Sstevel@tonic-gateinvoking the make target "perl.gprof"  (What is required is that Perl
2193*0Sstevel@tonic-gatemust be compiled using the C<-pg> flag, you may need to re-Configure).
2194*0Sstevel@tonic-gateRunning the profiled version of Perl will create an output file called
2195*0Sstevel@tonic-gateF<gmon.out> is created which contains the profiling data collected
2196*0Sstevel@tonic-gateduring the execution.
2197*0Sstevel@tonic-gate
2198*0Sstevel@tonic-gateThe gprof tool can then display the collected data in various ways.
2199*0Sstevel@tonic-gateUsually gprof understands the following options:
2200*0Sstevel@tonic-gate
2201*0Sstevel@tonic-gate=over 4
2202*0Sstevel@tonic-gate
2203*0Sstevel@tonic-gate=item -a
2204*0Sstevel@tonic-gate
2205*0Sstevel@tonic-gateSuppress statically defined functions from the profile.
2206*0Sstevel@tonic-gate
2207*0Sstevel@tonic-gate=item -b
2208*0Sstevel@tonic-gate
2209*0Sstevel@tonic-gateSuppress the verbose descriptions in the profile.
2210*0Sstevel@tonic-gate
2211*0Sstevel@tonic-gate=item -e routine
2212*0Sstevel@tonic-gate
2213*0Sstevel@tonic-gateExclude the given routine and its descendants from the profile.
2214*0Sstevel@tonic-gate
2215*0Sstevel@tonic-gate=item -f routine
2216*0Sstevel@tonic-gate
2217*0Sstevel@tonic-gateDisplay only the given routine and its descendants in the profile.
2218*0Sstevel@tonic-gate
2219*0Sstevel@tonic-gate=item -s
2220*0Sstevel@tonic-gate
2221*0Sstevel@tonic-gateGenerate a summary file called F<gmon.sum> which then may be given
2222*0Sstevel@tonic-gateto subsequent gprof runs to accumulate data over several runs.
2223*0Sstevel@tonic-gate
2224*0Sstevel@tonic-gate=item -z
2225*0Sstevel@tonic-gate
2226*0Sstevel@tonic-gateDisplay routines that have zero usage.
2227*0Sstevel@tonic-gate
2228*0Sstevel@tonic-gate=back
2229*0Sstevel@tonic-gate
2230*0Sstevel@tonic-gateFor more detailed explanation of the available commands and output
2231*0Sstevel@tonic-gateformats, see your own local documentation of gprof.
2232*0Sstevel@tonic-gate
2233*0Sstevel@tonic-gate=head2 GCC gcov Profiling
2234*0Sstevel@tonic-gate
2235*0Sstevel@tonic-gateStarting from GCC 3.0 I<basic block profiling> is officially available
2236*0Sstevel@tonic-gatefor the GNU CC.
2237*0Sstevel@tonic-gate
2238*0Sstevel@tonic-gateYou can build a profiled version of perl called F<perl.gcov> by
2239*0Sstevel@tonic-gateinvoking the make target "perl.gcov" (what is required that Perl must
2240*0Sstevel@tonic-gatebe compiled using gcc with the flags C<-fprofile-arcs
2241*0Sstevel@tonic-gate-ftest-coverage>, you may need to re-Configure).
2242*0Sstevel@tonic-gate
2243*0Sstevel@tonic-gateRunning the profiled version of Perl will cause profile output to be
2244*0Sstevel@tonic-gategenerated.  For each source file an accompanying ".da" file will be
2245*0Sstevel@tonic-gatecreated.
2246*0Sstevel@tonic-gate
2247*0Sstevel@tonic-gateTo display the results you use the "gcov" utility (which should
2248*0Sstevel@tonic-gatebe installed if you have gcc 3.0 or newer installed).  F<gcov> is
2249*0Sstevel@tonic-gaterun on source code files, like this
2250*0Sstevel@tonic-gate
2251*0Sstevel@tonic-gate    gcov sv.c
2252*0Sstevel@tonic-gate
2253*0Sstevel@tonic-gatewhich will cause F<sv.c.gcov> to be created.  The F<.gcov> files
2254*0Sstevel@tonic-gatecontain the source code annotated with relative frequencies of
2255*0Sstevel@tonic-gateexecution indicated by "#" markers.
2256*0Sstevel@tonic-gate
2257*0Sstevel@tonic-gateUseful options of F<gcov> include C<-b> which will summarise the
2258*0Sstevel@tonic-gatebasic block, branch, and function call coverage, and C<-c> which
2259*0Sstevel@tonic-gateinstead of relative frequencies will use the actual counts.  For
2260*0Sstevel@tonic-gatemore information on the use of F<gcov> and basic block profiling
2261*0Sstevel@tonic-gatewith gcc, see the latest GNU CC manual, as of GCC 3.0 see
2262*0Sstevel@tonic-gate
2263*0Sstevel@tonic-gate    http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc.html
2264*0Sstevel@tonic-gate
2265*0Sstevel@tonic-gateand its section titled "8. gcov: a Test Coverage Program"
2266*0Sstevel@tonic-gate
2267*0Sstevel@tonic-gate    http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_8.html#SEC132
2268*0Sstevel@tonic-gate
2269*0Sstevel@tonic-gate=head2 Pixie Profiling
2270*0Sstevel@tonic-gate
2271*0Sstevel@tonic-gatePixie is a profiling tool available on IRIX and Tru64 (aka Digital
2272*0Sstevel@tonic-gateUNIX aka DEC OSF/1) platforms.  Pixie does its profiling using
2273*0Sstevel@tonic-gateI<basic-block counting>.
2274*0Sstevel@tonic-gate
2275*0Sstevel@tonic-gateYou can build a profiled version of perl called F<perl.pixie> by
2276*0Sstevel@tonic-gateinvoking the make target "perl.pixie" (what is required is that Perl
2277*0Sstevel@tonic-gatemust be compiled using the C<-g> flag, you may need to re-Configure).
2278*0Sstevel@tonic-gate
2279*0Sstevel@tonic-gateIn Tru64 a file called F<perl.Addrs> will also be silently created,
2280*0Sstevel@tonic-gatethis file contains the addresses of the basic blocks.  Running the
2281*0Sstevel@tonic-gateprofiled version of Perl will create a new file called "perl.Counts"
2282*0Sstevel@tonic-gatewhich contains the counts for the basic block for that particular
2283*0Sstevel@tonic-gateprogram execution.
2284*0Sstevel@tonic-gate
2285*0Sstevel@tonic-gateTo display the results you use the F<prof> utility.  The exact
2286*0Sstevel@tonic-gateincantation depends on your operating system, "prof perl.Counts" in
2287*0Sstevel@tonic-gateIRIX, and "prof -pixie -all -L. perl" in Tru64.
2288*0Sstevel@tonic-gate
2289*0Sstevel@tonic-gateIn IRIX the following prof options are available:
2290*0Sstevel@tonic-gate
2291*0Sstevel@tonic-gate=over 4
2292*0Sstevel@tonic-gate
2293*0Sstevel@tonic-gate=item -h
2294*0Sstevel@tonic-gate
2295*0Sstevel@tonic-gateReports the most heavily used lines in descending order of use.
2296*0Sstevel@tonic-gateUseful for finding the hotspot lines.
2297*0Sstevel@tonic-gate
2298*0Sstevel@tonic-gate=item -l
2299*0Sstevel@tonic-gate
2300*0Sstevel@tonic-gateGroups lines by procedure, with procedures sorted in descending order of use.
2301*0Sstevel@tonic-gateWithin a procedure, lines are listed in source order.
2302*0Sstevel@tonic-gateUseful for finding the hotspots of procedures.
2303*0Sstevel@tonic-gate
2304*0Sstevel@tonic-gate=back
2305*0Sstevel@tonic-gate
2306*0Sstevel@tonic-gateIn Tru64 the following options are available:
2307*0Sstevel@tonic-gate
2308*0Sstevel@tonic-gate=over 4
2309*0Sstevel@tonic-gate
2310*0Sstevel@tonic-gate=item -p[rocedures]
2311*0Sstevel@tonic-gate
2312*0Sstevel@tonic-gateProcedures sorted in descending order by the number of cycles executed
2313*0Sstevel@tonic-gatein each procedure.  Useful for finding the hotspot procedures.
2314*0Sstevel@tonic-gate(This is the default option.)
2315*0Sstevel@tonic-gate
2316*0Sstevel@tonic-gate=item -h[eavy]
2317*0Sstevel@tonic-gate
2318*0Sstevel@tonic-gateLines sorted in descending order by the number of cycles executed in
2319*0Sstevel@tonic-gateeach line.  Useful for finding the hotspot lines.
2320*0Sstevel@tonic-gate
2321*0Sstevel@tonic-gate=item -i[nvocations]
2322*0Sstevel@tonic-gate
2323*0Sstevel@tonic-gateThe called procedures are sorted in descending order by number of calls
2324*0Sstevel@tonic-gatemade to the procedures.  Useful for finding the most used procedures.
2325*0Sstevel@tonic-gate
2326*0Sstevel@tonic-gate=item -l[ines]
2327*0Sstevel@tonic-gate
2328*0Sstevel@tonic-gateGrouped by procedure, sorted by cycles executed per procedure.
2329*0Sstevel@tonic-gateUseful for finding the hotspots of procedures.
2330*0Sstevel@tonic-gate
2331*0Sstevel@tonic-gate=item -testcoverage
2332*0Sstevel@tonic-gate
2333*0Sstevel@tonic-gateThe compiler emitted code for these lines, but the code was unexecuted.
2334*0Sstevel@tonic-gate
2335*0Sstevel@tonic-gate=item -z[ero]
2336*0Sstevel@tonic-gate
2337*0Sstevel@tonic-gateUnexecuted procedures.
2338*0Sstevel@tonic-gate
2339*0Sstevel@tonic-gate=back
2340*0Sstevel@tonic-gate
2341*0Sstevel@tonic-gateFor further information, see your system's manual pages for pixie and prof.
2342*0Sstevel@tonic-gate
2343*0Sstevel@tonic-gate=head2 Miscellaneous tricks
2344*0Sstevel@tonic-gate
2345*0Sstevel@tonic-gate=over 4
2346*0Sstevel@tonic-gate
2347*0Sstevel@tonic-gate=item *
2348*0Sstevel@tonic-gate
2349*0Sstevel@tonic-gateThose debugging perl with the DDD frontend over gdb may find the
2350*0Sstevel@tonic-gatefollowing useful:
2351*0Sstevel@tonic-gate
2352*0Sstevel@tonic-gateYou can extend the data conversion shortcuts menu, so for example you
2353*0Sstevel@tonic-gatecan display an SV's IV value with one click, without doing any typing.
2354*0Sstevel@tonic-gateTo do that simply edit ~/.ddd/init file and add after:
2355*0Sstevel@tonic-gate
2356*0Sstevel@tonic-gate  ! Display shortcuts.
2357*0Sstevel@tonic-gate  Ddd*gdbDisplayShortcuts: \
2358*0Sstevel@tonic-gate  /t ()   // Convert to Bin\n\
2359*0Sstevel@tonic-gate  /d ()   // Convert to Dec\n\
2360*0Sstevel@tonic-gate  /x ()   // Convert to Hex\n\
2361*0Sstevel@tonic-gate  /o ()   // Convert to Oct(\n\
2362*0Sstevel@tonic-gate
2363*0Sstevel@tonic-gatethe following two lines:
2364*0Sstevel@tonic-gate
2365*0Sstevel@tonic-gate  ((XPV*) (())->sv_any )->xpv_pv  // 2pvx\n\
2366*0Sstevel@tonic-gate  ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx
2367*0Sstevel@tonic-gate
2368*0Sstevel@tonic-gateso now you can do ivx and pvx lookups or you can plug there the
2369*0Sstevel@tonic-gatesv_peek "conversion":
2370*0Sstevel@tonic-gate
2371*0Sstevel@tonic-gate  Perl_sv_peek(my_perl, (SV*)()) // sv_peek
2372*0Sstevel@tonic-gate
2373*0Sstevel@tonic-gate(The my_perl is for threaded builds.)
2374*0Sstevel@tonic-gateJust remember that every line, but the last one, should end with \n\
2375*0Sstevel@tonic-gate
2376*0Sstevel@tonic-gateAlternatively edit the init file interactively via:
2377*0Sstevel@tonic-gate3rd mouse button -> New Display -> Edit Menu
2378*0Sstevel@tonic-gate
2379*0Sstevel@tonic-gateNote: you can define up to 20 conversion shortcuts in the gdb
2380*0Sstevel@tonic-gatesection.
2381*0Sstevel@tonic-gate
2382*0Sstevel@tonic-gate=item *
2383*0Sstevel@tonic-gate
2384*0Sstevel@tonic-gateIf you see in a debugger a memory area mysteriously full of 0xabababab,
2385*0Sstevel@tonic-gateyou may be seeing the effect of the Poison() macro, see L<perlclib>.
2386*0Sstevel@tonic-gate
2387*0Sstevel@tonic-gate=back
2388*0Sstevel@tonic-gate
2389*0Sstevel@tonic-gate=head2 CONCLUSION
2390*0Sstevel@tonic-gate
2391*0Sstevel@tonic-gateWe've had a brief look around the Perl source, an overview of the stages
2392*0Sstevel@tonic-gateF<perl> goes through when it's running your code, and how to use a
2393*0Sstevel@tonic-gatedebugger to poke at the Perl guts. We took a very simple problem and
2394*0Sstevel@tonic-gatedemonstrated how to solve it fully - with documentation, regression
2395*0Sstevel@tonic-gatetests, and finally a patch for submission to p5p.  Finally, we talked
2396*0Sstevel@tonic-gateabout how to use external tools to debug and test Perl.
2397*0Sstevel@tonic-gate
2398*0Sstevel@tonic-gateI'd now suggest you read over those references again, and then, as soon
2399*0Sstevel@tonic-gateas possible, get your hands dirty. The best way to learn is by doing,
2400*0Sstevel@tonic-gateso:
2401*0Sstevel@tonic-gate
2402*0Sstevel@tonic-gate=over 3
2403*0Sstevel@tonic-gate
2404*0Sstevel@tonic-gate=item *
2405*0Sstevel@tonic-gate
2406*0Sstevel@tonic-gateSubscribe to perl5-porters, follow the patches and try and understand
2407*0Sstevel@tonic-gatethem; don't be afraid to ask if there's a portion you're not clear on -
2408*0Sstevel@tonic-gatewho knows, you may unearth a bug in the patch...
2409*0Sstevel@tonic-gate
2410*0Sstevel@tonic-gate=item *
2411*0Sstevel@tonic-gate
2412*0Sstevel@tonic-gateKeep up to date with the bleeding edge Perl distributions and get
2413*0Sstevel@tonic-gatefamiliar with the changes. Try and get an idea of what areas people are
2414*0Sstevel@tonic-gateworking on and the changes they're making.
2415*0Sstevel@tonic-gate
2416*0Sstevel@tonic-gate=item *
2417*0Sstevel@tonic-gate
2418*0Sstevel@tonic-gateDo read the README associated with your operating system, e.g. README.aix
2419*0Sstevel@tonic-gateon the IBM AIX OS. Don't hesitate to supply patches to that README if
2420*0Sstevel@tonic-gateyou find anything missing or changed over a new OS release.
2421*0Sstevel@tonic-gate
2422*0Sstevel@tonic-gate=item *
2423*0Sstevel@tonic-gate
2424*0Sstevel@tonic-gateFind an area of Perl that seems interesting to you, and see if you can
2425*0Sstevel@tonic-gatework out how it works. Scan through the source, and step over it in the
2426*0Sstevel@tonic-gatedebugger. Play, poke, investigate, fiddle! You'll probably get to
2427*0Sstevel@tonic-gateunderstand not just your chosen area but a much wider range of F<perl>'s
2428*0Sstevel@tonic-gateactivity as well, and probably sooner than you'd think.
2429*0Sstevel@tonic-gate
2430*0Sstevel@tonic-gate=back
2431*0Sstevel@tonic-gate
2432*0Sstevel@tonic-gate=over 3
2433*0Sstevel@tonic-gate
2434*0Sstevel@tonic-gate=item I<The Road goes ever on and on, down from the door where it began.>
2435*0Sstevel@tonic-gate
2436*0Sstevel@tonic-gate=back
2437*0Sstevel@tonic-gate
2438*0Sstevel@tonic-gateIf you can do these things, you've started on the long road to Perl porting.
2439*0Sstevel@tonic-gateThanks for wanting to help make Perl better - and happy hacking!
2440*0Sstevel@tonic-gate
2441*0Sstevel@tonic-gate=head1 AUTHOR
2442*0Sstevel@tonic-gate
2443*0Sstevel@tonic-gateThis document was written by Nathan Torkington, and is maintained by
2444*0Sstevel@tonic-gatethe perl5-porters mailing list.
2445*0Sstevel@tonic-gate
2446