1*0Sstevel@tonic-gate=head1 NAME 2*0Sstevel@tonic-gate 3*0Sstevel@tonic-gateperlhack - How to hack at the Perl internals 4*0Sstevel@tonic-gate 5*0Sstevel@tonic-gate=head1 DESCRIPTION 6*0Sstevel@tonic-gate 7*0Sstevel@tonic-gateThis document attempts to explain how Perl development takes place, 8*0Sstevel@tonic-gateand ends with some suggestions for people wanting to become bona fide 9*0Sstevel@tonic-gateporters. 10*0Sstevel@tonic-gate 11*0Sstevel@tonic-gateThe perl5-porters mailing list is where the Perl standard distribution 12*0Sstevel@tonic-gateis maintained and developed. The list can get anywhere from 10 to 150 13*0Sstevel@tonic-gatemessages a day, depending on the heatedness of the debate. Most days 14*0Sstevel@tonic-gatethere are two or three patches, extensions, features, or bugs being 15*0Sstevel@tonic-gatediscussed at a time. 16*0Sstevel@tonic-gate 17*0Sstevel@tonic-gateA searchable archive of the list is at either: 18*0Sstevel@tonic-gate 19*0Sstevel@tonic-gate http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/ 20*0Sstevel@tonic-gate 21*0Sstevel@tonic-gateor 22*0Sstevel@tonic-gate 23*0Sstevel@tonic-gate http://archive.develooper.com/perl5-porters@perl.org/ 24*0Sstevel@tonic-gate 25*0Sstevel@tonic-gateList subscribers (the porters themselves) come in several flavours. 26*0Sstevel@tonic-gateSome are quiet curious lurkers, who rarely pitch in and instead watch 27*0Sstevel@tonic-gatethe ongoing development to ensure they're forewarned of new changes or 28*0Sstevel@tonic-gatefeatures in Perl. Some are representatives of vendors, who are there 29*0Sstevel@tonic-gateto make sure that Perl continues to compile and work on their 30*0Sstevel@tonic-gateplatforms. Some patch any reported bug that they know how to fix, 31*0Sstevel@tonic-gatesome are actively patching their pet area (threads, Win32, the regexp 32*0Sstevel@tonic-gateengine), while others seem to do nothing but complain. In other 33*0Sstevel@tonic-gatewords, it's your usual mix of technical people. 34*0Sstevel@tonic-gate 35*0Sstevel@tonic-gateOver this group of porters presides Larry Wall. He has the final word 36*0Sstevel@tonic-gatein what does and does not change in the Perl language. Various 37*0Sstevel@tonic-gatereleases of Perl are shepherded by a ``pumpking'', a porter 38*0Sstevel@tonic-gateresponsible for gathering patches, deciding on a patch-by-patch 39*0Sstevel@tonic-gatefeature-by-feature basis what will and will not go into the release. 40*0Sstevel@tonic-gateFor instance, Gurusamy Sarathy was the pumpking for the 5.6 release of 41*0Sstevel@tonic-gatePerl, and Jarkko Hietaniemi is the pumpking for the 5.8 release, and 42*0Sstevel@tonic-gateHugo van der Sanden will be the pumpking for the 5.10 release. 43*0Sstevel@tonic-gate 44*0Sstevel@tonic-gateIn addition, various people are pumpkings for different things. For 45*0Sstevel@tonic-gateinstance, Andy Dougherty and Jarkko Hietaniemi share the I<Configure> 46*0Sstevel@tonic-gatepumpkin. 47*0Sstevel@tonic-gate 48*0Sstevel@tonic-gateLarry sees Perl development along the lines of the US government: 49*0Sstevel@tonic-gatethere's the Legislature (the porters), the Executive branch (the 50*0Sstevel@tonic-gatepumpkings), and the Supreme Court (Larry). The legislature can 51*0Sstevel@tonic-gatediscuss and submit patches to the executive branch all they like, but 52*0Sstevel@tonic-gatethe executive branch is free to veto them. Rarely, the Supreme Court 53*0Sstevel@tonic-gatewill side with the executive branch over the legislature, or the 54*0Sstevel@tonic-gatelegislature over the executive branch. Mostly, however, the 55*0Sstevel@tonic-gatelegislature and the executive branch are supposed to get along and 56*0Sstevel@tonic-gatework out their differences without impeachment or court cases. 57*0Sstevel@tonic-gate 58*0Sstevel@tonic-gateYou might sometimes see reference to Rule 1 and Rule 2. Larry's power 59*0Sstevel@tonic-gateas Supreme Court is expressed in The Rules: 60*0Sstevel@tonic-gate 61*0Sstevel@tonic-gate=over 4 62*0Sstevel@tonic-gate 63*0Sstevel@tonic-gate=item 1 64*0Sstevel@tonic-gate 65*0Sstevel@tonic-gateLarry is always by definition right about how Perl should behave. 66*0Sstevel@tonic-gateThis means he has final veto power on the core functionality. 67*0Sstevel@tonic-gate 68*0Sstevel@tonic-gate=item 2 69*0Sstevel@tonic-gate 70*0Sstevel@tonic-gateLarry is allowed to change his mind about any matter at a later date, 71*0Sstevel@tonic-gateregardless of whether he previously invoked Rule 1. 72*0Sstevel@tonic-gate 73*0Sstevel@tonic-gate=back 74*0Sstevel@tonic-gate 75*0Sstevel@tonic-gateGot that? Larry is always right, even when he was wrong. It's rare 76*0Sstevel@tonic-gateto see either Rule exercised, but they are often alluded to. 77*0Sstevel@tonic-gate 78*0Sstevel@tonic-gateNew features and extensions to the language are contentious, because 79*0Sstevel@tonic-gatethe criteria used by the pumpkings, Larry, and other porters to decide 80*0Sstevel@tonic-gatewhich features should be implemented and incorporated are not codified 81*0Sstevel@tonic-gatein a few small design goals as with some other languages. Instead, 82*0Sstevel@tonic-gatethe heuristics are flexible and often difficult to fathom. Here is 83*0Sstevel@tonic-gateone person's list, roughly in decreasing order of importance, of 84*0Sstevel@tonic-gateheuristics that new features have to be weighed against: 85*0Sstevel@tonic-gate 86*0Sstevel@tonic-gate=over 4 87*0Sstevel@tonic-gate 88*0Sstevel@tonic-gate=item Does concept match the general goals of Perl? 89*0Sstevel@tonic-gate 90*0Sstevel@tonic-gateThese haven't been written anywhere in stone, but one approximation 91*0Sstevel@tonic-gateis: 92*0Sstevel@tonic-gate 93*0Sstevel@tonic-gate 1. Keep it fast, simple, and useful. 94*0Sstevel@tonic-gate 2. Keep features/concepts as orthogonal as possible. 95*0Sstevel@tonic-gate 3. No arbitrary limits (platforms, data sizes, cultures). 96*0Sstevel@tonic-gate 4. Keep it open and exciting to use/patch/advocate Perl everywhere. 97*0Sstevel@tonic-gate 5. Either assimilate new technologies, or build bridges to them. 98*0Sstevel@tonic-gate 99*0Sstevel@tonic-gate=item Where is the implementation? 100*0Sstevel@tonic-gate 101*0Sstevel@tonic-gateAll the talk in the world is useless without an implementation. In 102*0Sstevel@tonic-gatealmost every case, the person or people who argue for a new feature 103*0Sstevel@tonic-gatewill be expected to be the ones who implement it. Porters capable 104*0Sstevel@tonic-gateof coding new features have their own agendas, and are not available 105*0Sstevel@tonic-gateto implement your (possibly good) idea. 106*0Sstevel@tonic-gate 107*0Sstevel@tonic-gate=item Backwards compatibility 108*0Sstevel@tonic-gate 109*0Sstevel@tonic-gateIt's a cardinal sin to break existing Perl programs. New warnings are 110*0Sstevel@tonic-gatecontentious--some say that a program that emits warnings is not 111*0Sstevel@tonic-gatebroken, while others say it is. Adding keywords has the potential to 112*0Sstevel@tonic-gatebreak programs, changing the meaning of existing token sequences or 113*0Sstevel@tonic-gatefunctions might break programs. 114*0Sstevel@tonic-gate 115*0Sstevel@tonic-gate=item Could it be a module instead? 116*0Sstevel@tonic-gate 117*0Sstevel@tonic-gatePerl 5 has extension mechanisms, modules and XS, specifically to avoid 118*0Sstevel@tonic-gatethe need to keep changing the Perl interpreter. You can write modules 119*0Sstevel@tonic-gatethat export functions, you can give those functions prototypes so they 120*0Sstevel@tonic-gatecan be called like built-in functions, you can even write XS code to 121*0Sstevel@tonic-gatemess with the runtime data structures of the Perl interpreter if you 122*0Sstevel@tonic-gatewant to implement really complicated things. If it can be done in a 123*0Sstevel@tonic-gatemodule instead of in the core, it's highly unlikely to be added. 124*0Sstevel@tonic-gate 125*0Sstevel@tonic-gate=item Is the feature generic enough? 126*0Sstevel@tonic-gate 127*0Sstevel@tonic-gateIs this something that only the submitter wants added to the language, 128*0Sstevel@tonic-gateor would it be broadly useful? Sometimes, instead of adding a feature 129*0Sstevel@tonic-gatewith a tight focus, the porters might decide to wait until someone 130*0Sstevel@tonic-gateimplements the more generalized feature. For instance, instead of 131*0Sstevel@tonic-gateimplementing a ``delayed evaluation'' feature, the porters are waiting 132*0Sstevel@tonic-gatefor a macro system that would permit delayed evaluation and much more. 133*0Sstevel@tonic-gate 134*0Sstevel@tonic-gate=item Does it potentially introduce new bugs? 135*0Sstevel@tonic-gate 136*0Sstevel@tonic-gateRadical rewrites of large chunks of the Perl interpreter have the 137*0Sstevel@tonic-gatepotential to introduce new bugs. The smaller and more localized the 138*0Sstevel@tonic-gatechange, the better. 139*0Sstevel@tonic-gate 140*0Sstevel@tonic-gate=item Does it preclude other desirable features? 141*0Sstevel@tonic-gate 142*0Sstevel@tonic-gateA patch is likely to be rejected if it closes off future avenues of 143*0Sstevel@tonic-gatedevelopment. For instance, a patch that placed a true and final 144*0Sstevel@tonic-gateinterpretation on prototypes is likely to be rejected because there 145*0Sstevel@tonic-gateare still options for the future of prototypes that haven't been 146*0Sstevel@tonic-gateaddressed. 147*0Sstevel@tonic-gate 148*0Sstevel@tonic-gate=item Is the implementation robust? 149*0Sstevel@tonic-gate 150*0Sstevel@tonic-gateGood patches (tight code, complete, correct) stand more chance of 151*0Sstevel@tonic-gategoing in. Sloppy or incorrect patches might be placed on the back 152*0Sstevel@tonic-gateburner until the pumpking has time to fix, or might be discarded 153*0Sstevel@tonic-gatealtogether without further notice. 154*0Sstevel@tonic-gate 155*0Sstevel@tonic-gate=item Is the implementation generic enough to be portable? 156*0Sstevel@tonic-gate 157*0Sstevel@tonic-gateThe worst patches make use of a system-specific features. It's highly 158*0Sstevel@tonic-gateunlikely that nonportable additions to the Perl language will be 159*0Sstevel@tonic-gateaccepted. 160*0Sstevel@tonic-gate 161*0Sstevel@tonic-gate=item Is the implementation tested? 162*0Sstevel@tonic-gate 163*0Sstevel@tonic-gatePatches which change behaviour (fixing bugs or introducing new features) 164*0Sstevel@tonic-gatemust include regression tests to verify that everything works as expected. 165*0Sstevel@tonic-gateWithout tests provided by the original author, how can anyone else changing 166*0Sstevel@tonic-gateperl in the future be sure that they haven't unwittingly broken the behaviour 167*0Sstevel@tonic-gatethe patch implements? And without tests, how can the patch's author be 168*0Sstevel@tonic-gateconfident that his/her hard work put into the patch won't be accidentally 169*0Sstevel@tonic-gatethrown away by someone in the future? 170*0Sstevel@tonic-gate 171*0Sstevel@tonic-gate=item Is there enough documentation? 172*0Sstevel@tonic-gate 173*0Sstevel@tonic-gatePatches without documentation are probably ill-thought out or 174*0Sstevel@tonic-gateincomplete. Nothing can be added without documentation, so submitting 175*0Sstevel@tonic-gatea patch for the appropriate manpages as well as the source code is 176*0Sstevel@tonic-gatealways a good idea. 177*0Sstevel@tonic-gate 178*0Sstevel@tonic-gate=item Is there another way to do it? 179*0Sstevel@tonic-gate 180*0Sstevel@tonic-gateLarry said ``Although the Perl Slogan is I<There's More Than One Way 181*0Sstevel@tonic-gateto Do It>, I hesitate to make 10 ways to do something''. This is a 182*0Sstevel@tonic-gatetricky heuristic to navigate, though--one man's essential addition is 183*0Sstevel@tonic-gateanother man's pointless cruft. 184*0Sstevel@tonic-gate 185*0Sstevel@tonic-gate=item Does it create too much work? 186*0Sstevel@tonic-gate 187*0Sstevel@tonic-gateWork for the pumpking, work for Perl programmers, work for module 188*0Sstevel@tonic-gateauthors, ... Perl is supposed to be easy. 189*0Sstevel@tonic-gate 190*0Sstevel@tonic-gate=item Patches speak louder than words 191*0Sstevel@tonic-gate 192*0Sstevel@tonic-gateWorking code is always preferred to pie-in-the-sky ideas. A patch to 193*0Sstevel@tonic-gateadd a feature stands a much higher chance of making it to the language 194*0Sstevel@tonic-gatethan does a random feature request, no matter how fervently argued the 195*0Sstevel@tonic-gaterequest might be. This ties into ``Will it be useful?'', as the fact 196*0Sstevel@tonic-gatethat someone took the time to make the patch demonstrates a strong 197*0Sstevel@tonic-gatedesire for the feature. 198*0Sstevel@tonic-gate 199*0Sstevel@tonic-gate=back 200*0Sstevel@tonic-gate 201*0Sstevel@tonic-gateIf you're on the list, you might hear the word ``core'' bandied 202*0Sstevel@tonic-gatearound. It refers to the standard distribution. ``Hacking on the 203*0Sstevel@tonic-gatecore'' means you're changing the C source code to the Perl 204*0Sstevel@tonic-gateinterpreter. ``A core module'' is one that ships with Perl. 205*0Sstevel@tonic-gate 206*0Sstevel@tonic-gate=head2 Keeping in sync 207*0Sstevel@tonic-gate 208*0Sstevel@tonic-gateThe source code to the Perl interpreter, in its different versions, is 209*0Sstevel@tonic-gatekept in a repository managed by a revision control system ( which is 210*0Sstevel@tonic-gatecurrently the Perforce program, see http://perforce.com/ ). The 211*0Sstevel@tonic-gatepumpkings and a few others have access to the repository to check in 212*0Sstevel@tonic-gatechanges. Periodically the pumpking for the development version of Perl 213*0Sstevel@tonic-gatewill release a new version, so the rest of the porters can see what's 214*0Sstevel@tonic-gatechanged. The current state of the main trunk of repository, and patches 215*0Sstevel@tonic-gatethat describe the individual changes that have happened since the last 216*0Sstevel@tonic-gatepublic release are available at this location: 217*0Sstevel@tonic-gate 218*0Sstevel@tonic-gate http://public.activestate.com/gsar/APC/ 219*0Sstevel@tonic-gate ftp://ftp.linux.activestate.com/pub/staff/gsar/APC/ 220*0Sstevel@tonic-gate 221*0Sstevel@tonic-gateIf you're looking for a particular change, or a change that affected 222*0Sstevel@tonic-gatea particular set of files, you may find the B<Perl Repository Browser> 223*0Sstevel@tonic-gateuseful: 224*0Sstevel@tonic-gate 225*0Sstevel@tonic-gate http://public.activestate.com/cgi-bin/perlbrowse 226*0Sstevel@tonic-gate 227*0Sstevel@tonic-gateYou may also want to subscribe to the perl5-changes mailing list to 228*0Sstevel@tonic-gatereceive a copy of each patch that gets submitted to the maintenance 229*0Sstevel@tonic-gateand development "branches" of the perl repository. See 230*0Sstevel@tonic-gatehttp://lists.perl.org/ for subscription information. 231*0Sstevel@tonic-gate 232*0Sstevel@tonic-gateIf you are a member of the perl5-porters mailing list, it is a good 233*0Sstevel@tonic-gatething to keep in touch with the most recent changes. If not only to 234*0Sstevel@tonic-gateverify if what you would have posted as a bug report isn't already 235*0Sstevel@tonic-gatesolved in the most recent available perl development branch, also 236*0Sstevel@tonic-gateknown as perl-current, bleading edge perl, bleedperl or bleadperl. 237*0Sstevel@tonic-gate 238*0Sstevel@tonic-gateNeedless to say, the source code in perl-current is usually in a perpetual 239*0Sstevel@tonic-gatestate of evolution. You should expect it to be very buggy. Do B<not> use 240*0Sstevel@tonic-gateit for any purpose other than testing and development. 241*0Sstevel@tonic-gate 242*0Sstevel@tonic-gateKeeping in sync with the most recent branch can be done in several ways, 243*0Sstevel@tonic-gatebut the most convenient and reliable way is using B<rsync>, available at 244*0Sstevel@tonic-gateftp://rsync.samba.org/pub/rsync/ . (You can also get the most recent 245*0Sstevel@tonic-gatebranch by FTP.) 246*0Sstevel@tonic-gate 247*0Sstevel@tonic-gateIf you choose to keep in sync using rsync, there are two approaches 248*0Sstevel@tonic-gateto doing so: 249*0Sstevel@tonic-gate 250*0Sstevel@tonic-gate=over 4 251*0Sstevel@tonic-gate 252*0Sstevel@tonic-gate=item rsync'ing the source tree 253*0Sstevel@tonic-gate 254*0Sstevel@tonic-gatePresuming you are in the directory where your perl source resides 255*0Sstevel@tonic-gateand you have rsync installed and available, you can `upgrade' to 256*0Sstevel@tonic-gatethe bleadperl using: 257*0Sstevel@tonic-gate 258*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current/ . 259*0Sstevel@tonic-gate 260*0Sstevel@tonic-gateThis takes care of updating every single item in the source tree to 261*0Sstevel@tonic-gatethe latest applied patch level, creating files that are new (to your 262*0Sstevel@tonic-gatedistribution) and setting date/time stamps of existing files to 263*0Sstevel@tonic-gatereflect the bleadperl status. 264*0Sstevel@tonic-gate 265*0Sstevel@tonic-gateNote that this will not delete any files that were in '.' before 266*0Sstevel@tonic-gatethe rsync. Once you are sure that the rsync is running correctly, 267*0Sstevel@tonic-gaterun it with the --delete and the --dry-run options like this: 268*0Sstevel@tonic-gate 269*0Sstevel@tonic-gate # rsync -avz --delete --dry-run rsync://ftp.linux.activestate.com/perl-current/ . 270*0Sstevel@tonic-gate 271*0Sstevel@tonic-gateThis will I<simulate> an rsync run that also deletes files not 272*0Sstevel@tonic-gatepresent in the bleadperl master copy. Observe the results from 273*0Sstevel@tonic-gatethis run closely. If you are sure that the actual run would delete 274*0Sstevel@tonic-gateno files precious to you, you could remove the '--dry-run' option. 275*0Sstevel@tonic-gate 276*0Sstevel@tonic-gateYou can than check what patch was the latest that was applied by 277*0Sstevel@tonic-gatelooking in the file B<.patch>, which will show the number of the 278*0Sstevel@tonic-gatelatest patch. 279*0Sstevel@tonic-gate 280*0Sstevel@tonic-gateIf you have more than one machine to keep in sync, and not all of 281*0Sstevel@tonic-gatethem have access to the WAN (so you are not able to rsync all the 282*0Sstevel@tonic-gatesource trees to the real source), there are some ways to get around 283*0Sstevel@tonic-gatethis problem. 284*0Sstevel@tonic-gate 285*0Sstevel@tonic-gate=over 4 286*0Sstevel@tonic-gate 287*0Sstevel@tonic-gate=item Using rsync over the LAN 288*0Sstevel@tonic-gate 289*0Sstevel@tonic-gateSet up a local rsync server which makes the rsynced source tree 290*0Sstevel@tonic-gateavailable to the LAN and sync the other machines against this 291*0Sstevel@tonic-gatedirectory. 292*0Sstevel@tonic-gate 293*0Sstevel@tonic-gateFrom http://rsync.samba.org/README.html : 294*0Sstevel@tonic-gate 295*0Sstevel@tonic-gate "Rsync uses rsh or ssh for communication. It does not need to be 296*0Sstevel@tonic-gate setuid and requires no special privileges for installation. It 297*0Sstevel@tonic-gate does not require an inetd entry or a daemon. You must, however, 298*0Sstevel@tonic-gate have a working rsh or ssh system. Using ssh is recommended for 299*0Sstevel@tonic-gate its security features." 300*0Sstevel@tonic-gate 301*0Sstevel@tonic-gate=item Using pushing over the NFS 302*0Sstevel@tonic-gate 303*0Sstevel@tonic-gateHaving the other systems mounted over the NFS, you can take an 304*0Sstevel@tonic-gateactive pushing approach by checking the just updated tree against 305*0Sstevel@tonic-gatethe other not-yet synced trees. An example would be 306*0Sstevel@tonic-gate 307*0Sstevel@tonic-gate #!/usr/bin/perl -w 308*0Sstevel@tonic-gate 309*0Sstevel@tonic-gate use strict; 310*0Sstevel@tonic-gate use File::Copy; 311*0Sstevel@tonic-gate 312*0Sstevel@tonic-gate my %MF = map { 313*0Sstevel@tonic-gate m/(\S+)/; 314*0Sstevel@tonic-gate $1 => [ (stat $1)[2, 7, 9] ]; # mode, size, mtime 315*0Sstevel@tonic-gate } `cat MANIFEST`; 316*0Sstevel@tonic-gate 317*0Sstevel@tonic-gate my %remote = map { $_ => "/$_/pro/3gl/CPAN/perl-5.7.1" } qw(host1 host2); 318*0Sstevel@tonic-gate 319*0Sstevel@tonic-gate foreach my $host (keys %remote) { 320*0Sstevel@tonic-gate unless (-d $remote{$host}) { 321*0Sstevel@tonic-gate print STDERR "Cannot Xsync for host $host\n"; 322*0Sstevel@tonic-gate next; 323*0Sstevel@tonic-gate } 324*0Sstevel@tonic-gate foreach my $file (keys %MF) { 325*0Sstevel@tonic-gate my $rfile = "$remote{$host}/$file"; 326*0Sstevel@tonic-gate my ($mode, $size, $mtime) = (stat $rfile)[2, 7, 9]; 327*0Sstevel@tonic-gate defined $size or ($mode, $size, $mtime) = (0, 0, 0); 328*0Sstevel@tonic-gate $size == $MF{$file}[1] && $mtime == $MF{$file}[2] and next; 329*0Sstevel@tonic-gate printf "%4s %-34s %8d %9d %8d %9d\n", 330*0Sstevel@tonic-gate $host, $file, $MF{$file}[1], $MF{$file}[2], $size, $mtime; 331*0Sstevel@tonic-gate unlink $rfile; 332*0Sstevel@tonic-gate copy ($file, $rfile); 333*0Sstevel@tonic-gate utime time, $MF{$file}[2], $rfile; 334*0Sstevel@tonic-gate chmod $MF{$file}[0], $rfile; 335*0Sstevel@tonic-gate } 336*0Sstevel@tonic-gate } 337*0Sstevel@tonic-gate 338*0Sstevel@tonic-gatethough this is not perfect. It could be improved with checking 339*0Sstevel@tonic-gatefile checksums before updating. Not all NFS systems support 340*0Sstevel@tonic-gatereliable utime support (when used over the NFS). 341*0Sstevel@tonic-gate 342*0Sstevel@tonic-gate=back 343*0Sstevel@tonic-gate 344*0Sstevel@tonic-gate=item rsync'ing the patches 345*0Sstevel@tonic-gate 346*0Sstevel@tonic-gateThe source tree is maintained by the pumpking who applies patches to 347*0Sstevel@tonic-gatethe files in the tree. These patches are either created by the 348*0Sstevel@tonic-gatepumpking himself using C<diff -c> after updating the file manually or 349*0Sstevel@tonic-gateby applying patches sent in by posters on the perl5-porters list. 350*0Sstevel@tonic-gateThese patches are also saved and rsync'able, so you can apply them 351*0Sstevel@tonic-gateyourself to the source files. 352*0Sstevel@tonic-gate 353*0Sstevel@tonic-gatePresuming you are in a directory where your patches reside, you can 354*0Sstevel@tonic-gateget them in sync with 355*0Sstevel@tonic-gate 356*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ . 357*0Sstevel@tonic-gate 358*0Sstevel@tonic-gateThis makes sure the latest available patch is downloaded to your 359*0Sstevel@tonic-gatepatch directory. 360*0Sstevel@tonic-gate 361*0Sstevel@tonic-gateIt's then up to you to apply these patches, using something like 362*0Sstevel@tonic-gate 363*0Sstevel@tonic-gate # last=`ls -t *.gz | sed q` 364*0Sstevel@tonic-gate # rsync -avz rsync://ftp.linux.activestate.com/perl-current-diffs/ . 365*0Sstevel@tonic-gate # find . -name '*.gz' -newer $last -exec gzcat {} \; >blead.patch 366*0Sstevel@tonic-gate # cd ../perl-current 367*0Sstevel@tonic-gate # patch -p1 -N <../perl-current-diffs/blead.patch 368*0Sstevel@tonic-gate 369*0Sstevel@tonic-gateor, since this is only a hint towards how it works, use CPAN-patchaperl 370*0Sstevel@tonic-gatefrom Andreas K�nig to have better control over the patching process. 371*0Sstevel@tonic-gate 372*0Sstevel@tonic-gate=back 373*0Sstevel@tonic-gate 374*0Sstevel@tonic-gate=head2 Why rsync the source tree 375*0Sstevel@tonic-gate 376*0Sstevel@tonic-gate=over 4 377*0Sstevel@tonic-gate 378*0Sstevel@tonic-gate=item It's easier to rsync the source tree 379*0Sstevel@tonic-gate 380*0Sstevel@tonic-gateSince you don't have to apply the patches yourself, you are sure all 381*0Sstevel@tonic-gatefiles in the source tree are in the right state. 382*0Sstevel@tonic-gate 383*0Sstevel@tonic-gate=item It's more reliable 384*0Sstevel@tonic-gate 385*0Sstevel@tonic-gateWhile both the rsync-able source and patch areas are automatically 386*0Sstevel@tonic-gateupdated every few minutes, keep in mind that applying patches may 387*0Sstevel@tonic-gatesometimes mean careful hand-holding, especially if your version of 388*0Sstevel@tonic-gatethe C<patch> program does not understand how to deal with new files, 389*0Sstevel@tonic-gatefiles with 8-bit characters, or files without trailing newlines. 390*0Sstevel@tonic-gate 391*0Sstevel@tonic-gate=back 392*0Sstevel@tonic-gate 393*0Sstevel@tonic-gate=head2 Why rsync the patches 394*0Sstevel@tonic-gate 395*0Sstevel@tonic-gate=over 4 396*0Sstevel@tonic-gate 397*0Sstevel@tonic-gate=item It's easier to rsync the patches 398*0Sstevel@tonic-gate 399*0Sstevel@tonic-gateIf you have more than one machine that you want to keep in track with 400*0Sstevel@tonic-gatebleadperl, it's easier to rsync the patches only once and then apply 401*0Sstevel@tonic-gatethem to all the source trees on the different machines. 402*0Sstevel@tonic-gate 403*0Sstevel@tonic-gateIn case you try to keep in pace on 5 different machines, for which 404*0Sstevel@tonic-gateonly one of them has access to the WAN, rsync'ing all the source 405*0Sstevel@tonic-gatetrees should than be done 5 times over the NFS. Having 406*0Sstevel@tonic-gatersync'ed the patches only once, I can apply them to all the source 407*0Sstevel@tonic-gatetrees automatically. Need you say more ;-) 408*0Sstevel@tonic-gate 409*0Sstevel@tonic-gate=item It's a good reference 410*0Sstevel@tonic-gate 411*0Sstevel@tonic-gateIf you do not only like to have the most recent development branch, 412*0Sstevel@tonic-gatebut also like to B<fix> bugs, or extend features, you want to dive 413*0Sstevel@tonic-gateinto the sources. If you are a seasoned perl core diver, you don't 414*0Sstevel@tonic-gateneed no manuals, tips, roadmaps, perlguts.pod or other aids to find 415*0Sstevel@tonic-gateyour way around. But if you are a starter, the patches may help you 416*0Sstevel@tonic-gatein finding where you should start and how to change the bits that 417*0Sstevel@tonic-gatebug you. 418*0Sstevel@tonic-gate 419*0Sstevel@tonic-gateThe file B<Changes> is updated on occasions the pumpking sees as his 420*0Sstevel@tonic-gateown little sync points. On those occasions, he releases a tar-ball of 421*0Sstevel@tonic-gatethe current source tree (i.e. perl@7582.tar.gz), which will be an 422*0Sstevel@tonic-gateexcellent point to start with when choosing to use the 'rsync the 423*0Sstevel@tonic-gatepatches' scheme. Starting with perl@7582, which means a set of source 424*0Sstevel@tonic-gatefiles on which the latest applied patch is number 7582, you apply all 425*0Sstevel@tonic-gatesucceeding patches available from then on (7583, 7584, ...). 426*0Sstevel@tonic-gate 427*0Sstevel@tonic-gateYou can use the patches later as a kind of search archive. 428*0Sstevel@tonic-gate 429*0Sstevel@tonic-gate=over 4 430*0Sstevel@tonic-gate 431*0Sstevel@tonic-gate=item Finding a start point 432*0Sstevel@tonic-gate 433*0Sstevel@tonic-gateIf you want to fix/change the behaviour of function/feature Foo, just 434*0Sstevel@tonic-gatescan the patches for patches that mention Foo either in the subject, 435*0Sstevel@tonic-gatethe comments, or the body of the fix. A good chance the patch shows 436*0Sstevel@tonic-gateyou the files that are affected by that patch which are very likely 437*0Sstevel@tonic-gateto be the starting point of your journey into the guts of perl. 438*0Sstevel@tonic-gate 439*0Sstevel@tonic-gate=item Finding how to fix a bug 440*0Sstevel@tonic-gate 441*0Sstevel@tonic-gateIf you've found I<where> the function/feature Foo misbehaves, but you 442*0Sstevel@tonic-gatedon't know how to fix it (but you do know the change you want to 443*0Sstevel@tonic-gatemake), you can, again, peruse the patches for similar changes and 444*0Sstevel@tonic-gatelook how others apply the fix. 445*0Sstevel@tonic-gate 446*0Sstevel@tonic-gate=item Finding the source of misbehaviour 447*0Sstevel@tonic-gate 448*0Sstevel@tonic-gateWhen you keep in sync with bleadperl, the pumpking would love to 449*0Sstevel@tonic-gateI<see> that the community efforts really work. So after each of his 450*0Sstevel@tonic-gatesync points, you are to 'make test' to check if everything is still 451*0Sstevel@tonic-gatein working order. If it is, you do 'make ok', which will send an OK 452*0Sstevel@tonic-gatereport to perlbug@perl.org. (If you do not have access to a mailer 453*0Sstevel@tonic-gatefrom the system you just finished successfully 'make test', you can 454*0Sstevel@tonic-gatedo 'make okfile', which creates the file C<perl.ok>, which you can 455*0Sstevel@tonic-gatethan take to your favourite mailer and mail yourself). 456*0Sstevel@tonic-gate 457*0Sstevel@tonic-gateBut of course, as always, things will not always lead to a success 458*0Sstevel@tonic-gatepath, and one or more test do not pass the 'make test'. Before 459*0Sstevel@tonic-gatesending in a bug report (using 'make nok' or 'make nokfile'), check 460*0Sstevel@tonic-gatethe mailing list if someone else has reported the bug already and if 461*0Sstevel@tonic-gateso, confirm it by replying to that message. If not, you might want to 462*0Sstevel@tonic-gatetrace the source of that misbehaviour B<before> sending in the bug, 463*0Sstevel@tonic-gatewhich will help all the other porters in finding the solution. 464*0Sstevel@tonic-gate 465*0Sstevel@tonic-gateHere the saved patches come in very handy. You can check the list of 466*0Sstevel@tonic-gatepatches to see which patch changed what file and what change caused 467*0Sstevel@tonic-gatethe misbehaviour. If you note that in the bug report, it saves the 468*0Sstevel@tonic-gateone trying to solve it, looking for that point. 469*0Sstevel@tonic-gate 470*0Sstevel@tonic-gate=back 471*0Sstevel@tonic-gate 472*0Sstevel@tonic-gateIf searching the patches is too bothersome, you might consider using 473*0Sstevel@tonic-gateperl's bugtron to find more information about discussions and 474*0Sstevel@tonic-gateramblings on posted bugs. 475*0Sstevel@tonic-gate 476*0Sstevel@tonic-gateIf you want to get the best of both worlds, rsync both the source 477*0Sstevel@tonic-gatetree for convenience, reliability and ease and rsync the patches 478*0Sstevel@tonic-gatefor reference. 479*0Sstevel@tonic-gate 480*0Sstevel@tonic-gate=back 481*0Sstevel@tonic-gate 482*0Sstevel@tonic-gate 483*0Sstevel@tonic-gate=head2 Perlbug administration 484*0Sstevel@tonic-gate 485*0Sstevel@tonic-gateThere is a single remote administrative interface for modifying bug status, 486*0Sstevel@tonic-gatecategory, open issues etc. using the B<RT> I<bugtracker> system, maintained 487*0Sstevel@tonic-gateby I<Robert Spier>. Become an administrator, and close any bugs you can get 488*0Sstevel@tonic-gateyour sticky mitts on: 489*0Sstevel@tonic-gate 490*0Sstevel@tonic-gate http://rt.perl.org 491*0Sstevel@tonic-gate 492*0Sstevel@tonic-gateThe bugtracker mechanism for B<perl5> bugs in particular is at: 493*0Sstevel@tonic-gate 494*0Sstevel@tonic-gate http://bugs6.perl.org/perlbug 495*0Sstevel@tonic-gate 496*0Sstevel@tonic-gateTo email the bug system administrators: 497*0Sstevel@tonic-gate 498*0Sstevel@tonic-gate "perlbug-admin" <perlbug-admin@perl.org> 499*0Sstevel@tonic-gate 500*0Sstevel@tonic-gate 501*0Sstevel@tonic-gate=head2 Submitting patches 502*0Sstevel@tonic-gate 503*0Sstevel@tonic-gateAlways submit patches to I<perl5-porters@perl.org>. If you're 504*0Sstevel@tonic-gatepatching a core module and there's an author listed, send the author a 505*0Sstevel@tonic-gatecopy (see L<Patching a core module>). This lets other porters review 506*0Sstevel@tonic-gateyour patch, which catches a surprising number of errors in patches. 507*0Sstevel@tonic-gateEither use the diff program (available in source code form from 508*0Sstevel@tonic-gateftp://ftp.gnu.org/pub/gnu/ , or use Johan Vromans' I<makepatch> 509*0Sstevel@tonic-gate(available from I<CPAN/authors/id/JV/>). Unified diffs are preferred, 510*0Sstevel@tonic-gatebut context diffs are accepted. Do not send RCS-style diffs or diffs 511*0Sstevel@tonic-gatewithout context lines. More information is given in the 512*0Sstevel@tonic-gateI<Porting/patching.pod> file in the Perl source distribution. Please 513*0Sstevel@tonic-gatepatch against the latest B<development> version (e.g., if you're 514*0Sstevel@tonic-gatefixing a bug in the 5.005 track, patch against the latest 5.005_5x 515*0Sstevel@tonic-gateversion). Only patches that survive the heat of the development 516*0Sstevel@tonic-gatebranch get applied to maintenance versions. 517*0Sstevel@tonic-gate 518*0Sstevel@tonic-gateYour patch should update the documentation and test suite. See 519*0Sstevel@tonic-gateL<Writing a test>. 520*0Sstevel@tonic-gate 521*0Sstevel@tonic-gateTo report a bug in Perl, use the program I<perlbug> which comes with 522*0Sstevel@tonic-gatePerl (if you can't get Perl to work, send mail to the address 523*0Sstevel@tonic-gateI<perlbug@perl.org> or I<perlbug@perl.com>). Reporting bugs through 524*0Sstevel@tonic-gateI<perlbug> feeds into the automated bug-tracking system, access to 525*0Sstevel@tonic-gatewhich is provided through the web at http://bugs.perl.org/ . It 526*0Sstevel@tonic-gateoften pays to check the archives of the perl5-porters mailing list to 527*0Sstevel@tonic-gatesee whether the bug you're reporting has been reported before, and if 528*0Sstevel@tonic-gateso whether it was considered a bug. See above for the location of 529*0Sstevel@tonic-gatethe searchable archives. 530*0Sstevel@tonic-gate 531*0Sstevel@tonic-gateThe CPAN testers ( http://testers.cpan.org/ ) are a group of 532*0Sstevel@tonic-gatevolunteers who test CPAN modules on a variety of platforms. Perl 533*0Sstevel@tonic-gateSmokers ( http://archives.develooper.com/daily-build@perl.org/ ) 534*0Sstevel@tonic-gateautomatically tests Perl source releases on platforms with various 535*0Sstevel@tonic-gateconfigurations. Both efforts welcome volunteers. 536*0Sstevel@tonic-gate 537*0Sstevel@tonic-gateIt's a good idea to read and lurk for a while before chipping in. 538*0Sstevel@tonic-gateThat way you'll get to see the dynamic of the conversations, learn the 539*0Sstevel@tonic-gatepersonalities of the players, and hopefully be better prepared to make 540*0Sstevel@tonic-gatea useful contribution when do you speak up. 541*0Sstevel@tonic-gate 542*0Sstevel@tonic-gateIf after all this you still think you want to join the perl5-porters 543*0Sstevel@tonic-gatemailing list, send mail to I<perl5-porters-subscribe@perl.org>. To 544*0Sstevel@tonic-gateunsubscribe, send mail to I<perl5-porters-unsubscribe@perl.org>. 545*0Sstevel@tonic-gate 546*0Sstevel@tonic-gateTo hack on the Perl guts, you'll need to read the following things: 547*0Sstevel@tonic-gate 548*0Sstevel@tonic-gate=over 3 549*0Sstevel@tonic-gate 550*0Sstevel@tonic-gate=item L<perlguts> 551*0Sstevel@tonic-gate 552*0Sstevel@tonic-gateThis is of paramount importance, since it's the documentation of what 553*0Sstevel@tonic-gategoes where in the Perl source. Read it over a couple of times and it 554*0Sstevel@tonic-gatemight start to make sense - don't worry if it doesn't yet, because the 555*0Sstevel@tonic-gatebest way to study it is to read it in conjunction with poking at Perl 556*0Sstevel@tonic-gatesource, and we'll do that later on. 557*0Sstevel@tonic-gate 558*0Sstevel@tonic-gateYou might also want to look at Gisle Aas's illustrated perlguts - 559*0Sstevel@tonic-gatethere's no guarantee that this will be absolutely up-to-date with the 560*0Sstevel@tonic-gatelatest documentation in the Perl core, but the fundamentals will be 561*0Sstevel@tonic-gateright. ( http://gisle.aas.no/perl/illguts/ ) 562*0Sstevel@tonic-gate 563*0Sstevel@tonic-gate=item L<perlxstut> and L<perlxs> 564*0Sstevel@tonic-gate 565*0Sstevel@tonic-gateA working knowledge of XSUB programming is incredibly useful for core 566*0Sstevel@tonic-gatehacking; XSUBs use techniques drawn from the PP code, the portion of the 567*0Sstevel@tonic-gateguts that actually executes a Perl program. It's a lot gentler to learn 568*0Sstevel@tonic-gatethose techniques from simple examples and explanation than from the core 569*0Sstevel@tonic-gateitself. 570*0Sstevel@tonic-gate 571*0Sstevel@tonic-gate=item L<perlapi> 572*0Sstevel@tonic-gate 573*0Sstevel@tonic-gateThe documentation for the Perl API explains what some of the internal 574*0Sstevel@tonic-gatefunctions do, as well as the many macros used in the source. 575*0Sstevel@tonic-gate 576*0Sstevel@tonic-gate=item F<Porting/pumpkin.pod> 577*0Sstevel@tonic-gate 578*0Sstevel@tonic-gateThis is a collection of words of wisdom for a Perl porter; some of it is 579*0Sstevel@tonic-gateonly useful to the pumpkin holder, but most of it applies to anyone 580*0Sstevel@tonic-gatewanting to go about Perl development. 581*0Sstevel@tonic-gate 582*0Sstevel@tonic-gate=item The perl5-porters FAQ 583*0Sstevel@tonic-gate 584*0Sstevel@tonic-gateThis should be available from http://simon-cozens.org/writings/p5p-faq ; 585*0Sstevel@tonic-gatealternatively, you can get the FAQ emailed to you by sending mail to 586*0Sstevel@tonic-gateC<perl5-porters-faq@perl.org>. It contains hints on reading perl5-porters, 587*0Sstevel@tonic-gateinformation on how perl5-porters works and how Perl development in general 588*0Sstevel@tonic-gateworks. 589*0Sstevel@tonic-gate 590*0Sstevel@tonic-gate=back 591*0Sstevel@tonic-gate 592*0Sstevel@tonic-gate=head2 Finding Your Way Around 593*0Sstevel@tonic-gate 594*0Sstevel@tonic-gatePerl maintenance can be split into a number of areas, and certain people 595*0Sstevel@tonic-gate(pumpkins) will have responsibility for each area. These areas sometimes 596*0Sstevel@tonic-gatecorrespond to files or directories in the source kit. Among the areas are: 597*0Sstevel@tonic-gate 598*0Sstevel@tonic-gate=over 3 599*0Sstevel@tonic-gate 600*0Sstevel@tonic-gate=item Core modules 601*0Sstevel@tonic-gate 602*0Sstevel@tonic-gateModules shipped as part of the Perl core live in the F<lib/> and F<ext/> 603*0Sstevel@tonic-gatesubdirectories: F<lib/> is for the pure-Perl modules, and F<ext/> 604*0Sstevel@tonic-gatecontains the core XS modules. 605*0Sstevel@tonic-gate 606*0Sstevel@tonic-gate=item Tests 607*0Sstevel@tonic-gate 608*0Sstevel@tonic-gateThere are tests for nearly all the modules, built-ins and major bits 609*0Sstevel@tonic-gateof functionality. Test files all have a .t suffix. Module tests live 610*0Sstevel@tonic-gatein the F<lib/> and F<ext/> directories next to the module being 611*0Sstevel@tonic-gatetested. Others live in F<t/>. See L<Writing a test> 612*0Sstevel@tonic-gate 613*0Sstevel@tonic-gate=item Documentation 614*0Sstevel@tonic-gate 615*0Sstevel@tonic-gateDocumentation maintenance includes looking after everything in the 616*0Sstevel@tonic-gateF<pod/> directory, (as well as contributing new documentation) and 617*0Sstevel@tonic-gatethe documentation to the modules in core. 618*0Sstevel@tonic-gate 619*0Sstevel@tonic-gate=item Configure 620*0Sstevel@tonic-gate 621*0Sstevel@tonic-gateThe configure process is the way we make Perl portable across the 622*0Sstevel@tonic-gatemyriad of operating systems it supports. Responsibility for the 623*0Sstevel@tonic-gateconfigure, build and installation process, as well as the overall 624*0Sstevel@tonic-gateportability of the core code rests with the configure pumpkin - others 625*0Sstevel@tonic-gatehelp out with individual operating systems. 626*0Sstevel@tonic-gate 627*0Sstevel@tonic-gateThe files involved are the operating system directories, (F<win32/>, 628*0Sstevel@tonic-gateF<os2/>, F<vms/> and so on) the shell scripts which generate F<config.h> 629*0Sstevel@tonic-gateand F<Makefile>, as well as the metaconfig files which generate 630*0Sstevel@tonic-gateF<Configure>. (metaconfig isn't included in the core distribution.) 631*0Sstevel@tonic-gate 632*0Sstevel@tonic-gate=item Interpreter 633*0Sstevel@tonic-gate 634*0Sstevel@tonic-gateAnd of course, there's the core of the Perl interpreter itself. Let's 635*0Sstevel@tonic-gatehave a look at that in a little more detail. 636*0Sstevel@tonic-gate 637*0Sstevel@tonic-gate=back 638*0Sstevel@tonic-gate 639*0Sstevel@tonic-gateBefore we leave looking at the layout, though, don't forget that 640*0Sstevel@tonic-gateF<MANIFEST> contains not only the file names in the Perl distribution, 641*0Sstevel@tonic-gatebut short descriptions of what's in them, too. For an overview of the 642*0Sstevel@tonic-gateimportant files, try this: 643*0Sstevel@tonic-gate 644*0Sstevel@tonic-gate perl -lne 'print if /^[^\/]+\.[ch]\s+/' MANIFEST 645*0Sstevel@tonic-gate 646*0Sstevel@tonic-gate=head2 Elements of the interpreter 647*0Sstevel@tonic-gate 648*0Sstevel@tonic-gateThe work of the interpreter has two main stages: compiling the code 649*0Sstevel@tonic-gateinto the internal representation, or bytecode, and then executing it. 650*0Sstevel@tonic-gateL<perlguts/Compiled code> explains exactly how the compilation stage 651*0Sstevel@tonic-gatehappens. 652*0Sstevel@tonic-gate 653*0Sstevel@tonic-gateHere is a short breakdown of perl's operation: 654*0Sstevel@tonic-gate 655*0Sstevel@tonic-gate=over 3 656*0Sstevel@tonic-gate 657*0Sstevel@tonic-gate=item Startup 658*0Sstevel@tonic-gate 659*0Sstevel@tonic-gateThe action begins in F<perlmain.c>. (or F<miniperlmain.c> for miniperl) 660*0Sstevel@tonic-gateThis is very high-level code, enough to fit on a single screen, and it 661*0Sstevel@tonic-gateresembles the code found in L<perlembed>; most of the real action takes 662*0Sstevel@tonic-gateplace in F<perl.c> 663*0Sstevel@tonic-gate 664*0Sstevel@tonic-gateFirst, F<perlmain.c> allocates some memory and constructs a Perl 665*0Sstevel@tonic-gateinterpreter: 666*0Sstevel@tonic-gate 667*0Sstevel@tonic-gate 1 PERL_SYS_INIT3(&argc,&argv,&env); 668*0Sstevel@tonic-gate 2 669*0Sstevel@tonic-gate 3 if (!PL_do_undump) { 670*0Sstevel@tonic-gate 4 my_perl = perl_alloc(); 671*0Sstevel@tonic-gate 5 if (!my_perl) 672*0Sstevel@tonic-gate 6 exit(1); 673*0Sstevel@tonic-gate 7 perl_construct(my_perl); 674*0Sstevel@tonic-gate 8 PL_perl_destruct_level = 0; 675*0Sstevel@tonic-gate 9 } 676*0Sstevel@tonic-gate 677*0Sstevel@tonic-gateLine 1 is a macro, and its definition is dependent on your operating 678*0Sstevel@tonic-gatesystem. Line 3 references C<PL_do_undump>, a global variable - all 679*0Sstevel@tonic-gateglobal variables in Perl start with C<PL_>. This tells you whether the 680*0Sstevel@tonic-gatecurrent running program was created with the C<-u> flag to perl and then 681*0Sstevel@tonic-gateF<undump>, which means it's going to be false in any sane context. 682*0Sstevel@tonic-gate 683*0Sstevel@tonic-gateLine 4 calls a function in F<perl.c> to allocate memory for a Perl 684*0Sstevel@tonic-gateinterpreter. It's quite a simple function, and the guts of it looks like 685*0Sstevel@tonic-gatethis: 686*0Sstevel@tonic-gate 687*0Sstevel@tonic-gate my_perl = (PerlInterpreter*)PerlMem_malloc(sizeof(PerlInterpreter)); 688*0Sstevel@tonic-gate 689*0Sstevel@tonic-gateHere you see an example of Perl's system abstraction, which we'll see 690*0Sstevel@tonic-gatelater: C<PerlMem_malloc> is either your system's C<malloc>, or Perl's 691*0Sstevel@tonic-gateown C<malloc> as defined in F<malloc.c> if you selected that option at 692*0Sstevel@tonic-gateconfigure time. 693*0Sstevel@tonic-gate 694*0Sstevel@tonic-gateNext, in line 7, we construct the interpreter; this sets up all the 695*0Sstevel@tonic-gatespecial variables that Perl needs, the stacks, and so on. 696*0Sstevel@tonic-gate 697*0Sstevel@tonic-gateNow we pass Perl the command line options, and tell it to go: 698*0Sstevel@tonic-gate 699*0Sstevel@tonic-gate exitstatus = perl_parse(my_perl, xs_init, argc, argv, (char **)NULL); 700*0Sstevel@tonic-gate if (!exitstatus) { 701*0Sstevel@tonic-gate exitstatus = perl_run(my_perl); 702*0Sstevel@tonic-gate } 703*0Sstevel@tonic-gate 704*0Sstevel@tonic-gate 705*0Sstevel@tonic-gateC<perl_parse> is actually a wrapper around C<S_parse_body>, as defined 706*0Sstevel@tonic-gatein F<perl.c>, which processes the command line options, sets up any 707*0Sstevel@tonic-gatestatically linked XS modules, opens the program and calls C<yyparse> to 708*0Sstevel@tonic-gateparse it. 709*0Sstevel@tonic-gate 710*0Sstevel@tonic-gate=item Parsing 711*0Sstevel@tonic-gate 712*0Sstevel@tonic-gateThe aim of this stage is to take the Perl source, and turn it into an op 713*0Sstevel@tonic-gatetree. We'll see what one of those looks like later. Strictly speaking, 714*0Sstevel@tonic-gatethere's three things going on here. 715*0Sstevel@tonic-gate 716*0Sstevel@tonic-gateC<yyparse>, the parser, lives in F<perly.c>, although you're better off 717*0Sstevel@tonic-gatereading the original YACC input in F<perly.y>. (Yes, Virginia, there 718*0Sstevel@tonic-gateB<is> a YACC grammar for Perl!) The job of the parser is to take your 719*0Sstevel@tonic-gatecode and `understand' it, splitting it into sentences, deciding which 720*0Sstevel@tonic-gateoperands go with which operators and so on. 721*0Sstevel@tonic-gate 722*0Sstevel@tonic-gateThe parser is nobly assisted by the lexer, which chunks up your input 723*0Sstevel@tonic-gateinto tokens, and decides what type of thing each token is: a variable 724*0Sstevel@tonic-gatename, an operator, a bareword, a subroutine, a core function, and so on. 725*0Sstevel@tonic-gateThe main point of entry to the lexer is C<yylex>, and that and its 726*0Sstevel@tonic-gateassociated routines can be found in F<toke.c>. Perl isn't much like 727*0Sstevel@tonic-gateother computer languages; it's highly context sensitive at times, it can 728*0Sstevel@tonic-gatebe tricky to work out what sort of token something is, or where a token 729*0Sstevel@tonic-gateends. As such, there's a lot of interplay between the tokeniser and the 730*0Sstevel@tonic-gateparser, which can get pretty frightening if you're not used to it. 731*0Sstevel@tonic-gate 732*0Sstevel@tonic-gateAs the parser understands a Perl program, it builds up a tree of 733*0Sstevel@tonic-gateoperations for the interpreter to perform during execution. The routines 734*0Sstevel@tonic-gatewhich construct and link together the various operations are to be found 735*0Sstevel@tonic-gatein F<op.c>, and will be examined later. 736*0Sstevel@tonic-gate 737*0Sstevel@tonic-gate=item Optimization 738*0Sstevel@tonic-gate 739*0Sstevel@tonic-gateNow the parsing stage is complete, and the finished tree represents 740*0Sstevel@tonic-gatethe operations that the Perl interpreter needs to perform to execute our 741*0Sstevel@tonic-gateprogram. Next, Perl does a dry run over the tree looking for 742*0Sstevel@tonic-gateoptimisations: constant expressions such as C<3 + 4> will be computed 743*0Sstevel@tonic-gatenow, and the optimizer will also see if any multiple operations can be 744*0Sstevel@tonic-gatereplaced with a single one. For instance, to fetch the variable C<$foo>, 745*0Sstevel@tonic-gateinstead of grabbing the glob C<*foo> and looking at the scalar 746*0Sstevel@tonic-gatecomponent, the optimizer fiddles the op tree to use a function which 747*0Sstevel@tonic-gatedirectly looks up the scalar in question. The main optimizer is C<peep> 748*0Sstevel@tonic-gatein F<op.c>, and many ops have their own optimizing functions. 749*0Sstevel@tonic-gate 750*0Sstevel@tonic-gate=item Running 751*0Sstevel@tonic-gate 752*0Sstevel@tonic-gateNow we're finally ready to go: we have compiled Perl byte code, and all 753*0Sstevel@tonic-gatethat's left to do is run it. The actual execution is done by the 754*0Sstevel@tonic-gateC<runops_standard> function in F<run.c>; more specifically, it's done by 755*0Sstevel@tonic-gatethese three innocent looking lines: 756*0Sstevel@tonic-gate 757*0Sstevel@tonic-gate while ((PL_op = CALL_FPTR(PL_op->op_ppaddr)(aTHX))) { 758*0Sstevel@tonic-gate PERL_ASYNC_CHECK(); 759*0Sstevel@tonic-gate } 760*0Sstevel@tonic-gate 761*0Sstevel@tonic-gateYou may be more comfortable with the Perl version of that: 762*0Sstevel@tonic-gate 763*0Sstevel@tonic-gate PERL_ASYNC_CHECK() while $Perl::op = &{$Perl::op->{function}}; 764*0Sstevel@tonic-gate 765*0Sstevel@tonic-gateWell, maybe not. Anyway, each op contains a function pointer, which 766*0Sstevel@tonic-gatestipulates the function which will actually carry out the operation. 767*0Sstevel@tonic-gateThis function will return the next op in the sequence - this allows for 768*0Sstevel@tonic-gatethings like C<if> which choose the next op dynamically at run time. 769*0Sstevel@tonic-gateThe C<PERL_ASYNC_CHECK> makes sure that things like signals interrupt 770*0Sstevel@tonic-gateexecution if required. 771*0Sstevel@tonic-gate 772*0Sstevel@tonic-gateThe actual functions called are known as PP code, and they're spread 773*0Sstevel@tonic-gatebetween four files: F<pp_hot.c> contains the `hot' code, which is most 774*0Sstevel@tonic-gateoften used and highly optimized, F<pp_sys.c> contains all the 775*0Sstevel@tonic-gatesystem-specific functions, F<pp_ctl.c> contains the functions which 776*0Sstevel@tonic-gateimplement control structures (C<if>, C<while> and the like) and F<pp.c> 777*0Sstevel@tonic-gatecontains everything else. These are, if you like, the C code for Perl's 778*0Sstevel@tonic-gatebuilt-in functions and operators. 779*0Sstevel@tonic-gate 780*0Sstevel@tonic-gate=back 781*0Sstevel@tonic-gate 782*0Sstevel@tonic-gate=head2 Internal Variable Types 783*0Sstevel@tonic-gate 784*0Sstevel@tonic-gateYou should by now have had a look at L<perlguts>, which tells you about 785*0Sstevel@tonic-gatePerl's internal variable types: SVs, HVs, AVs and the rest. If not, do 786*0Sstevel@tonic-gatethat now. 787*0Sstevel@tonic-gate 788*0Sstevel@tonic-gateThese variables are used not only to represent Perl-space variables, but 789*0Sstevel@tonic-gatealso any constants in the code, as well as some structures completely 790*0Sstevel@tonic-gateinternal to Perl. The symbol table, for instance, is an ordinary Perl 791*0Sstevel@tonic-gatehash. Your code is represented by an SV as it's read into the parser; 792*0Sstevel@tonic-gateany program files you call are opened via ordinary Perl filehandles, and 793*0Sstevel@tonic-gateso on. 794*0Sstevel@tonic-gate 795*0Sstevel@tonic-gateThe core L<Devel::Peek|Devel::Peek> module lets us examine SVs from a 796*0Sstevel@tonic-gatePerl program. Let's see, for instance, how Perl treats the constant 797*0Sstevel@tonic-gateC<"hello">. 798*0Sstevel@tonic-gate 799*0Sstevel@tonic-gate % perl -MDevel::Peek -e 'Dump("hello")' 800*0Sstevel@tonic-gate 1 SV = PV(0xa041450) at 0xa04ecbc 801*0Sstevel@tonic-gate 2 REFCNT = 1 802*0Sstevel@tonic-gate 3 FLAGS = (POK,READONLY,pPOK) 803*0Sstevel@tonic-gate 4 PV = 0xa0484e0 "hello"\0 804*0Sstevel@tonic-gate 5 CUR = 5 805*0Sstevel@tonic-gate 6 LEN = 6 806*0Sstevel@tonic-gate 807*0Sstevel@tonic-gateReading C<Devel::Peek> output takes a bit of practise, so let's go 808*0Sstevel@tonic-gatethrough it line by line. 809*0Sstevel@tonic-gate 810*0Sstevel@tonic-gateLine 1 tells us we're looking at an SV which lives at C<0xa04ecbc> in 811*0Sstevel@tonic-gatememory. SVs themselves are very simple structures, but they contain a 812*0Sstevel@tonic-gatepointer to a more complex structure. In this case, it's a PV, a 813*0Sstevel@tonic-gatestructure which holds a string value, at location C<0xa041450>. Line 2 814*0Sstevel@tonic-gateis the reference count; there are no other references to this data, so 815*0Sstevel@tonic-gateit's 1. 816*0Sstevel@tonic-gate 817*0Sstevel@tonic-gateLine 3 are the flags for this SV - it's OK to use it as a PV, it's a 818*0Sstevel@tonic-gateread-only SV (because it's a constant) and the data is a PV internally. 819*0Sstevel@tonic-gateNext we've got the contents of the string, starting at location 820*0Sstevel@tonic-gateC<0xa0484e0>. 821*0Sstevel@tonic-gate 822*0Sstevel@tonic-gateLine 5 gives us the current length of the string - note that this does 823*0Sstevel@tonic-gateB<not> include the null terminator. Line 6 is not the length of the 824*0Sstevel@tonic-gatestring, but the length of the currently allocated buffer; as the string 825*0Sstevel@tonic-gategrows, Perl automatically extends the available storage via a routine 826*0Sstevel@tonic-gatecalled C<SvGROW>. 827*0Sstevel@tonic-gate 828*0Sstevel@tonic-gateYou can get at any of these quantities from C very easily; just add 829*0Sstevel@tonic-gateC<Sv> to the name of the field shown in the snippet, and you've got a 830*0Sstevel@tonic-gatemacro which will return the value: C<SvCUR(sv)> returns the current 831*0Sstevel@tonic-gatelength of the string, C<SvREFCOUNT(sv)> returns the reference count, 832*0Sstevel@tonic-gateC<SvPV(sv, len)> returns the string itself with its length, and so on. 833*0Sstevel@tonic-gateMore macros to manipulate these properties can be found in L<perlguts>. 834*0Sstevel@tonic-gate 835*0Sstevel@tonic-gateLet's take an example of manipulating a PV, from C<sv_catpvn>, in F<sv.c> 836*0Sstevel@tonic-gate 837*0Sstevel@tonic-gate 1 void 838*0Sstevel@tonic-gate 2 Perl_sv_catpvn(pTHX_ register SV *sv, register const char *ptr, register STRLEN len) 839*0Sstevel@tonic-gate 3 { 840*0Sstevel@tonic-gate 4 STRLEN tlen; 841*0Sstevel@tonic-gate 5 char *junk; 842*0Sstevel@tonic-gate 843*0Sstevel@tonic-gate 6 junk = SvPV_force(sv, tlen); 844*0Sstevel@tonic-gate 7 SvGROW(sv, tlen + len + 1); 845*0Sstevel@tonic-gate 8 if (ptr == junk) 846*0Sstevel@tonic-gate 9 ptr = SvPVX(sv); 847*0Sstevel@tonic-gate 10 Move(ptr,SvPVX(sv)+tlen,len,char); 848*0Sstevel@tonic-gate 11 SvCUR(sv) += len; 849*0Sstevel@tonic-gate 12 *SvEND(sv) = '\0'; 850*0Sstevel@tonic-gate 13 (void)SvPOK_only_UTF8(sv); /* validate pointer */ 851*0Sstevel@tonic-gate 14 SvTAINT(sv); 852*0Sstevel@tonic-gate 15 } 853*0Sstevel@tonic-gate 854*0Sstevel@tonic-gateThis is a function which adds a string, C<ptr>, of length C<len> onto 855*0Sstevel@tonic-gatethe end of the PV stored in C<sv>. The first thing we do in line 6 is 856*0Sstevel@tonic-gatemake sure that the SV B<has> a valid PV, by calling the C<SvPV_force> 857*0Sstevel@tonic-gatemacro to force a PV. As a side effect, C<tlen> gets set to the current 858*0Sstevel@tonic-gatevalue of the PV, and the PV itself is returned to C<junk>. 859*0Sstevel@tonic-gate 860*0Sstevel@tonic-gateIn line 7, we make sure that the SV will have enough room to accommodate 861*0Sstevel@tonic-gatethe old string, the new string and the null terminator. If C<LEN> isn't 862*0Sstevel@tonic-gatebig enough, C<SvGROW> will reallocate space for us. 863*0Sstevel@tonic-gate 864*0Sstevel@tonic-gateNow, if C<junk> is the same as the string we're trying to add, we can 865*0Sstevel@tonic-gategrab the string directly from the SV; C<SvPVX> is the address of the PV 866*0Sstevel@tonic-gatein the SV. 867*0Sstevel@tonic-gate 868*0Sstevel@tonic-gateLine 10 does the actual catenation: the C<Move> macro moves a chunk of 869*0Sstevel@tonic-gatememory around: we move the string C<ptr> to the end of the PV - that's 870*0Sstevel@tonic-gatethe start of the PV plus its current length. We're moving C<len> bytes 871*0Sstevel@tonic-gateof type C<char>. After doing so, we need to tell Perl we've extended the 872*0Sstevel@tonic-gatestring, by altering C<CUR> to reflect the new length. C<SvEND> is a 873*0Sstevel@tonic-gatemacro which gives us the end of the string, so that needs to be a 874*0Sstevel@tonic-gateC<"\0">. 875*0Sstevel@tonic-gate 876*0Sstevel@tonic-gateLine 13 manipulates the flags; since we've changed the PV, any IV or NV 877*0Sstevel@tonic-gatevalues will no longer be valid: if we have C<$a=10; $a.="6";> we don't 878*0Sstevel@tonic-gatewant to use the old IV of 10. C<SvPOK_only_utf8> is a special UTF-8-aware 879*0Sstevel@tonic-gateversion of C<SvPOK_only>, a macro which turns off the IOK and NOK flags 880*0Sstevel@tonic-gateand turns on POK. The final C<SvTAINT> is a macro which launders tainted 881*0Sstevel@tonic-gatedata if taint mode is turned on. 882*0Sstevel@tonic-gate 883*0Sstevel@tonic-gateAVs and HVs are more complicated, but SVs are by far the most common 884*0Sstevel@tonic-gatevariable type being thrown around. Having seen something of how we 885*0Sstevel@tonic-gatemanipulate these, let's go on and look at how the op tree is 886*0Sstevel@tonic-gateconstructed. 887*0Sstevel@tonic-gate 888*0Sstevel@tonic-gate=head2 Op Trees 889*0Sstevel@tonic-gate 890*0Sstevel@tonic-gateFirst, what is the op tree, anyway? The op tree is the parsed 891*0Sstevel@tonic-gaterepresentation of your program, as we saw in our section on parsing, and 892*0Sstevel@tonic-gateit's the sequence of operations that Perl goes through to execute your 893*0Sstevel@tonic-gateprogram, as we saw in L</Running>. 894*0Sstevel@tonic-gate 895*0Sstevel@tonic-gateAn op is a fundamental operation that Perl can perform: all the built-in 896*0Sstevel@tonic-gatefunctions and operators are ops, and there are a series of ops which 897*0Sstevel@tonic-gatedeal with concepts the interpreter needs internally - entering and 898*0Sstevel@tonic-gateleaving a block, ending a statement, fetching a variable, and so on. 899*0Sstevel@tonic-gate 900*0Sstevel@tonic-gateThe op tree is connected in two ways: you can imagine that there are two 901*0Sstevel@tonic-gate"routes" through it, two orders in which you can traverse the tree. 902*0Sstevel@tonic-gateFirst, parse order reflects how the parser understood the code, and 903*0Sstevel@tonic-gatesecondly, execution order tells perl what order to perform the 904*0Sstevel@tonic-gateoperations in. 905*0Sstevel@tonic-gate 906*0Sstevel@tonic-gateThe easiest way to examine the op tree is to stop Perl after it has 907*0Sstevel@tonic-gatefinished parsing, and get it to dump out the tree. This is exactly what 908*0Sstevel@tonic-gatethe compiler backends L<B::Terse|B::Terse>, L<B::Concise|B::Concise> 909*0Sstevel@tonic-gateand L<B::Debug|B::Debug> do. 910*0Sstevel@tonic-gate 911*0Sstevel@tonic-gateLet's have a look at how Perl sees C<$a = $b + $c>: 912*0Sstevel@tonic-gate 913*0Sstevel@tonic-gate % perl -MO=Terse -e '$a=$b+$c' 914*0Sstevel@tonic-gate 1 LISTOP (0x8179888) leave 915*0Sstevel@tonic-gate 2 OP (0x81798b0) enter 916*0Sstevel@tonic-gate 3 COP (0x8179850) nextstate 917*0Sstevel@tonic-gate 4 BINOP (0x8179828) sassign 918*0Sstevel@tonic-gate 5 BINOP (0x8179800) add [1] 919*0Sstevel@tonic-gate 6 UNOP (0x81796e0) null [15] 920*0Sstevel@tonic-gate 7 SVOP (0x80fafe0) gvsv GV (0x80fa4cc) *b 921*0Sstevel@tonic-gate 8 UNOP (0x81797e0) null [15] 922*0Sstevel@tonic-gate 9 SVOP (0x8179700) gvsv GV (0x80efeb0) *c 923*0Sstevel@tonic-gate 10 UNOP (0x816b4f0) null [15] 924*0Sstevel@tonic-gate 11 SVOP (0x816dcf0) gvsv GV (0x80fa460) *a 925*0Sstevel@tonic-gate 926*0Sstevel@tonic-gateLet's start in the middle, at line 4. This is a BINOP, a binary 927*0Sstevel@tonic-gateoperator, which is at location C<0x8179828>. The specific operator in 928*0Sstevel@tonic-gatequestion is C<sassign> - scalar assignment - and you can find the code 929*0Sstevel@tonic-gatewhich implements it in the function C<pp_sassign> in F<pp_hot.c>. As a 930*0Sstevel@tonic-gatebinary operator, it has two children: the add operator, providing the 931*0Sstevel@tonic-gateresult of C<$b+$c>, is uppermost on line 5, and the left hand side is on 932*0Sstevel@tonic-gateline 10. 933*0Sstevel@tonic-gate 934*0Sstevel@tonic-gateLine 10 is the null op: this does exactly nothing. What is that doing 935*0Sstevel@tonic-gatethere? If you see the null op, it's a sign that something has been 936*0Sstevel@tonic-gateoptimized away after parsing. As we mentioned in L</Optimization>, 937*0Sstevel@tonic-gatethe optimization stage sometimes converts two operations into one, for 938*0Sstevel@tonic-gateexample when fetching a scalar variable. When this happens, instead of 939*0Sstevel@tonic-gaterewriting the op tree and cleaning up the dangling pointers, it's easier 940*0Sstevel@tonic-gatejust to replace the redundant operation with the null op. Originally, 941*0Sstevel@tonic-gatethe tree would have looked like this: 942*0Sstevel@tonic-gate 943*0Sstevel@tonic-gate 10 SVOP (0x816b4f0) rv2sv [15] 944*0Sstevel@tonic-gate 11 SVOP (0x816dcf0) gv GV (0x80fa460) *a 945*0Sstevel@tonic-gate 946*0Sstevel@tonic-gateThat is, fetch the C<a> entry from the main symbol table, and then look 947*0Sstevel@tonic-gateat the scalar component of it: C<gvsv> (C<pp_gvsv> into F<pp_hot.c>) 948*0Sstevel@tonic-gatehappens to do both these things. 949*0Sstevel@tonic-gate 950*0Sstevel@tonic-gateThe right hand side, starting at line 5 is similar to what we've just 951*0Sstevel@tonic-gateseen: we have the C<add> op (C<pp_add> also in F<pp_hot.c>) add together 952*0Sstevel@tonic-gatetwo C<gvsv>s. 953*0Sstevel@tonic-gate 954*0Sstevel@tonic-gateNow, what's this about? 955*0Sstevel@tonic-gate 956*0Sstevel@tonic-gate 1 LISTOP (0x8179888) leave 957*0Sstevel@tonic-gate 2 OP (0x81798b0) enter 958*0Sstevel@tonic-gate 3 COP (0x8179850) nextstate 959*0Sstevel@tonic-gate 960*0Sstevel@tonic-gateC<enter> and C<leave> are scoping ops, and their job is to perform any 961*0Sstevel@tonic-gatehousekeeping every time you enter and leave a block: lexical variables 962*0Sstevel@tonic-gateare tidied up, unreferenced variables are destroyed, and so on. Every 963*0Sstevel@tonic-gateprogram will have those first three lines: C<leave> is a list, and its 964*0Sstevel@tonic-gatechildren are all the statements in the block. Statements are delimited 965*0Sstevel@tonic-gateby C<nextstate>, so a block is a collection of C<nextstate> ops, with 966*0Sstevel@tonic-gatethe ops to be performed for each statement being the children of 967*0Sstevel@tonic-gateC<nextstate>. C<enter> is a single op which functions as a marker. 968*0Sstevel@tonic-gate 969*0Sstevel@tonic-gateThat's how Perl parsed the program, from top to bottom: 970*0Sstevel@tonic-gate 971*0Sstevel@tonic-gate Program 972*0Sstevel@tonic-gate | 973*0Sstevel@tonic-gate Statement 974*0Sstevel@tonic-gate | 975*0Sstevel@tonic-gate = 976*0Sstevel@tonic-gate / \ 977*0Sstevel@tonic-gate / \ 978*0Sstevel@tonic-gate $a + 979*0Sstevel@tonic-gate / \ 980*0Sstevel@tonic-gate $b $c 981*0Sstevel@tonic-gate 982*0Sstevel@tonic-gateHowever, it's impossible to B<perform> the operations in this order: 983*0Sstevel@tonic-gateyou have to find the values of C<$b> and C<$c> before you add them 984*0Sstevel@tonic-gatetogether, for instance. So, the other thread that runs through the op 985*0Sstevel@tonic-gatetree is the execution order: each op has a field C<op_next> which points 986*0Sstevel@tonic-gateto the next op to be run, so following these pointers tells us how perl 987*0Sstevel@tonic-gateexecutes the code. We can traverse the tree in this order using 988*0Sstevel@tonic-gatethe C<exec> option to C<B::Terse>: 989*0Sstevel@tonic-gate 990*0Sstevel@tonic-gate % perl -MO=Terse,exec -e '$a=$b+$c' 991*0Sstevel@tonic-gate 1 OP (0x8179928) enter 992*0Sstevel@tonic-gate 2 COP (0x81798c8) nextstate 993*0Sstevel@tonic-gate 3 SVOP (0x81796c8) gvsv GV (0x80fa4d4) *b 994*0Sstevel@tonic-gate 4 SVOP (0x8179798) gvsv GV (0x80efeb0) *c 995*0Sstevel@tonic-gate 5 BINOP (0x8179878) add [1] 996*0Sstevel@tonic-gate 6 SVOP (0x816dd38) gvsv GV (0x80fa468) *a 997*0Sstevel@tonic-gate 7 BINOP (0x81798a0) sassign 998*0Sstevel@tonic-gate 8 LISTOP (0x8179900) leave 999*0Sstevel@tonic-gate 1000*0Sstevel@tonic-gateThis probably makes more sense for a human: enter a block, start a 1001*0Sstevel@tonic-gatestatement. Get the values of C<$b> and C<$c>, and add them together. 1002*0Sstevel@tonic-gateFind C<$a>, and assign one to the other. Then leave. 1003*0Sstevel@tonic-gate 1004*0Sstevel@tonic-gateThe way Perl builds up these op trees in the parsing process can be 1005*0Sstevel@tonic-gateunravelled by examining F<perly.y>, the YACC grammar. Let's take the 1006*0Sstevel@tonic-gatepiece we need to construct the tree for C<$a = $b + $c> 1007*0Sstevel@tonic-gate 1008*0Sstevel@tonic-gate 1 term : term ASSIGNOP term 1009*0Sstevel@tonic-gate 2 { $$ = newASSIGNOP(OPf_STACKED, $1, $2, $3); } 1010*0Sstevel@tonic-gate 3 | term ADDOP term 1011*0Sstevel@tonic-gate 4 { $$ = newBINOP($2, 0, scalar($1), scalar($3)); } 1012*0Sstevel@tonic-gate 1013*0Sstevel@tonic-gateIf you're not used to reading BNF grammars, this is how it works: You're 1014*0Sstevel@tonic-gatefed certain things by the tokeniser, which generally end up in upper 1015*0Sstevel@tonic-gatecase. Here, C<ADDOP>, is provided when the tokeniser sees C<+> in your 1016*0Sstevel@tonic-gatecode. C<ASSIGNOP> is provided when C<=> is used for assigning. These are 1017*0Sstevel@tonic-gate`terminal symbols', because you can't get any simpler than them. 1018*0Sstevel@tonic-gate 1019*0Sstevel@tonic-gateThe grammar, lines one and three of the snippet above, tells you how to 1020*0Sstevel@tonic-gatebuild up more complex forms. These complex forms, `non-terminal symbols' 1021*0Sstevel@tonic-gateare generally placed in lower case. C<term> here is a non-terminal 1022*0Sstevel@tonic-gatesymbol, representing a single expression. 1023*0Sstevel@tonic-gate 1024*0Sstevel@tonic-gateThe grammar gives you the following rule: you can make the thing on the 1025*0Sstevel@tonic-gateleft of the colon if you see all the things on the right in sequence. 1026*0Sstevel@tonic-gateThis is called a "reduction", and the aim of parsing is to completely 1027*0Sstevel@tonic-gatereduce the input. There are several different ways you can perform a 1028*0Sstevel@tonic-gatereduction, separated by vertical bars: so, C<term> followed by C<=> 1029*0Sstevel@tonic-gatefollowed by C<term> makes a C<term>, and C<term> followed by C<+> 1030*0Sstevel@tonic-gatefollowed by C<term> can also make a C<term>. 1031*0Sstevel@tonic-gate 1032*0Sstevel@tonic-gateSo, if you see two terms with an C<=> or C<+>, between them, you can 1033*0Sstevel@tonic-gateturn them into a single expression. When you do this, you execute the 1034*0Sstevel@tonic-gatecode in the block on the next line: if you see C<=>, you'll do the code 1035*0Sstevel@tonic-gatein line 2. If you see C<+>, you'll do the code in line 4. It's this code 1036*0Sstevel@tonic-gatewhich contributes to the op tree. 1037*0Sstevel@tonic-gate 1038*0Sstevel@tonic-gate | term ADDOP term 1039*0Sstevel@tonic-gate { $$ = newBINOP($2, 0, scalar($1), scalar($3)); } 1040*0Sstevel@tonic-gate 1041*0Sstevel@tonic-gateWhat this does is creates a new binary op, and feeds it a number of 1042*0Sstevel@tonic-gatevariables. The variables refer to the tokens: C<$1> is the first token in 1043*0Sstevel@tonic-gatethe input, C<$2> the second, and so on - think regular expression 1044*0Sstevel@tonic-gatebackreferences. C<$$> is the op returned from this reduction. So, we 1045*0Sstevel@tonic-gatecall C<newBINOP> to create a new binary operator. The first parameter to 1046*0Sstevel@tonic-gateC<newBINOP>, a function in F<op.c>, is the op type. It's an addition 1047*0Sstevel@tonic-gateoperator, so we want the type to be C<ADDOP>. We could specify this 1048*0Sstevel@tonic-gatedirectly, but it's right there as the second token in the input, so we 1049*0Sstevel@tonic-gateuse C<$2>. The second parameter is the op's flags: 0 means `nothing 1050*0Sstevel@tonic-gatespecial'. Then the things to add: the left and right hand side of our 1051*0Sstevel@tonic-gateexpression, in scalar context. 1052*0Sstevel@tonic-gate 1053*0Sstevel@tonic-gate=head2 Stacks 1054*0Sstevel@tonic-gate 1055*0Sstevel@tonic-gateWhen perl executes something like C<addop>, how does it pass on its 1056*0Sstevel@tonic-gateresults to the next op? The answer is, through the use of stacks. Perl 1057*0Sstevel@tonic-gatehas a number of stacks to store things it's currently working on, and 1058*0Sstevel@tonic-gatewe'll look at the three most important ones here. 1059*0Sstevel@tonic-gate 1060*0Sstevel@tonic-gate=over 3 1061*0Sstevel@tonic-gate 1062*0Sstevel@tonic-gate=item Argument stack 1063*0Sstevel@tonic-gate 1064*0Sstevel@tonic-gateArguments are passed to PP code and returned from PP code using the 1065*0Sstevel@tonic-gateargument stack, C<ST>. The typical way to handle arguments is to pop 1066*0Sstevel@tonic-gatethem off the stack, deal with them how you wish, and then push the result 1067*0Sstevel@tonic-gateback onto the stack. This is how, for instance, the cosine operator 1068*0Sstevel@tonic-gateworks: 1069*0Sstevel@tonic-gate 1070*0Sstevel@tonic-gate NV value; 1071*0Sstevel@tonic-gate value = POPn; 1072*0Sstevel@tonic-gate value = Perl_cos(value); 1073*0Sstevel@tonic-gate XPUSHn(value); 1074*0Sstevel@tonic-gate 1075*0Sstevel@tonic-gateWe'll see a more tricky example of this when we consider Perl's macros 1076*0Sstevel@tonic-gatebelow. C<POPn> gives you the NV (floating point value) of the top SV on 1077*0Sstevel@tonic-gatethe stack: the C<$x> in C<cos($x)>. Then we compute the cosine, and push 1078*0Sstevel@tonic-gatethe result back as an NV. The C<X> in C<XPUSHn> means that the stack 1079*0Sstevel@tonic-gateshould be extended if necessary - it can't be necessary here, because we 1080*0Sstevel@tonic-gateknow there's room for one more item on the stack, since we've just 1081*0Sstevel@tonic-gateremoved one! The C<XPUSH*> macros at least guarantee safety. 1082*0Sstevel@tonic-gate 1083*0Sstevel@tonic-gateAlternatively, you can fiddle with the stack directly: C<SP> gives you 1084*0Sstevel@tonic-gatethe first element in your portion of the stack, and C<TOP*> gives you 1085*0Sstevel@tonic-gatethe top SV/IV/NV/etc. on the stack. So, for instance, to do unary 1086*0Sstevel@tonic-gatenegation of an integer: 1087*0Sstevel@tonic-gate 1088*0Sstevel@tonic-gate SETi(-TOPi); 1089*0Sstevel@tonic-gate 1090*0Sstevel@tonic-gateJust set the integer value of the top stack entry to its negation. 1091*0Sstevel@tonic-gate 1092*0Sstevel@tonic-gateArgument stack manipulation in the core is exactly the same as it is in 1093*0Sstevel@tonic-gateXSUBs - see L<perlxstut>, L<perlxs> and L<perlguts> for a longer 1094*0Sstevel@tonic-gatedescription of the macros used in stack manipulation. 1095*0Sstevel@tonic-gate 1096*0Sstevel@tonic-gate=item Mark stack 1097*0Sstevel@tonic-gate 1098*0Sstevel@tonic-gateI say `your portion of the stack' above because PP code doesn't 1099*0Sstevel@tonic-gatenecessarily get the whole stack to itself: if your function calls 1100*0Sstevel@tonic-gateanother function, you'll only want to expose the arguments aimed for the 1101*0Sstevel@tonic-gatecalled function, and not (necessarily) let it get at your own data. The 1102*0Sstevel@tonic-gateway we do this is to have a `virtual' bottom-of-stack, exposed to each 1103*0Sstevel@tonic-gatefunction. The mark stack keeps bookmarks to locations in the argument 1104*0Sstevel@tonic-gatestack usable by each function. For instance, when dealing with a tied 1105*0Sstevel@tonic-gatevariable, (internally, something with `P' magic) Perl has to call 1106*0Sstevel@tonic-gatemethods for accesses to the tied variables. However, we need to separate 1107*0Sstevel@tonic-gatethe arguments exposed to the method to the argument exposed to the 1108*0Sstevel@tonic-gateoriginal function - the store or fetch or whatever it may be. Here's how 1109*0Sstevel@tonic-gatethe tied C<push> is implemented; see C<av_push> in F<av.c>: 1110*0Sstevel@tonic-gate 1111*0Sstevel@tonic-gate 1 PUSHMARK(SP); 1112*0Sstevel@tonic-gate 2 EXTEND(SP,2); 1113*0Sstevel@tonic-gate 3 PUSHs(SvTIED_obj((SV*)av, mg)); 1114*0Sstevel@tonic-gate 4 PUSHs(val); 1115*0Sstevel@tonic-gate 5 PUTBACK; 1116*0Sstevel@tonic-gate 6 ENTER; 1117*0Sstevel@tonic-gate 7 call_method("PUSH", G_SCALAR|G_DISCARD); 1118*0Sstevel@tonic-gate 8 LEAVE; 1119*0Sstevel@tonic-gate 9 POPSTACK; 1120*0Sstevel@tonic-gate 1121*0Sstevel@tonic-gateThe lines which concern the mark stack are the first, fifth and last 1122*0Sstevel@tonic-gatelines: they save away, restore and remove the current position of the 1123*0Sstevel@tonic-gateargument stack. 1124*0Sstevel@tonic-gate 1125*0Sstevel@tonic-gateLet's examine the whole implementation, for practice: 1126*0Sstevel@tonic-gate 1127*0Sstevel@tonic-gate 1 PUSHMARK(SP); 1128*0Sstevel@tonic-gate 1129*0Sstevel@tonic-gatePush the current state of the stack pointer onto the mark stack. This is 1130*0Sstevel@tonic-gateso that when we've finished adding items to the argument stack, Perl 1131*0Sstevel@tonic-gateknows how many things we've added recently. 1132*0Sstevel@tonic-gate 1133*0Sstevel@tonic-gate 2 EXTEND(SP,2); 1134*0Sstevel@tonic-gate 3 PUSHs(SvTIED_obj((SV*)av, mg)); 1135*0Sstevel@tonic-gate 4 PUSHs(val); 1136*0Sstevel@tonic-gate 1137*0Sstevel@tonic-gateWe're going to add two more items onto the argument stack: when you have 1138*0Sstevel@tonic-gatea tied array, the C<PUSH> subroutine receives the object and the value 1139*0Sstevel@tonic-gateto be pushed, and that's exactly what we have here - the tied object, 1140*0Sstevel@tonic-gateretrieved with C<SvTIED_obj>, and the value, the SV C<val>. 1141*0Sstevel@tonic-gate 1142*0Sstevel@tonic-gate 5 PUTBACK; 1143*0Sstevel@tonic-gate 1144*0Sstevel@tonic-gateNext we tell Perl to make the change to the global stack pointer: C<dSP> 1145*0Sstevel@tonic-gateonly gave us a local copy, not a reference to the global. 1146*0Sstevel@tonic-gate 1147*0Sstevel@tonic-gate 6 ENTER; 1148*0Sstevel@tonic-gate 7 call_method("PUSH", G_SCALAR|G_DISCARD); 1149*0Sstevel@tonic-gate 8 LEAVE; 1150*0Sstevel@tonic-gate 1151*0Sstevel@tonic-gateC<ENTER> and C<LEAVE> localise a block of code - they make sure that all 1152*0Sstevel@tonic-gatevariables are tidied up, everything that has been localised gets 1153*0Sstevel@tonic-gateits previous value returned, and so on. Think of them as the C<{> and 1154*0Sstevel@tonic-gateC<}> of a Perl block. 1155*0Sstevel@tonic-gate 1156*0Sstevel@tonic-gateTo actually do the magic method call, we have to call a subroutine in 1157*0Sstevel@tonic-gatePerl space: C<call_method> takes care of that, and it's described in 1158*0Sstevel@tonic-gateL<perlcall>. We call the C<PUSH> method in scalar context, and we're 1159*0Sstevel@tonic-gategoing to discard its return value. 1160*0Sstevel@tonic-gate 1161*0Sstevel@tonic-gate 9 POPSTACK; 1162*0Sstevel@tonic-gate 1163*0Sstevel@tonic-gateFinally, we remove the value we placed on the mark stack, since we 1164*0Sstevel@tonic-gatedon't need it any more. 1165*0Sstevel@tonic-gate 1166*0Sstevel@tonic-gate=item Save stack 1167*0Sstevel@tonic-gate 1168*0Sstevel@tonic-gateC doesn't have a concept of local scope, so perl provides one. We've 1169*0Sstevel@tonic-gateseen that C<ENTER> and C<LEAVE> are used as scoping braces; the save 1170*0Sstevel@tonic-gatestack implements the C equivalent of, for example: 1171*0Sstevel@tonic-gate 1172*0Sstevel@tonic-gate { 1173*0Sstevel@tonic-gate local $foo = 42; 1174*0Sstevel@tonic-gate ... 1175*0Sstevel@tonic-gate } 1176*0Sstevel@tonic-gate 1177*0Sstevel@tonic-gateSee L<perlguts/Localising Changes> for how to use the save stack. 1178*0Sstevel@tonic-gate 1179*0Sstevel@tonic-gate=back 1180*0Sstevel@tonic-gate 1181*0Sstevel@tonic-gate=head2 Millions of Macros 1182*0Sstevel@tonic-gate 1183*0Sstevel@tonic-gateOne thing you'll notice about the Perl source is that it's full of 1184*0Sstevel@tonic-gatemacros. Some have called the pervasive use of macros the hardest thing 1185*0Sstevel@tonic-gateto understand, others find it adds to clarity. Let's take an example, 1186*0Sstevel@tonic-gatethe code which implements the addition operator: 1187*0Sstevel@tonic-gate 1188*0Sstevel@tonic-gate 1 PP(pp_add) 1189*0Sstevel@tonic-gate 2 { 1190*0Sstevel@tonic-gate 3 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); 1191*0Sstevel@tonic-gate 4 { 1192*0Sstevel@tonic-gate 5 dPOPTOPnnrl_ul; 1193*0Sstevel@tonic-gate 6 SETn( left + right ); 1194*0Sstevel@tonic-gate 7 RETURN; 1195*0Sstevel@tonic-gate 8 } 1196*0Sstevel@tonic-gate 9 } 1197*0Sstevel@tonic-gate 1198*0Sstevel@tonic-gateEvery line here (apart from the braces, of course) contains a macro. The 1199*0Sstevel@tonic-gatefirst line sets up the function declaration as Perl expects for PP code; 1200*0Sstevel@tonic-gateline 3 sets up variable declarations for the argument stack and the 1201*0Sstevel@tonic-gatetarget, the return value of the operation. Finally, it tries to see if 1202*0Sstevel@tonic-gatethe addition operation is overloaded; if so, the appropriate subroutine 1203*0Sstevel@tonic-gateis called. 1204*0Sstevel@tonic-gate 1205*0Sstevel@tonic-gateLine 5 is another variable declaration - all variable declarations start 1206*0Sstevel@tonic-gatewith C<d> - which pops from the top of the argument stack two NVs (hence 1207*0Sstevel@tonic-gateC<nn>) and puts them into the variables C<right> and C<left>, hence the 1208*0Sstevel@tonic-gateC<rl>. These are the two operands to the addition operator. Next, we 1209*0Sstevel@tonic-gatecall C<SETn> to set the NV of the return value to the result of adding 1210*0Sstevel@tonic-gatethe two values. This done, we return - the C<RETURN> macro makes sure 1211*0Sstevel@tonic-gatethat our return value is properly handled, and we pass the next operator 1212*0Sstevel@tonic-gateto run back to the main run loop. 1213*0Sstevel@tonic-gate 1214*0Sstevel@tonic-gateMost of these macros are explained in L<perlapi>, and some of the more 1215*0Sstevel@tonic-gateimportant ones are explained in L<perlxs> as well. Pay special attention 1216*0Sstevel@tonic-gateto L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for information on 1217*0Sstevel@tonic-gatethe C<[pad]THX_?> macros. 1218*0Sstevel@tonic-gate 1219*0Sstevel@tonic-gate=head2 The .i Targets 1220*0Sstevel@tonic-gate 1221*0Sstevel@tonic-gateYou can expand the macros in a F<foo.c> file by saying 1222*0Sstevel@tonic-gate 1223*0Sstevel@tonic-gate make foo.i 1224*0Sstevel@tonic-gate 1225*0Sstevel@tonic-gatewhich will expand the macros using cpp. Don't be scared by the results. 1226*0Sstevel@tonic-gate 1227*0Sstevel@tonic-gate=head2 Poking at Perl 1228*0Sstevel@tonic-gate 1229*0Sstevel@tonic-gateTo really poke around with Perl, you'll probably want to build Perl for 1230*0Sstevel@tonic-gatedebugging, like this: 1231*0Sstevel@tonic-gate 1232*0Sstevel@tonic-gate ./Configure -d -D optimize=-g 1233*0Sstevel@tonic-gate make 1234*0Sstevel@tonic-gate 1235*0Sstevel@tonic-gateC<-g> is a flag to the C compiler to have it produce debugging 1236*0Sstevel@tonic-gateinformation which will allow us to step through a running program. 1237*0Sstevel@tonic-gateF<Configure> will also turn on the C<DEBUGGING> compilation symbol which 1238*0Sstevel@tonic-gateenables all the internal debugging code in Perl. There are a whole bunch 1239*0Sstevel@tonic-gateof things you can debug with this: L<perlrun> lists them all, and the 1240*0Sstevel@tonic-gatebest way to find out about them is to play about with them. The most 1241*0Sstevel@tonic-gateuseful options are probably 1242*0Sstevel@tonic-gate 1243*0Sstevel@tonic-gate l Context (loop) stack processing 1244*0Sstevel@tonic-gate t Trace execution 1245*0Sstevel@tonic-gate o Method and overloading resolution 1246*0Sstevel@tonic-gate c String/numeric conversions 1247*0Sstevel@tonic-gate 1248*0Sstevel@tonic-gateSome of the functionality of the debugging code can be achieved using XS 1249*0Sstevel@tonic-gatemodules. 1250*0Sstevel@tonic-gate 1251*0Sstevel@tonic-gate -Dr => use re 'debug' 1252*0Sstevel@tonic-gate -Dx => use O 'Debug' 1253*0Sstevel@tonic-gate 1254*0Sstevel@tonic-gate=head2 Using a source-level debugger 1255*0Sstevel@tonic-gate 1256*0Sstevel@tonic-gateIf the debugging output of C<-D> doesn't help you, it's time to step 1257*0Sstevel@tonic-gatethrough perl's execution with a source-level debugger. 1258*0Sstevel@tonic-gate 1259*0Sstevel@tonic-gate=over 3 1260*0Sstevel@tonic-gate 1261*0Sstevel@tonic-gate=item * 1262*0Sstevel@tonic-gate 1263*0Sstevel@tonic-gateWe'll use C<gdb> for our examples here; the principles will apply to any 1264*0Sstevel@tonic-gatedebugger, but check the manual of the one you're using. 1265*0Sstevel@tonic-gate 1266*0Sstevel@tonic-gate=back 1267*0Sstevel@tonic-gate 1268*0Sstevel@tonic-gateTo fire up the debugger, type 1269*0Sstevel@tonic-gate 1270*0Sstevel@tonic-gate gdb ./perl 1271*0Sstevel@tonic-gate 1272*0Sstevel@tonic-gateYou'll want to do that in your Perl source tree so the debugger can read 1273*0Sstevel@tonic-gatethe source code. You should see the copyright message, followed by the 1274*0Sstevel@tonic-gateprompt. 1275*0Sstevel@tonic-gate 1276*0Sstevel@tonic-gate (gdb) 1277*0Sstevel@tonic-gate 1278*0Sstevel@tonic-gateC<help> will get you into the documentation, but here are the most 1279*0Sstevel@tonic-gateuseful commands: 1280*0Sstevel@tonic-gate 1281*0Sstevel@tonic-gate=over 3 1282*0Sstevel@tonic-gate 1283*0Sstevel@tonic-gate=item run [args] 1284*0Sstevel@tonic-gate 1285*0Sstevel@tonic-gateRun the program with the given arguments. 1286*0Sstevel@tonic-gate 1287*0Sstevel@tonic-gate=item break function_name 1288*0Sstevel@tonic-gate 1289*0Sstevel@tonic-gate=item break source.c:xxx 1290*0Sstevel@tonic-gate 1291*0Sstevel@tonic-gateTells the debugger that we'll want to pause execution when we reach 1292*0Sstevel@tonic-gateeither the named function (but see L<perlguts/Internal Functions>!) or the given 1293*0Sstevel@tonic-gateline in the named source file. 1294*0Sstevel@tonic-gate 1295*0Sstevel@tonic-gate=item step 1296*0Sstevel@tonic-gate 1297*0Sstevel@tonic-gateSteps through the program a line at a time. 1298*0Sstevel@tonic-gate 1299*0Sstevel@tonic-gate=item next 1300*0Sstevel@tonic-gate 1301*0Sstevel@tonic-gateSteps through the program a line at a time, without descending into 1302*0Sstevel@tonic-gatefunctions. 1303*0Sstevel@tonic-gate 1304*0Sstevel@tonic-gate=item continue 1305*0Sstevel@tonic-gate 1306*0Sstevel@tonic-gateRun until the next breakpoint. 1307*0Sstevel@tonic-gate 1308*0Sstevel@tonic-gate=item finish 1309*0Sstevel@tonic-gate 1310*0Sstevel@tonic-gateRun until the end of the current function, then stop again. 1311*0Sstevel@tonic-gate 1312*0Sstevel@tonic-gate=item 'enter' 1313*0Sstevel@tonic-gate 1314*0Sstevel@tonic-gateJust pressing Enter will do the most recent operation again - it's a 1315*0Sstevel@tonic-gateblessing when stepping through miles of source code. 1316*0Sstevel@tonic-gate 1317*0Sstevel@tonic-gate=item print 1318*0Sstevel@tonic-gate 1319*0Sstevel@tonic-gateExecute the given C code and print its results. B<WARNING>: Perl makes 1320*0Sstevel@tonic-gateheavy use of macros, and F<gdb> does not necessarily support macros 1321*0Sstevel@tonic-gate(see later L</"gdb macro support">). You'll have to substitute them 1322*0Sstevel@tonic-gateyourself, or to invoke cpp on the source code files 1323*0Sstevel@tonic-gate(see L</"The .i Targets">) 1324*0Sstevel@tonic-gateSo, for instance, you can't say 1325*0Sstevel@tonic-gate 1326*0Sstevel@tonic-gate print SvPV_nolen(sv) 1327*0Sstevel@tonic-gate 1328*0Sstevel@tonic-gatebut you have to say 1329*0Sstevel@tonic-gate 1330*0Sstevel@tonic-gate print Perl_sv_2pv_nolen(sv) 1331*0Sstevel@tonic-gate 1332*0Sstevel@tonic-gate=back 1333*0Sstevel@tonic-gate 1334*0Sstevel@tonic-gateYou may find it helpful to have a "macro dictionary", which you can 1335*0Sstevel@tonic-gateproduce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won't 1336*0Sstevel@tonic-gaterecursively apply those macros for you. 1337*0Sstevel@tonic-gate 1338*0Sstevel@tonic-gate=head2 gdb macro support 1339*0Sstevel@tonic-gate 1340*0Sstevel@tonic-gateRecent versions of F<gdb> have fairly good macro support, but 1341*0Sstevel@tonic-gatein order to use it you'll need to compile perl with macro definitions 1342*0Sstevel@tonic-gateincluded in the debugging information. Using F<gcc> version 3.1, this 1343*0Sstevel@tonic-gatemeans configuring with C<-Doptimize=-g3>. Other compilers might use a 1344*0Sstevel@tonic-gatedifferent switch (if they support debugging macros at all). 1345*0Sstevel@tonic-gate 1346*0Sstevel@tonic-gate=head2 Dumping Perl Data Structures 1347*0Sstevel@tonic-gate 1348*0Sstevel@tonic-gateOne way to get around this macro hell is to use the dumping functions in 1349*0Sstevel@tonic-gateF<dump.c>; these work a little like an internal 1350*0Sstevel@tonic-gateL<Devel::Peek|Devel::Peek>, but they also cover OPs and other structures 1351*0Sstevel@tonic-gatethat you can't get at from Perl. Let's take an example. We'll use the 1352*0Sstevel@tonic-gateC<$a = $b + $c> we used before, but give it a bit of context: 1353*0Sstevel@tonic-gateC<$b = "6XXXX"; $c = 2.3;>. Where's a good place to stop and poke around? 1354*0Sstevel@tonic-gate 1355*0Sstevel@tonic-gateWhat about C<pp_add>, the function we examined earlier to implement the 1356*0Sstevel@tonic-gateC<+> operator: 1357*0Sstevel@tonic-gate 1358*0Sstevel@tonic-gate (gdb) break Perl_pp_add 1359*0Sstevel@tonic-gate Breakpoint 1 at 0x46249f: file pp_hot.c, line 309. 1360*0Sstevel@tonic-gate 1361*0Sstevel@tonic-gateNotice we use C<Perl_pp_add> and not C<pp_add> - see L<perlguts/Internal Functions>. 1362*0Sstevel@tonic-gateWith the breakpoint in place, we can run our program: 1363*0Sstevel@tonic-gate 1364*0Sstevel@tonic-gate (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c' 1365*0Sstevel@tonic-gate 1366*0Sstevel@tonic-gateLots of junk will go past as gdb reads in the relevant source files and 1367*0Sstevel@tonic-gatelibraries, and then: 1368*0Sstevel@tonic-gate 1369*0Sstevel@tonic-gate Breakpoint 1, Perl_pp_add () at pp_hot.c:309 1370*0Sstevel@tonic-gate 309 dSP; dATARGET; tryAMAGICbin(add,opASSIGN); 1371*0Sstevel@tonic-gate (gdb) step 1372*0Sstevel@tonic-gate 311 dPOPTOPnnrl_ul; 1373*0Sstevel@tonic-gate (gdb) 1374*0Sstevel@tonic-gate 1375*0Sstevel@tonic-gateWe looked at this bit of code before, and we said that C<dPOPTOPnnrl_ul> 1376*0Sstevel@tonic-gatearranges for two C<NV>s to be placed into C<left> and C<right> - let's 1377*0Sstevel@tonic-gateslightly expand it: 1378*0Sstevel@tonic-gate 1379*0Sstevel@tonic-gate #define dPOPTOPnnrl_ul NV right = POPn; \ 1380*0Sstevel@tonic-gate SV *leftsv = TOPs; \ 1381*0Sstevel@tonic-gate NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0 1382*0Sstevel@tonic-gate 1383*0Sstevel@tonic-gateC<POPn> takes the SV from the top of the stack and obtains its NV either 1384*0Sstevel@tonic-gatedirectly (if C<SvNOK> is set) or by calling the C<sv_2nv> function. 1385*0Sstevel@tonic-gateC<TOPs> takes the next SV from the top of the stack - yes, C<POPn> uses 1386*0Sstevel@tonic-gateC<TOPs> - but doesn't remove it. We then use C<SvNV> to get the NV from 1387*0Sstevel@tonic-gateC<leftsv> in the same way as before - yes, C<POPn> uses C<SvNV>. 1388*0Sstevel@tonic-gate 1389*0Sstevel@tonic-gateSince we don't have an NV for C<$b>, we'll have to use C<sv_2nv> to 1390*0Sstevel@tonic-gateconvert it. If we step again, we'll find ourselves there: 1391*0Sstevel@tonic-gate 1392*0Sstevel@tonic-gate Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669 1393*0Sstevel@tonic-gate 1669 if (!sv) 1394*0Sstevel@tonic-gate (gdb) 1395*0Sstevel@tonic-gate 1396*0Sstevel@tonic-gateWe can now use C<Perl_sv_dump> to investigate the SV: 1397*0Sstevel@tonic-gate 1398*0Sstevel@tonic-gate SV = PV(0xa057cc0) at 0xa0675d0 1399*0Sstevel@tonic-gate REFCNT = 1 1400*0Sstevel@tonic-gate FLAGS = (POK,pPOK) 1401*0Sstevel@tonic-gate PV = 0xa06a510 "6XXXX"\0 1402*0Sstevel@tonic-gate CUR = 5 1403*0Sstevel@tonic-gate LEN = 6 1404*0Sstevel@tonic-gate $1 = void 1405*0Sstevel@tonic-gate 1406*0Sstevel@tonic-gateWe know we're going to get C<6> from this, so let's finish the 1407*0Sstevel@tonic-gatesubroutine: 1408*0Sstevel@tonic-gate 1409*0Sstevel@tonic-gate (gdb) finish 1410*0Sstevel@tonic-gate Run till exit from #0 Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671 1411*0Sstevel@tonic-gate 0x462669 in Perl_pp_add () at pp_hot.c:311 1412*0Sstevel@tonic-gate 311 dPOPTOPnnrl_ul; 1413*0Sstevel@tonic-gate 1414*0Sstevel@tonic-gateWe can also dump out this op: the current op is always stored in 1415*0Sstevel@tonic-gateC<PL_op>, and we can dump it with C<Perl_op_dump>. This'll give us 1416*0Sstevel@tonic-gatesimilar output to L<B::Debug|B::Debug>. 1417*0Sstevel@tonic-gate 1418*0Sstevel@tonic-gate { 1419*0Sstevel@tonic-gate 13 TYPE = add ===> 14 1420*0Sstevel@tonic-gate TARG = 1 1421*0Sstevel@tonic-gate FLAGS = (SCALAR,KIDS) 1422*0Sstevel@tonic-gate { 1423*0Sstevel@tonic-gate TYPE = null ===> (12) 1424*0Sstevel@tonic-gate (was rv2sv) 1425*0Sstevel@tonic-gate FLAGS = (SCALAR,KIDS) 1426*0Sstevel@tonic-gate { 1427*0Sstevel@tonic-gate 11 TYPE = gvsv ===> 12 1428*0Sstevel@tonic-gate FLAGS = (SCALAR) 1429*0Sstevel@tonic-gate GV = main::b 1430*0Sstevel@tonic-gate } 1431*0Sstevel@tonic-gate } 1432*0Sstevel@tonic-gate 1433*0Sstevel@tonic-gate# finish this later # 1434*0Sstevel@tonic-gate 1435*0Sstevel@tonic-gate=head2 Patching 1436*0Sstevel@tonic-gate 1437*0Sstevel@tonic-gateAll right, we've now had a look at how to navigate the Perl sources and 1438*0Sstevel@tonic-gatesome things you'll need to know when fiddling with them. Let's now get 1439*0Sstevel@tonic-gateon and create a simple patch. Here's something Larry suggested: if a 1440*0Sstevel@tonic-gateC<U> is the first active format during a C<pack>, (for example, 1441*0Sstevel@tonic-gateC<pack "U3C8", @stuff>) then the resulting string should be treated as 1442*0Sstevel@tonic-gateUTF-8 encoded. 1443*0Sstevel@tonic-gate 1444*0Sstevel@tonic-gateHow do we prepare to fix this up? First we locate the code in question - 1445*0Sstevel@tonic-gatethe C<pack> happens at runtime, so it's going to be in one of the F<pp> 1446*0Sstevel@tonic-gatefiles. Sure enough, C<pp_pack> is in F<pp.c>. Since we're going to be 1447*0Sstevel@tonic-gatealtering this file, let's copy it to F<pp.c~>. 1448*0Sstevel@tonic-gate 1449*0Sstevel@tonic-gate[Well, it was in F<pp.c> when this tutorial was written. It has now been 1450*0Sstevel@tonic-gatesplit off with C<pp_unpack> to its own file, F<pp_pack.c>] 1451*0Sstevel@tonic-gate 1452*0Sstevel@tonic-gateNow let's look over C<pp_pack>: we take a pattern into C<pat>, and then 1453*0Sstevel@tonic-gateloop over the pattern, taking each format character in turn into 1454*0Sstevel@tonic-gateC<datum_type>. Then for each possible format character, we swallow up 1455*0Sstevel@tonic-gatethe other arguments in the pattern (a field width, an asterisk, and so 1456*0Sstevel@tonic-gateon) and convert the next chunk input into the specified format, adding 1457*0Sstevel@tonic-gateit onto the output SV C<cat>. 1458*0Sstevel@tonic-gate 1459*0Sstevel@tonic-gateHow do we know if the C<U> is the first format in the C<pat>? Well, if 1460*0Sstevel@tonic-gatewe have a pointer to the start of C<pat> then, if we see a C<U> we can 1461*0Sstevel@tonic-gatetest whether we're still at the start of the string. So, here's where 1462*0Sstevel@tonic-gateC<pat> is set up: 1463*0Sstevel@tonic-gate 1464*0Sstevel@tonic-gate STRLEN fromlen; 1465*0Sstevel@tonic-gate register char *pat = SvPVx(*++MARK, fromlen); 1466*0Sstevel@tonic-gate register char *patend = pat + fromlen; 1467*0Sstevel@tonic-gate register I32 len; 1468*0Sstevel@tonic-gate I32 datumtype; 1469*0Sstevel@tonic-gate SV *fromstr; 1470*0Sstevel@tonic-gate 1471*0Sstevel@tonic-gateWe'll have another string pointer in there: 1472*0Sstevel@tonic-gate 1473*0Sstevel@tonic-gate STRLEN fromlen; 1474*0Sstevel@tonic-gate register char *pat = SvPVx(*++MARK, fromlen); 1475*0Sstevel@tonic-gate register char *patend = pat + fromlen; 1476*0Sstevel@tonic-gate + char *patcopy; 1477*0Sstevel@tonic-gate register I32 len; 1478*0Sstevel@tonic-gate I32 datumtype; 1479*0Sstevel@tonic-gate SV *fromstr; 1480*0Sstevel@tonic-gate 1481*0Sstevel@tonic-gateAnd just before we start the loop, we'll set C<patcopy> to be the start 1482*0Sstevel@tonic-gateof C<pat>: 1483*0Sstevel@tonic-gate 1484*0Sstevel@tonic-gate items = SP - MARK; 1485*0Sstevel@tonic-gate MARK++; 1486*0Sstevel@tonic-gate sv_setpvn(cat, "", 0); 1487*0Sstevel@tonic-gate + patcopy = pat; 1488*0Sstevel@tonic-gate while (pat < patend) { 1489*0Sstevel@tonic-gate 1490*0Sstevel@tonic-gateNow if we see a C<U> which was at the start of the string, we turn on 1491*0Sstevel@tonic-gatethe C<UTF8> flag for the output SV, C<cat>: 1492*0Sstevel@tonic-gate 1493*0Sstevel@tonic-gate + if (datumtype == 'U' && pat==patcopy+1) 1494*0Sstevel@tonic-gate + SvUTF8_on(cat); 1495*0Sstevel@tonic-gate if (datumtype == '#') { 1496*0Sstevel@tonic-gate while (pat < patend && *pat != '\n') 1497*0Sstevel@tonic-gate pat++; 1498*0Sstevel@tonic-gate 1499*0Sstevel@tonic-gateRemember that it has to be C<patcopy+1> because the first character of 1500*0Sstevel@tonic-gatethe string is the C<U> which has been swallowed into C<datumtype!> 1501*0Sstevel@tonic-gate 1502*0Sstevel@tonic-gateOops, we forgot one thing: what if there are spaces at the start of the 1503*0Sstevel@tonic-gatepattern? C<pack(" U*", @stuff)> will have C<U> as the first active 1504*0Sstevel@tonic-gatecharacter, even though it's not the first thing in the pattern. In this 1505*0Sstevel@tonic-gatecase, we have to advance C<patcopy> along with C<pat> when we see spaces: 1506*0Sstevel@tonic-gate 1507*0Sstevel@tonic-gate if (isSPACE(datumtype)) 1508*0Sstevel@tonic-gate continue; 1509*0Sstevel@tonic-gate 1510*0Sstevel@tonic-gateneeds to become 1511*0Sstevel@tonic-gate 1512*0Sstevel@tonic-gate if (isSPACE(datumtype)) { 1513*0Sstevel@tonic-gate patcopy++; 1514*0Sstevel@tonic-gate continue; 1515*0Sstevel@tonic-gate } 1516*0Sstevel@tonic-gate 1517*0Sstevel@tonic-gateOK. That's the C part done. Now we must do two additional things before 1518*0Sstevel@tonic-gatethis patch is ready to go: we've changed the behaviour of Perl, and so 1519*0Sstevel@tonic-gatewe must document that change. We must also provide some more regression 1520*0Sstevel@tonic-gatetests to make sure our patch works and doesn't create a bug somewhere 1521*0Sstevel@tonic-gateelse along the line. 1522*0Sstevel@tonic-gate 1523*0Sstevel@tonic-gateThe regression tests for each operator live in F<t/op/>, and so we 1524*0Sstevel@tonic-gatemake a copy of F<t/op/pack.t> to F<t/op/pack.t~>. Now we can add our 1525*0Sstevel@tonic-gatetests to the end. First, we'll test that the C<U> does indeed create 1526*0Sstevel@tonic-gateUnicode strings. 1527*0Sstevel@tonic-gate 1528*0Sstevel@tonic-gatet/op/pack.t has a sensible ok() function, but if it didn't we could 1529*0Sstevel@tonic-gateuse the one from t/test.pl. 1530*0Sstevel@tonic-gate 1531*0Sstevel@tonic-gate require './test.pl'; 1532*0Sstevel@tonic-gate plan( tests => 159 ); 1533*0Sstevel@tonic-gate 1534*0Sstevel@tonic-gateso instead of this: 1535*0Sstevel@tonic-gate 1536*0Sstevel@tonic-gate print 'not ' unless "1.20.300.4000" eq sprintf "%vd", pack("U*",1,20,300,4000); 1537*0Sstevel@tonic-gate print "ok $test\n"; $test++; 1538*0Sstevel@tonic-gate 1539*0Sstevel@tonic-gatewe can write the more sensible (see L<Test::More> for a full 1540*0Sstevel@tonic-gateexplanation of is() and other testing functions). 1541*0Sstevel@tonic-gate 1542*0Sstevel@tonic-gate is( "1.20.300.4000", sprintf "%vd", pack("U*",1,20,300,4000), 1543*0Sstevel@tonic-gate "U* produces unicode" ); 1544*0Sstevel@tonic-gate 1545*0Sstevel@tonic-gateNow we'll test that we got that space-at-the-beginning business right: 1546*0Sstevel@tonic-gate 1547*0Sstevel@tonic-gate is( "1.20.300.4000", sprintf "%vd", pack(" U*",1,20,300,4000), 1548*0Sstevel@tonic-gate " with spaces at the beginning" ); 1549*0Sstevel@tonic-gate 1550*0Sstevel@tonic-gateAnd finally we'll test that we don't make Unicode strings if C<U> is B<not> 1551*0Sstevel@tonic-gatethe first active format: 1552*0Sstevel@tonic-gate 1553*0Sstevel@tonic-gate isnt( v1.20.300.4000, sprintf "%vd", pack("C0U*",1,20,300,4000), 1554*0Sstevel@tonic-gate "U* not first isn't unicode" ); 1555*0Sstevel@tonic-gate 1556*0Sstevel@tonic-gateMustn't forget to change the number of tests which appears at the top, 1557*0Sstevel@tonic-gateor else the automated tester will get confused. This will either look 1558*0Sstevel@tonic-gatelike this: 1559*0Sstevel@tonic-gate 1560*0Sstevel@tonic-gate print "1..156\n"; 1561*0Sstevel@tonic-gate 1562*0Sstevel@tonic-gateor this: 1563*0Sstevel@tonic-gate 1564*0Sstevel@tonic-gate plan( tests => 156 ); 1565*0Sstevel@tonic-gate 1566*0Sstevel@tonic-gateWe now compile up Perl, and run it through the test suite. Our new 1567*0Sstevel@tonic-gatetests pass, hooray! 1568*0Sstevel@tonic-gate 1569*0Sstevel@tonic-gateFinally, the documentation. The job is never done until the paperwork is 1570*0Sstevel@tonic-gateover, so let's describe the change we've just made. The relevant place 1571*0Sstevel@tonic-gateis F<pod/perlfunc.pod>; again, we make a copy, and then we'll insert 1572*0Sstevel@tonic-gatethis text in the description of C<pack>: 1573*0Sstevel@tonic-gate 1574*0Sstevel@tonic-gate =item * 1575*0Sstevel@tonic-gate 1576*0Sstevel@tonic-gate If the pattern begins with a C<U>, the resulting string will be treated 1577*0Sstevel@tonic-gate as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string 1578*0Sstevel@tonic-gate with an initial C<U0>, and the bytes that follow will be interpreted as 1579*0Sstevel@tonic-gate Unicode characters. If you don't want this to happen, you can begin your 1580*0Sstevel@tonic-gate pattern with C<C0> (or anything else) to force Perl not to UTF-8 encode your 1581*0Sstevel@tonic-gate string, and then follow this with a C<U*> somewhere in your pattern. 1582*0Sstevel@tonic-gate 1583*0Sstevel@tonic-gateAll done. Now let's create the patch. F<Porting/patching.pod> tells us 1584*0Sstevel@tonic-gatethat if we're making major changes, we should copy the entire directory 1585*0Sstevel@tonic-gateto somewhere safe before we begin fiddling, and then do 1586*0Sstevel@tonic-gate 1587*0Sstevel@tonic-gate diff -ruN old new > patch 1588*0Sstevel@tonic-gate 1589*0Sstevel@tonic-gateHowever, we know which files we've changed, and we can simply do this: 1590*0Sstevel@tonic-gate 1591*0Sstevel@tonic-gate diff -u pp.c~ pp.c > patch 1592*0Sstevel@tonic-gate diff -u t/op/pack.t~ t/op/pack.t >> patch 1593*0Sstevel@tonic-gate diff -u pod/perlfunc.pod~ pod/perlfunc.pod >> patch 1594*0Sstevel@tonic-gate 1595*0Sstevel@tonic-gateWe end up with a patch looking a little like this: 1596*0Sstevel@tonic-gate 1597*0Sstevel@tonic-gate --- pp.c~ Fri Jun 02 04:34:10 2000 1598*0Sstevel@tonic-gate +++ pp.c Fri Jun 16 11:37:25 2000 1599*0Sstevel@tonic-gate @@ -4375,6 +4375,7 @@ 1600*0Sstevel@tonic-gate register I32 items; 1601*0Sstevel@tonic-gate STRLEN fromlen; 1602*0Sstevel@tonic-gate register char *pat = SvPVx(*++MARK, fromlen); 1603*0Sstevel@tonic-gate + char *patcopy; 1604*0Sstevel@tonic-gate register char *patend = pat + fromlen; 1605*0Sstevel@tonic-gate register I32 len; 1606*0Sstevel@tonic-gate I32 datumtype; 1607*0Sstevel@tonic-gate @@ -4405,6 +4406,7 @@ 1608*0Sstevel@tonic-gate ... 1609*0Sstevel@tonic-gate 1610*0Sstevel@tonic-gateAnd finally, we submit it, with our rationale, to perl5-porters. Job 1611*0Sstevel@tonic-gatedone! 1612*0Sstevel@tonic-gate 1613*0Sstevel@tonic-gate=head2 Patching a core module 1614*0Sstevel@tonic-gate 1615*0Sstevel@tonic-gateThis works just like patching anything else, with an extra 1616*0Sstevel@tonic-gateconsideration. Many core modules also live on CPAN. If this is so, 1617*0Sstevel@tonic-gatepatch the CPAN version instead of the core and send the patch off to 1618*0Sstevel@tonic-gatethe module maintainer (with a copy to p5p). This will help the module 1619*0Sstevel@tonic-gatemaintainer keep the CPAN version in sync with the core version without 1620*0Sstevel@tonic-gateconstantly scanning p5p. 1621*0Sstevel@tonic-gate 1622*0Sstevel@tonic-gate=head2 Adding a new function to the core 1623*0Sstevel@tonic-gate 1624*0Sstevel@tonic-gateIf, as part of a patch to fix a bug, or just because you have an 1625*0Sstevel@tonic-gateespecially good idea, you decide to add a new function to the core, 1626*0Sstevel@tonic-gatediscuss your ideas on p5p well before you start work. It may be that 1627*0Sstevel@tonic-gatesomeone else has already attempted to do what you are considering and 1628*0Sstevel@tonic-gatecan give lots of good advice or even provide you with bits of code 1629*0Sstevel@tonic-gatethat they already started (but never finished). 1630*0Sstevel@tonic-gate 1631*0Sstevel@tonic-gateYou have to follow all of the advice given above for patching. It is 1632*0Sstevel@tonic-gateextremely important to test any addition thoroughly and add new tests 1633*0Sstevel@tonic-gateto explore all boundary conditions that your new function is expected 1634*0Sstevel@tonic-gateto handle. If your new function is used only by one module (e.g. toke), 1635*0Sstevel@tonic-gatethen it should probably be named S_your_function (for static); on the 1636*0Sstevel@tonic-gateother hand, if you expect it to accessible from other functions in 1637*0Sstevel@tonic-gatePerl, you should name it Perl_your_function. See L<perlguts/Internal Functions> 1638*0Sstevel@tonic-gatefor more details. 1639*0Sstevel@tonic-gate 1640*0Sstevel@tonic-gateThe location of any new code is also an important consideration. Don't 1641*0Sstevel@tonic-gatejust create a new top level .c file and put your code there; you would 1642*0Sstevel@tonic-gatehave to make changes to Configure (so the Makefile is created properly), 1643*0Sstevel@tonic-gateas well as possibly lots of include files. This is strictly pumpking 1644*0Sstevel@tonic-gatebusiness. 1645*0Sstevel@tonic-gate 1646*0Sstevel@tonic-gateIt is better to add your function to one of the existing top level 1647*0Sstevel@tonic-gatesource code files, but your choice is complicated by the nature of 1648*0Sstevel@tonic-gatethe Perl distribution. Only the files that are marked as compiled 1649*0Sstevel@tonic-gatestatic are located in the perl executable. Everything else is located 1650*0Sstevel@tonic-gatein the shared library (or DLL if you are running under WIN32). So, 1651*0Sstevel@tonic-gatefor example, if a function was only used by functions located in 1652*0Sstevel@tonic-gatetoke.c, then your code can go in toke.c. If, however, you want to call 1653*0Sstevel@tonic-gatethe function from universal.c, then you should put your code in another 1654*0Sstevel@tonic-gatelocation, for example util.c. 1655*0Sstevel@tonic-gate 1656*0Sstevel@tonic-gateIn addition to writing your c-code, you will need to create an 1657*0Sstevel@tonic-gateappropriate entry in embed.pl describing your function, then run 1658*0Sstevel@tonic-gate'make regen_headers' to create the entries in the numerous header 1659*0Sstevel@tonic-gatefiles that perl needs to compile correctly. See L<perlguts/Internal Functions> 1660*0Sstevel@tonic-gatefor information on the various options that you can set in embed.pl. 1661*0Sstevel@tonic-gateYou will forget to do this a few (or many) times and you will get 1662*0Sstevel@tonic-gatewarnings during the compilation phase. Make sure that you mention 1663*0Sstevel@tonic-gatethis when you post your patch to P5P; the pumpking needs to know this. 1664*0Sstevel@tonic-gate 1665*0Sstevel@tonic-gateWhen you write your new code, please be conscious of existing code 1666*0Sstevel@tonic-gateconventions used in the perl source files. See L<perlstyle> for 1667*0Sstevel@tonic-gatedetails. Although most of the guidelines discussed seem to focus on 1668*0Sstevel@tonic-gatePerl code, rather than c, they all apply (except when they don't ;). 1669*0Sstevel@tonic-gateSee also I<Porting/patching.pod> file in the Perl source distribution 1670*0Sstevel@tonic-gatefor lots of details about both formatting and submitting patches of 1671*0Sstevel@tonic-gateyour changes. 1672*0Sstevel@tonic-gate 1673*0Sstevel@tonic-gateLastly, TEST TEST TEST TEST TEST any code before posting to p5p. 1674*0Sstevel@tonic-gateTest on as many platforms as you can find. Test as many perl 1675*0Sstevel@tonic-gateConfigure options as you can (e.g. MULTIPLICITY). If you have 1676*0Sstevel@tonic-gateprofiling or memory tools, see L<EXTERNAL TOOLS FOR DEBUGGING PERL> 1677*0Sstevel@tonic-gatebelow for how to use them to further test your code. Remember that 1678*0Sstevel@tonic-gatemost of the people on P5P are doing this on their own time and 1679*0Sstevel@tonic-gatedon't have the time to debug your code. 1680*0Sstevel@tonic-gate 1681*0Sstevel@tonic-gate=head2 Writing a test 1682*0Sstevel@tonic-gate 1683*0Sstevel@tonic-gateEvery module and built-in function has an associated test file (or 1684*0Sstevel@tonic-gateshould...). If you add or change functionality, you have to write a 1685*0Sstevel@tonic-gatetest. If you fix a bug, you have to write a test so that bug never 1686*0Sstevel@tonic-gatecomes back. If you alter the docs, it would be nice to test what the 1687*0Sstevel@tonic-gatenew documentation says. 1688*0Sstevel@tonic-gate 1689*0Sstevel@tonic-gateIn short, if you submit a patch you probably also have to patch the 1690*0Sstevel@tonic-gatetests. 1691*0Sstevel@tonic-gate 1692*0Sstevel@tonic-gateFor modules, the test file is right next to the module itself. 1693*0Sstevel@tonic-gateF<lib/strict.t> tests F<lib/strict.pm>. This is a recent innovation, 1694*0Sstevel@tonic-gateso there are some snags (and it would be wonderful for you to brush 1695*0Sstevel@tonic-gatethem out), but it basically works that way. Everything else lives in 1696*0Sstevel@tonic-gateF<t/>. 1697*0Sstevel@tonic-gate 1698*0Sstevel@tonic-gate=over 3 1699*0Sstevel@tonic-gate 1700*0Sstevel@tonic-gate=item F<t/base/> 1701*0Sstevel@tonic-gate 1702*0Sstevel@tonic-gateTesting of the absolute basic functionality of Perl. Things like 1703*0Sstevel@tonic-gateC<if>, basic file reads and writes, simple regexes, etc. These are 1704*0Sstevel@tonic-gaterun first in the test suite and if any of them fail, something is 1705*0Sstevel@tonic-gateI<really> broken. 1706*0Sstevel@tonic-gate 1707*0Sstevel@tonic-gate=item F<t/cmd/> 1708*0Sstevel@tonic-gate 1709*0Sstevel@tonic-gateThese test the basic control structures, C<if/else>, C<while>, 1710*0Sstevel@tonic-gatesubroutines, etc. 1711*0Sstevel@tonic-gate 1712*0Sstevel@tonic-gate=item F<t/comp/> 1713*0Sstevel@tonic-gate 1714*0Sstevel@tonic-gateTests basic issues of how Perl parses and compiles itself. 1715*0Sstevel@tonic-gate 1716*0Sstevel@tonic-gate=item F<t/io/> 1717*0Sstevel@tonic-gate 1718*0Sstevel@tonic-gateTests for built-in IO functions, including command line arguments. 1719*0Sstevel@tonic-gate 1720*0Sstevel@tonic-gate=item F<t/lib/> 1721*0Sstevel@tonic-gate 1722*0Sstevel@tonic-gateThe old home for the module tests, you shouldn't put anything new in 1723*0Sstevel@tonic-gatehere. There are still some bits and pieces hanging around in here 1724*0Sstevel@tonic-gatethat need to be moved. Perhaps you could move them? Thanks! 1725*0Sstevel@tonic-gate 1726*0Sstevel@tonic-gate=item F<t/op/> 1727*0Sstevel@tonic-gate 1728*0Sstevel@tonic-gateTests for perl's built in functions that don't fit into any of the 1729*0Sstevel@tonic-gateother directories. 1730*0Sstevel@tonic-gate 1731*0Sstevel@tonic-gate=item F<t/pod/> 1732*0Sstevel@tonic-gate 1733*0Sstevel@tonic-gateTests for POD directives. There are still some tests for the Pod 1734*0Sstevel@tonic-gatemodules hanging around in here that need to be moved out into F<lib/>. 1735*0Sstevel@tonic-gate 1736*0Sstevel@tonic-gate=item F<t/run/> 1737*0Sstevel@tonic-gate 1738*0Sstevel@tonic-gateTesting features of how perl actually runs, including exit codes and 1739*0Sstevel@tonic-gatehandling of PERL* environment variables. 1740*0Sstevel@tonic-gate 1741*0Sstevel@tonic-gate=item F<t/uni/> 1742*0Sstevel@tonic-gate 1743*0Sstevel@tonic-gateTests for the core support of Unicode. 1744*0Sstevel@tonic-gate 1745*0Sstevel@tonic-gate=item F<t/win32/> 1746*0Sstevel@tonic-gate 1747*0Sstevel@tonic-gateWindows-specific tests. 1748*0Sstevel@tonic-gate 1749*0Sstevel@tonic-gate=item F<t/x2p> 1750*0Sstevel@tonic-gate 1751*0Sstevel@tonic-gateA test suite for the s2p converter. 1752*0Sstevel@tonic-gate 1753*0Sstevel@tonic-gate=back 1754*0Sstevel@tonic-gate 1755*0Sstevel@tonic-gateThe core uses the same testing style as the rest of Perl, a simple 1756*0Sstevel@tonic-gate"ok/not ok" run through Test::Harness, but there are a few special 1757*0Sstevel@tonic-gateconsiderations. 1758*0Sstevel@tonic-gate 1759*0Sstevel@tonic-gateThere are three ways to write a test in the core. Test::More, 1760*0Sstevel@tonic-gatet/test.pl and ad hoc C<print $test ? "ok 42\n" : "not ok 42\n">. The 1761*0Sstevel@tonic-gatedecision of which to use depends on what part of the test suite you're 1762*0Sstevel@tonic-gateworking on. This is a measure to prevent a high-level failure (such 1763*0Sstevel@tonic-gateas Config.pm breaking) from causing basic functionality tests to fail. 1764*0Sstevel@tonic-gate 1765*0Sstevel@tonic-gate=over 4 1766*0Sstevel@tonic-gate 1767*0Sstevel@tonic-gate=item t/base t/comp 1768*0Sstevel@tonic-gate 1769*0Sstevel@tonic-gateSince we don't know if require works, or even subroutines, use ad hoc 1770*0Sstevel@tonic-gatetests for these two. Step carefully to avoid using the feature being 1771*0Sstevel@tonic-gatetested. 1772*0Sstevel@tonic-gate 1773*0Sstevel@tonic-gate=item t/cmd t/run t/io t/op 1774*0Sstevel@tonic-gate 1775*0Sstevel@tonic-gateNow that basic require() and subroutines are tested, you can use the 1776*0Sstevel@tonic-gatet/test.pl library which emulates the important features of Test::More 1777*0Sstevel@tonic-gatewhile using a minimum of core features. 1778*0Sstevel@tonic-gate 1779*0Sstevel@tonic-gateYou can also conditionally use certain libraries like Config, but be 1780*0Sstevel@tonic-gatesure to skip the test gracefully if it's not there. 1781*0Sstevel@tonic-gate 1782*0Sstevel@tonic-gate=item t/lib ext lib 1783*0Sstevel@tonic-gate 1784*0Sstevel@tonic-gateNow that the core of Perl is tested, Test::More can be used. You can 1785*0Sstevel@tonic-gatealso use the full suite of core modules in the tests. 1786*0Sstevel@tonic-gate 1787*0Sstevel@tonic-gate=back 1788*0Sstevel@tonic-gate 1789*0Sstevel@tonic-gateWhen you say "make test" Perl uses the F<t/TEST> program to run the 1790*0Sstevel@tonic-gatetest suite. All tests are run from the F<t/> directory, B<not> the 1791*0Sstevel@tonic-gatedirectory which contains the test. This causes some problems with the 1792*0Sstevel@tonic-gatetests in F<lib/>, so here's some opportunity for some patching. 1793*0Sstevel@tonic-gate 1794*0Sstevel@tonic-gateYou must be triply conscious of cross-platform concerns. This usually 1795*0Sstevel@tonic-gateboils down to using File::Spec and avoiding things like C<fork()> and 1796*0Sstevel@tonic-gateC<system()> unless absolutely necessary. 1797*0Sstevel@tonic-gate 1798*0Sstevel@tonic-gate=head2 Special Make Test Targets 1799*0Sstevel@tonic-gate 1800*0Sstevel@tonic-gateThere are various special make targets that can be used to test Perl 1801*0Sstevel@tonic-gateslightly differently than the standard "test" target. Not all them 1802*0Sstevel@tonic-gateare expected to give a 100% success rate. Many of them have several 1803*0Sstevel@tonic-gatealiases. 1804*0Sstevel@tonic-gate 1805*0Sstevel@tonic-gate=over 4 1806*0Sstevel@tonic-gate 1807*0Sstevel@tonic-gate=item coretest 1808*0Sstevel@tonic-gate 1809*0Sstevel@tonic-gateRun F<perl> on all core tests (F<t/*> and F<lib/[a-z]*> pragma tests). 1810*0Sstevel@tonic-gate 1811*0Sstevel@tonic-gate=item test.deparse 1812*0Sstevel@tonic-gate 1813*0Sstevel@tonic-gateRun all the tests through B::Deparse. Not all tests will succeed. 1814*0Sstevel@tonic-gate 1815*0Sstevel@tonic-gate=item test.taintwarn 1816*0Sstevel@tonic-gate 1817*0Sstevel@tonic-gateRun all tests with the B<-t> command-line switch. Not all tests 1818*0Sstevel@tonic-gateare expected to succeed (until they're specifically fixed, of course). 1819*0Sstevel@tonic-gate 1820*0Sstevel@tonic-gate=item minitest 1821*0Sstevel@tonic-gate 1822*0Sstevel@tonic-gateRun F<miniperl> on F<t/base>, F<t/comp>, F<t/cmd>, F<t/run>, F<t/io>, 1823*0Sstevel@tonic-gateF<t/op>, and F<t/uni> tests. 1824*0Sstevel@tonic-gate 1825*0Sstevel@tonic-gate=item test.valgrind check.valgrind utest.valgrind ucheck.valgrind 1826*0Sstevel@tonic-gate 1827*0Sstevel@tonic-gate(Only in Linux) Run all the tests using the memory leak + naughty 1828*0Sstevel@tonic-gatememory access tool "valgrind". The log files will be named 1829*0Sstevel@tonic-gateF<testname.valgrind>. 1830*0Sstevel@tonic-gate 1831*0Sstevel@tonic-gate=item test.third check.third utest.third ucheck.third 1832*0Sstevel@tonic-gate 1833*0Sstevel@tonic-gate(Only in Tru64) Run all the tests using the memory leak + naughty 1834*0Sstevel@tonic-gatememory access tool "Third Degree". The log files will be named 1835*0Sstevel@tonic-gateF<perl3.log.testname>. 1836*0Sstevel@tonic-gate 1837*0Sstevel@tonic-gate=item test.torture torturetest 1838*0Sstevel@tonic-gate 1839*0Sstevel@tonic-gateRun all the usual tests and some extra tests. As of Perl 5.8.0 the 1840*0Sstevel@tonic-gateonly extra tests are Abigail's JAPHs, F<t/japh/abigail.t>. 1841*0Sstevel@tonic-gate 1842*0Sstevel@tonic-gateYou can also run the torture test with F<t/harness> by giving 1843*0Sstevel@tonic-gateC<-torture> argument to F<t/harness>. 1844*0Sstevel@tonic-gate 1845*0Sstevel@tonic-gate=item utest ucheck test.utf8 check.utf8 1846*0Sstevel@tonic-gate 1847*0Sstevel@tonic-gateRun all the tests with -Mutf8. Not all tests will succeed. 1848*0Sstevel@tonic-gate 1849*0Sstevel@tonic-gate=item test_harness 1850*0Sstevel@tonic-gate 1851*0Sstevel@tonic-gateRun the test suite with the F<t/harness> controlling program, instead of 1852*0Sstevel@tonic-gateF<t/TEST>. F<t/harness> is more sophisticated, and uses the 1853*0Sstevel@tonic-gateL<Test::Harness> module, thus using this test target supposes that perl 1854*0Sstevel@tonic-gatemostly works. The main advantage for our purposes is that it prints a 1855*0Sstevel@tonic-gatedetailed summary of failed tests at the end. Also, unlike F<t/TEST>, it 1856*0Sstevel@tonic-gatedoesn't redirect stderr to stdout. 1857*0Sstevel@tonic-gate 1858*0Sstevel@tonic-gate=back 1859*0Sstevel@tonic-gate 1860*0Sstevel@tonic-gate=head2 Running tests by hand 1861*0Sstevel@tonic-gate 1862*0Sstevel@tonic-gateYou can run part of the test suite by hand by using one the following 1863*0Sstevel@tonic-gatecommands from the F<t/> directory : 1864*0Sstevel@tonic-gate 1865*0Sstevel@tonic-gate ./perl -I../lib TEST list-of-.t-files 1866*0Sstevel@tonic-gate 1867*0Sstevel@tonic-gateor 1868*0Sstevel@tonic-gate 1869*0Sstevel@tonic-gate ./perl -I../lib harness list-of-.t-files 1870*0Sstevel@tonic-gate 1871*0Sstevel@tonic-gate(if you don't specify test scripts, the whole test suite will be run.) 1872*0Sstevel@tonic-gate 1873*0Sstevel@tonic-gateYou can run an individual test by a command similar to 1874*0Sstevel@tonic-gate 1875*0Sstevel@tonic-gate ./perl -I../lib patho/to/foo.t 1876*0Sstevel@tonic-gate 1877*0Sstevel@tonic-gateexcept that the harnesses set up some environment variables that may 1878*0Sstevel@tonic-gateaffect the execution of the test : 1879*0Sstevel@tonic-gate 1880*0Sstevel@tonic-gate=over 4 1881*0Sstevel@tonic-gate 1882*0Sstevel@tonic-gate=item PERL_CORE=1 1883*0Sstevel@tonic-gate 1884*0Sstevel@tonic-gateindicates that we're running this test part of the perl core test suite. 1885*0Sstevel@tonic-gateThis is useful for modules that have a dual life on CPAN. 1886*0Sstevel@tonic-gate 1887*0Sstevel@tonic-gate=item PERL_DESTRUCT_LEVEL=2 1888*0Sstevel@tonic-gate 1889*0Sstevel@tonic-gateis set to 2 if it isn't set already (see L</PERL_DESTRUCT_LEVEL>) 1890*0Sstevel@tonic-gate 1891*0Sstevel@tonic-gate=item PERL 1892*0Sstevel@tonic-gate 1893*0Sstevel@tonic-gate(used only by F<t/TEST>) if set, overrides the path to the perl executable 1894*0Sstevel@tonic-gatethat should be used to run the tests (the default being F<./perl>). 1895*0Sstevel@tonic-gate 1896*0Sstevel@tonic-gate=item PERL_SKIP_TTY_TEST 1897*0Sstevel@tonic-gate 1898*0Sstevel@tonic-gateif set, tells to skip the tests that need a terminal. It's actually set 1899*0Sstevel@tonic-gateautomatically by the Makefile, but can also be forced artificially by 1900*0Sstevel@tonic-gaterunning 'make test_notty'. 1901*0Sstevel@tonic-gate 1902*0Sstevel@tonic-gate=back 1903*0Sstevel@tonic-gate 1904*0Sstevel@tonic-gate=head1 EXTERNAL TOOLS FOR DEBUGGING PERL 1905*0Sstevel@tonic-gate 1906*0Sstevel@tonic-gateSometimes it helps to use external tools while debugging and 1907*0Sstevel@tonic-gatetesting Perl. This section tries to guide you through using 1908*0Sstevel@tonic-gatesome common testing and debugging tools with Perl. This is 1909*0Sstevel@tonic-gatemeant as a guide to interfacing these tools with Perl, not 1910*0Sstevel@tonic-gateas any kind of guide to the use of the tools themselves. 1911*0Sstevel@tonic-gate 1912*0Sstevel@tonic-gateB<NOTE 1>: Running under memory debuggers such as Purify, valgrind, or 1913*0Sstevel@tonic-gateThird Degree greatly slows down the execution: seconds become minutes, 1914*0Sstevel@tonic-gateminutes become hours. For example as of Perl 5.8.1, the 1915*0Sstevel@tonic-gateext/Encode/t/Unicode.t takes extraordinarily long to complete under 1916*0Sstevel@tonic-gatee.g. Purify, Third Degree, and valgrind. Under valgrind it takes more 1917*0Sstevel@tonic-gatethan six hours, even on a snappy computer-- the said test must be 1918*0Sstevel@tonic-gatedoing something that is quite unfriendly for memory debuggers. If you 1919*0Sstevel@tonic-gatedon't feel like waiting, that you can simply kill away the perl 1920*0Sstevel@tonic-gateprocess. 1921*0Sstevel@tonic-gate 1922*0Sstevel@tonic-gateB<NOTE 2>: To minimize the number of memory leak false alarms (see 1923*0Sstevel@tonic-gateL</PERL_DESTRUCT_LEVEL> for more information), you have to have 1924*0Sstevel@tonic-gateenvironment variable PERL_DESTRUCT_LEVEL set to 2. The F<TEST> 1925*0Sstevel@tonic-gateand harness scripts do that automatically. But if you are running 1926*0Sstevel@tonic-gatesome of the tests manually-- for csh-like shells: 1927*0Sstevel@tonic-gate 1928*0Sstevel@tonic-gate setenv PERL_DESTRUCT_LEVEL 2 1929*0Sstevel@tonic-gate 1930*0Sstevel@tonic-gateand for Bourne-type shells: 1931*0Sstevel@tonic-gate 1932*0Sstevel@tonic-gate PERL_DESTRUCT_LEVEL=2 1933*0Sstevel@tonic-gate export PERL_DESTRUCT_LEVEL 1934*0Sstevel@tonic-gate 1935*0Sstevel@tonic-gateor in UNIXy environments you can also use the C<env> command: 1936*0Sstevel@tonic-gate 1937*0Sstevel@tonic-gate env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ... 1938*0Sstevel@tonic-gate 1939*0Sstevel@tonic-gateB<NOTE 3>: There are known memory leaks when there are compile-time 1940*0Sstevel@tonic-gateerrors within eval or require, seeing C<S_doeval> in the call stack 1941*0Sstevel@tonic-gateis a good sign of these. Fixing these leaks is non-trivial, 1942*0Sstevel@tonic-gateunfortunately, but they must be fixed eventually. 1943*0Sstevel@tonic-gate 1944*0Sstevel@tonic-gate=head2 Rational Software's Purify 1945*0Sstevel@tonic-gate 1946*0Sstevel@tonic-gatePurify is a commercial tool that is helpful in identifying 1947*0Sstevel@tonic-gatememory overruns, wild pointers, memory leaks and other such 1948*0Sstevel@tonic-gatebadness. Perl must be compiled in a specific way for 1949*0Sstevel@tonic-gateoptimal testing with Purify. Purify is available under 1950*0Sstevel@tonic-gateWindows NT, Solaris, HP-UX, SGI, and Siemens Unix. 1951*0Sstevel@tonic-gate 1952*0Sstevel@tonic-gate=head2 Purify on Unix 1953*0Sstevel@tonic-gate 1954*0Sstevel@tonic-gateOn Unix, Purify creates a new Perl binary. To get the most 1955*0Sstevel@tonic-gatebenefit out of Purify, you should create the perl to Purify 1956*0Sstevel@tonic-gateusing: 1957*0Sstevel@tonic-gate 1958*0Sstevel@tonic-gate sh Configure -Accflags=-DPURIFY -Doptimize='-g' \ 1959*0Sstevel@tonic-gate -Uusemymalloc -Dusemultiplicity 1960*0Sstevel@tonic-gate 1961*0Sstevel@tonic-gatewhere these arguments mean: 1962*0Sstevel@tonic-gate 1963*0Sstevel@tonic-gate=over 4 1964*0Sstevel@tonic-gate 1965*0Sstevel@tonic-gate=item -Accflags=-DPURIFY 1966*0Sstevel@tonic-gate 1967*0Sstevel@tonic-gateDisables Perl's arena memory allocation functions, as well as 1968*0Sstevel@tonic-gateforcing use of memory allocation functions derived from the 1969*0Sstevel@tonic-gatesystem malloc. 1970*0Sstevel@tonic-gate 1971*0Sstevel@tonic-gate=item -Doptimize='-g' 1972*0Sstevel@tonic-gate 1973*0Sstevel@tonic-gateAdds debugging information so that you see the exact source 1974*0Sstevel@tonic-gatestatements where the problem occurs. Without this flag, all 1975*0Sstevel@tonic-gateyou will see is the source filename of where the error occurred. 1976*0Sstevel@tonic-gate 1977*0Sstevel@tonic-gate=item -Uusemymalloc 1978*0Sstevel@tonic-gate 1979*0Sstevel@tonic-gateDisable Perl's malloc so that Purify can more closely monitor 1980*0Sstevel@tonic-gateallocations and leaks. Using Perl's malloc will make Purify 1981*0Sstevel@tonic-gatereport most leaks in the "potential" leaks category. 1982*0Sstevel@tonic-gate 1983*0Sstevel@tonic-gate=item -Dusemultiplicity 1984*0Sstevel@tonic-gate 1985*0Sstevel@tonic-gateEnabling the multiplicity option allows perl to clean up 1986*0Sstevel@tonic-gatethoroughly when the interpreter shuts down, which reduces the 1987*0Sstevel@tonic-gatenumber of bogus leak reports from Purify. 1988*0Sstevel@tonic-gate 1989*0Sstevel@tonic-gate=back 1990*0Sstevel@tonic-gate 1991*0Sstevel@tonic-gateOnce you've compiled a perl suitable for Purify'ing, then you 1992*0Sstevel@tonic-gatecan just: 1993*0Sstevel@tonic-gate 1994*0Sstevel@tonic-gate make pureperl 1995*0Sstevel@tonic-gate 1996*0Sstevel@tonic-gatewhich creates a binary named 'pureperl' that has been Purify'ed. 1997*0Sstevel@tonic-gateThis binary is used in place of the standard 'perl' binary 1998*0Sstevel@tonic-gatewhen you want to debug Perl memory problems. 1999*0Sstevel@tonic-gate 2000*0Sstevel@tonic-gateAs an example, to show any memory leaks produced during the 2001*0Sstevel@tonic-gatestandard Perl testset you would create and run the Purify'ed 2002*0Sstevel@tonic-gateperl as: 2003*0Sstevel@tonic-gate 2004*0Sstevel@tonic-gate make pureperl 2005*0Sstevel@tonic-gate cd t 2006*0Sstevel@tonic-gate ../pureperl -I../lib harness 2007*0Sstevel@tonic-gate 2008*0Sstevel@tonic-gatewhich would run Perl on test.pl and report any memory problems. 2009*0Sstevel@tonic-gate 2010*0Sstevel@tonic-gatePurify outputs messages in "Viewer" windows by default. If 2011*0Sstevel@tonic-gateyou don't have a windowing environment or if you simply 2012*0Sstevel@tonic-gatewant the Purify output to unobtrusively go to a log file 2013*0Sstevel@tonic-gateinstead of to the interactive window, use these following 2014*0Sstevel@tonic-gateoptions to output to the log file "perl.log": 2015*0Sstevel@tonic-gate 2016*0Sstevel@tonic-gate setenv PURIFYOPTIONS "-chain-length=25 -windows=no \ 2017*0Sstevel@tonic-gate -log-file=perl.log -append-logfile=yes" 2018*0Sstevel@tonic-gate 2019*0Sstevel@tonic-gateIf you plan to use the "Viewer" windows, then you only need this option: 2020*0Sstevel@tonic-gate 2021*0Sstevel@tonic-gate setenv PURIFYOPTIONS "-chain-length=25" 2022*0Sstevel@tonic-gate 2023*0Sstevel@tonic-gateIn Bourne-type shells: 2024*0Sstevel@tonic-gate 2025*0Sstevel@tonic-gate PURIFYOPTIONS="..." 2026*0Sstevel@tonic-gate export PURIFYOPTIONS 2027*0Sstevel@tonic-gate 2028*0Sstevel@tonic-gateor if you have the "env" utility: 2029*0Sstevel@tonic-gate 2030*0Sstevel@tonic-gate env PURIFYOPTIONS="..." ../pureperl ... 2031*0Sstevel@tonic-gate 2032*0Sstevel@tonic-gate=head2 Purify on NT 2033*0Sstevel@tonic-gate 2034*0Sstevel@tonic-gatePurify on Windows NT instruments the Perl binary 'perl.exe' 2035*0Sstevel@tonic-gateon the fly. There are several options in the makefile you 2036*0Sstevel@tonic-gateshould change to get the most use out of Purify: 2037*0Sstevel@tonic-gate 2038*0Sstevel@tonic-gate=over 4 2039*0Sstevel@tonic-gate 2040*0Sstevel@tonic-gate=item DEFINES 2041*0Sstevel@tonic-gate 2042*0Sstevel@tonic-gateYou should add -DPURIFY to the DEFINES line so the DEFINES 2043*0Sstevel@tonic-gateline looks something like: 2044*0Sstevel@tonic-gate 2045*0Sstevel@tonic-gate DEFINES = -DWIN32 -D_CONSOLE -DNO_STRICT $(CRYPT_FLAG) -DPURIFY=1 2046*0Sstevel@tonic-gate 2047*0Sstevel@tonic-gateto disable Perl's arena memory allocation functions, as 2048*0Sstevel@tonic-gatewell as to force use of memory allocation functions derived 2049*0Sstevel@tonic-gatefrom the system malloc. 2050*0Sstevel@tonic-gate 2051*0Sstevel@tonic-gate=item USE_MULTI = define 2052*0Sstevel@tonic-gate 2053*0Sstevel@tonic-gateEnabling the multiplicity option allows perl to clean up 2054*0Sstevel@tonic-gatethoroughly when the interpreter shuts down, which reduces the 2055*0Sstevel@tonic-gatenumber of bogus leak reports from Purify. 2056*0Sstevel@tonic-gate 2057*0Sstevel@tonic-gate=item #PERL_MALLOC = define 2058*0Sstevel@tonic-gate 2059*0Sstevel@tonic-gateDisable Perl's malloc so that Purify can more closely monitor 2060*0Sstevel@tonic-gateallocations and leaks. Using Perl's malloc will make Purify 2061*0Sstevel@tonic-gatereport most leaks in the "potential" leaks category. 2062*0Sstevel@tonic-gate 2063*0Sstevel@tonic-gate=item CFG = Debug 2064*0Sstevel@tonic-gate 2065*0Sstevel@tonic-gateAdds debugging information so that you see the exact source 2066*0Sstevel@tonic-gatestatements where the problem occurs. Without this flag, all 2067*0Sstevel@tonic-gateyou will see is the source filename of where the error occurred. 2068*0Sstevel@tonic-gate 2069*0Sstevel@tonic-gate=back 2070*0Sstevel@tonic-gate 2071*0Sstevel@tonic-gateAs an example, to show any memory leaks produced during the 2072*0Sstevel@tonic-gatestandard Perl testset you would create and run Purify as: 2073*0Sstevel@tonic-gate 2074*0Sstevel@tonic-gate cd win32 2075*0Sstevel@tonic-gate make 2076*0Sstevel@tonic-gate cd ../t 2077*0Sstevel@tonic-gate purify ../perl -I../lib harness 2078*0Sstevel@tonic-gate 2079*0Sstevel@tonic-gatewhich would instrument Perl in memory, run Perl on test.pl, 2080*0Sstevel@tonic-gatethen finally report any memory problems. 2081*0Sstevel@tonic-gate 2082*0Sstevel@tonic-gate=head2 valgrind 2083*0Sstevel@tonic-gate 2084*0Sstevel@tonic-gateThe excellent valgrind tool can be used to find out both memory leaks 2085*0Sstevel@tonic-gateand illegal memory accesses. As of August 2003 it unfortunately works 2086*0Sstevel@tonic-gateonly on x86 (ELF) Linux. The special "test.valgrind" target can be used 2087*0Sstevel@tonic-gateto run the tests under valgrind. Found errors and memory leaks are 2088*0Sstevel@tonic-gatelogged in files named F<test.valgrind>. 2089*0Sstevel@tonic-gate 2090*0Sstevel@tonic-gateAs system libraries (most notably glibc) are also triggering errors, 2091*0Sstevel@tonic-gatevalgrind allows to suppress such errors using suppression files. The 2092*0Sstevel@tonic-gatedefault suppression file that comes with valgrind already catches a lot 2093*0Sstevel@tonic-gateof them. Some additional suppressions are defined in F<t/perl.supp>. 2094*0Sstevel@tonic-gate 2095*0Sstevel@tonic-gateTo get valgrind and for more information see 2096*0Sstevel@tonic-gate 2097*0Sstevel@tonic-gate http://developer.kde.org/~sewardj/ 2098*0Sstevel@tonic-gate 2099*0Sstevel@tonic-gate=head2 Compaq's/Digital's/HP's Third Degree 2100*0Sstevel@tonic-gate 2101*0Sstevel@tonic-gateThird Degree is a tool for memory leak detection and memory access checks. 2102*0Sstevel@tonic-gateIt is one of the many tools in the ATOM toolkit. The toolkit is only 2103*0Sstevel@tonic-gateavailable on Tru64 (formerly known as Digital UNIX formerly known as 2104*0Sstevel@tonic-gateDEC OSF/1). 2105*0Sstevel@tonic-gate 2106*0Sstevel@tonic-gateWhen building Perl, you must first run Configure with -Doptimize=-g 2107*0Sstevel@tonic-gateand -Uusemymalloc flags, after that you can use the make targets 2108*0Sstevel@tonic-gate"perl.third" and "test.third". (What is required is that Perl must be 2109*0Sstevel@tonic-gatecompiled using the C<-g> flag, you may need to re-Configure.) 2110*0Sstevel@tonic-gate 2111*0Sstevel@tonic-gateThe short story is that with "atom" you can instrument the Perl 2112*0Sstevel@tonic-gateexecutable to create a new executable called F<perl.third>. When the 2113*0Sstevel@tonic-gateinstrumented executable is run, it creates a log of dubious memory 2114*0Sstevel@tonic-gatetraffic in file called F<perl.3log>. See the manual pages of atom and 2115*0Sstevel@tonic-gatethird for more information. The most extensive Third Degree 2116*0Sstevel@tonic-gatedocumentation is available in the Compaq "Tru64 UNIX Programmer's 2117*0Sstevel@tonic-gateGuide", chapter "Debugging Programs with Third Degree". 2118*0Sstevel@tonic-gate 2119*0Sstevel@tonic-gateThe "test.third" leaves a lot of files named F<foo_bar.3log> in the t/ 2120*0Sstevel@tonic-gatesubdirectory. There is a problem with these files: Third Degree is so 2121*0Sstevel@tonic-gateeffective that it finds problems also in the system libraries. 2122*0Sstevel@tonic-gateTherefore you should used the Porting/thirdclean script to cleanup 2123*0Sstevel@tonic-gatethe F<*.3log> files. 2124*0Sstevel@tonic-gate 2125*0Sstevel@tonic-gateThere are also leaks that for given certain definition of a leak, 2126*0Sstevel@tonic-gatearen't. See L</PERL_DESTRUCT_LEVEL> for more information. 2127*0Sstevel@tonic-gate 2128*0Sstevel@tonic-gate=head2 PERL_DESTRUCT_LEVEL 2129*0Sstevel@tonic-gate 2130*0Sstevel@tonic-gateIf you want to run any of the tests yourself manually using e.g. 2131*0Sstevel@tonic-gatevalgrind, or the pureperl or perl.third executables, please note that 2132*0Sstevel@tonic-gateby default perl B<does not> explicitly cleanup all the memory it has 2133*0Sstevel@tonic-gateallocated (such as global memory arenas) but instead lets the exit() 2134*0Sstevel@tonic-gateof the whole program "take care" of such allocations, also known as 2135*0Sstevel@tonic-gate"global destruction of objects". 2136*0Sstevel@tonic-gate 2137*0Sstevel@tonic-gateThere is a way to tell perl to do complete cleanup: set the 2138*0Sstevel@tonic-gateenvironment variable PERL_DESTRUCT_LEVEL to a non-zero value. 2139*0Sstevel@tonic-gateThe t/TEST wrapper does set this to 2, and this is what you 2140*0Sstevel@tonic-gateneed to do too, if you don't want to see the "global leaks": 2141*0Sstevel@tonic-gateFor example, for "third-degreed" Perl: 2142*0Sstevel@tonic-gate 2143*0Sstevel@tonic-gate env PERL_DESTRUCT_LEVEL=2 ./perl.third -Ilib t/foo/bar.t 2144*0Sstevel@tonic-gate 2145*0Sstevel@tonic-gate(Note: the mod_perl apache module uses also this environment variable 2146*0Sstevel@tonic-gatefor its own purposes and extended its semantics. Refer to the mod_perl 2147*0Sstevel@tonic-gatedocumentation for more information. Also, spawned threads do the 2148*0Sstevel@tonic-gateequivalent of setting this variable to the value 1.) 2149*0Sstevel@tonic-gate 2150*0Sstevel@tonic-gateIf, at the end of a run you get the message I<N scalars leaked>, you can 2151*0Sstevel@tonic-gaterecompile with C<-DDEBUG_LEAKING_SCALARS>, which will cause 2152*0Sstevel@tonic-gatethe addresses of all those leaked SVs to be dumped; it also converts 2153*0Sstevel@tonic-gateC<new_SV()> from a macro into a real function, so you can use your 2154*0Sstevel@tonic-gatefavourite debugger to discover where those pesky SVs were allocated. 2155*0Sstevel@tonic-gate 2156*0Sstevel@tonic-gate=head2 Profiling 2157*0Sstevel@tonic-gate 2158*0Sstevel@tonic-gateDepending on your platform there are various of profiling Perl. 2159*0Sstevel@tonic-gate 2160*0Sstevel@tonic-gateThere are two commonly used techniques of profiling executables: 2161*0Sstevel@tonic-gateI<statistical time-sampling> and I<basic-block counting>. 2162*0Sstevel@tonic-gate 2163*0Sstevel@tonic-gateThe first method takes periodically samples of the CPU program 2164*0Sstevel@tonic-gatecounter, and since the program counter can be correlated with the code 2165*0Sstevel@tonic-gategenerated for functions, we get a statistical view of in which 2166*0Sstevel@tonic-gatefunctions the program is spending its time. The caveats are that very 2167*0Sstevel@tonic-gatesmall/fast functions have lower probability of showing up in the 2168*0Sstevel@tonic-gateprofile, and that periodically interrupting the program (this is 2169*0Sstevel@tonic-gateusually done rather frequently, in the scale of milliseconds) imposes 2170*0Sstevel@tonic-gatean additional overhead that may skew the results. The first problem 2171*0Sstevel@tonic-gatecan be alleviated by running the code for longer (in general this is a 2172*0Sstevel@tonic-gategood idea for profiling), the second problem is usually kept in guard 2173*0Sstevel@tonic-gateby the profiling tools themselves. 2174*0Sstevel@tonic-gate 2175*0Sstevel@tonic-gateThe second method divides up the generated code into I<basic blocks>. 2176*0Sstevel@tonic-gateBasic blocks are sections of code that are entered only in the 2177*0Sstevel@tonic-gatebeginning and exited only at the end. For example, a conditional jump 2178*0Sstevel@tonic-gatestarts a basic block. Basic block profiling usually works by 2179*0Sstevel@tonic-gateI<instrumenting> the code by adding I<enter basic block #nnnn> 2180*0Sstevel@tonic-gatebook-keeping code to the generated code. During the execution of the 2181*0Sstevel@tonic-gatecode the basic block counters are then updated appropriately. The 2182*0Sstevel@tonic-gatecaveat is that the added extra code can skew the results: again, the 2183*0Sstevel@tonic-gateprofiling tools usually try to factor their own effects out of the 2184*0Sstevel@tonic-gateresults. 2185*0Sstevel@tonic-gate 2186*0Sstevel@tonic-gate=head2 Gprof Profiling 2187*0Sstevel@tonic-gate 2188*0Sstevel@tonic-gategprof is a profiling tool available in many UNIX platforms, 2189*0Sstevel@tonic-gateit uses F<statistical time-sampling>. 2190*0Sstevel@tonic-gate 2191*0Sstevel@tonic-gateYou can build a profiled version of perl called "perl.gprof" by 2192*0Sstevel@tonic-gateinvoking the make target "perl.gprof" (What is required is that Perl 2193*0Sstevel@tonic-gatemust be compiled using the C<-pg> flag, you may need to re-Configure). 2194*0Sstevel@tonic-gateRunning the profiled version of Perl will create an output file called 2195*0Sstevel@tonic-gateF<gmon.out> is created which contains the profiling data collected 2196*0Sstevel@tonic-gateduring the execution. 2197*0Sstevel@tonic-gate 2198*0Sstevel@tonic-gateThe gprof tool can then display the collected data in various ways. 2199*0Sstevel@tonic-gateUsually gprof understands the following options: 2200*0Sstevel@tonic-gate 2201*0Sstevel@tonic-gate=over 4 2202*0Sstevel@tonic-gate 2203*0Sstevel@tonic-gate=item -a 2204*0Sstevel@tonic-gate 2205*0Sstevel@tonic-gateSuppress statically defined functions from the profile. 2206*0Sstevel@tonic-gate 2207*0Sstevel@tonic-gate=item -b 2208*0Sstevel@tonic-gate 2209*0Sstevel@tonic-gateSuppress the verbose descriptions in the profile. 2210*0Sstevel@tonic-gate 2211*0Sstevel@tonic-gate=item -e routine 2212*0Sstevel@tonic-gate 2213*0Sstevel@tonic-gateExclude the given routine and its descendants from the profile. 2214*0Sstevel@tonic-gate 2215*0Sstevel@tonic-gate=item -f routine 2216*0Sstevel@tonic-gate 2217*0Sstevel@tonic-gateDisplay only the given routine and its descendants in the profile. 2218*0Sstevel@tonic-gate 2219*0Sstevel@tonic-gate=item -s 2220*0Sstevel@tonic-gate 2221*0Sstevel@tonic-gateGenerate a summary file called F<gmon.sum> which then may be given 2222*0Sstevel@tonic-gateto subsequent gprof runs to accumulate data over several runs. 2223*0Sstevel@tonic-gate 2224*0Sstevel@tonic-gate=item -z 2225*0Sstevel@tonic-gate 2226*0Sstevel@tonic-gateDisplay routines that have zero usage. 2227*0Sstevel@tonic-gate 2228*0Sstevel@tonic-gate=back 2229*0Sstevel@tonic-gate 2230*0Sstevel@tonic-gateFor more detailed explanation of the available commands and output 2231*0Sstevel@tonic-gateformats, see your own local documentation of gprof. 2232*0Sstevel@tonic-gate 2233*0Sstevel@tonic-gate=head2 GCC gcov Profiling 2234*0Sstevel@tonic-gate 2235*0Sstevel@tonic-gateStarting from GCC 3.0 I<basic block profiling> is officially available 2236*0Sstevel@tonic-gatefor the GNU CC. 2237*0Sstevel@tonic-gate 2238*0Sstevel@tonic-gateYou can build a profiled version of perl called F<perl.gcov> by 2239*0Sstevel@tonic-gateinvoking the make target "perl.gcov" (what is required that Perl must 2240*0Sstevel@tonic-gatebe compiled using gcc with the flags C<-fprofile-arcs 2241*0Sstevel@tonic-gate-ftest-coverage>, you may need to re-Configure). 2242*0Sstevel@tonic-gate 2243*0Sstevel@tonic-gateRunning the profiled version of Perl will cause profile output to be 2244*0Sstevel@tonic-gategenerated. For each source file an accompanying ".da" file will be 2245*0Sstevel@tonic-gatecreated. 2246*0Sstevel@tonic-gate 2247*0Sstevel@tonic-gateTo display the results you use the "gcov" utility (which should 2248*0Sstevel@tonic-gatebe installed if you have gcc 3.0 or newer installed). F<gcov> is 2249*0Sstevel@tonic-gaterun on source code files, like this 2250*0Sstevel@tonic-gate 2251*0Sstevel@tonic-gate gcov sv.c 2252*0Sstevel@tonic-gate 2253*0Sstevel@tonic-gatewhich will cause F<sv.c.gcov> to be created. The F<.gcov> files 2254*0Sstevel@tonic-gatecontain the source code annotated with relative frequencies of 2255*0Sstevel@tonic-gateexecution indicated by "#" markers. 2256*0Sstevel@tonic-gate 2257*0Sstevel@tonic-gateUseful options of F<gcov> include C<-b> which will summarise the 2258*0Sstevel@tonic-gatebasic block, branch, and function call coverage, and C<-c> which 2259*0Sstevel@tonic-gateinstead of relative frequencies will use the actual counts. For 2260*0Sstevel@tonic-gatemore information on the use of F<gcov> and basic block profiling 2261*0Sstevel@tonic-gatewith gcc, see the latest GNU CC manual, as of GCC 3.0 see 2262*0Sstevel@tonic-gate 2263*0Sstevel@tonic-gate http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc.html 2264*0Sstevel@tonic-gate 2265*0Sstevel@tonic-gateand its section titled "8. gcov: a Test Coverage Program" 2266*0Sstevel@tonic-gate 2267*0Sstevel@tonic-gate http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_8.html#SEC132 2268*0Sstevel@tonic-gate 2269*0Sstevel@tonic-gate=head2 Pixie Profiling 2270*0Sstevel@tonic-gate 2271*0Sstevel@tonic-gatePixie is a profiling tool available on IRIX and Tru64 (aka Digital 2272*0Sstevel@tonic-gateUNIX aka DEC OSF/1) platforms. Pixie does its profiling using 2273*0Sstevel@tonic-gateI<basic-block counting>. 2274*0Sstevel@tonic-gate 2275*0Sstevel@tonic-gateYou can build a profiled version of perl called F<perl.pixie> by 2276*0Sstevel@tonic-gateinvoking the make target "perl.pixie" (what is required is that Perl 2277*0Sstevel@tonic-gatemust be compiled using the C<-g> flag, you may need to re-Configure). 2278*0Sstevel@tonic-gate 2279*0Sstevel@tonic-gateIn Tru64 a file called F<perl.Addrs> will also be silently created, 2280*0Sstevel@tonic-gatethis file contains the addresses of the basic blocks. Running the 2281*0Sstevel@tonic-gateprofiled version of Perl will create a new file called "perl.Counts" 2282*0Sstevel@tonic-gatewhich contains the counts for the basic block for that particular 2283*0Sstevel@tonic-gateprogram execution. 2284*0Sstevel@tonic-gate 2285*0Sstevel@tonic-gateTo display the results you use the F<prof> utility. The exact 2286*0Sstevel@tonic-gateincantation depends on your operating system, "prof perl.Counts" in 2287*0Sstevel@tonic-gateIRIX, and "prof -pixie -all -L. perl" in Tru64. 2288*0Sstevel@tonic-gate 2289*0Sstevel@tonic-gateIn IRIX the following prof options are available: 2290*0Sstevel@tonic-gate 2291*0Sstevel@tonic-gate=over 4 2292*0Sstevel@tonic-gate 2293*0Sstevel@tonic-gate=item -h 2294*0Sstevel@tonic-gate 2295*0Sstevel@tonic-gateReports the most heavily used lines in descending order of use. 2296*0Sstevel@tonic-gateUseful for finding the hotspot lines. 2297*0Sstevel@tonic-gate 2298*0Sstevel@tonic-gate=item -l 2299*0Sstevel@tonic-gate 2300*0Sstevel@tonic-gateGroups lines by procedure, with procedures sorted in descending order of use. 2301*0Sstevel@tonic-gateWithin a procedure, lines are listed in source order. 2302*0Sstevel@tonic-gateUseful for finding the hotspots of procedures. 2303*0Sstevel@tonic-gate 2304*0Sstevel@tonic-gate=back 2305*0Sstevel@tonic-gate 2306*0Sstevel@tonic-gateIn Tru64 the following options are available: 2307*0Sstevel@tonic-gate 2308*0Sstevel@tonic-gate=over 4 2309*0Sstevel@tonic-gate 2310*0Sstevel@tonic-gate=item -p[rocedures] 2311*0Sstevel@tonic-gate 2312*0Sstevel@tonic-gateProcedures sorted in descending order by the number of cycles executed 2313*0Sstevel@tonic-gatein each procedure. Useful for finding the hotspot procedures. 2314*0Sstevel@tonic-gate(This is the default option.) 2315*0Sstevel@tonic-gate 2316*0Sstevel@tonic-gate=item -h[eavy] 2317*0Sstevel@tonic-gate 2318*0Sstevel@tonic-gateLines sorted in descending order by the number of cycles executed in 2319*0Sstevel@tonic-gateeach line. Useful for finding the hotspot lines. 2320*0Sstevel@tonic-gate 2321*0Sstevel@tonic-gate=item -i[nvocations] 2322*0Sstevel@tonic-gate 2323*0Sstevel@tonic-gateThe called procedures are sorted in descending order by number of calls 2324*0Sstevel@tonic-gatemade to the procedures. Useful for finding the most used procedures. 2325*0Sstevel@tonic-gate 2326*0Sstevel@tonic-gate=item -l[ines] 2327*0Sstevel@tonic-gate 2328*0Sstevel@tonic-gateGrouped by procedure, sorted by cycles executed per procedure. 2329*0Sstevel@tonic-gateUseful for finding the hotspots of procedures. 2330*0Sstevel@tonic-gate 2331*0Sstevel@tonic-gate=item -testcoverage 2332*0Sstevel@tonic-gate 2333*0Sstevel@tonic-gateThe compiler emitted code for these lines, but the code was unexecuted. 2334*0Sstevel@tonic-gate 2335*0Sstevel@tonic-gate=item -z[ero] 2336*0Sstevel@tonic-gate 2337*0Sstevel@tonic-gateUnexecuted procedures. 2338*0Sstevel@tonic-gate 2339*0Sstevel@tonic-gate=back 2340*0Sstevel@tonic-gate 2341*0Sstevel@tonic-gateFor further information, see your system's manual pages for pixie and prof. 2342*0Sstevel@tonic-gate 2343*0Sstevel@tonic-gate=head2 Miscellaneous tricks 2344*0Sstevel@tonic-gate 2345*0Sstevel@tonic-gate=over 4 2346*0Sstevel@tonic-gate 2347*0Sstevel@tonic-gate=item * 2348*0Sstevel@tonic-gate 2349*0Sstevel@tonic-gateThose debugging perl with the DDD frontend over gdb may find the 2350*0Sstevel@tonic-gatefollowing useful: 2351*0Sstevel@tonic-gate 2352*0Sstevel@tonic-gateYou can extend the data conversion shortcuts menu, so for example you 2353*0Sstevel@tonic-gatecan display an SV's IV value with one click, without doing any typing. 2354*0Sstevel@tonic-gateTo do that simply edit ~/.ddd/init file and add after: 2355*0Sstevel@tonic-gate 2356*0Sstevel@tonic-gate ! Display shortcuts. 2357*0Sstevel@tonic-gate Ddd*gdbDisplayShortcuts: \ 2358*0Sstevel@tonic-gate /t () // Convert to Bin\n\ 2359*0Sstevel@tonic-gate /d () // Convert to Dec\n\ 2360*0Sstevel@tonic-gate /x () // Convert to Hex\n\ 2361*0Sstevel@tonic-gate /o () // Convert to Oct(\n\ 2362*0Sstevel@tonic-gate 2363*0Sstevel@tonic-gatethe following two lines: 2364*0Sstevel@tonic-gate 2365*0Sstevel@tonic-gate ((XPV*) (())->sv_any )->xpv_pv // 2pvx\n\ 2366*0Sstevel@tonic-gate ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx 2367*0Sstevel@tonic-gate 2368*0Sstevel@tonic-gateso now you can do ivx and pvx lookups or you can plug there the 2369*0Sstevel@tonic-gatesv_peek "conversion": 2370*0Sstevel@tonic-gate 2371*0Sstevel@tonic-gate Perl_sv_peek(my_perl, (SV*)()) // sv_peek 2372*0Sstevel@tonic-gate 2373*0Sstevel@tonic-gate(The my_perl is for threaded builds.) 2374*0Sstevel@tonic-gateJust remember that every line, but the last one, should end with \n\ 2375*0Sstevel@tonic-gate 2376*0Sstevel@tonic-gateAlternatively edit the init file interactively via: 2377*0Sstevel@tonic-gate3rd mouse button -> New Display -> Edit Menu 2378*0Sstevel@tonic-gate 2379*0Sstevel@tonic-gateNote: you can define up to 20 conversion shortcuts in the gdb 2380*0Sstevel@tonic-gatesection. 2381*0Sstevel@tonic-gate 2382*0Sstevel@tonic-gate=item * 2383*0Sstevel@tonic-gate 2384*0Sstevel@tonic-gateIf you see in a debugger a memory area mysteriously full of 0xabababab, 2385*0Sstevel@tonic-gateyou may be seeing the effect of the Poison() macro, see L<perlclib>. 2386*0Sstevel@tonic-gate 2387*0Sstevel@tonic-gate=back 2388*0Sstevel@tonic-gate 2389*0Sstevel@tonic-gate=head2 CONCLUSION 2390*0Sstevel@tonic-gate 2391*0Sstevel@tonic-gateWe've had a brief look around the Perl source, an overview of the stages 2392*0Sstevel@tonic-gateF<perl> goes through when it's running your code, and how to use a 2393*0Sstevel@tonic-gatedebugger to poke at the Perl guts. We took a very simple problem and 2394*0Sstevel@tonic-gatedemonstrated how to solve it fully - with documentation, regression 2395*0Sstevel@tonic-gatetests, and finally a patch for submission to p5p. Finally, we talked 2396*0Sstevel@tonic-gateabout how to use external tools to debug and test Perl. 2397*0Sstevel@tonic-gate 2398*0Sstevel@tonic-gateI'd now suggest you read over those references again, and then, as soon 2399*0Sstevel@tonic-gateas possible, get your hands dirty. The best way to learn is by doing, 2400*0Sstevel@tonic-gateso: 2401*0Sstevel@tonic-gate 2402*0Sstevel@tonic-gate=over 3 2403*0Sstevel@tonic-gate 2404*0Sstevel@tonic-gate=item * 2405*0Sstevel@tonic-gate 2406*0Sstevel@tonic-gateSubscribe to perl5-porters, follow the patches and try and understand 2407*0Sstevel@tonic-gatethem; don't be afraid to ask if there's a portion you're not clear on - 2408*0Sstevel@tonic-gatewho knows, you may unearth a bug in the patch... 2409*0Sstevel@tonic-gate 2410*0Sstevel@tonic-gate=item * 2411*0Sstevel@tonic-gate 2412*0Sstevel@tonic-gateKeep up to date with the bleeding edge Perl distributions and get 2413*0Sstevel@tonic-gatefamiliar with the changes. Try and get an idea of what areas people are 2414*0Sstevel@tonic-gateworking on and the changes they're making. 2415*0Sstevel@tonic-gate 2416*0Sstevel@tonic-gate=item * 2417*0Sstevel@tonic-gate 2418*0Sstevel@tonic-gateDo read the README associated with your operating system, e.g. README.aix 2419*0Sstevel@tonic-gateon the IBM AIX OS. Don't hesitate to supply patches to that README if 2420*0Sstevel@tonic-gateyou find anything missing or changed over a new OS release. 2421*0Sstevel@tonic-gate 2422*0Sstevel@tonic-gate=item * 2423*0Sstevel@tonic-gate 2424*0Sstevel@tonic-gateFind an area of Perl that seems interesting to you, and see if you can 2425*0Sstevel@tonic-gatework out how it works. Scan through the source, and step over it in the 2426*0Sstevel@tonic-gatedebugger. Play, poke, investigate, fiddle! You'll probably get to 2427*0Sstevel@tonic-gateunderstand not just your chosen area but a much wider range of F<perl>'s 2428*0Sstevel@tonic-gateactivity as well, and probably sooner than you'd think. 2429*0Sstevel@tonic-gate 2430*0Sstevel@tonic-gate=back 2431*0Sstevel@tonic-gate 2432*0Sstevel@tonic-gate=over 3 2433*0Sstevel@tonic-gate 2434*0Sstevel@tonic-gate=item I<The Road goes ever on and on, down from the door where it began.> 2435*0Sstevel@tonic-gate 2436*0Sstevel@tonic-gate=back 2437*0Sstevel@tonic-gate 2438*0Sstevel@tonic-gateIf you can do these things, you've started on the long road to Perl porting. 2439*0Sstevel@tonic-gateThanks for wanting to help make Perl better - and happy hacking! 2440*0Sstevel@tonic-gate 2441*0Sstevel@tonic-gate=head1 AUTHOR 2442*0Sstevel@tonic-gate 2443*0Sstevel@tonic-gateThis document was written by Nathan Torkington, and is maintained by 2444*0Sstevel@tonic-gatethe perl5-porters mailing list. 2445*0Sstevel@tonic-gate 2446