1=head1 NAME 2 3perlmod - Perl modules (packages and symbol tables) 4 5=head1 DESCRIPTION 6 7=head2 Packages 8X<package> X<namespace> X<variable, global> X<global variable> X<global> 9 10Perl provides a mechanism for alternative namespaces to protect 11packages from stomping on each other's variables. In fact, there's 12really no such thing as a global variable in Perl. The package 13statement declares the compilation unit as being in the given 14namespace. The scope of the package declaration is from the 15declaration itself through the end of the enclosing block, C<eval>, 16or file, whichever comes first (the same scope as the my() and 17local() operators). Unqualified dynamic identifiers will be in 18this namespace, except for those few identifiers that if unqualified, 19default to the main package instead of the current one as described 20below. A package statement affects only dynamic variables--including 21those you've used local() on--but I<not> lexical variables created 22with my(). Typically it would be the first declaration in a file 23included by the C<do>, C<require>, or C<use> operators. You can 24switch into a package in more than one place; it merely influences 25which symbol table is used by the compiler for the rest of that 26block. You can refer to variables and filehandles in other packages 27by prefixing the identifier with the package name and a double 28colon: C<$Package::Variable>. If the package name is null, the 29C<main> package is assumed. That is, C<$::sail> is equivalent to 30C<$main::sail>. 31 32The old package delimiter was a single quote, but double colon is now the 33preferred delimiter, in part because it's more readable to humans, and 34in part because it's more readable to B<emacs> macros. It also makes C++ 35programmers feel like they know what's going on--as opposed to using the 36single quote as separator, which was there to make Ada programmers feel 37like they knew what was going on. Because the old-fashioned syntax is still 38supported for backwards compatibility, if you try to use a string like 39C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, 40the $s variable in package C<owner>, which is probably not what you meant. 41Use braces to disambiguate, as in C<"This is ${owner}'s house">. 42X<::> X<'> 43 44Packages may themselves contain package separators, as in 45C<$OUTER::INNER::var>. This implies nothing about the order of 46name lookups, however. There are no relative packages: all symbols 47are either local to the current package, or must be fully qualified 48from the outer package name down. For instance, there is nowhere 49within package C<OUTER> that C<$INNER::var> refers to 50C<$OUTER::INNER::var>. C<INNER> refers to a totally 51separate global package. 52 53Only identifiers starting with letters (or underscore) are stored 54in a package's symbol table. All other symbols are kept in package 55C<main>, including all punctuation variables, like $_. In addition, 56when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, 57ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, 58even when used for other purposes than their built-in ones. If you 59have a package called C<m>, C<s>, or C<y>, then you can't use the 60qualified form of an identifier because it would be instead interpreted 61as a pattern match, a substitution, or a transliteration. 62X<variable, punctuation> 63 64Variables beginning with underscore used to be forced into package 65main, but we decided it was more useful for package writers to be able 66to use leading underscore to indicate private variables and method names. 67However, variables and functions named with a single C<_>, such as 68$_ and C<sub _>, are still forced into the package C<main>. See also 69L<perlvar/"Technical Note on the Syntax of Variable Names">. 70 71C<eval>ed strings are compiled in the package in which the eval() was 72compiled. (Assignments to C<$SIG{}>, however, assume the signal 73handler specified is in the C<main> package. Qualify the signal handler 74name if you wish to have a signal handler in a package.) For an 75example, examine F<perldb.pl> in the Perl library. It initially switches 76to the C<DB> package so that the debugger doesn't interfere with variables 77in the program you are trying to debug. At various points, however, it 78temporarily switches back to the C<main> package to evaluate various 79expressions in the context of the C<main> package (or wherever you came 80from). See L<perldebug>. 81 82The special symbol C<__PACKAGE__> contains the current package, but cannot 83(easily) be used to construct variable names. 84 85See L<perlsub> for other scoping issues related to my() and local(), 86and L<perlref> regarding closures. 87 88=head2 Symbol Tables 89X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> 90 91The symbol table for a package happens to be stored in the hash of that 92name with two colons appended. The main symbol table's name is thus 93C<%main::>, or C<%::> for short. Likewise the symbol table for the nested 94package mentioned earlier is named C<%OUTER::INNER::>. 95 96The value in each entry of the hash is what you are referring to when you 97use the C<*name> typeglob notation. In fact, the following have the same 98effect, though the first is more efficient because it does the symbol 99table lookups at compile time: 100 101 local *main::foo = *main::bar; 102 local $main::{foo} = $main::{bar}; 103 104(Be sure to note the B<vast> difference between the second line above 105and C<local $main::foo = $main::bar>. The former is accessing the hash 106C<%main::>, which is the symbol table of package C<main>. The latter is 107simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of 108the same package.) 109 110You can use this to print out all the variables in a package, for 111instance. The standard but antiquated F<dumpvar.pl> library and 112the CPAN module Devel::Symdump make use of this. 113 114Assignment to a typeglob performs an aliasing operation, i.e., 115 116 *dick = *richard; 117 118causes variables, subroutines, formats, and file and directory handles 119accessible via the identifier C<richard> also to be accessible via the 120identifier C<dick>. If you want to alias only a particular variable or 121subroutine, assign a reference instead: 122 123 *dick = \$richard; 124 125Which makes $richard and $dick the same variable, but leaves 126@richard and @dick as separate arrays. Tricky, eh? 127 128There is one subtle difference between the following statements: 129 130 *foo = *bar; 131 *foo = \$bar; 132 133C<*foo = *bar> makes the typeglobs themselves synonymous while 134C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs 135refer to the same scalar value. This means that the following code: 136 137 $bar = 1; 138 *foo = \$bar; # Make $foo an alias for $bar 139 140 { 141 local $bar = 2; # Restrict changes to block 142 print $foo; # Prints '1'! 143 } 144 145Would print '1', because C<$foo> holds a reference to the I<original> 146C<$bar> -- the one that was stuffed away by C<local()> and which will be 147restored when the block ends. Because variables are accessed through the 148typeglob, you can use C<*foo = *bar> to create an alias which can be 149localized. (But be aware that this means you can't have a separate 150C<@foo> and C<@bar>, etc.) 151 152What makes all of this important is that the Exporter module uses glob 153aliasing as the import/export mechanism. Whether or not you can properly 154localize a variable that has been exported from a module depends on how 155it was exported: 156 157 @EXPORT = qw($FOO); # Usual form, can't be localized 158 @EXPORT = qw(*FOO); # Can be localized 159 160You can work around the first case by using the fully qualified name 161(C<$Package::FOO>) where you need a local value, or by overriding it 162by saying C<*FOO = *Package::FOO> in your script. 163 164The C<*x = \$y> mechanism may be used to pass and return cheap references 165into or from subroutines if you don't want to copy the whole 166thing. It only works when assigning to dynamic variables, not 167lexicals. 168 169 %some_hash = (); # can't be my() 170 *some_hash = fn( \%another_hash ); 171 sub fn { 172 local *hashsym = shift; 173 # now use %hashsym normally, and you 174 # will affect the caller's %another_hash 175 my %nhash = (); # do what you want 176 return \%nhash; 177 } 178 179On return, the reference will overwrite the hash slot in the 180symbol table specified by the *some_hash typeglob. This 181is a somewhat tricky way of passing around references cheaply 182when you don't want to have to remember to dereference variables 183explicitly. 184 185Another use of symbol tables is for making "constant" scalars. 186X<constant> X<scalar, constant> 187 188 *PI = \3.14159265358979; 189 190Now you cannot alter C<$PI>, which is probably a good thing all in all. 191This isn't the same as a constant subroutine, which is subject to 192optimization at compile-time. A constant subroutine is one prototyped 193to take no arguments and to return a constant expression. See 194L<perlsub> for details on these. The C<use constant> pragma is a 195convenient shorthand for these. 196 197You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and 198package the *foo symbol table entry comes from. This may be useful 199in a subroutine that gets passed typeglobs as arguments: 200 201 sub identify_typeglob { 202 my $glob = shift; 203 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; 204 } 205 identify_typeglob *foo; 206 identify_typeglob *bar::baz; 207 208This prints 209 210 You gave me main::foo 211 You gave me bar::baz 212 213The C<*foo{THING}> notation can also be used to obtain references to the 214individual elements of *foo. See L<perlref>. 215 216Subroutine definitions (and declarations, for that matter) need 217not necessarily be situated in the package whose symbol table they 218occupy. You can define a subroutine outside its package by 219explicitly qualifying the name of the subroutine: 220 221 package main; 222 sub Some_package::foo { ... } # &foo defined in Some_package 223 224This is just a shorthand for a typeglob assignment at compile time: 225 226 BEGIN { *Some_package::foo = sub { ... } } 227 228and is I<not> the same as writing: 229 230 { 231 package Some_package; 232 sub foo { ... } 233 } 234 235In the first two versions, the body of the subroutine is 236lexically in the main package, I<not> in Some_package. So 237something like this: 238 239 package main; 240 241 $Some_package::name = "fred"; 242 $main::name = "barney"; 243 244 sub Some_package::foo { 245 print "in ", __PACKAGE__, ": \$name is '$name'\n"; 246 } 247 248 Some_package::foo(); 249 250prints: 251 252 in main: $name is 'barney' 253 254rather than: 255 256 in Some_package: $name is 'fred' 257 258This also has implications for the use of the SUPER:: qualifier 259(see L<perlobj>). 260 261=head2 BEGIN, UNITCHECK, CHECK, INIT and END 262X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> 263 264Five specially named code blocks are executed at the beginning and at 265the end of a running Perl program. These are the C<BEGIN>, 266C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. 267 268These code blocks can be prefixed with C<sub> to give the appearance of a 269subroutine (although this is not considered good style). One should note 270that these code blocks don't really exist as named subroutines (despite 271their appearance). The thing that gives this away is the fact that you can 272have B<more than one> of these code blocks in a program, and they will get 273B<all> executed at the appropriate moment. So you can't execute any of 274these code blocks by name. 275 276A C<BEGIN> code block is executed as soon as possible, that is, the moment 277it is completely defined, even before the rest of the containing file (or 278string) is parsed. You may have multiple C<BEGIN> blocks within a file (or 279eval'ed string) -- they will execute in order of definition. Because a C<BEGIN> 280code block executes immediately, it can pull in definitions of subroutines 281and such from other files in time to be visible to the rest of the compile 282and run time. Once a C<BEGIN> has run, it is immediately undefined and any 283code it used is returned to Perl's memory pool. 284 285It should be noted that C<BEGIN> and C<UNITCHECK> code blocks B<are> 286executed inside string C<eval()>'s. The C<CHECK> and C<INIT> code 287blocks are B<not> executed inside a string eval, which e.g. can be a 288problem in a mod_perl environment. 289 290An C<END> code block is executed as late as possible, that is, after 291perl has finished running the program and just before the interpreter 292is being exited, even if it is exiting as a result of a die() function. 293(But not if it's morphing into another program via C<exec>, or 294being blown out of the water by a signal--you have to trap that yourself 295(if you can).) You may have multiple C<END> blocks within a file--they 296will execute in reverse order of definition; that is: last in, first 297out (LIFO). C<END> blocks are not executed when you run perl with the 298C<-c> switch, or if compilation fails. 299 300Note that C<END> code blocks are B<not> executed at the end of a string 301C<eval()>: if any C<END> code blocks are created in a string C<eval()>, 302they will be executed just as any other C<END> code block of that package 303in LIFO order just before the interpreter is being exited. 304 305Inside an C<END> code block, C<$?> contains the value that the program is 306going to pass to C<exit()>. You can modify C<$?> to change the exit 307value of the program. Beware of changing C<$?> by accident (e.g. by 308running something via C<system>). 309X<$?> 310 311C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the 312transition between the compilation phase and the execution phase of 313the main program. 314 315C<UNITCHECK> blocks are run just after the unit which defined them has 316been compiled. The main program file and each module it loads are 317compilation units, as are string C<eval>s, code compiled using the 318C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, 319and code after the C<-e> switch on the command line. 320 321C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends 322and before the run time begins, in LIFO order. C<CHECK> code blocks are used 323in the Perl compiler suite to save the compiled state of the program. 324 325C<INIT> blocks are run just before the Perl runtime begins execution, in 326"first in, first out" (FIFO) order. 327 328When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and 329C<END> work just as they do in B<awk>, as a degenerate case. 330Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> 331switch for a compile-only syntax check, although your main code 332is not. 333 334The B<begincheck> program makes it all clear, eventually: 335 336 #!/usr/bin/perl 337 338 # begincheck 339 340 print "10. Ordinary code runs at runtime.\n"; 341 342 END { print "16. So this is the end of the tale.\n" } 343 INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } 344 UNITCHECK { 345 print " 4. And therefore before any CHECK blocks.\n" 346 } 347 CHECK { print " 6. So this is the sixth line.\n" } 348 349 print "11. It runs in order, of course.\n"; 350 351 BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } 352 END { print "15. Read perlmod for the rest of the story.\n" } 353 CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } 354 INIT { print " 8. Run this again, using Perl's -c switch.\n" } 355 356 print "12. This is anti-obfuscated code.\n"; 357 358 END { print "14. END blocks run LIFO at quitting time.\n" } 359 BEGIN { print " 2. So this line comes out second.\n" } 360 UNITCHECK { 361 print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" 362 } 363 INIT { print " 9. You'll see the difference right away.\n" } 364 365 print "13. It merely _looks_ like it should be confusing.\n"; 366 367 __END__ 368 369=head2 Perl Classes 370X<class> X<@ISA> 371 372There is no special class syntax in Perl, but a package may act 373as a class if it provides subroutines to act as methods. Such a 374package may also derive some of its methods from another class (package) 375by listing the other package name(s) in its global @ISA array (which 376must be a package global, not a lexical). 377 378For more on this, see L<perltoot> and L<perlobj>. 379 380=head2 Perl Modules 381X<module> 382 383A module is just a set of related functions in a library file, i.e., 384a Perl package with the same name as the file. It is specifically 385designed to be reusable by other modules or programs. It may do this 386by providing a mechanism for exporting some of its symbols into the 387symbol table of any package using it, or it may function as a class 388definition and make its semantics available implicitly through 389method calls on the class and its objects, without explicitly 390exporting anything. Or it can do a little of both. 391 392For example, to start a traditional, non-OO module called Some::Module, 393create a file called F<Some/Module.pm> and start with this template: 394 395 package Some::Module; # assumes Some/Module.pm 396 397 use strict; 398 use warnings; 399 400 BEGIN { 401 use Exporter (); 402 our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS); 403 404 # set the version for version checking 405 $VERSION = 1.00; 406 # if using RCS/CVS, this may be preferred 407 $VERSION = sprintf "%d.%03d", q$Revision: 1.1 $ =~ /(\d+)/g; 408 409 @ISA = qw(Exporter); 410 @EXPORT = qw(&func1 &func2 &func4); 411 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], 412 413 # your exported package globals go here, 414 # as well as any optionally exported functions 415 @EXPORT_OK = qw($Var1 %Hashit &func3); 416 } 417 our @EXPORT_OK; 418 419 # exported package globals go here 420 our $Var1; 421 our %Hashit; 422 423 # non-exported package globals go here 424 our @more; 425 our $stuff; 426 427 # initialize package globals, first exported ones 428 $Var1 = ''; 429 %Hashit = (); 430 431 # then the others (which are still accessible as $Some::Module::stuff) 432 $stuff = ''; 433 @more = (); 434 435 # all file-scoped lexicals must be created before 436 # the functions below that use them. 437 438 # file-private lexicals go here 439 my $priv_var = ''; 440 my %secret_hash = (); 441 442 # here's a file-private function as a closure, 443 # callable as &$priv_func; it cannot be prototyped. 444 my $priv_func = sub { 445 # stuff goes here. 446 }; 447 448 # make all your functions, whether exported or not; 449 # remember to put something interesting in the {} stubs 450 sub func1 {} # no prototype 451 sub func2() {} # proto'd void 452 sub func3($$) {} # proto'd to 2 scalars 453 454 # this one isn't exported, but could be called! 455 sub func4(\%) {} # proto'd to 1 hash ref 456 457 END { } # module clean-up code here (global destructor) 458 459 ## YOUR CODE GOES HERE 460 461 1; # don't forget to return a true value from the file 462 463Then go on to declare and use your variables in functions without 464any qualifications. See L<Exporter> and the L<perlmodlib> for 465details on mechanics and style issues in module creation. 466 467Perl modules are included into your program by saying 468 469 use Module; 470 471or 472 473 use Module LIST; 474 475This is exactly equivalent to 476 477 BEGIN { require Module; import Module; } 478 479or 480 481 BEGIN { require Module; import Module LIST; } 482 483As a special case 484 485 use Module (); 486 487is exactly equivalent to 488 489 BEGIN { require Module; } 490 491All Perl module files have the extension F<.pm>. The C<use> operator 492assumes this so you don't have to spell out "F<Module.pm>" in quotes. 493This also helps to differentiate new modules from old F<.pl> and 494F<.ph> files. Module names are also capitalized unless they're 495functioning as pragmas; pragmas are in effect compiler directives, 496and are sometimes called "pragmatic modules" (or even "pragmata" 497if you're a classicist). 498 499The two statements: 500 501 require SomeModule; 502 require "SomeModule.pm"; 503 504differ from each other in two ways. In the first case, any double 505colons in the module name, such as C<Some::Module>, are translated 506into your system's directory separator, usually "/". The second 507case does not, and would have to be specified literally. The other 508difference is that seeing the first C<require> clues in the compiler 509that uses of indirect object notation involving "SomeModule", as 510in C<$ob = purge SomeModule>, are method calls, not function calls. 511(Yes, this really can make a difference.) 512 513Because the C<use> statement implies a C<BEGIN> block, the importing 514of semantics happens as soon as the C<use> statement is compiled, 515before the rest of the file is compiled. This is how it is able 516to function as a pragma mechanism, and also how modules are able to 517declare subroutines that are then visible as list or unary operators for 518the rest of the current file. This will not work if you use C<require> 519instead of C<use>. With C<require> you can get into this problem: 520 521 require Cwd; # make Cwd:: accessible 522 $here = Cwd::getcwd(); 523 524 use Cwd; # import names from Cwd:: 525 $here = getcwd(); 526 527 require Cwd; # make Cwd:: accessible 528 $here = getcwd(); # oops! no main::getcwd() 529 530In general, C<use Module ()> is recommended over C<require Module>, 531because it determines module availability at compile time, not in the 532middle of your program's execution. An exception would be if two modules 533each tried to C<use> each other, and each also called a function from 534that other module. In that case, it's easy to use C<require> instead. 535 536Perl packages may be nested inside other package names, so we can have 537package names containing C<::>. But if we used that package name 538directly as a filename it would make for unwieldy or impossible 539filenames on some systems. Therefore, if a module's name is, say, 540C<Text::Soundex>, then its definition is actually found in the library 541file F<Text/Soundex.pm>. 542 543Perl modules always have a F<.pm> file, but there may also be 544dynamically linked executables (often ending in F<.so>) or autoloaded 545subroutine definitions (often ending in F<.al>) associated with the 546module. If so, these will be entirely transparent to the user of 547the module. It is the responsibility of the F<.pm> file to load 548(or arrange to autoload) any additional functionality. For example, 549although the POSIX module happens to do both dynamic loading and 550autoloading, the user can say just C<use POSIX> to get it all. 551 552=head2 Making your module threadsafe 553X<threadsafe> X<thread safe> 554X<module, threadsafe> X<module, thread safe> 555X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> 556 557Since 5.6.0, Perl has had support for a new type of threads called 558interpreter threads (ithreads). These threads can be used explicitly 559and implicitly. 560 561Ithreads work by cloning the data tree so that no data is shared 562between different threads. These threads can be used by using the C<threads> 563module or by doing fork() on win32 (fake fork() support). When a 564thread is cloned all Perl data is cloned, however non-Perl data cannot 565be cloned automatically. Perl after 5.7.2 has support for the C<CLONE> 566special subroutine. In C<CLONE> you can do whatever 567you need to do, 568like for example handle the cloning of non-Perl data, if necessary. 569C<CLONE> will be called once as a class method for every package that has it 570defined (or inherits it). It will be called in the context of the new thread, 571so all modifications are made in the new area. Currently CLONE is called with 572no parameters other than the invocant package name, but code should not assume 573that this will remain unchanged, as it is likely that in future extra parameters 574will be passed in to give more information about the state of cloning. 575 576If you want to CLONE all objects you will need to keep track of them per 577package. This is simply done using a hash and Scalar::Util::weaken(). 578 579Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. 580Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is 581called just before cloning starts, and in the context of the parent 582thread. If it returns a true value, then no objects of that class will 583be cloned; or rather, they will be copied as unblessed, undef values. 584For example: if in the parent there are two references to a single blessed 585hash, then in the child there will be two references to a single undefined 586scalar value instead. 587This provides a simple mechanism for making a module threadsafe; just add 588C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will be 589now only be called once per object. Of course, if the child thread needs 590to make use of the objects, then a more sophisticated approach is 591needed. 592 593Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other 594than the invocant package name, although that may change. Similarly, to 595allow for future expansion, the return value should be a single C<0> or 596C<1> value. 597 598=head1 SEE ALSO 599 600See L<perlmodlib> for general style issues related to building Perl 601modules and classes, as well as descriptions of the standard library 602and CPAN, L<Exporter> for how Perl's standard import/export mechanism 603works, L<perltoot> and L<perltooc> for an in-depth tutorial on 604creating classes, L<perlobj> for a hard-core reference document on 605objects, L<perlsub> for an explanation of functions and scoping, 606and L<perlxstut> and L<perlguts> for more information on writing 607extension modules. 608