1=head1 NAME 2 3perlmod - Perl modules (packages and symbol tables) 4 5=head1 DESCRIPTION 6 7=head2 Is this the document you were after? 8 9There are other documents which might contain the information that you're 10looking for: 11 12=over 2 13 14=item This doc 15 16Perl's packages, namespaces, and some info on classes. 17 18=item L<perlnewmod> 19 20Tutorial on making a new module. 21 22=item L<perlmodstyle> 23 24Best practices for making a new module. 25 26=back 27 28=head2 Packages 29X<package> X<namespace> X<variable, global> X<global variable> X<global> 30 31Unlike Perl 4, in which all the variables were dynamic and shared one 32global name space, causing maintainability problems, Perl 5 provides two 33mechanisms for protecting code from having its variables stomped on by 34other code: lexically scoped variables created with C<my> or C<state> and 35namespaced global variables, which are exposed via the C<vars> pragma, 36or the C<our> keyword. Any global variable is considered to 37be part of a namespace and can be accessed via a "fully qualified form". 38Conversely, any lexically scoped variable is considered to be part of 39that lexical-scope, and does not have a "fully qualified form". 40 41In perl namespaces are called "packages" and 42the C<package> declaration tells the compiler which 43namespace to prefix to C<our> variables and unqualified dynamic names. 44This both protects 45against accidental stomping and provides an interface for deliberately 46clobbering global dynamic variables declared and used in other scopes or 47packages, when that is what you want to do. 48 49The scope of the C<package> declaration is from the 50declaration itself through the end of the enclosing block, C<eval>, 51or file, whichever comes first (the same scope as the my(), our(), state(), and 52local() operators, and also the effect 53of the experimental "reference aliasing," which may change), or until 54the next C<package> declaration. Unqualified dynamic identifiers will be in 55this namespace, except for those few identifiers that, if unqualified, 56default to the main package instead of the current one as described 57below. A C<package> statement affects only dynamic global 58symbols, including subroutine names, and variables you've used local() 59on, but I<not> lexical variables created with my(), our() or state(). 60 61Typically, a C<package> statement is the first declaration in a file 62included in a program by one of the C<do>, C<require>, or C<use> operators. You can 63switch into a package in more than one place: C<package> has no 64effect beyond specifying which symbol table the compiler will use for 65dynamic symbols for the rest of that block or until the next C<package> statement. 66You can refer to variables and filehandles in other packages 67by prefixing the identifier with the package name and a double 68colon: C<$Package::Variable>. If the package name is null, the 69C<main> package is assumed. That is, C<$::sail> is equivalent to 70C<$main::sail>. 71 72The old package delimiter was a single quote, but double colon is now the 73preferred delimiter, in part because it's more readable to humans, and 74in part because it's more readable to B<emacs> macros. It also makes C++ 75programmers feel like they know what's going on--as opposed to using the 76single quote as separator, which was there to make Ada programmers feel 77like they knew what was going on. Because the old-fashioned syntax is still 78supported for backwards compatibility, if you try to use a string like 79C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, 80the $s variable in package C<owner>, which is probably not what you meant. 81Use braces to disambiguate, as in C<"This is ${owner}'s house">. 82X<::> X<'> 83 84Using C<'> as a package separator is deprecated and will be removed in 85Perl 5.40. 86 87Packages may themselves contain package separators, as in 88C<$OUTER::INNER::var>. This implies nothing about the order of 89name lookups, however. There are no relative packages: all symbols 90are either local to the current package, or must be fully qualified 91from the outer package name down. For instance, there is nowhere 92within package C<OUTER> that C<$INNER::var> refers to 93C<$OUTER::INNER::var>. C<INNER> refers to a totally 94separate global package. The custom of treating package names as a 95hierarchy is very strong, but the language in no way enforces it. 96 97Only identifiers starting with letters (or underscore) are stored 98in a package's symbol table. All other symbols are kept in package 99C<main>, including all punctuation variables, like $_. In addition, 100when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, 101ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, 102even when used for other purposes than their built-in ones. If you 103have a package called C<m>, C<s>, or C<y>, then you can't use the 104qualified form of an identifier because it would be instead interpreted 105as a pattern match, a substitution, or a transliteration. 106X<variable, punctuation> 107 108Variables beginning with underscore used to be forced into package 109main, but we decided it was more useful for package writers to be able 110to use leading underscore to indicate private variables and method names. 111However, variables and functions named with a single C<_>, such as 112$_ and C<sub _>, are still forced into the package C<main>. See also 113L<perlvar/"The Syntax of Variable Names">. 114 115C<eval>ed strings are compiled in the package in which the eval() was 116compiled. (Assignments to C<$SIG{}>, however, assume the signal 117handler specified is in the C<main> package. Qualify the signal handler 118name if you wish to have a signal handler in a package.) For an 119example, examine F<perldb.pl> in the Perl library. It initially switches 120to the C<DB> package so that the debugger doesn't interfere with variables 121in the program you are trying to debug. At various points, however, it 122temporarily switches back to the C<main> package to evaluate various 123expressions in the context of the C<main> package (or wherever you came 124from). See L<perldebug>. 125 126The special symbol C<__PACKAGE__> contains the current package, but cannot 127(easily) be used to construct variable names. After C<my($foo)> has hidden 128package variable C<$foo>, it can still be accessed, without knowing what 129package you are in, as C<${__PACKAGE__.'::foo'}>. 130 131See L<perlsub> for other scoping issues related to my() and local(), 132and L<perlref> regarding closures. 133 134=head2 Symbol Tables 135X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> 136 137The symbol table for a package happens to be stored in the hash of that 138name with two colons appended. The main symbol table's name is thus 139C<%main::>, or C<%::> for short. Likewise the symbol table for the nested 140package mentioned earlier is named C<%OUTER::INNER::>. 141 142The value in each entry of the hash is what you are referring to when you 143use the C<*name> typeglob notation. 144 145 local *main::foo = *main::bar; 146 147You can use this to print out all the variables in a package, for 148instance. The standard but antiquated F<dumpvar.pl> library and 149the CPAN module Devel::Symdump make use of this. 150 151The results of creating new symbol table entries directly or modifying any 152entries that are not already typeglobs are undefined and subject to change 153between releases of perl. 154 155Assignment to a typeglob performs an aliasing operation, i.e., 156 157 *dick = *richard; 158 159causes variables, subroutines, formats, and file and directory handles 160accessible via the identifier C<richard> also to be accessible via the 161identifier C<dick>. If you want to alias only a particular variable or 162subroutine, assign a reference instead: 163 164 *dick = \$richard; 165 166Which makes $richard and $dick the same variable, but leaves 167@richard and @dick as separate arrays. Tricky, eh? 168 169There is one subtle difference between the following statements: 170 171 *foo = *bar; 172 *foo = \$bar; 173 174C<*foo = *bar> makes the typeglobs themselves synonymous while 175C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs 176refer to the same scalar value. This means that the following code: 177 178 $bar = 1; 179 *foo = \$bar; # Make $foo an alias for $bar 180 181 { 182 local $bar = 2; # Restrict changes to block 183 print $foo; # Prints '1'! 184 } 185 186Would print '1', because C<$foo> holds a reference to the I<original> 187C<$bar>. The one that was stuffed away by C<local()> and which will be 188restored when the block ends. Because variables are accessed through the 189typeglob, you can use C<*foo = *bar> to create an alias which can be 190localized. (But be aware that this means you can't have a separate 191C<@foo> and C<@bar>, etc.) 192 193What makes all of this important is that the Exporter module uses glob 194aliasing as the import/export mechanism. Whether or not you can properly 195localize a variable that has been exported from a module depends on how 196it was exported: 197 198 @EXPORT = qw($FOO); # Usual form, can't be localized 199 @EXPORT = qw(*FOO); # Can be localized 200 201You can work around the first case by using the fully qualified name 202(C<$Package::FOO>) where you need a local value, or by overriding it 203by saying C<*FOO = *Package::FOO> in your script. 204 205The C<*x = \$y> mechanism may be used to pass and return cheap references 206into or from subroutines if you don't want to copy the whole 207thing. It only works when assigning to dynamic variables, not 208lexicals. 209 210 %some_hash = (); # can't be my() 211 *some_hash = fn( \%another_hash ); 212 sub fn { 213 local *hashsym = shift; 214 # now use %hashsym normally, and you 215 # will affect the caller's %another_hash 216 my %nhash = (); # do what you want 217 return \%nhash; 218 } 219 220On return, the reference will overwrite the hash slot in the 221symbol table specified by the *some_hash typeglob. This 222is a somewhat tricky way of passing around references cheaply 223when you don't want to have to remember to dereference variables 224explicitly. 225 226Another use of symbol tables is for making "constant" scalars. 227X<constant> X<scalar, constant> 228 229 *PI = \3.14159265358979; 230 231Now you cannot alter C<$PI>, which is probably a good thing all in all. 232This isn't the same as a constant subroutine, which is subject to 233optimization at compile-time. A constant subroutine is one prototyped 234to take no arguments and to return a constant expression. See 235L<perlsub> for details on these. The C<use constant> pragma is a 236convenient shorthand for these. 237 238You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and 239package the *foo symbol table entry comes from. This may be useful 240in a subroutine that gets passed typeglobs as arguments: 241 242 sub identify_typeglob { 243 my $glob = shift; 244 print 'You gave me ', *{$glob}{PACKAGE}, 245 '::', *{$glob}{NAME}, "\n"; 246 } 247 identify_typeglob *foo; 248 identify_typeglob *bar::baz; 249 250This prints 251 252 You gave me main::foo 253 You gave me bar::baz 254 255The C<*foo{THING}> notation can also be used to obtain references to the 256individual elements of *foo. See L<perlref>. 257 258Subroutine definitions (and declarations, for that matter) need 259not necessarily be situated in the package whose symbol table they 260occupy. You can define a subroutine outside its package by 261explicitly qualifying the name of the subroutine: 262 263 package main; 264 sub Some_package::foo { ... } # &foo defined in Some_package 265 266This is just a shorthand for a typeglob assignment at compile time: 267 268 BEGIN { *Some_package::foo = sub { ... } } 269 270and is I<not> the same as writing: 271 272 { 273 package Some_package; 274 sub foo { ... } 275 } 276 277In the first two versions, the body of the subroutine is 278lexically in the main package, I<not> in Some_package. So 279something like this: 280 281 package main; 282 283 $Some_package::name = "fred"; 284 $main::name = "barney"; 285 286 sub Some_package::foo { 287 print "in ", __PACKAGE__, ": \$name is '$name'\n"; 288 } 289 290 Some_package::foo(); 291 292prints: 293 294 in main: $name is 'barney' 295 296rather than: 297 298 in Some_package: $name is 'fred' 299 300This also has implications for the use of the SUPER:: qualifier 301(see L<perlobj>). 302 303=head2 BEGIN, UNITCHECK, CHECK, INIT and END 304X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> 305 306Five specially named code blocks are executed at the beginning and at 307the end of a running Perl program. These are the C<BEGIN>, 308C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. 309 310These code blocks can be prefixed with C<sub> to give the appearance of a 311subroutine (although this is not considered good style). One should note 312that these code blocks don't really exist as named subroutines (despite 313their appearance). The thing that gives this away is the fact that you can 314have B<more than one> of these code blocks in a program, and they will get 315B<all> executed at the appropriate moment. So you can't execute any of 316these code blocks by name. 317 318A C<BEGIN> code block is executed as soon as possible, that is, the moment 319it is completely defined, even before the rest of the containing file (or 320string) is parsed. You may have multiple C<BEGIN> blocks within a file (or 321eval'ed string); they will execute in order of definition. Because a C<BEGIN> 322code block executes immediately, it can pull in definitions of subroutines 323and such from other files in time to be visible to the rest of the compile 324and run time. Once a C<BEGIN> has run, it is immediately undefined and any 325code it used is returned to Perl's memory pool. 326 327An C<END> code block is executed as late as possible, that is, after 328perl has finished running the program and just before the interpreter 329is being exited, even if it is exiting as a result of a die() function. 330(But not if it's morphing into another program via C<exec>, or 331being blown out of the water by a signal--you have to trap that yourself 332(if you can).) You may have multiple C<END> blocks within a file--they 333will execute in reverse order of definition; that is: last in, first 334out (LIFO). C<END> blocks are not executed when you run perl with the 335C<-c> switch, or if compilation fails. 336 337Note that C<END> code blocks are B<not> executed at the end of a string 338C<eval()>: if any C<END> code blocks are created in a string C<eval()>, 339they will be executed just as any other C<END> code block of that package 340in LIFO order just before the interpreter is being exited. 341 342Inside an C<END> code block, C<$?> contains the value that the program is 343going to pass to C<exit()>. You can modify C<$?> to change the exit 344value of the program. Beware of changing C<$?> by accident (e.g. by 345running something via C<system>). 346X<$?> 347 348Inside of a C<END> block, the value of C<${^GLOBAL_PHASE}> will be 349C<"END">. 350 351Similar to an C<END> block are C<defer> blocks, though they operate on the 352lifetime of individual block scopes, rather than the program as a whole. They 353are documented in L<perlsyn/defer>. 354 355C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the 356transition between the compilation phase and the execution phase of 357the main program. 358 359C<UNITCHECK> blocks are run just after the unit which defined them has 360been compiled. The main program file and each module it loads are 361compilation units, as are string C<eval>s, run-time code compiled using the 362C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, 363and code after the C<-e> switch on the command line. 364 365C<BEGIN> and C<UNITCHECK> blocks are not directly related to the phase of 366the interpreter. They can be created and executed during any phase. 367 368C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends 369and before the run time begins, in LIFO order. C<CHECK> code blocks are used 370in the Perl compiler suite to save the compiled state of the program. 371 372Inside of a C<CHECK> block, the value of C<${^GLOBAL_PHASE}> will be 373C<"CHECK">. 374 375C<INIT> blocks are run just before the Perl runtime begins execution, in 376"first in, first out" (FIFO) order. 377 378Inside of an C<INIT> block, the value of C<${^GLOBAL_PHASE}> will be C<"INIT">. 379 380The C<CHECK> and C<INIT> blocks in code compiled by C<require>, string C<do>, 381or string C<eval> will not be executed if they occur after the end of the 382main compilation phase; that can be a problem in mod_perl and other persistent 383environments which use those functions to load code at runtime. 384 385When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and 386C<END> work just as they do in B<awk>, as a degenerate case. 387Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> 388switch for a compile-only syntax check, although your main code 389is not. 390 391The B<begincheck> program makes it all clear, eventually: 392 393 #!/usr/bin/perl 394 395 # begincheck 396 397 print "10. Ordinary code runs at runtime.\n"; 398 399 END { print "16. So this is the end of the tale.\n" } 400 INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } 401 UNITCHECK { 402 print " 4. And therefore before any CHECK blocks.\n" 403 } 404 CHECK { print " 6. So this is the sixth line.\n" } 405 406 print "11. It runs in order, of course.\n"; 407 408 BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } 409 END { print "15. Read perlmod for the rest of the story.\n" } 410 CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } 411 INIT { print " 8. Run this again, using Perl's -c switch.\n" } 412 413 print "12. This is anti-obfuscated code.\n"; 414 415 END { print "14. END blocks run LIFO at quitting time.\n" } 416 BEGIN { print " 2. So this line comes out second.\n" } 417 UNITCHECK { 418 print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" 419 } 420 INIT { print " 9. You'll see the difference right away.\n" } 421 422 print "13. It only _looks_ like it should be confusing.\n"; 423 424 __END__ 425 426=head2 Perl Classes 427X<class> X<@ISA> 428 429There is no stable class syntax in Perl, but a package may act 430as a class if it provides subroutines to act as methods. Such a 431package may also derive some of its methods from another class (package) 432by listing the other package name(s) in its global @ISA array (which 433must be a package global, not a lexical). 434 435For more on packages acting as classes, see L<perlootut> and L<perlobj>. 436For more on the not-yet-stable class syntax, see L<perlclass>. 437 438=head2 Perl Modules 439X<module> 440 441A module is just a set of related functions in a library file, i.e., 442a Perl package with the same name as the file. It is specifically 443designed to be reusable by other modules or programs. It may do this 444by providing a mechanism for exporting some of its symbols into the 445symbol table of any package using it, or it may function as a class 446definition and make its semantics available implicitly through 447method calls on the class and its objects, without explicitly 448exporting anything. Or it can do a little of both. 449 450For example, to start a traditional, non-OO module called Some::Module, 451create a file called F<Some/Module.pm> and start with this template: 452 453 package Some::Module; # assumes Some/Module.pm 454 455 use v5.36; 456 457 # Get the import method from Exporter to export functions and 458 # variables 459 use Exporter 5.57 'import'; 460 461 # set the version for version checking 462 our $VERSION = '1.00'; 463 464 # Functions and variables which are exported by default 465 our @EXPORT = qw(func1 func2); 466 467 # Functions and variables which can be optionally exported 468 our @EXPORT_OK = qw($Var1 %Hashit func3); 469 470 # exported package globals go here 471 our $Var1 = ''; 472 our %Hashit = (); 473 474 # non-exported package globals go here 475 # (they are still accessible as $Some::Module::stuff) 476 our @more = (); 477 our $stuff = ''; 478 479 # file-private lexicals go here, before any functions which use them 480 my $priv_var = ''; 481 my %secret_hash = (); 482 483 # here's a file-private function as a closure, 484 # callable as $priv_func->(); 485 my $priv_func = sub { 486 ... 487 }; 488 489 # make all your functions, whether exported or not; 490 # remember to put something interesting in the {} stubs 491 sub func1 { ... } 492 sub func2 { ... } 493 494 # this one isn't always exported, but could be called directly 495 # as Some::Module::func3() 496 sub func3 { ... } 497 498 END { ... } # module clean-up code here (global destructor) 499 500 1; # don't forget to return a true value from the file 501 502Then go on to declare and use your variables in functions without 503any qualifications. See L<Exporter> and the L<perlmodlib> for 504details on mechanics and style issues in module creation. 505 506Perl modules are included into your program by saying 507 508 use Module; 509 510or 511 512 use Module LIST; 513 514This is exactly equivalent to 515 516 BEGIN { require 'Module.pm'; 'Module'->import; } 517 518or 519 520 BEGIN { require 'Module.pm'; 'Module'->import( LIST ); } 521 522As a special case 523 524 use Module (); 525 526is exactly equivalent to 527 528 BEGIN { require 'Module.pm'; } 529 530All Perl module files have the extension F<.pm>. The C<use> operator 531assumes this so you don't have to spell out "F<Module.pm>" in quotes. 532This also helps to differentiate new modules from old F<.pl> and 533F<.ph> files. Module names are also capitalized unless they're 534functioning as pragmas; pragmas are in effect compiler directives, 535and are sometimes called "pragmatic modules" (or even "pragmata" 536if you're a classicist). 537 538The two statements: 539 540 require SomeModule; 541 require "SomeModule.pm"; 542 543differ from each other in two ways. In the first case, any double 544colons in the module name, such as C<Some::Module>, are translated 545into your system's directory separator, usually "/". The second 546case does not, and would have to be specified literally. The other 547difference is that seeing the first C<require> clues in the compiler 548that uses of indirect object notation involving "SomeModule", as 549in C<$ob = purge SomeModule>, are method calls, not function calls. 550(Yes, this really can make a difference.) 551 552Because the C<use> statement implies a C<BEGIN> block, the importing 553of semantics happens as soon as the C<use> statement is compiled, 554before the rest of the file is compiled. This is how it is able 555to function as a pragma mechanism, and also how modules are able to 556declare subroutines that are then visible as list or unary operators for 557the rest of the current file. This will not work if you use C<require> 558instead of C<use>. With C<require> you can get into this problem: 559 560 require Cwd; # make Cwd:: accessible 561 $here = Cwd::getcwd(); 562 563 use Cwd; # import names from Cwd:: 564 $here = getcwd(); 565 566 require Cwd; # make Cwd:: accessible 567 $here = getcwd(); # oops! no main::getcwd() 568 569In general, C<use Module ()> is recommended over C<require Module>, 570because it determines module availability at compile time, not in the 571middle of your program's execution. An exception would be if two modules 572each tried to C<use> each other, and each also called a function from 573that other module. In that case, it's easy to use C<require> instead. 574 575Perl packages may be nested inside other package names, so we can have 576package names containing C<::>. But if we used that package name 577directly as a filename it would make for unwieldy or impossible 578filenames on some systems. Therefore, if a module's name is, say, 579C<Text::Soundex>, then its definition is actually found in the library 580file F<Text/Soundex.pm>. 581 582Perl modules always have a F<.pm> file, but there may also be 583dynamically linked executables (often ending in F<.so>) or autoloaded 584subroutine definitions (often ending in F<.al>) associated with the 585module. If so, these will be entirely transparent to the user of 586the module. It is the responsibility of the F<.pm> file to load 587(or arrange to autoload) any additional functionality. For example, 588although the POSIX module happens to do both dynamic loading and 589autoloading, the user can say just C<use POSIX> to get it all. 590 591=head2 Making your module threadsafe 592X<threadsafe> X<thread safe> 593X<module, threadsafe> X<module, thread safe> 594X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> 595 596Perl supports a type of threads called interpreter threads (ithreads). 597These threads can be used explicitly and implicitly. 598 599Ithreads work by cloning the data tree so that no data is shared 600between different threads. These threads can be used by using the C<threads> 601module or by doing fork() on win32 (fake fork() support). When a 602thread is cloned all Perl data is cloned, however non-Perl data cannot 603be cloned automatically. Perl after 5.8.0 has support for the C<CLONE> 604special subroutine. In C<CLONE> you can do whatever 605you need to do, 606like for example handle the cloning of non-Perl data, if necessary. 607C<CLONE> will be called once as a class method for every package that has it 608defined (or inherits it). It will be called in the context of the new thread, 609so all modifications are made in the new area. Currently CLONE is called with 610no parameters other than the invocant package name, but code should not assume 611that this will remain unchanged, as it is likely that in future extra parameters 612will be passed in to give more information about the state of cloning. 613 614If you want to CLONE all objects you will need to keep track of them per 615package. This is simply done using a hash and Scalar::Util::weaken(). 616 617Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. 618Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is 619called just before cloning starts, and in the context of the parent 620thread. If it returns a true value, then no objects of that class will 621be cloned; or rather, they will be copied as unblessed, undef values. 622For example: if in the parent there are two references to a single blessed 623hash, then in the child there will be two references to a single undefined 624scalar value instead. 625This provides a simple mechanism for making a module threadsafe; just add 626C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will 627now only be called once per object. Of course, if the child thread needs 628to make use of the objects, then a more sophisticated approach is 629needed. 630 631Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other 632than the invocant package name, although that may change. Similarly, to 633allow for future expansion, the return value should be a single C<0> or 634C<1> value. 635 636=head1 SEE ALSO 637 638See L<perlmodlib> for general style issues related to building Perl 639modules and classes, as well as descriptions of the standard library 640and CPAN, L<Exporter> for how Perl's standard import/export mechanism 641works, L<perlootut> and L<perlobj> for in-depth information on 642creating classes, L<perlobj> for a hard-core reference document on 643objects, L<perlsub> for an explanation of functions and scoping, 644and L<perlxstut> and L<perlguts> for more information on writing 645extension modules. 646