1=head1 NAME 2 3perlmod - Perl modules (packages and symbol tables) 4 5=head1 DESCRIPTION 6 7=head2 Packages 8 9Perl provides a mechanism for alternative namespaces to protect 10packages from stomping on each other's variables. In fact, there's 11really no such thing as a global variable in Perl. The package 12statement declares the compilation unit as being in the given 13namespace. The scope of the package declaration is from the 14declaration itself through the end of the enclosing block, C<eval>, 15or file, whichever comes first (the same scope as the my() and 16local() operators). Unqualified dynamic identifiers will be in 17this namespace, except for those few identifiers that if unqualified, 18default to the main package instead of the current one as described 19below. A package statement affects only dynamic variables--including 20those you've used local() on--but I<not> lexical variables created 21with my(). Typically it would be the first declaration in a file 22included by the C<do>, C<require>, or C<use> operators. You can 23switch into a package in more than one place; it merely influences 24which symbol table is used by the compiler for the rest of that 25block. You can refer to variables and filehandles in other packages 26by prefixing the identifier with the package name and a double 27colon: C<$Package::Variable>. If the package name is null, the 28C<main> package is assumed. That is, C<$::sail> is equivalent to 29C<$main::sail>. 30 31The old package delimiter was a single quote, but double colon is now the 32preferred delimiter, in part because it's more readable to humans, and 33in part because it's more readable to B<emacs> macros. It also makes C++ 34programmers feel like they know what's going on--as opposed to using the 35single quote as separator, which was there to make Ada programmers feel 36like they knew what's going on. Because the old-fashioned syntax is still 37supported for backwards compatibility, if you try to use a string like 38C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, 39the $s variable in package C<owner>, which is probably not what you meant. 40Use braces to disambiguate, as in C<"This is ${owner}'s house">. 41 42Packages may themselves contain package separators, as in 43C<$OUTER::INNER::var>. This implies nothing about the order of 44name lookups, however. There are no relative packages: all symbols 45are either local to the current package, or must be fully qualified 46from the outer package name down. For instance, there is nowhere 47within package C<OUTER> that C<$INNER::var> refers to 48C<$OUTER::INNER::var>. It would treat package C<INNER> as a totally 49separate global package. 50 51Only identifiers starting with letters (or underscore) are stored 52in a package's symbol table. All other symbols are kept in package 53C<main>, including all punctuation variables, like $_. In addition, 54when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, 55ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, 56even when used for other purposes than their built-in one. If you 57have a package called C<m>, C<s>, or C<y>, then you can't use the 58qualified form of an identifier because it would be instead interpreted 59as a pattern match, a substitution, or a transliteration. 60 61Variables beginning with underscore used to be forced into package 62main, but we decided it was more useful for package writers to be able 63to use leading underscore to indicate private variables and method names. 64$_ is still global though. See also 65L<perlvar/"Technical Note on the Syntax of Variable Names">. 66 67C<eval>ed strings are compiled in the package in which the eval() was 68compiled. (Assignments to C<$SIG{}>, however, assume the signal 69handler specified is in the C<main> package. Qualify the signal handler 70name if you wish to have a signal handler in a package.) For an 71example, examine F<perldb.pl> in the Perl library. It initially switches 72to the C<DB> package so that the debugger doesn't interfere with variables 73in the program you are trying to debug. At various points, however, it 74temporarily switches back to the C<main> package to evaluate various 75expressions in the context of the C<main> package (or wherever you came 76from). See L<perldebug>. 77 78The special symbol C<__PACKAGE__> contains the current package, but cannot 79(easily) be used to construct variables. 80 81See L<perlsub> for other scoping issues related to my() and local(), 82and L<perlref> regarding closures. 83 84=head2 Symbol Tables 85 86The symbol table for a package happens to be stored in the hash of that 87name with two colons appended. The main symbol table's name is thus 88C<%main::>, or C<%::> for short. Likewise the symbol table for the nested 89package mentioned earlier is named C<%OUTER::INNER::>. 90 91The value in each entry of the hash is what you are referring to when you 92use the C<*name> typeglob notation. In fact, the following have the same 93effect, though the first is more efficient because it does the symbol 94table lookups at compile time: 95 96 local *main::foo = *main::bar; 97 local $main::{foo} = $main::{bar}; 98 99(Be sure to note the B<vast> difference between the second line above 100and C<local $main::foo = $main::bar>. The former is accessing the hash 101C<%main::>, which is the symbol table of package C<main>. The latter is 102simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of 103the same package.) 104 105You can use this to print out all the variables in a package, for 106instance. The standard but antiquated F<dumpvar.pl> library and 107the CPAN module Devel::Symdump make use of this. 108 109Assignment to a typeglob performs an aliasing operation, i.e., 110 111 *dick = *richard; 112 113causes variables, subroutines, formats, and file and directory handles 114accessible via the identifier C<richard> also to be accessible via the 115identifier C<dick>. If you want to alias only a particular variable or 116subroutine, assign a reference instead: 117 118 *dick = \$richard; 119 120Which makes $richard and $dick the same variable, but leaves 121@richard and @dick as separate arrays. Tricky, eh? 122 123This mechanism may be used to pass and return cheap references 124into or from subroutines if you don't want to copy the whole 125thing. It only works when assigning to dynamic variables, not 126lexicals. 127 128 %some_hash = (); # can't be my() 129 *some_hash = fn( \%another_hash ); 130 sub fn { 131 local *hashsym = shift; 132 # now use %hashsym normally, and you 133 # will affect the caller's %another_hash 134 my %nhash = (); # do what you want 135 return \%nhash; 136 } 137 138On return, the reference will overwrite the hash slot in the 139symbol table specified by the *some_hash typeglob. This 140is a somewhat tricky way of passing around references cheaply 141when you don't want to have to remember to dereference variables 142explicitly. 143 144Another use of symbol tables is for making "constant" scalars. 145 146 *PI = \3.14159265358979; 147 148Now you cannot alter C<$PI>, which is probably a good thing all in all. 149This isn't the same as a constant subroutine, which is subject to 150optimization at compile-time. A constant subroutine is one prototyped 151to take no arguments and to return a constant expression. See 152L<perlsub> for details on these. The C<use constant> pragma is a 153convenient shorthand for these. 154 155You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and 156package the *foo symbol table entry comes from. This may be useful 157in a subroutine that gets passed typeglobs as arguments: 158 159 sub identify_typeglob { 160 my $glob = shift; 161 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; 162 } 163 identify_typeglob *foo; 164 identify_typeglob *bar::baz; 165 166This prints 167 168 You gave me main::foo 169 You gave me bar::baz 170 171The C<*foo{THING}> notation can also be used to obtain references to the 172individual elements of *foo. See L<perlref>. 173 174Subroutine definitions (and declarations, for that matter) need 175not necessarily be situated in the package whose symbol table they 176occupy. You can define a subroutine outside its package by 177explicitly qualifying the name of the subroutine: 178 179 package main; 180 sub Some_package::foo { ... } # &foo defined in Some_package 181 182This is just a shorthand for a typeglob assignment at compile time: 183 184 BEGIN { *Some_package::foo = sub { ... } } 185 186and is I<not> the same as writing: 187 188 { 189 package Some_package; 190 sub foo { ... } 191 } 192 193In the first two versions, the body of the subroutine is 194lexically in the main package, I<not> in Some_package. So 195something like this: 196 197 package main; 198 199 $Some_package::name = "fred"; 200 $main::name = "barney"; 201 202 sub Some_package::foo { 203 print "in ", __PACKAGE__, ": \$name is '$name'\n"; 204 } 205 206 Some_package::foo(); 207 208prints: 209 210 in main: $name is 'barney' 211 212rather than: 213 214 in Some_package: $name is 'fred' 215 216This also has implications for the use of the SUPER:: qualifier 217(see L<perlobj>). 218 219=head2 Package Constructors and Destructors 220 221Four special subroutines act as package constructors and destructors. 222These are the C<BEGIN>, C<CHECK>, C<INIT>, and C<END> routines. The 223C<sub> is optional for these routines. 224 225A C<BEGIN> subroutine is executed as soon as possible, that is, the moment 226it is completely defined, even before the rest of the containing file 227is parsed. You may have multiple C<BEGIN> blocks within a file--they 228will execute in order of definition. Because a C<BEGIN> block executes 229immediately, it can pull in definitions of subroutines and such from other 230files in time to be visible to the rest of the file. Once a C<BEGIN> 231has run, it is immediately undefined and any code it used is returned to 232Perl's memory pool. This means you can't ever explicitly call a C<BEGIN>. 233 234An C<END> subroutine is executed as late as possible, that is, after 235perl has finished running the program and just before the interpreter 236is being exited, even if it is exiting as a result of a die() function. 237(But not if it's polymorphing into another program via C<exec>, or 238being blown out of the water by a signal--you have to trap that yourself 239(if you can).) You may have multiple C<END> blocks within a file--they 240will execute in reverse order of definition; that is: last in, first 241out (LIFO). C<END> blocks are not executed when you run perl with the 242C<-c> switch, or if compilation fails. 243 244Inside an C<END> subroutine, C<$?> contains the value that the program is 245going to pass to C<exit()>. You can modify C<$?> to change the exit 246value of the program. Beware of changing C<$?> by accident (e.g. by 247running something via C<system>). 248 249Similar to C<BEGIN> blocks, C<INIT> blocks are run just before the 250Perl runtime begins execution, in "first in, first out" (FIFO) order. 251For example, the code generators documented in L<perlcc> make use of 252C<INIT> blocks to initialize and resolve pointers to XSUBs. 253 254Similar to C<END> blocks, C<CHECK> blocks are run just after the 255Perl compile phase ends and before the run time begins, in 256LIFO order. C<CHECK> blocks are again useful in the Perl compiler 257suite to save the compiled state of the program. 258 259When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and 260C<END> work just as they do in B<awk>, as a degenerate case. 261Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> 262switch for a compile-only syntax check, although your main code 263is not. 264 265=head2 Perl Classes 266 267There is no special class syntax in Perl, but a package may act 268as a class if it provides subroutines to act as methods. Such a 269package may also derive some of its methods from another class (package) 270by listing the other package name(s) in its global @ISA array (which 271must be a package global, not a lexical). 272 273For more on this, see L<perltoot> and L<perlobj>. 274 275=head2 Perl Modules 276 277A module is just a set of related functions in a library file, i.e., 278a Perl package with the same name as the file. It is specifically 279designed to be reusable by other modules or programs. It may do this 280by providing a mechanism for exporting some of its symbols into the 281symbol table of any package using it. Or it may function as a class 282definition and make its semantics available implicitly through 283method calls on the class and its objects, without explicitly 284exporting anything. Or it can do a little of both. 285 286For example, to start a traditional, non-OO module called Some::Module, 287create a file called F<Some/Module.pm> and start with this template: 288 289 package Some::Module; # assumes Some/Module.pm 290 291 use strict; 292 use warnings; 293 294 BEGIN { 295 use Exporter (); 296 our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS); 297 298 # set the version for version checking 299 $VERSION = 1.00; 300 # if using RCS/CVS, this may be preferred 301 $VERSION = do { my @r = (q$Revision: 2.21 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r }; # must be all one line, for MakeMaker 302 303 @ISA = qw(Exporter); 304 @EXPORT = qw(&func1 &func2 &func4); 305 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], 306 307 # your exported package globals go here, 308 # as well as any optionally exported functions 309 @EXPORT_OK = qw($Var1 %Hashit &func3); 310 } 311 our @EXPORT_OK; 312 313 # exported package globals go here 314 our $Var1; 315 our %Hashit; 316 317 # non-exported package globals go here 318 our @more; 319 our $stuff; 320 321 # initialize package globals, first exported ones 322 $Var1 = ''; 323 %Hashit = (); 324 325 # then the others (which are still accessible as $Some::Module::stuff) 326 $stuff = ''; 327 @more = (); 328 329 # all file-scoped lexicals must be created before 330 # the functions below that use them. 331 332 # file-private lexicals go here 333 my $priv_var = ''; 334 my %secret_hash = (); 335 336 # here's a file-private function as a closure, 337 # callable as &$priv_func; it cannot be prototyped. 338 my $priv_func = sub { 339 # stuff goes here. 340 }; 341 342 # make all your functions, whether exported or not; 343 # remember to put something interesting in the {} stubs 344 sub func1 {} # no prototype 345 sub func2() {} # proto'd void 346 sub func3($$) {} # proto'd to 2 scalars 347 348 # this one isn't exported, but could be called! 349 sub func4(\%) {} # proto'd to 1 hash ref 350 351 END { } # module clean-up code here (global destructor) 352 353 ## YOUR CODE GOES HERE 354 355 1; # don't forget to return a true value from the file 356 357Then go on to declare and use your variables in functions without 358any qualifications. See L<Exporter> and the L<perlmodlib> for 359details on mechanics and style issues in module creation. 360 361Perl modules are included into your program by saying 362 363 use Module; 364 365or 366 367 use Module LIST; 368 369This is exactly equivalent to 370 371 BEGIN { require Module; import Module; } 372 373or 374 375 BEGIN { require Module; import Module LIST; } 376 377As a special case 378 379 use Module (); 380 381is exactly equivalent to 382 383 BEGIN { require Module; } 384 385All Perl module files have the extension F<.pm>. The C<use> operator 386assumes this so you don't have to spell out "F<Module.pm>" in quotes. 387This also helps to differentiate new modules from old F<.pl> and 388F<.ph> files. Module names are also capitalized unless they're 389functioning as pragmas; pragmas are in effect compiler directives, 390and are sometimes called "pragmatic modules" (or even "pragmata" 391if you're a classicist). 392 393The two statements: 394 395 require SomeModule; 396 require "SomeModule.pm"; 397 398differ from each other in two ways. In the first case, any double 399colons in the module name, such as C<Some::Module>, are translated 400into your system's directory separator, usually "/". The second 401case does not, and would have to be specified literally. The other 402difference is that seeing the first C<require> clues in the compiler 403that uses of indirect object notation involving "SomeModule", as 404in C<$ob = purge SomeModule>, are method calls, not function calls. 405(Yes, this really can make a difference.) 406 407Because the C<use> statement implies a C<BEGIN> block, the importing 408of semantics happens as soon as the C<use> statement is compiled, 409before the rest of the file is compiled. This is how it is able 410to function as a pragma mechanism, and also how modules are able to 411declare subroutines that are then visible as list or unary operators for 412the rest of the current file. This will not work if you use C<require> 413instead of C<use>. With C<require> you can get into this problem: 414 415 require Cwd; # make Cwd:: accessible 416 $here = Cwd::getcwd(); 417 418 use Cwd; # import names from Cwd:: 419 $here = getcwd(); 420 421 require Cwd; # make Cwd:: accessible 422 $here = getcwd(); # oops! no main::getcwd() 423 424In general, C<use Module ()> is recommended over C<require Module>, 425because it determines module availability at compile time, not in the 426middle of your program's execution. An exception would be if two modules 427each tried to C<use> each other, and each also called a function from 428that other module. In that case, it's easy to use C<require>s instead. 429 430Perl packages may be nested inside other package names, so we can have 431package names containing C<::>. But if we used that package name 432directly as a filename it would make for unwieldy or impossible 433filenames on some systems. Therefore, if a module's name is, say, 434C<Text::Soundex>, then its definition is actually found in the library 435file F<Text/Soundex.pm>. 436 437Perl modules always have a F<.pm> file, but there may also be 438dynamically linked executables (often ending in F<.so>) or autoloaded 439subroutine definitions (often ending in F<.al>) associated with the 440module. If so, these will be entirely transparent to the user of 441the module. It is the responsibility of the F<.pm> file to load 442(or arrange to autoload) any additional functionality. For example, 443although the POSIX module happens to do both dynamic loading and 444autoloading, the user can say just C<use POSIX> to get it all. 445 446=head1 SEE ALSO 447 448See L<perlmodlib> for general style issues related to building Perl 449modules and classes, as well as descriptions of the standard library 450and CPAN, L<Exporter> for how Perl's standard import/export mechanism 451works, L<perltoot> and L<perltootc> for an in-depth tutorial on 452creating classes, L<perlobj> for a hard-core reference document on 453objects, L<perlsub> for an explanation of functions and scoping, 454and L<perlxstut> and L<perlguts> for more information on writing 455extension modules. 456