1=head1 NAME 2 3perlobj - Perl objects 4 5=head1 DESCRIPTION 6 7First you need to understand what references are in Perl. 8See L<perlref> for that. Second, if you still find the following 9reference work too complicated, a tutorial on object-oriented programming 10in Perl can be found in L<perltoot> and L<perltootc>. 11 12If you're still with us, then 13here are three very simple definitions that you should find reassuring. 14 15=over 4 16 17=item 1. 18 19An object is simply a reference that happens to know which class it 20belongs to. 21 22=item 2. 23 24A class is simply a package that happens to provide methods to deal 25with object references. 26 27=item 3. 28 29A method is simply a subroutine that expects an object reference (or 30a package name, for class methods) as the first argument. 31 32=back 33 34We'll cover these points now in more depth. 35 36=head2 An Object is Simply a Reference 37 38Unlike say C++, Perl doesn't provide any special syntax for 39constructors. A constructor is merely a subroutine that returns a 40reference to something "blessed" into a class, generally the 41class that the subroutine is defined in. Here is a typical 42constructor: 43 44 package Critter; 45 sub new { bless {} } 46 47That word C<new> isn't special. You could have written 48a construct this way, too: 49 50 package Critter; 51 sub spawn { bless {} } 52 53This might even be preferable, because the C++ programmers won't 54be tricked into thinking that C<new> works in Perl as it does in C++. 55It doesn't. We recommend that you name your constructors whatever 56makes sense in the context of the problem you're solving. For example, 57constructors in the Tk extension to Perl are named after the widgets 58they create. 59 60One thing that's different about Perl constructors compared with those in 61C++ is that in Perl, they have to allocate their own memory. (The other 62things is that they don't automatically call overridden base-class 63constructors.) The C<{}> allocates an anonymous hash containing no 64key/value pairs, and returns it The bless() takes that reference and 65tells the object it references that it's now a Critter, and returns 66the reference. This is for convenience, because the referenced object 67itself knows that it has been blessed, and the reference to it could 68have been returned directly, like this: 69 70 sub new { 71 my $self = {}; 72 bless $self; 73 return $self; 74 } 75 76You often see such a thing in more complicated constructors 77that wish to call methods in the class as part of the construction: 78 79 sub new { 80 my $self = {}; 81 bless $self; 82 $self->initialize(); 83 return $self; 84 } 85 86If you care about inheritance (and you should; see 87L<perlmodlib/"Modules: Creation, Use, and Abuse">), 88then you want to use the two-arg form of bless 89so that your constructors may be inherited: 90 91 sub new { 92 my $class = shift; 93 my $self = {}; 94 bless $self, $class; 95 $self->initialize(); 96 return $self; 97 } 98 99Or if you expect people to call not just C<< CLASS->new() >> but also 100C<< $obj->new() >>, then use something like this. The initialize() 101method used will be of whatever $class we blessed the 102object into: 103 104 sub new { 105 my $this = shift; 106 my $class = ref($this) || $this; 107 my $self = {}; 108 bless $self, $class; 109 $self->initialize(); 110 return $self; 111 } 112 113Within the class package, the methods will typically deal with the 114reference as an ordinary reference. Outside the class package, 115the reference is generally treated as an opaque value that may 116be accessed only through the class's methods. 117 118Although a constructor can in theory re-bless a referenced object 119currently belonging to another class, this is almost certainly going 120to get you into trouble. The new class is responsible for all 121cleanup later. The previous blessing is forgotten, as an object 122may belong to only one class at a time. (Although of course it's 123free to inherit methods from many classes.) If you find yourself 124having to do this, the parent class is probably misbehaving, though. 125 126A clarification: Perl objects are blessed. References are not. Objects 127know which package they belong to. References do not. The bless() 128function uses the reference to find the object. Consider 129the following example: 130 131 $a = {}; 132 $b = $a; 133 bless $a, BLAH; 134 print "\$b is a ", ref($b), "\n"; 135 136This reports $b as being a BLAH, so obviously bless() 137operated on the object and not on the reference. 138 139=head2 A Class is Simply a Package 140 141Unlike say C++, Perl doesn't provide any special syntax for class 142definitions. You use a package as a class by putting method 143definitions into the class. 144 145There is a special array within each package called @ISA, which says 146where else to look for a method if you can't find it in the current 147package. This is how Perl implements inheritance. Each element of the 148@ISA array is just the name of another package that happens to be a 149class package. The classes are searched (depth first) for missing 150methods in the order that they occur in @ISA. The classes accessible 151through @ISA are known as base classes of the current class. 152 153All classes implicitly inherit from class C<UNIVERSAL> as their 154last base class. Several commonly used methods are automatically 155supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for 156more details. 157 158If a missing method is found in a base class, it is cached 159in the current class for efficiency. Changing @ISA or defining new 160subroutines invalidates the cache and causes Perl to do the lookup again. 161 162If neither the current class, its named base classes, nor the UNIVERSAL 163class contains the requested method, these three places are searched 164all over again, this time looking for a method named AUTOLOAD(). If an 165AUTOLOAD is found, this method is called on behalf of the missing method, 166setting the package global $AUTOLOAD to be the fully qualified name of 167the method that was intended to be called. 168 169If none of that works, Perl finally gives up and complains. 170 171If you want to stop the AUTOLOAD inheritance say simply 172 173 sub AUTOLOAD; 174 175and the call will die using the name of the sub being called. 176 177Perl classes do method inheritance only. Data inheritance is left up 178to the class itself. By and large, this is not a problem in Perl, 179because most classes model the attributes of their object using an 180anonymous hash, which serves as its own little namespace to be carved up 181by the various classes that might want to do something with the object. 182The only problem with this is that you can't sure that you aren't using 183a piece of the hash that isn't already used. A reasonable workaround 184is to prepend your fieldname in the hash with the package name. 185 186 sub bump { 187 my $self = shift; 188 $self->{ __PACKAGE__ . ".count"}++; 189 } 190 191=head2 A Method is Simply a Subroutine 192 193Unlike say C++, Perl doesn't provide any special syntax for method 194definition. (It does provide a little syntax for method invocation 195though. More on that later.) A method expects its first argument 196to be the object (reference) or package (string) it is being invoked 197on. There are two ways of calling methods, which we'll call class 198methods and instance methods. 199 200A class method expects a class name as the first argument. It 201provides functionality for the class as a whole, not for any 202individual object belonging to the class. Constructors are often 203class methods, but see L<perltoot> and L<perltootc> for alternatives. 204Many class methods simply ignore their first argument, because they 205already know what package they're in and don't care what package 206they were invoked via. (These aren't necessarily the same, because 207class methods follow the inheritance tree just like ordinary instance 208methods.) Another typical use for class methods is to look up an 209object by name: 210 211 sub find { 212 my ($class, $name) = @_; 213 $objtable{$name}; 214 } 215 216An instance method expects an object reference as its first argument. 217Typically it shifts the first argument into a "self" or "this" variable, 218and then uses that as an ordinary reference. 219 220 sub display { 221 my $self = shift; 222 my @keys = @_ ? @_ : sort keys %$self; 223 foreach $key (@keys) { 224 print "\t$key => $self->{$key}\n"; 225 } 226 } 227 228=head2 Method Invocation 229 230There are two ways to invoke a method, one of which you're already 231familiar with, and the other of which will look familiar. Perl 4 232already had an "indirect object" syntax that you use when you say 233 234 print STDERR "help!!!\n"; 235 236This same syntax can be used to call either class or instance methods. 237We'll use the two methods defined above, the class method to lookup 238an object reference and the instance method to print out its attributes. 239 240 $fred = find Critter "Fred"; 241 display $fred 'Height', 'Weight'; 242 243These could be combined into one statement by using a BLOCK in the 244indirect object slot: 245 246 display {find Critter "Fred"} 'Height', 'Weight'; 247 248For C++ fans, there's also a syntax using -> notation that does exactly 249the same thing. The parentheses are required if there are any arguments. 250 251 $fred = Critter->find("Fred"); 252 $fred->display('Height', 'Weight'); 253 254or in one statement, 255 256 Critter->find("Fred")->display('Height', 'Weight'); 257 258There are times when one syntax is more readable, and times when the 259other syntax is more readable. The indirect object syntax is less 260cluttered, but it has the same ambiguity as ordinary list operators. 261Indirect object method calls are usually parsed using the same rule as list 262operators: "If it looks like a function, it is a function". (Presuming 263for the moment that you think two words in a row can look like a 264function name. C++ programmers seem to think so with some regularity, 265especially when the first word is "new".) Thus, the parentheses of 266 267 new Critter ('Barney', 1.5, 70) 268 269are assumed to surround ALL the arguments of the method call, regardless 270of what comes after. Saying 271 272 new Critter ('Bam' x 2), 1.4, 45 273 274would be equivalent to 275 276 Critter->new('Bam' x 2), 1.4, 45 277 278which is unlikely to do what you want. Confusingly, however, this 279rule applies only when the indirect object is a bareword package name, 280not when it's a scalar, a BLOCK, or a C<Package::> qualified package name. 281In those cases, the arguments are parsed in the same way as an 282indirect object list operator like print, so 283 284 new Critter:: ('Bam' x 2), 1.4, 45 285 286is the same as 287 288 Critter::->new(('Bam' x 2), 1.4, 45) 289 290For more reasons why the indirect object syntax is ambiguous, see 291L<"WARNING"> below. 292 293There are times when you wish to specify which class's method to use. 294Here you can call your method as an ordinary subroutine 295call, being sure to pass the requisite first argument explicitly: 296 297 $fred = MyCritter::find("Critter", "Fred"); 298 MyCritter::display($fred, 'Height', 'Weight'); 299 300Unlike method calls, function calls don't consider inheritance. If you wish 301merely to specify that Perl should I<START> looking for a method in a 302particular package, use an ordinary method call, but qualify the method 303name with the package like this: 304 305 $fred = Critter->MyCritter::find("Fred"); 306 $fred->MyCritter::display('Height', 'Weight'); 307 308If you're trying to control where the method search begins I<and> you're 309executing in the class itself, then you may use the SUPER pseudo class, 310which says to start looking in your base class's @ISA list without having 311to name it explicitly: 312 313 $self->SUPER::display('Height', 'Weight'); 314 315Please note that the C<SUPER::> construct is meaningful I<only> within the 316class. 317 318Sometimes you want to call a method when you don't know the method name 319ahead of time. You can use the arrow form, replacing the method name 320with a simple scalar variable containing the method name or a 321reference to the function. 322 323 $method = $fast ? "findfirst" : "findbest"; 324 $fred->$method(@args); # call by name 325 326 if ($coderef = $fred->can($parent . "::findbest")) { 327 $self->$coderef(@args); # call by coderef 328 } 329 330=head2 WARNING 331 332While indirect object syntax may well be appealing to English speakers and 333to C++ programmers, be not seduced! It suffers from two grave problems. 334 335The first problem is that an indirect object is limited to a name, 336a scalar variable, or a block, because it would have to do too much 337lookahead otherwise, just like any other postfix dereference in the 338language. (These are the same quirky rules as are used for the filehandle 339slot in functions like C<print> and C<printf>.) This can lead to horribly 340confusing precedence problems, as in these next two lines: 341 342 move $obj->{FIELD}; # probably wrong! 343 move $ary[$i]; # probably wrong! 344 345Those actually parse as the very surprising: 346 347 $obj->move->{FIELD}; # Well, lookee here 348 $ary->move([$i]); # Didn't expect this one, eh? 349 350Rather than what you might have expected: 351 352 $obj->{FIELD}->move(); # You should be so lucky. 353 $ary[$i]->move; # Yeah, sure. 354 355The left side of ``->'' is not so limited, because it's an infix operator, 356not a postfix operator. 357 358As if that weren't bad enough, think about this: Perl must guess I<at 359compile time> whether C<name> and C<move> above are functions or methods. 360Usually Perl gets it right, but when it doesn't it, you get a function 361call compiled as a method, or vice versa. This can introduce subtle 362bugs that are hard to unravel. For example, calling a method C<new> 363in indirect notation--as C++ programmers are so wont to do--can 364be miscompiled into a subroutine call if there's already a C<new> 365function in scope. You'd end up calling the current package's C<new> 366as a subroutine, rather than the desired class's method. The compiler 367tries to cheat by remembering bareword C<require>s, but the grief if it 368messes up just isn't worth the years of debugging it would likely take 369you to track such subtle bugs down. 370 371The infix arrow notation using ``C<< -> >>'' doesn't suffer from either 372of these disturbing ambiguities, so we recommend you use it exclusively. 373 374=head2 Default UNIVERSAL methods 375 376The C<UNIVERSAL> package automatically contains the following methods that 377are inherited by all other classes: 378 379=over 4 380 381=item isa(CLASS) 382 383C<isa> returns I<true> if its object is blessed into a subclass of C<CLASS> 384 385C<isa> is also exportable and can be called as a sub with two arguments. This 386allows the ability to check what a reference points to. Example 387 388 use UNIVERSAL qw(isa); 389 390 if(isa($ref, 'ARRAY')) { 391 #... 392 } 393 394=item can(METHOD) 395 396C<can> checks to see if its object has a method called C<METHOD>, 397if it does then a reference to the sub is returned, if it does not then 398I<undef> is returned. 399 400=item VERSION( [NEED] ) 401 402C<VERSION> returns the version number of the class (package). If the 403NEED argument is given then it will check that the current version (as 404defined by the $VERSION variable in the given package) not less than 405NEED; it will die if this is not the case. This method is normally 406called as a class method. This method is called automatically by the 407C<VERSION> form of C<use>. 408 409 use A 1.2 qw(some imported subs); 410 # implies: 411 A->VERSION(1.2); 412 413=back 414 415B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and 416C<isa> uses a very similar method and cache-ing strategy. This may cause 417strange effects if the Perl code dynamically changes @ISA in any package. 418 419You may add other methods to the UNIVERSAL class via Perl or XS code. 420You do not need to C<use UNIVERSAL> to make these methods 421available to your program. This is necessary only if you wish to 422have C<isa> available as a plain subroutine in the current package. 423 424=head2 Destructors 425 426When the last reference to an object goes away, the object is 427automatically destroyed. (This may even be after you exit, if you've 428stored references in global variables.) If you want to capture control 429just before the object is freed, you may define a DESTROY method in 430your class. It will automatically be called at the appropriate moment, 431and you can do any extra cleanup you need to do. Perl passes a reference 432to the object under destruction as the first (and only) argument. Beware 433that the reference is a read-only value, and cannot be modified by 434manipulating C<$_[0]> within the destructor. The object itself (i.e. 435the thingy the reference points to, namely C<${$_[0]}>, C<@{$_[0]}>, 436C<%{$_[0]}> etc.) is not similarly constrained. 437 438If you arrange to re-bless the reference before the destructor returns, 439perl will again call the DESTROY method for the re-blessed object after 440the current one returns. This can be used for clean delegation of 441object destruction, or for ensuring that destructors in the base classes 442of your choosing get called. Explicitly calling DESTROY is also possible, 443but is usually never needed. 444 445Do not confuse the previous discussion with how objects I<CONTAINED> in the current 446one are destroyed. Such objects will be freed and destroyed automatically 447when the current object is freed, provided no other references to them exist 448elsewhere. 449 450=head2 Summary 451 452That's about all there is to it. Now you need just to go off and buy a 453book about object-oriented design methodology, and bang your forehead 454with it for the next six months or so. 455 456=head2 Two-Phased Garbage Collection 457 458For most purposes, Perl uses a fast and simple, reference-based 459garbage collection system. That means there's an extra 460dereference going on at some level, so if you haven't built 461your Perl executable using your C compiler's C<-O> flag, performance 462will suffer. If you I<have> built Perl with C<cc -O>, then this 463probably won't matter. 464 465A more serious concern is that unreachable memory with a non-zero 466reference count will not normally get freed. Therefore, this is a bad 467idea: 468 469 { 470 my $a; 471 $a = \$a; 472 } 473 474Even thought $a I<should> go away, it can't. When building recursive data 475structures, you'll have to break the self-reference yourself explicitly 476if you don't care to leak. For example, here's a self-referential 477node such as one might use in a sophisticated tree structure: 478 479 sub new_node { 480 my $self = shift; 481 my $class = ref($self) || $self; 482 my $node = {}; 483 $node->{LEFT} = $node->{RIGHT} = $node; 484 $node->{DATA} = [ @_ ]; 485 return bless $node => $class; 486 } 487 488If you create nodes like that, they (currently) won't go away unless you 489break their self reference yourself. (In other words, this is not to be 490construed as a feature, and you shouldn't depend on it.) 491 492Almost. 493 494When an interpreter thread finally shuts down (usually when your program 495exits), then a rather costly but complete mark-and-sweep style of garbage 496collection is performed, and everything allocated by that thread gets 497destroyed. This is essential to support Perl as an embedded or a 498multithreadable language. For example, this program demonstrates Perl's 499two-phased garbage collection: 500 501 #!/usr/bin/perl 502 package Subtle; 503 504 sub new { 505 my $test; 506 $test = \$test; 507 warn "CREATING " . \$test; 508 return bless \$test; 509 } 510 511 sub DESTROY { 512 my $self = shift; 513 warn "DESTROYING $self"; 514 } 515 516 package main; 517 518 warn "starting program"; 519 { 520 my $a = Subtle->new; 521 my $b = Subtle->new; 522 $$a = 0; # break selfref 523 warn "leaving block"; 524 } 525 526 warn "just exited block"; 527 warn "time to die..."; 528 exit; 529 530When run as F</tmp/test>, the following output is produced: 531 532 starting program at /tmp/test line 18. 533 CREATING SCALAR(0x8e5b8) at /tmp/test line 7. 534 CREATING SCALAR(0x8e57c) at /tmp/test line 7. 535 leaving block at /tmp/test line 23. 536 DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13. 537 just exited block at /tmp/test line 26. 538 time to die... at /tmp/test line 27. 539 DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. 540 541Notice that "global destruction" bit there? That's the thread 542garbage collector reaching the unreachable. 543 544Objects are always destructed, even when regular refs aren't. Objects 545are destructed in a separate pass before ordinary refs just to 546prevent object destructors from using refs that have been themselves 547destructed. Plain refs are only garbage-collected if the destruct level 548is greater than 0. You can test the higher levels of global destruction 549by setting the PERL_DESTRUCT_LEVEL environment variable, presuming 550C<-DDEBUGGING> was enabled during perl build time. 551 552A more complete garbage collection strategy will be implemented 553at a future date. 554 555In the meantime, the best solution is to create a non-recursive container 556class that holds a pointer to the self-referential data structure. 557Define a DESTROY method for the containing object's class that manually 558breaks the circularities in the self-referential structure. 559 560=head1 SEE ALSO 561 562A kinder, gentler tutorial on object-oriented programming in Perl can 563be found in L<perltoot>, L<perlbootc> and L<perltootc>. You should 564also check out L<perlbot> for other object tricks, traps, and tips, as 565well as L<perlmodlib> for some style guides on constructing both 566modules and classes. 567