In the previous two chapters, we looked at basic object creation and manipulation. In this chapter, we’ll look at an equally important topic: what happens when objects go away.
As you saw in Chapter 4, when the last reference to a Perl data structure goes away, Perl automatically reclaims the memory of that data structure, including destroying any links to other data. Of course, that in turn may cause other (“contained”) structures to be destroyed as well.
By default, objects work in this manner because objects use the same reference structure to make more complex objects. An object built of a hash reference is destroyed when the last reference to that hash goes away. If the values of the hash elements are also references, they’re similarly removed, possibly causing further destruction.
Suppose an object uses a temporary file to hold data that won’t fit entirely in memory. The filehandle for this temporary file can be included as one of the object’s instance variables. While the normal object destruction sequence will properly close the handle, you still have the temporary file on disk unless you take further action.
To perform
the proper cleanup operations when an object is destroyed, you need
to be notified when that happens. Thankfully, Perl provides such
notification upon request. You can request this notification by
giving the object a DESTROY
method.
When the last reference to an object, say $bessie
,
is destroyed, Perl invokes:
$bessie->DESTROY
This method call is like most other method calls: Perl starts at the class of the object and works its way up the inheritance hierarchy until it finds a suitable method. However, unlike other method calls, there’s no error if no suitable method is found.[43]
For
example, going back to the Animal
class defined in
Chapter 9, you can add a
DESTROY
method to know when objects go away,
purely for debugging purposes:
## in Animal sub DESTROY { my $self = shift; print "[", $self->name, " has died.]\n"; }
Now when you create any Animal
s in the program,
you get notification as they leave. For example:
## include animal classes from previous chapter... sub feed_a_cow_named { my $name = shift; my $cow = Cow->named($name); $cow->eat("grass"); print "Returning from the subroutine.\n"; # $cow is destroyed here } print "Start of program.\n"; my $outer_cow = Cow->named("Bessie"); print "Now have a cow named ", $outer_cow->name, ".\n"; feed_a_cow_named("Gwen"); print "Returned from subroutine.\n";
This prints:
Start of program. Now have a cow named Bessie. Gwen eats grass. Returning from the subroutine. [Gwen has died.] Returned from subroutine. [Bessie has died.]
Note that Gwen is active inside the subroutine. However, as the
subroutine exits, Perl notices there are no references to Gwen;
Gwen’s DESTROY
method is then
automatically invoked, printing the Gwen
has
died
message.
What happens at the end of the program? Since objects don’t live beyond the end of the program, Perl makes one final pass over all remaining data and destroys it. This is true whether the data is held in lexical variables or package global variables. Because Bessie was still alive at the end of the program, she needed to be recycled, and so you get the message for Bessie after all other steps in the program are complete.[44]
If an object holds another object
(say, as an element of an array or the value of a hash element), the
containing object is DESTROY
ed before any of the
contained objects begin their discarding process. This is reasonable
because the containing object may need to reference its contents in
order to be cleanly discarded. To illustrate this,
let’s build a
“barn” and tear it down. And just
to be interesting, we’ll make the barn a blessed
array reference, not a hash reference.
{ package Barn; sub new { bless [ ], shift } sub add { push @{+shift}, shift } sub contents { @{+shift} } sub DESTROY { my $self = shift; print "$self is being destroyed...\n"; for ($self->contents) { print " ", $_->name, " goes homeless.\n"; } } }
Here, we’re really being minimalistic in the object definition. To create a new barn, simply bless an empty array reference into the class name passed as the first parameter. Adding an animal just pushes it to the back of the barn. Asking for the barn contents merely dereferences the object array reference to return the contents.[45]
The fun part is the destructor. Let’s take the reference to ourselves, display a debugging message about the particular barn being destroyed, and then ask for the name of each inhabitant in turn. In action, this would be:
my $barn = Barn->new; $barn->add(Cow->named("Bessie")); $barn->add(Cow->named("Gwen")); print "Burn the barn:\n"; $barn = undef; print "End of program.\n";
This prints:
Burn the barn: Barn=ARRAY(0x541c) is being destroyed... Bessie goes homeless. Gwen goes homeless. [Gwen has died.] [Bessie has died.] End of program.
Note that the barn is destroyed first, letting you get the name of the inhabitants cleanly. However, once the barn is gone, the inhabitants have no additional references, so they also go away, and thus their destructors are also invoked. Compare that with the cows having a life outside the barn:
my $barn = Barn->new; my @cows = (Cow->named("Bessie"), Cow->named("Gwen")); $barn->add($_) for @cows; print "Burn the barn:\n"; $barn = undef; print "Lose the cows:\n"; @cows = ( ); print "End of program.\n";
This produces:
Burn the barn: Barn=ARRAY(0x541c) is being destroyed... Bessie goes homeless. Gwen goes homeless. Lose the cows: [Gwen has died.] [Bessie has died.] End of program.
The cows will now continue to live until the only other reference to
the cows (from the @cows
array) goes away.
The references to the cows are removed only when the barn destructor is completely finished. In some cases, you may wish instead to shoo the cows out of the barn as you notice them. In this case, it’s as simple as destructively altering the barn array, rather than iterating over it.[46]
Let’s alter the
Barn
to Barn2
to illustrate
this:
{ package Barn2; sub new { bless [ ], shift } sub add { push @{+shift}, shift } sub contents { @{+shift} } sub DESTROY { my $self = shift; print "$self is being destroyed...\n"; while (@$self) { my $homeless = shift @$self; print " ", $homeless->name, " goes homeless.\n"; } } }
Now use it in the previous scenarios:
my $barn = Barn2->new; $barn->add(Cow->named("Bessie")); $barn->add(Cow->named("Gwen")); print "Burn the barn:\n"; $barn = undef; print "End of program.\n";
This produces:
Burn the barn: Barn2=ARRAY(0x541c) is being destroyed... Bessie goes homeless. [Bessie has died.] Gwen goes homeless. [Gwen has died.] End of program.
As you can see, Bessie had no home by being booted out of the barn immediately, so she also died. (Poor Gwen suffers the same fate.) There were no references to her at that moment, even before the destructor for the barn was complete.
Thus, back to the temporary file problem. If you have an associated temporary file for an animal, you merely need to close it and delete the file during the destructor:
## in Animal use File::Temp qw(tempfile); sub named { my $class = shift; my $name = shift; my $self = { Name => $name, Color => $class->default_color }; ## new code here... my ($fh, $filename) = tempfile( ); $self->{temp_fh} = $fh; $self->{temp_filename} = $filename; ## .. to here bless $self, $class; }
You now have a filehandle and its
filename stored as instance variables of the
Animal
(or any class derived from
Animal
). In the destructor, close it down, and
delete the file:[47]
sub DESTROY { my $self = shift; my $fh = $self->{temp_fh}; close $fh; unlink $self->{temp_filename}; print "[", $self->name, " has died.]\n"; }
When the last reference to the
Animal
-ish object is destroyed (even at the end of
the program), also automatically remove the temporary file to avoid a
mess.
Because the destructor method is inherited, you can also override and extend superclass methods. For example, we’ll decide the dead horses need a further use:
## in Horse sub DESTROY { my $self = shift; $self->SUPER::DESTROY; print "[", $self->name, " has gone off to the glue factory.]\n"; } my @tv_horses = map Horse->named($_), ("Trigger", "Mr. Ed"); $_->eat("an apple") for @tv_horses; # their last meal print "End of program.\n";
This prints:
Trigger eats an apple. Mr. Ed eats an apple. End of program. [Mr. Ed has died.] [Mr. Ed has gone off to the glue factory.] [Trigger has died.] [Trigger has gone off to the glue factory.]
We’ll feed each horse a last meal; at the end of the program, each horse’s destructor is called.
The first step of this destructor is to call the parent destructor. Why is this important? Without calling the parent destructor, the steps taken by superclasses of this class will not properly execute. That’s not much if it’s simply a debugging statement as we’ve shown, but if it was the “delete the temporary file” cleanup method, you wouldn’t have deleted that file!
So, the rule is:
Always include a call to
$self->SUPER::DESTROY
in your destructors (even if you don’t yet have any base/parent classes). |
Whether you call it at the beginning or the end of your own destructor is a matter of hotly contested debate. If your derived class needs some superclass instance variables, you should probably call the superclass destructor after you complete your operations because the superclass destructor will likely alter them in annoying ways. On the other hand, in the example, we called the superclass destructor before the added behavior, because we wanted the superclass behavior first. There’s no rule of thumb, even. Sorry.
The arrow syntax used to invoke a method is sometimes called the direct object syntax because there’s also the indirect object syntax, also known as the “only works sometimes” syntax, for reasons explained in a moment. When you write:
Class->class_method(@args); $instance->instance_method(@other);
you can generally replace it with:
classmethod Class @args; instancemethod $instance @other;
A typical use of this is with the
new
method, replacing:
my $obj = Some::Class->new(@constructor_params);
with:
my $obj = new Some::Class @constructor_params;
making the C++ people feel right at home. Of course, in Perl,
there’s nothing special about the name
new
, but at least the syntax is hauntingly
familiar.
Why the previous “generally” caveat? Well, if the instance is something more complicated than a simple scalar variable:
$somehash->{$somekey}->[42]->instance_method(@parms);
then you can’t just swap it around like:
instance_method $somehash->{$somekey}->[42] @parms;
because the only things acceptable to indirect object syntax are a bareword (e.g., a class name), a simple scalar variable, or braces denoting a block returning either a blessed reference or a classname.[48]
This means you have to write it like so:
instance_method { $somehash->{$somekey}->[42] } @parms;
And that goes from simple to uglier in one step. There’s another downside: ambiguous parsing. When we developed the classroom materials concerning indirect object references, we wrote:
my $cow = Cow->named("Bessie"); print name $cow, " eats.\n";
because we were thinking about the indirect object equivalents for:
my $cow = Cow->named("Bessie"); print $cow->name, " eats.\n";
However, the latter works; the former
doesn’t. We were getting no output. Finally, we
enabled warnings (via -w
on the command
line)[49] and got this interesting series
of messages:
Unquoted string "name" may clash with future reserved word at ./foo line 92. Name "main::name" used only once: possible typo at ./foo line 92. print( ) on unopened filehandle name at ./foo line 92.
Ahh, so that line was being parsed as:
print name ($cow, " eats.\n");
In other words, print the list of items to the filehandle named
name
. That’s clearly not what we
wanted, so we had to add additional syntax to disambiguate the call.
This leads us to our next strong suggestion:
Use direct object syntax at all times, except perhaps for the constructor call. |
That exception acknowledges that most people write
new
Class
..
. rather than
Class->new(...)
and that most of us are fine
with that. However, there are circumstances in which even that can
lead to ambiguity (e.g., when a subroutine named
new
has been seen, and the class name itself has
not been seen as a package). When in doubt, ignore indirect object
syntax. Your maintenance programmer will thank you.
One of the nice things about
using a hash for a data structure is that derived classes can add
additional instance variables without the superclass needing to know
of their addition. For example, let’s derive a
RaceHorse
class that is everything a
Horse
is but also tracks its win/place/show/lose
standings. The first part of this is trivial:
{ package RaceHorse; our @ISA = qw(Horse); ... }
You’ll
also want to initialize “no wins of no
races” when you create the
RaceHorse
. You do this by extending the
named
subroutine and adding four additional fields
(wins
, places
,
shows
, losses
, for first-,
second-, and third-place finishes, and none of the above):
{ package RaceHorse; our @ISA = qw(Horse); ## extend parent constructor: sub named { my $self = shift->SUPER::named(@_); $self->{$_} = 0 for qw(wins places shows losses); $self; } }
Here, you pass all parameters
to the superclass, which should return a fully formed
Horse
. However, because you pass
RaceHorse
as the class, it’d be
already blessed into the RaceHorse
class.[50]
Next, add the four instance variables that go beyond those defined in
the superclass, setting their initial values to 0. Finally, return
the modified RaceHorse
to the caller.
It’s important to note
here that we’ve actually “opened
the box” a bit while writing this derived class. You
know that the superclass uses a hash reference and that the
superclass hierarchy doesn’t use the four names
chosen for a derived class. This is because
RaceHorse
will be a
“friend” class (in C++ or Java
terminology), accessing the instance variables directly. If the
maintainer of Horse
or Animal
ever changes representation or names of variables, there could be a
collision, which might go undetected except for that important day
when you’re showing off your code to the investors.
Things get even more interesting if the hashref is changed to an
arrayref as well.
One way to decouple this
dependency is to use composition rather than inheritance as a way to
create a derived class. In this example, you need to make a
Horse
object an instance variable of a
RaceHorse
and put the rest of the data in separate
instance variables. You also need to pass any inherited method calls
on the RaceHorse
down to the
Horse
instance, through delegation. However, even
though Perl can certainly support the needed operations, that
approach is usually slower and more cumbersome. Enough on that for
this discussion,
however.
Next, let’s provide some access methods:
{ package RaceHorse; our @ISA = qw(Horse); ## extend parent constructor: sub named { my $self = shift->SUPER::named(@_); $self->{$_} = 0 for qw(wins places shows losses); $self; } sub won { shift->{wins}++; } sub placed { shift->{places}++; } sub showed { shift->{shows}++; } sub lost { shift->{losses}++; } sub standings { my $self = shift; join ", ", map "$self->{$_} $_", qw(wins places shows losses); } } my $racer = RaceHorse->named("Billy Boy"); # record the outcomes: 3 wins, 1 show, 1 loss $racer->won; $racer->won; $racer->won; $racer->showed; $racer->lost; print $racer->name, " has standings of: ", $racer->standings, ".\n";
This prints:
Billy Boy has standings of: 3 wins, 0 places, 1 shows, 1 losses. [Billy Boy has died.] [Billy Boy has gone off to the glue factory.]
Note that we’re
still getting the Animal
and
Horse
destructor. The superclasses are unaware
that we’ve added four additional elements to the
hash and so, still function as they always have.
What if you want to iterate over all
the animals we’ve made so far? Animals may exist all
over the program namespace and are lost once they’re
handed back from the named
constructor method.
However, you can record the created animal in a hash and iterate over that hash. The key to the hash can be the stringified form of the animal reference,[51] while the value can be the actual reference, allowing you to access its name or class.
For example,
let’s extend named
as
follows:
## in Animal our %REGISTRY; sub named { my $class = shift; my $name = shift; my $self = { Name => $name, Color => $class->default_color }; bless $self, $class; $REGISTRY{$self} = $self; # also returns $self }
The
uppercase name for %REGISTRY
is a reminder that
this variable is more global than most variables. In this case,
it’s a metavariable that contains information about
many instances.
Note that when used as a key, $self
stringifies, which means it turns into a
string unique to the object.
We also need to add a new method:
sub registered { return map { "a ".ref($_)." named ".$_->name } values %REGISTRY; }
Now you can see all the animals we’ve made:
my @cows = map Cow->named($_), qw(Bessie Gwen); my @horses = map Horse->named($_), ("Trigger", "Mr. Ed"); my @racehorses = RaceHorse->named("Billy Boy"); print "We've seen:\n", map(" $_\n", Animal->registered); print "End of program.\n";
This prints:
We've seen: a RaceHorse named Billy Boy a Horse named Mr. Ed a Horse named Trigger a Cow named Gwen a Cow named Bessie End of program. [Billy Boy has died.] [Billy Boy has gone off to the glue factory.] [Bessie has died.] [Gwen has died.] [Trigger has died.] [Trigger has gone off to the glue factory.] [Mr. Ed has died.] [Mr. Ed has gone off to the glue factory.]
Note that the animals die at their proper time because the variables holding the animals are all being destroyed at the final step. Or are they?
The %REGISTRY
variable also holds a reference to each animal. So even if you toss
away the containing variables:
{ my @cows = map Cow->named($_), qw(Bessie Gwen); my @horses = map Horse->named($_), ("Trigger", "Mr. Ed"); my @racehorses = RaceHorse->named("Billy Boy"); } print "We've seen:\n", map(" $_\n", Animal->registered); print "End of program.\n";
you’ll still see the same result. The animals aren’t destroyed even though none of the code is holding the animals. At first glance, it looks like you can fix this by altering the destructor:
## in Animal sub DESTROY { my $self = shift; print "[", $self->name, " has died.]\n"; delete $REGISTRY{$self}; }
But this still results in the same output. Why? Because the destructor isn’t called until the last reference is gone, but the last reference won’t be destroyed until the destructor is called.[52]
One solution for fairly recent Perl versions[53] is to use weak references. A weak reference is a reference that doesn’t count as far as the reference counting, uh, counts. It’s best illustrated by example.
The weak
reference mechanism is already built into the core of recent Perl
versions, but as of this writing, the user interface is still
accessed by a CPAN module called WeakRef
. After
installing this module,[54] you can update the constructor as follows:
## in Animal use WeakRef qw(weaken); ## new sub named { ref(my $class = shift) and croak "class only"; my $name = shift; my $self = { Name => $name, Color => $class->default_color }; bless $self, $class; $REGISTRY{$self} = $self; weaken($REGISTRY{$self}); $self; }
When Perl counts the number of active references to a
thingy,[55] it
won’t count any that have been converted to weak
references by weaken
. If all ordinary references
are gone, Perl deletes the thingy and turns any weak references to
undef
.
Now you’ll get the right behavior for:
my @horses = map Horse->named($_), ("Trigger", "Mr. Ed"); print "alive before block:\n", map(" $_\n", Animal->registered); { my @cows = map Cow->named($_), qw(Bessie Gwen); my @racehorses = RaceHorse->named("Billy Boy"); print "alive inside block:\n", map(" $_\n", Animal->registered); } print "alive after block:\n", map(" $_\n", Animal->registered); print "End of program.\n";
This prints:
alive before block: a Horse named Trigger a Horse named Mr. Ed alive inside block: a RaceHorse named Billy Boy a Cow named Gwen a Horse named Trigger a Horse named Mr. Ed a Cow named Bessie [Billy Boy has died.] [Billy Boy has gone off to the glue factory.] [Gwen has died.] [Bessie has died.] alive after block: a Horse named Trigger a Horse named Mr. Ed End of program. [Mr. Ed has died.] [Mr. Ed has gone off to the glue factory.] [Trigger has died.] [Trigger has gone off to the glue factory.]
Notice that the racehorses and cows die at the end of the block, but the ordinary horses die at the end of the program. Success!
Weak references can also solve some memory leak issues. For example, suppose an animal wanted to record its pedigree. The parents might want to hold references to all their children while each child might want to hold references to each parent.
One or the other (or even both) of these
links can be weakened. If the link to the child is weakened, the
child can be destroyed when all other references are lost, and the
parent’s link simply becomes
undef
(or you can set a destructor to completely
remove it). However, a parent won’t disappear as
long as it still has offspring. Similarly, if the link to the parent
is weakened, you’ll simply get it as
undef
when the parent is no longer referenced by
other data structures. It’s really quite
flexible.[56]
Without weakening, as soon as any parent-child relationship is created, both the parent and the child remain in memory until the final global destruction phase, regardless of the destruction of the other structures holding either the parent or the child.
Be aware though: weak references should be used carefully, not just thrown at a problem of circular references. If you destroy data that is held by a weak reference before its time, you may have some very confusing programming problems to solve and debug.
The answers for all exercises can be found in Section A.9.
Modify the RaceHorse
class to get the previous
standings from a DBM hash (keyed by the horse’s
name) when the horse is created, and update the standings when the
horse is destroyed. For example, running this program four times:
my $runner = RaceHorse->named("Billy Boy"); $runner->won; print $runner->name, " has standings ", $runner->standings, ".\n";
should show four additional wins. Make sure that a
RaceHorse
still does everything a normal
Horse
does otherwise.
For simplicity, use four space-separated integers for the value in the DBM hash.
[43] Normally, your own method calls will cause an error if the method isn’t found. If you want to prevent that, just put a do-nothing method into the base class.
[44] This is just after the
END
blocks are executed and follows the same rules
as END
blocks: there must be a nice exit of the
program rather than an abrupt end. If Perl runs out of memory, all
bets are off.
[45] Did you
wonder why there’s a plus sign (+) before
shift
in two of those subroutines?
That’s due to one of the quirks in
Perl’s syntax. If the code were simply
@{shift}
, because the curly braces contain nothing
but a bareword, it would be interpreted as a soft reference:
@{"shift"}
. In Perl, the unary plus (a plus sign
at the beginning of a term) is defined to do nothing (not even
turning what follows into a number), just so it can distinguish cases
such as this.
[46] If
you’re using a hash instead, use
delete
on the elements you wish to process
immediately.
[47] As it turns out, you can tell
File::Temp
to do this automatically, but then we
wouldn’t be able to illustrate doing it manually.
Doing it manually allows you to store a summary of the information
from the temporary file into a database. However,
that’s too complex to show here.
[48] Astute readers will note that these are the same rules as for an indirect filehandle syntax, from which indirect object syntax directly mirrors, as well as the rules for specifying a reference to be dereferenced.
[49] Using -w
should be the first
step when Perl does something you don’t understand.
Or maybe it should be the zeroth because you should normally have
-w
in effect whenever you’re
developing code.
[50] Similar to the way the
Animal
constructor creates a
Horse
, not an Animal
, when
passed Horse
as the class.
[51] Or any other convenient and unique string.
[52] We’d make a reference to chickens and eggs,
but that would introduce yet another derived class to
Animal
.
[53] 5.6 and later.
[54] See Chapter 15 for information on installing modules.
[55] A thingy as defined in Perl’s own documentation, is anything a reference points to, such as an object. If you are an especially boring person, you could call it a referent instead.
[56] When using weak references, always make
sure you don’t dereference a weakened reference that
has turned to undef
.