Chapter 10. Object Destruction

In the previous two chapters, we looked at basic object creation and manipulation. In this chapter, we’ll look at an equally important topic: what happens when objects go away.

As you saw in Chapter 4, when the last reference to a Perl data structure goes away, Perl automatically reclaims the memory of that data structure, including destroying any links to other data. Of course, that in turn may cause other (“contained”) structures to be destroyed as well.

By default, objects work in this manner because objects use the same reference structure to make more complex objects. An object built of a hash reference is destroyed when the last reference to that hash goes away. If the values of the hash elements are also references, they’re similarly removed, possibly causing further destruction.

Suppose an object uses a temporary file to hold data that won’t fit entirely in memory. The filehandle for this temporary file can be included as one of the object’s instance variables. While the normal object destruction sequence will properly close the handle, you still have the temporary file on disk unless you take further action.

To perform the proper cleanup operations when an object is destroyed, you need to be notified when that happens. Thankfully, Perl provides such notification upon request. You can request this notification by giving the object a DESTROY method.

When the last reference to an object, say $bessie, is destroyed, Perl invokes:

$bessie->DESTROY

This method call is like most other method calls: Perl starts at the class of the object and works its way up the inheritance hierarchy until it finds a suitable method. However, unlike other method calls, there’s no error if no suitable method is found.^[43]

For example, going back to the Animal class defined in Chapter 9, you can add a DESTROY method to know when objects go away, purely for debugging purposes:

## in Animal
sub DESTROY {
  my $self = shift;
  print "[", $self->name, " has died.]\n";
}

Now when you create any Animals in the program, you get notification as they leave. For example:

## include animal classes from previous chapter...

sub feed_a_cow_named {
  my $name = shift;
  my $cow = Cow->named($name);
  $cow->eat("grass");
  print "Returning from the subroutine.\n";    # $cow is destroyed here
}
print "Start of program.\n";
my $outer_cow = Cow->named("Bessie");
print "Now have a cow named ", $outer_cow->name, ".\n";
feed_a_cow_named("Gwen");
print "Returned from subroutine.\n";

This prints:

Start of program.
Now have a cow named Bessie.
Gwen eats grass.
Returning from the subroutine.
[Gwen has died.]
Returned from subroutine.
[Bessie has died.]

Note that Gwen is active inside the subroutine. However, as the subroutine exits, Perl notices there are no references to Gwen; Gwen’s DESTROY method is then automatically invoked, printing the Gwen has died message.

What happens at the end of the program? Since objects don’t live beyond the end of the program, Perl makes one final pass over all remaining data and destroys it. This is true whether the data is held in lexical variables or package global variables. Because Bessie was still alive at the end of the program, she needed to be recycled, and so you get the message for Bessie after all other steps in the program are complete.^[44]

Nested Object Destruction

If an object holds another object (say, as an element of an array or the value of a hash element), the containing object is DESTROYed before any of the contained objects begin their discarding process. This is reasonable because the containing object may need to reference its contents in order to be cleanly discarded. To illustrate this, let’s build a “barn” and tear it down. And just to be interesting, we’ll make the barn a blessed array reference, not a hash reference.

{ package Barn;
  sub new { bless [  ], shift }
  sub add { push @{+shift}, shift }
  sub contents { @{+shift} }
  sub DESTROY {
    my $self = shift;
    print "$self is being destroyed...\n";
    for ($self->contents) {
      print "  ", $_->name, " goes homeless.\n";
    }
  }
}

Here, we’re really being minimalistic in the object definition. To create a new barn, simply bless an empty array reference into the class name passed as the first parameter. Adding an animal just pushes it to the back of the barn. Asking for the barn contents merely dereferences the object array reference to return the contents.^[45]

The fun part is the destructor. Let’s take the reference to ourselves, display a debugging message about the particular barn being destroyed, and then ask for the name of each inhabitant in turn. In action, this would be:

my $barn = Barn->new;
$barn->add(Cow->named("Bessie"));
$barn->add(Cow->named("Gwen"));
print "Burn the barn:\n";
$barn = undef;
print "End of program.\n";

This prints:

Burn the barn:
Barn=ARRAY(0x541c) is being destroyed...
  Bessie goes homeless.
  Gwen goes homeless.
[Gwen has died.]
[Bessie has died.]
End of program.

Note that the barn is destroyed first, letting you get the name of the inhabitants cleanly. However, once the barn is gone, the inhabitants have no additional references, so they also go away, and thus their destructors are also invoked. Compare that with the cows having a life outside the barn:

my $barn = Barn->new;
my @cows = (Cow->named("Bessie"), Cow->named("Gwen"));
$barn->add($_) for @cows;
print "Burn the barn:\n";
$barn = undef;
print "Lose the cows:\n";
@cows = (  );
print "End of program.\n";

This produces:

Burn the barn:
Barn=ARRAY(0x541c) is being destroyed...
  Bessie goes homeless.
  Gwen goes homeless.
Lose the cows:
[Gwen has died.]
[Bessie has died.]
End of program.

The cows will now continue to live until the only other reference to the cows (from the @cows array) goes away.

The references to the cows are removed only when the barn destructor is completely finished. In some cases, you may wish instead to shoo the cows out of the barn as you notice them. In this case, it’s as simple as destructively altering the barn array, rather than iterating over it.^[46]

Let’s alter the Barn to Barn2 to illustrate this:

{ package Barn2;
  sub new { bless [  ], shift }
  sub add { push @{+shift}, shift }
  sub contents { @{+shift} }
  sub DESTROY {
    my $self = shift;
    print "$self is being destroyed...\n";
    while (@$self) {
      my $homeless = shift @$self;
      print "  ", $homeless->name, " goes homeless.\n";
    }
  }
}

Now use it in the previous scenarios:

my $barn = Barn2->new;
$barn->add(Cow->named("Bessie"));
$barn->add(Cow->named("Gwen"));
print "Burn the barn:\n";
$barn = undef;
print "End of program.\n";

This produces:

Burn the barn:
Barn2=ARRAY(0x541c) is being destroyed...
  Bessie goes homeless.
[Bessie has died.]
  Gwen goes homeless.
[Gwen has died.]
End of program.

As you can see, Bessie had no home by being booted out of the barn immediately, so she also died. (Poor Gwen suffers the same fate.) There were no references to her at that moment, even before the destructor for the barn was complete.

Thus, back to the temporary file problem. If you have an associated temporary file for an animal, you merely need to close it and delete the file during the destructor:

## in Animal
use File::Temp qw(tempfile);

sub named {
  my $class = shift;
  my $name = shift;
  my $self = { Name => $name, Color => $class->default_color };
  ## new code here...
  my ($fh, $filename) = tempfile(  );
  $self->{temp_fh} = $fh;
  $self->{temp_filename} = $filename;
  ## .. to here
  bless $self, $class;
}

You now have a filehandle and its filename stored as instance variables of the Animal (or any class derived from Animal). In the destructor, close it down, and delete the file:^[47]

sub DESTROY {
  my $self = shift;
  my $fh = $self->{temp_fh};
  close $fh;
  unlink $self->{temp_filename};
  print "[", $self->name, " has died.]\n";
}

When the last reference to the Animal-ish object is destroyed (even at the end of the program), also automatically remove the temporary file to avoid a mess.

Beating a Dead Horse

Because the destructor method is inherited, you can also override and extend superclass methods. For example, we’ll decide the dead horses need a further use:

## in Horse
sub DESTROY {
  my $self = shift;
  $self->SUPER::DESTROY;
  print "[", $self->name, " has gone off to the glue factory.]\n";
}

my @tv_horses = map Horse->named($_), ("Trigger", "Mr. Ed");
$_->eat("an apple") for @tv_horses;     # their last meal
print "End of program.\n";

This prints:

Trigger eats an apple.
Mr. Ed eats an apple.
End of program.
[Mr. Ed has died.]
[Mr. Ed has gone off to the glue factory.]
[Trigger has died.]
[Trigger has gone off to the glue factory.]

We’ll feed each horse a last meal; at the end of the program, each horse’s destructor is called.

The first step of this destructor is to call the parent destructor. Why is this important? Without calling the parent destructor, the steps taken by superclasses of this class will not properly execute. That’s not much if it’s simply a debugging statement as we’ve shown, but if it was the “delete the temporary file” cleanup method, you wouldn’t have deleted that file!

So, the rule is:

Always include a call to $self->SUPER::DESTROY in your destructors (even if you don’t yet have any base/parent classes).

Whether you call it at the beginning or the end of your own destructor is a matter of hotly contested debate. If your derived class needs some superclass instance variables, you should probably call the superclass destructor after you complete your operations because the superclass destructor will likely alter them in annoying ways. On the other hand, in the example, we called the superclass destructor before the added behavior, because we wanted the superclass behavior first. There’s no rule of thumb, even. Sorry.

Indirect Object Notation

The arrow syntax used to invoke a method is sometimes called the direct object syntax because there’s also the indirect object syntax, also known as the “only works sometimes” syntax, for reasons explained in a moment. When you write:

Class->class_method(@args);
$instance->instance_method(@other);

you can generally replace it with:

classmethod Class @args;
instancemethod $instance @other;

A typical use of this is with the new method, replacing:

my $obj = Some::Class->new(@constructor_params);

with:

my $obj = new Some::Class @constructor_params;

making the C++ people feel right at home. Of course, in Perl, there’s nothing special about the name new, but at least the syntax is hauntingly familiar.

Why the previous “generally” caveat? Well, if the instance is something more complicated than a simple scalar variable:

$somehash->{$somekey}->[42]->instance_method(@parms);

then you can’t just swap it around like:

instance_method $somehash->{$somekey}->[42] @parms;

because the only things acceptable to indirect object syntax are a bareword (e.g., a class name), a simple scalar variable, or braces denoting a block returning either a blessed reference or a classname.^[48]

This means you have to write it like so:

instance_method { $somehash->{$somekey}->[42] } @parms;

And that goes from simple to uglier in one step. There’s another downside: ambiguous parsing. When we developed the classroom materials concerning indirect object references, we wrote:

my $cow = Cow->named("Bessie");
print name $cow, " eats.\n";

because we were thinking about the indirect object equivalents for:

my $cow = Cow->named("Bessie");
print $cow->name, " eats.\n";

However, the latter works; the former doesn’t. We were getting no output. Finally, we enabled warnings (via -w on the command line)^[49] and got this interesting series of messages:

Unquoted string "name" may clash with future reserved word at ./foo line 92.
Name "main::name" used only once: possible typo at ./foo line 92.
print(  ) on unopened filehandle name at ./foo line 92.

Ahh, so that line was being parsed as:

print name ($cow, " eats.\n");

In other words, print the list of items to the filehandle named name. That’s clearly not what we wanted, so we had to add additional syntax to disambiguate the call.

This leads us to our next strong suggestion:

Use direct object syntax at all times, except perhaps for the constructor call.

That exception acknowledges that most people write new Class ... rather than Class->new(...) and that most of us are fine with that. However, there are circumstances in which even that can lead to ambiguity (e.g., when a subroutine named new has been seen, and the class name itself has not been seen as a package). When in doubt, ignore indirect object syntax. Your maintenance programmer will thank you.

Additional Instance Variables in Subclasses

One of the nice things about using a hash for a data structure is that derived classes can add additional instance variables without the superclass needing to know of their addition. For example, let’s derive a RaceHorse class that is everything a Horse is but also tracks its win/place/show/lose standings. The first part of this is trivial:

{ package RaceHorse;
  our @ISA = qw(Horse);
  ...
}

You’ll also want to initialize “no wins of no races” when you create the RaceHorse. You do this by extending the named subroutine and adding four additional fields (wins, places, shows, losses, for first-, second-, and third-place finishes, and none of the above):

{ package RaceHorse;
  our @ISA = qw(Horse);
  ## extend parent constructor:
  sub named {
    my $self = shift->SUPER::named(@_);
    $self->{$_} = 0 for qw(wins places shows losses);
    $self;
  }
}

Here, you pass all parameters to the superclass, which should return a fully formed Horse. However, because you pass RaceHorse as the class, it’d be already blessed into the RaceHorse class.^[50] Next, add the four instance variables that go beyond those defined in the superclass, setting their initial values to 0. Finally, return the modified RaceHorse to the caller.

It’s important to note here that we’ve actually “opened the box” a bit while writing this derived class. You know that the superclass uses a hash reference and that the superclass hierarchy doesn’t use the four names chosen for a derived class. This is because RaceHorse will be a “friend” class (in C++ or Java terminology), accessing the instance variables directly. If the maintainer of Horse or Animal ever changes representation or names of variables, there could be a collision, which might go undetected except for that important day when you’re showing off your code to the investors. Things get even more interesting if the hashref is changed to an arrayref as well.

One way to decouple this dependency is to use composition rather than inheritance as a way to create a derived class. In this example, you need to make a Horse object an instance variable of a RaceHorse and put the rest of the data in separate instance variables. You also need to pass any inherited method calls on the RaceHorse down to the Horse instance, through delegation. However, even though Perl can certainly support the needed operations, that approach is usually slower and more cumbersome. Enough on that for this discussion, however.

Next, let’s provide some access methods:

{ package RaceHorse;
  our @ISA = qw(Horse);
  ## extend parent constructor:
  sub named {
    my $self = shift->SUPER::named(@_);
    $self->{$_} = 0 for qw(wins places shows losses);
    $self;
  }
  sub won { shift->{wins}++; }
  sub placed { shift->{places}++; }
  sub showed { shift->{shows}++; }
  sub lost { shift->{losses}++; }
  sub standings {
    my $self = shift;
    join ", ", map "$self->{$_} $_", qw(wins places shows losses);
  }
}

my $racer = RaceHorse->named("Billy Boy");
# record the outcomes: 3 wins, 1 show, 1 loss
$racer->won;
$racer->won;
$racer->won;
$racer->showed;
$racer->lost;
print $racer->name, " has standings of: ", $racer->standings, ".\n";

This prints:

Billy Boy has standings of: 3 wins, 0 places, 1 shows, 1 losses.
[Billy Boy has died.]
[Billy Boy has gone off to the glue factory.]

Note that we’re still getting the Animal and Horse destructor. The superclasses are unaware that we’ve added four additional elements to the hash and so, still function as they always have.

Using Class Variables

What if you want to iterate over all the animals we’ve made so far? Animals may exist all over the program namespace and are lost once they’re handed back from the named constructor method.

However, you can record the created animal in a hash and iterate over that hash. The key to the hash can be the stringified form of the animal reference,^[51] while the value can be the actual reference, allowing you to access its name or class.

For example, let’s extend named as follows:

## in Animal
our %REGISTRY;
sub named {
  my $class = shift;
  my $name = shift;
  my $self = { Name => $name, Color => $class->default_color };
  bless $self, $class;
  $REGISTRY{$self} = $self;  # also returns $self
}

The uppercase name for %REGISTRY is a reminder that this variable is more global than most variables. In this case, it’s a metavariable that contains information about many instances.

Note that when used as a key, $self stringifies, which means it turns into a string unique to the object.

We also need to add a new method:

sub registered {
  return map { "a ".ref($_)." named ".$_->name } values %REGISTRY;
}

Now you can see all the animals we’ve made:

my @cows = map Cow->named($_), qw(Bessie Gwen);
my @horses = map Horse->named($_), ("Trigger", "Mr. Ed");
my @racehorses = RaceHorse->named("Billy Boy");
print "We've seen:\n", map("  $_\n", Animal->registered);
print "End of program.\n";

This prints:

We've seen:
  a RaceHorse named Billy Boy
  a Horse named Mr. Ed
  a Horse named Trigger
  a Cow named Gwen
  a Cow named Bessie
End of program.
[Billy Boy has died.]
[Billy Boy has gone off to the glue factory.]
[Bessie has died.]
[Gwen has died.]
[Trigger has died.]
[Trigger has gone off to the glue factory.]
[Mr. Ed has died.]
[Mr. Ed has gone off to the glue factory.]

Note that the animals die at their proper time because the variables holding the animals are all being destroyed at the final step. Or are they?

Weakening the Argument

The %REGISTRY variable also holds a reference to each animal. So even if you toss away the containing variables:

{
  my @cows = map Cow->named($_), qw(Bessie Gwen);
  my @horses = map Horse->named($_), ("Trigger", "Mr. Ed");
  my @racehorses = RaceHorse->named("Billy Boy");
}
print "We've seen:\n", map("  $_\n", Animal->registered);
print "End of program.\n";

you’ll still see the same result. The animals aren’t destroyed even though none of the code is holding the animals. At first glance, it looks like you can fix this by altering the destructor:

## in Animal
sub DESTROY {
  my $self = shift;
  print "[", $self->name, " has died.]\n";
  delete $REGISTRY{$self};
}

But this still results in the same output. Why? Because the destructor isn’t called until the last reference is gone, but the last reference won’t be destroyed until the destructor is called.^[52]

One solution for fairly recent Perl versions^[53] is to use weak references. A weak reference is a reference that doesn’t count as far as the reference counting, uh, counts. It’s best illustrated by example.

The weak reference mechanism is already built into the core of recent Perl versions, but as of this writing, the user interface is still accessed by a CPAN module called WeakRef. After installing this module,^[54] you can update the constructor as follows:

## in Animal
use WeakRef qw(weaken); ## new

sub named {
  ref(my $class = shift) and croak "class only";
  my $name = shift;
  my $self = { Name => $name, Color => $class->default_color };
  bless $self, $class;
  $REGISTRY{$self} = $self;
  weaken($REGISTRY{$self});
  $self;
}

When Perl counts the number of active references to a thingy,^[55] it won’t count any that have been converted to weak references by weaken. If all ordinary references are gone, Perl deletes the thingy and turns any weak references to undef.

Now you’ll get the right behavior for:

my @horses = map Horse->named($_), ("Trigger", "Mr. Ed");
print "alive before block:\n", map("  $_\n", Animal->registered);
{
  my @cows = map Cow->named($_), qw(Bessie Gwen);
  my @racehorses = RaceHorse->named("Billy Boy");
  print "alive inside block:\n", map("  $_\n", Animal->registered);
}
print "alive after block:\n", map("  $_\n", Animal->registered);
print "End of program.\n";

This prints:

alive before block:
  a Horse named Trigger
  a Horse named Mr. Ed
alive inside block:
  a RaceHorse named Billy Boy
  a Cow named Gwen
  a Horse named Trigger
  a Horse named Mr. Ed
  a Cow named Bessie
[Billy Boy has died.]
[Billy Boy has gone off to the glue factory.]
[Gwen has died.]
[Bessie has died.]
alive after block:
  a Horse named Trigger
  a Horse named Mr. Ed
End of program.
[Mr. Ed has died.]
[Mr. Ed has gone off to the glue factory.]
[Trigger has died.]
[Trigger has gone off to the glue factory.]

Notice that the racehorses and cows die at the end of the block, but the ordinary horses die at the end of the program. Success!

Weak references can also solve some memory leak issues. For example, suppose an animal wanted to record its pedigree. The parents might want to hold references to all their children while each child might want to hold references to each parent.

One or the other (or even both) of these links can be weakened. If the link to the child is weakened, the child can be destroyed when all other references are lost, and the parent’s link simply becomes undef (or you can set a destructor to completely remove it). However, a parent won’t disappear as long as it still has offspring. Similarly, if the link to the parent is weakened, you’ll simply get it as undef when the parent is no longer referenced by other data structures. It’s really quite flexible.^[56]

Without weakening, as soon as any parent-child relationship is created, both the parent and the child remain in memory until the final global destruction phase, regardless of the destruction of the other structures holding either the parent or the child.

Be aware though: weak references should be used carefully, not just thrown at a problem of circular references. If you destroy data that is held by a weak reference before its time, you may have some very confusing programming problems to solve and debug.

Exercise

The answers for all exercises can be found in Section A.9.

Exercise [45 min]

Modify the RaceHorse class to get the previous standings from a DBM hash (keyed by the horse’s name) when the horse is created, and update the standings when the horse is destroyed. For example, running this program four times:

my $runner = RaceHorse->named("Billy Boy");
$runner->won;
print $runner->name, " has standings ", $runner->standings, ".\n";

should show four additional wins. Make sure that a RaceHorse still does everything a normal Horse does otherwise.

For simplicity, use four space-separated integers for the value in the DBM hash.

^[43] Normally, your own method calls will cause an error if the method isn’t found. If you want to prevent that, just put a do-nothing method into the base class.

^[44]This is just after the END blocks are executed and follows the same rules as END blocks: there must be a nice exit of the program rather than an abrupt end. If Perl runs out of memory, all bets are off.

^[45]Did you wonder why there’s a plus sign (+) before shift in two of those subroutines? That’s due to one of the quirks in Perl’s syntax. If the code were simply @{shift}, because the curly braces contain nothing but a bareword, it would be interpreted as a soft reference: @{"shift"}. In Perl, the unary plus (a plus sign at the beginning of a term) is defined to do nothing (not even turning what follows into a number), just so it can distinguish cases such as this.

^[46]If you’re using a hash instead, use delete on the elements you wish to process immediately.

^[47]As it turns out, you can tell File::Temp to do this automatically, but then we wouldn’t be able to illustrate doing it manually. Doing it manually allows you to store a summary of the information from the temporary file into a database. However, that’s too complex to show here.

^[48]Astute readers will note that these are the same rules as for an indirect filehandle syntax, from which indirect object syntax directly mirrors, as well as the rules for specifying a reference to be dereferenced.

^[49]Using -w should be the first step when Perl does something you don’t understand. Or maybe it should be the zeroth because you should normally have -w in effect whenever you’re developing code.

^[50]Similar to the way the Animal constructor creates a Horse, not an Animal, when passed Horse as the class.

^[51]Or any other convenient and unique string.

^[52]We’d make a reference to chickens and eggs, but that would introduce yet another derived class to Animal.

^[53]5.6 and later.

^[54]See Chapter 15 for information on installing modules.

^[55]A thingy as defined in Perl’s own documentation, is anything a reference points to, such as an object. If you are an especially boring person, you could call it a referent instead.

^[56]When using weak references, always make sure you don’t dereference a weakened reference that has turned to undef.