Using the simple syntax introduced in Chapter 8, you have class methods, (multiple) inheritance, overriding, and extending. You’ve been able to factor out common code and provide a way to reuse implementations with variations. This is at the core of what objects provide, but objects also provide instance data, which we haven’t even begun to cover.
Let’s
look at the code used in Chapter 8 for the
Animal
classes and Horse
classes:
{ package Animal; sub speak { my $class = shift; print "a $class goes ", $class->sound, "!\n" } } { package Horse; @ISA = qw(Animal); sub sound { "neigh" } }
This lets you invoke Horse->speak
to ripple
upward to Animal::speak
, calling back to
Horse::sound
to get the specific sound, and the
output of:
a Horse goes neigh!
But all Horse
objects would have to be absolutely
identical. If you add a subroutine, all horses automatically share
it. That’s great for making horses identical, but
how do you capture the properties of an individual horse? For
example, suppose you want to give your horse a name.
There’s got to be a way to keep its name separate
from those of other horses.
You can do so by establishing an instance. An instance is generally created by a class, much like a car is created by a car factory. An instance will have associated properties, called instance variables (or member variables, if you come from a C++ or Java background). An instance has a unique identity (like the serial number of a registered horse), shared properties (the color and talents of the horse), and common behavior (i.e., pulling the reins back tells the horse to stop).
In Perl, an instance must be a reference to one of the built-in types. Start with the simplest reference that can hold a horse’s name: a scalar reference:[37]
my $name = "Mr. Ed"; my $tv_horse = \$name;
Now $tv_horse
is a reference to what will be the
instance-specific data (the name). The final step in turning this
into a real instance involves a special operator called
bless
:
bless $tv_horse, "Horse";
The bless
operator follows the reference to find what variable it points
to—in this case the scalar $name
. Then it
“blesses” that variable, turning
$tv_horse
into an object —a
Horse
object, in fact. (Imagine that a little
sticky-note that says Horse
is now attached to
$name
.)
At this point,
$tv_horse
is an instance of
Horse
.[38] That is,
it’s a specific horse. The reference is otherwise
unchanged and can still be used with traditional dereferencing
operators.[39]
The
method arrow can be used on instances, as well as names of packages
(classes). Let’s get the sound that
$tv_horse
makes:
my $noise = $tv_horse->sound;
To invoke
sound
, Perl first notes that
$tv_horse
is a blessed reference, and thus an
instance. Perl then constructs an argument list, similar to the way
an argument list was constructed when you used the method arrow with
a class name. In this case, it’ll be just
($tv_horse)
. (Later you’ll see
that arguments will take their place following the instance variable,
just as with classes.)
Now for the fun part: Perl takes the class in which the instance was
blessed, in this case Horse
, and uses it to locate
the subroutine to invoke the method, as if you had said
Horse->sound
instead of
$tv_horse->sound
. The purpose of the original
blessing is to associate a class with that reference to allow the
proper method (subroutine) to be found.
In this case, Horse::sound
is found directly
(without using inheritance), yielding the final subroutine
invocation:
Horse::sound($tv_horse)
Note that the first parameter here is still the instance, not the
name of the class as before. neigh
is the return
value, which ends up as the earlier $noise
variable.
If Horse::sound
had not
been found, you’d wander up the
@Horse::ISA
list to try to find the method in one
of the superclasses, just as for a class method. The only difference
between a class method and an instance method is whether the first
parameter is an instance (a blessed reference) or a class name (a
string).[40]
Because you get the instance as the first parameter, you can now access the instance-specific data. In this case, let’s add a way to get at the name:
{ package Horse; @ISA = qw(Animal); sub sound { "neigh" } sub name { my $self = shift; $$self; } }
Now you call for the name:
print $tv_horse->name, " says ", $tv_horse->sound, "\n";
Inside Horse::name
,
the @_
array contains just
$tv_horse
, which the shift
stores into $self
. It’s
traditional to shift the first parameter into a variable named
$self
for instance methods, so stay with that
unless you have strong reasons otherwise. Perl places no significance
on the name $self
, however.[41]
Then $self
is dereferenced as a scalar reference,
yielding Mr
. Ed
. The result is:
Mr. Ed says neigh.
If you constructed all your horses by hand, you’d
most likely make mistakes from time to time. Making the
“inside guts” of a
Horse
visible also violates one of the principles
of OOP. That’s good if you’re a
veterinarian but not if you just like to own horses. Let the
Horse
class build a new horse:
{ package Horse; @ISA = qw(Animal); sub sound { "neigh" } sub name { my $self = shift; $$self; } sub named { my $class = shift; my $name = shift; bless \$name, $class; } }
Now with the new named
method, build a
Horse
:
my $tv_horse = Horse->named("Mr. Ed");
You’re
back to a class method, so the two arguments to
Horse::named
are "Horse
" and
"Mr. Ed
“. The bless
operator
not only blesses $name
, it also returns the
reference to $name
, so that’s
fine as a return value. And that’s how to build a
horse.
You called the constructor
named
here so it quickly denotes the
constructor’s argument as the name for this
particular Horse
. You can use different
constructors with different names for different ways of
“giving birth” to the object (such
as recording its pedigree or date of birth). However,
you’ll find that most people coming to Perl from
less-flexible languages (such as Java or C++
) use
a single constructor named new
, with various ways
of interpreting the arguments to new
. Either style
is fine, as long as you document your particular way of giving birth
to an object. Most core and CPAN modules use new
,
with notable exceptions, such as DBI’s
DBI->connect( )
. It’s really
up to the author. It all works, as long as it’s
documented.
Was
there anything specific to Horse
in that method?
No. Therefore, it’s also the same recipe for
building anything else inherited from Animal
, so
let’s put it there:
{ package Animal; sub speak { my $class = shift; print "a $class goes ", $class->sound, "!\n" } sub name { my $self = shift; $$self; } sub named { my $class = shift; my $name = shift; bless \$name, $class; } } { package Horse; @ISA = qw(Animal); sub sound { "neigh" } }
Ahh, but what happens if you invoke speak
on an
instance?
my $tv_horse = Horse->named("Mr. Ed"); $tv_horse->speak;
You get a debugging value:
a Horse=SCALAR(0xaca42ac) goes neigh!
Why? Because the
Animal::speak
routine expects a classname as its
first parameter, not an instance. When the instance is passed in,
you’ll use a blessed scalar reference as a string,
which shows up as you saw it just now—similar to a stringified
reference, but with the class name in front.
All you need to fix this is a way to
detect whether the method is called on a class or an instance. The
most straightforward way to find out is with the
ref
operator. This operator returns a string (the
classname) when used on a blessed reference, and
undef
when used on a string (like a classname).
Modify the name
method first to notice the change:
sub name { my $either = shift; ref $either ? $$either # it's an instance, return name : "an unnamed $either"; # it's a class, return generic }
Here the
?
: operator selects either the dereference or a
derived string. Now you can use it with either an instance or a
class. Note that you changed the first parameter holder to
$either
to show that it is intentional:
print Horse->name, "\n"; # prints "an unnamed Horse\n" my $tv_horse = Horse->named("Mr. Ed"); print $tv_horse->name, "\n"; # prints "Mr Ed.\n"
and now you’ll fix speak
to use
this:
sub speak { my $either = shift; print $either->name, " goes ", $either->sound, "\n"; }
Since sound
already worked with either a class or
an instance, you’re done!
Let’s train your animals to eat:
{ package Animal; sub named { my $class = shift; my $name = shift; bless \$name, $class; } sub name { my $either = shift; ref $either ? $$either # it's an instance, return name : "an unnamed $either"; # it's a class, return generic } sub speak { my $either = shift; print $either->name, " goes ", $either->sound, "\n"; } sub eat { my $either = shift; my $food = shift; print $either->name, " eats $food.\n"; } } { package Horse; @ISA = qw(Animal); sub sound { "neigh" } } { package Sheep; @ISA = qw(Animal); sub sound { "baaaah" } }
Now try it out:
my $tv_horse = Horse->named("Mr. Ed"); $tv_horse->eat("hay"); Sheep->eat("grass");
It prints:
Mr. Ed eats hay. an unnamed Sheep eats grass.
An instance method with parameters gets invoked with the instance, and then the list of parameters. That first invocation is like:
Animal::eat($tv_horse, "hay");
The instance methods form the Application Programming Interface (API) for an object. Most of the effort involved in designing a good object class goes into the API design because the API defines how reusable and maintainable the object and its subclasses will be. Do not rush to freeze an API design before you’ve considered how the object will be used.
What if an instance needs more data? Most interesting instances are made of many items, each of which can in turn be a reference or another object. The easiest way to store these items is often in a hash. The keys of the hash serve as the names of parts of the object (also called instance or member variables), and the corresponding values are, well, the values.
How do you turn the horse into a hash?[42] Recall that an object is any blessed reference. You can just as easily make it a blessed hash reference as a blessed scalar reference, as long as everything that looks at the reference is changed accordingly.
Let’s make a sheep that has a name and a color:
my $lost = bless { Name => "Bo", Color => "white" }, Sheep;
$lost->{Name}
has Bo
, and
$lost->{Color}
has white
.
But you want to make $lost->name
access the
name, and that’s now messed up because
it’s expecting a scalar reference. Not to worry,
because it’s pretty easy to fix up:
## in Animal sub name { my $either = shift; ref $either ? $either->{Name} : "an unnamed $either"; }
named
still builds a scalar sheep, so
let’s fix that as well:
## in Animal sub named { my $class = shift; my $name = shift; my $self = { Name => $name, Color => $class->default_color }; bless $self, $class; }
What’s this default_color
? If
named
has only the name, you still need to set a
color, so you’ll have a class-specific initial
color. For a sheep, you might define it as white:
## in Sheep sub default_color { "white" }
Then to keep from having to define one
for each additional class, define a backstop method, which serves as
the “default default,” directly in
Animal
:
## in Animal sub default_color { "brown" }
Thus, all animals are brown (muddy, perhaps), unless a specific animal class gives a specific override to this method.
Now, because
name
and named
were the only
methods that referenced the structure of the object, the remaining
methods can stay the same, so speak
still works as
before. This supports another basic rule of OOP: if the structure of
the object is accessed only by the object’s own
methods or inherited methods, there’s less code to
change when it’s time to modify that structure.
Having all horses be brown would be boring. Let’s add a method or two to get and set the color:
## in Animal sub color { my $self = shift; $self->{Color}; } sub set_color { my $self = shift; $self->{Color} = shift; }
Now you can fix that color for Mr. Ed:
my $tv_horse = Horse->named("Mr. Ed"); $tv_horse->set_color("black-and-white"); print $tv_horse->name, " is colored ", $tv_horse->color, "\n";
which results in:
Mr. Ed is colored black-and-white
Because of the way the code is written, the setter also returns the updated value. Think about this (and document it) when you write a setter. What does the setter return? Here are some common variations:
The updated parameter (same as what was passed in)
The previous value (similar to the way umask
or
the single-argument form of select
works)
The object itself
A success/fail code
Each has advantages and disadvantages. For example, if you return the updated parameter, you can use it again for another object:
$tv_horse->set_color( $eating->set_color( color_from_user( ) ));
The implementation given earlier returns the newly updated value. Frequently, this is the easiest code to write, and often the fastest to execute.
If you return the previous parameter, you can easily create “set this value temporarily to that” functions:
{ my $old_color = $tv_horse->set_color("orange"); ... do things with $tv_horse ... $tv_horse->set_color($old_color); }
This is implemented as:
sub set_color { my $self = shift; my $old = $self->{Color}; $self->{Color} = shift; $old; }
For more efficiency, you can avoid
stashing the previous value when in a void context using the
wantarray
function:
sub set_color { my $self = shift; if (defined wantarray) { # this method call is not in void context, so # the return value matters my $old = $self->{Color}; $self->{Color} = shift; $old; } else { # this method call is in void context $self->{Color} = shift; } }
If you return the object itself, you can chain settings:
my $tv_horse = Horse->named("Mr. Ed") ->set_color("grey") ->set_age(4) ->set_height("17 hands");
This works because the output of each setter is the original object, becoming the object for the next method call. Implementing this is again relatively easy:
sub set_color { my $self = shift; $self->{Color} = shift; $self; }
The void
context trick can be used here too, although with questionable value
because you’ve already established
$self
.
Finally, returning a success status is useful if
it’s fairly common for an update to fail, rather
than an exceptional event. The other variations would have to
indicate failure by throwing an exception with
die
.
In summary: use what you want, be consistent if you can, but document it nonetheless (and don’t change it after you’ve already released one version).
You might have obtained or set the color
outside the class simply by following the hash reference:
$tv_horse->{Color}
. However, this violates the
encapsulation of the object by exposing its
internal structure. The object is supposed to be a black box, but
you’ve pried off the hinges and looked inside.
One purpose of OOP is to enable the maintainer of
Animal
or Horse
to make
reasonably independent changes to the implementation of the methods
and still have the exported interface work properly. To see why
accessing the hash directly violates this, let’s say
that Animal
no longer uses a simple color name for
the color, but instead changes to use a computed RGB triple to store
the color (holding it as an arrayref), as in:
use Color::Conversions qw(color_name_to_rgb rgb_to_color_name); ... sub set_color { my $self = shift; my $new_color = shift; $self->{Color} = color_name_to_rgb($new_color); # arrayref } sub color { my $self = shift; rgb_to_color_name($self->{Color}); # takes arrayref }
The old interface can be maintained if you use a setter and getter because they can perform the translations. You can also add new interfaces now to enable the direct setting and getting of the RGB triple:
sub set_color_rgb { my $self = shift; $self->{Color} = [@_]; # set colors to remaining parameters } sub get_color_rgb { my $self = shift; @{ $self->{Color} }; # return RGB list }
If you use code outside the class that looks at
$tv_horse->{Color}
directly, this change is no
longer possible. Store a string ('blue
') where an
arrayref is needed ([0,0,255]
) or use an arrayref
as a string.
Because you’re going to play nice and always call the getters and setters instead of reaching into the data structure, getters and setters are called frequently. To save a teeny-tiny bit of time, you might see these getters and setters written as:
## in Animal sub color { $_[0]->{Color} } sub set_color { $_[0]->{Color} = $_[1]; }
Here’s an alternate way to access the arguments:
$_[0]
is used in place, rather than with a
shift
. Functionally, this example is identical to
the previous implementation, but it’s slightly
faster, at the expense of some ugliness.
Another alternative to the pattern of creating two different methods for getting and setting a parameter is to create one method that notes whether or not it gets any additional arguments. If the arguments are absent, it’s a get operation; if the arguments are present, it’s a set operation. A simple version looks like:
sub color { my $shift; if (@_) { # are there any more parameters? # yes, it's a setter: $self->{Color} = shift; } else { # no, it's a getter: $self->{Color}; } }
Now you can say:
my $tv_horse = Horse->named("Mr. Ed"); $tv_horse->color("black-and-white"); print $tv_horse->name, " is colored ", $tv_horse->color, "\n";
The presence of the parameter in the second line denotes that you are setting the color, while its absence in the third line indicates a getter.
While this strategy might at first seem attractive because of its apparent simplicity, it complicates the actions of the getter (which will be called frequently). This strategy also makes it difficult to search through your listings to find only the setters of a particular parameter, which are often more important than the getters. In fact, we’ve been burned by this in the past when a setter became a getter because another function returned more parameters than expected after an upgrade.
Setting the name of an unnameable
generic Horse
is probably not a good idea; neither
is calling named
on an instance. Nothing in the
Perl subroutine definition says “this is a class
method” or “this is an instance
method.” Fortunately, the ref
operator lets you throw an exception when called incorrectly. As an
example of instance- or class-only methods, consider the following:
use Carp qw(croak); sub instance_only { ref(my $self = shift) or croak "instance variable needed"; ... use $self as the instance ... } sub class_only { ref(my $class = shift) and croak "class name needed"; ... use $class as the class ... }
Here, the ref
function returns true for an
instance or false for a class. If the undesired value is returned,
you’ll croak
, which has the added
advantage of placing the blame on the caller, not on you. The caller
will get an error message like this, giving the line number in their
code where the wrong method was called:
instance variable needed at their_code line 1234
While this seems like a good thing to do all the time, practically no CPAN or core modules add this extra checking. Maybe it’s only for the ultra-paranoid.
The answers for all exercises can be found in Section A.8.
Give the Animal
class the ability to get and set
the name and color. Be sure that your result works under use strict
. Also make sure your get methods work with both a
generic animal and a specific animal instance. Test your work with:
my $tv_horse = Horse->named("Mr. Ed"); $tv_horse->set_name("Mister Ed"); $tv_horse->set_color("grey"); print $tv_horse->name, " is ", $tv_horse->color, "\n"; print Sheep->name, " colored ", Sheep->color, " goes ", Sheep->sound, "\n";
What should you do if you’re asked to set the name or color of a generic animal?
[37] The simplest, but rarely used in real code for reasons you’ll see shortly
[38] Actually,
$tv_horse
points to the object, but in common
terms, you nearly always deal with objects by references to those
objects. Hence, it’s simpler to say that
$tv_horse
is the horse, not “the
thing that $tv_horse
is
referencing.”
[39] Although doing so outside the class is a bad idea, as you’ll see later.
[40] This is perhaps different from other OOP languages with which you may be familiar.
[41] If you
come from another OO language background, you might choose
$this
or $me
for the variable
name, but you’ll probably confuse most other Perl
OO-hackers.
[42] Other than calling on a butcher, that is.