You might wonder, “do all objects inherit from a common class?” “What if a method is missing?” “What about multiple inheritance?” “How come we haven’t seen a reference to a filehandle yet?” Well, wonder no more. This chapter covers these subjects and more.
As you define classes, you create
inheritance hierarchies through the global @ISA
variables in each package. To search for a method, Perl wanders
through the @ISA
tree until it finds a match or
fails.
After the
search fails however, Perl always looks in one special class called
UNIVERSAL
and invokes a method from there, if
found, just as if it had been located in any other class or
superclass.
One way to look at this is that UNIVERSAL
is the
base class from which all objects are derived. Any method you place
here, such as:
sub UNIVERSAL::fandango { warn "object ", shift, " can do the fandango!\n"; }
enables all objects of your program to be called as
$some_object->fandango
.
Generally, you should provide a
fandango
method for specific classes of interest,
and then provide a definition in
UNIVERSAL::fandango
as a backstop, in case a more
specific method can’t be found. A practical example
might be a data-dumping routine for debugging or maybe a marshalling
strategy to dump all application objects to a file. Simply provide
the general method in UNIVERSAL
and override it in
the specific classes for unusual objects.
Obviously, UNIVERSAL
should be used sparingly
because there’s only one universe of objects, and
your fandango
might collide with some other
included module’s fandango
. For
this reason, UNIVERSAL
is hardly used for anything
except methods which must be completely, well, universal. Like during
debugging.
Besides providing a
place for you to put universally available methods, the
UNIVERSAL
package comes preloaded with two very
useful utility methods: isa
and
can
. Because these methods are defined in
UNIVERSAL
, they are automatically available to all
objects.
The isa
method tests to see whether a given class or instance is a member of
a given class or a member of a class that inherits from the given
class. For example, continuing on with the
Animal
-family from the past chapters:
if (Horse->isa("Animal")) { # does Horse inherit from Animal? print "A Horse is an Animal.\n"; } my $tv_horse = Horse->named("Mr. Ed"); if ($tv_horse->isa("Animal")) { # is it an Animal? print $tv_horse->name, " is an Animal.\n"; if ($tv_horse->isa("Horse")) { # is it a Horse? print "In fact, ", $tv_horse->name, " is a Horse.\n"; } else { print "...but it's not a Horse.\n"; } }
This is handy when you have a heterogeneous mix of objects in a data structure and want to distinguish particular categories of objects:
my @horses = grep $_->isa("Horse"), @all_animals;
The result will be only the horses (or race horses) from the array. Compare that with:
my @horses_only = ref $_ eq "Horse", @all_animals;
which picks out just the horses because a
RaceHorse
won’t return
Horse
for ref
.
In general, you shouldn’t use:
ref($some_object) eq "SomeClass"
in your programs because it prevents future users from subclassing
that class. Use the isa
construct as given
earlier.
One downside of the isa
call here is that it works only on blessed references or scalars that
look like class names. If you happen to pass it an unblessed
reference, you get a fatal (but trappable) error of:
Can't call method "isa" on unblessed reference at ...
To call isa
more
robustly, don’t call it as a method. Instead, call
it as a subroutine:
if (UNIVERSAL::isa($unknown_thing, "Animal")) { ... it's an Animal! ... }
This works regardless of what $unknown_thing
contains.
As in the case of isa
, you can test for acceptable
behaviors with the can
method. For example:
if ($tv_horse->can("eat")) { $tv_horse->eat("hay"); }
If the result of can
is true, then somewhere in
the inheritance hierarchy, a class has defined an
eat
method. Again, the caveats about
$tv_horse
being only either a blessed reference or
a class name as a scalar still apply, so the robust solution when you
might deal with nearly anything looks like:
if (UNIVERSAL::can($tv_horse, "eat")) { ... }
Note that if you
defined UNIVERSAL::fandango
earlier, then:
$object->can("fandango")
always returns true because all objects can do the fandango.
After
Perl searches the inheritance tree and UNIVERSAL
for a method, it doesn’t just stop there if the
search is unsuccessful. Perl repeats the search through the very same
hierarchy (including UNIVERSAL
), looking for a
method named AUTOLOAD
.
If an AUTOLOAD
exists, the subroutine is called in place of the original method,
passing it the normal predetermined argument list: the class name or
instance reference, followed by any arguments provided to the method
call. The original method name is passed in the package variable
called $AUTOLOAD
(in the package where the
subroutine was compiled) and contains the fully qualified method
name, so you should generally strip everything up to the final double
colon if you want a simple method name.
The AUTOLOAD subroutine can execute the desired operation itself,
install a subroutine and then jump into it, or perhaps just
die
if asked to perform an unknown method.
One use of AUTOLOAD
defers the compilation of a large subroutine until it is actually
needed. For example, suppose the eat
method for an
animal is complex but unused in nearly every invocation of the
program. You can defer its compilation as follows:
## in Animal sub AUTOLOAD { our $AUTOLOAD; (my $method = $AUTOLOAD) =~ s/.*:://s; # remove package name if ($method eq "eat") { ## define eat: eval q{ sub eat { ... long definition goes here ... } }; # End of eval's q{ } string die $@ if $@; # if typo snuck in goto &eat; # jump into it } else { # unknown method croak "$_[0] does not know how to $method\n"; } }
If the method name is eat
, you’ll
define eat
(which had previously been held in a
string but not compiled), and then jump into it with a special
construct that replaces the current subroutine invocation with an
invocation to eat
.[57] After the first
AUTOLOAD
hit, the eat
subroutine is now defined, so won’t be coming back
here. This is great for compile-as-you-go programs because it
minimizes startup overhead.
For a more automated way of creating code
to do this, which makes it easy to turn the autoloading off during
development and debugging, see the AutoLoader
and
SelfLoader
core module documentation.
Chapter 9 showed how to create color
and set_color
to get and set the color of an
animal. If you had 20 attributes instead of one or two, the code
would be painfully repetitive. However, using an
AUTOLOAD
method, you can construct the nearly
identical accessors as needed, saving both compilation time and
wear-and-tear on the developer’s keyboard.
Use a code reference as a closure
to do the job. First, set up an AUTOLOAD
for the
object and define a list of hash keys for which you want trivial
accessors:
sub AUTOLOAD { my @elements = qw(color age weight height);
Next, you’ll see if the method is a getter for one of these keys, and if so, install a getter and jump to it:
our $AUTOLOAD; if ($AUTOLOAD =~ /::(\w+)$/ and grep $1 eq $_, @elements) { my $field = ucfirst $1; { no strict 'refs'; *{$AUTOLOAD} = sub { $_[0]->{$field} }; } goto &{$AUTOLOAD}; }
You need to
use ucfirst
because you named the method
color
to fetch the hash element called
Color
. The glob notation here installs a wanted
subroutine as defined by the coderef closure, which fetches the
corresponding key from the object hash. Consider this part to be
magic that you just cut and paste into your program. Finally, the
goto
construct jumps into the newly defined
subroutine.
Otherwise, perhaps it’s a setter:
if ($AUTOLOAD =~ /::set_(\w+)$/ and grep $1 eq $_, @elements) { my $field = ucfirst $1; { no strict 'refs'; *{$AUTOLOAD} = sub { $_[0]->{$field} = $_[1] }; } goto &{$AUTOLOAD}; }
If it is neither, death awaits:
die "$_[0] does not understand $method\n"; }
Again, you pay the price for the
AUTOLOAD
only on the first hit of a particular
getter or setter. After that, a subroutine is now already defined,
and you can just invoke it directly.
If all that coding for creating
accessors using AUTOLOAD
looks messy, rest assured
that you really don’t need to tackle it, because
there’s a CPAN module that does it a bit more
directly: Class::MethodMaker
.
For example, a simplified version of the Animal
class might be defined as follows:
package Animal; use Class::MethodMaker new_with_init => 'new', get_set => [-eiffel => [qw(color height name age)]], abstract => [qw(sound)], ; sub init { my $self = shift; $self->set_color($self->default_color); } sub named { my $self = shift->new; $self->set_name(shift); $self; } sub speak { my $self = shift; print $self->name, " goes ", $self->sound, "\n"; } sub eat { my $self = shift; my $food = shift; print $self->name, " eats $food\n"; } sub default_color { "brown"; }
The getters and setters for the four instance attributes (name,
height, color, and age) are defined automatically, using the method
color
to get the color and
set_color
to set the color. (The
eiffel
flag says “do it the way
the Eiffel language does it,” which is the way it
should be done here.) The messy blessing step is now hidden behind a
simple new
method. The initial color is defined as
the default color, as before, because the init
method is automatically called from new
.
However, you can still call Horse->named('Mr
.
Ed')
because it immediately calls the
new
routine as well.
The sound
method is
automatically generated as an abstract method.
Abstract methods are placeholders, meant to be
defined in a subclass. If a subclass fails to define the method, the
method generated for Animal
’s
sound
dies.
You lose the ability to call the getters
(such as name
) on the class itself, rather than an
instance. In turn, this breaks your prior usage of calling
speak
and eat
on generic
animals, since they call the accessors. One way around this is to
define a more general version of name
to handle
either a class or instance and then change the other routines to call
it:
sub generic_name { my $either = shift; ref $either ? $either->name : "an unnamed $either"; } sub speak { my $either = shift; print $either->generic_name, " goes ", $either->sound, "\n"; } sub eat { my $either = shift; my $food = shift; print $either->generic_name, " eats $food\n"; }
There. Now
it’s looking nearly drop-in compatible with the
previous definition, except for those friend classes that referenced
the attribute names directly in the hash as the initial-cap-keyed
versions (such as Color
) rather than through the
accessors ($self->color
).
That brings up the maintenance issue again. The more you can decouple your implementation (hash versus array, names of hash keys, or types of elements) from the interface (method names, parameter lists, or types of return values), the more flexible and maintainable your system becomes.
That flexibility is not free, however. The cost of a method call is higher than the cost of a hash lookup, so it may be acceptable (or even necessary) for a friend class to peer inside. You may have to pay the programmer-time price of development and maintenance so you don’t pay the runtime price of an overly flexible system.
On the other hand, don’t go overboard in the other direction. Many anecdotes float around about systems where everything was so indirected (to be flexible) that the system ran too slowly to be used.
How does Perl wander through the
@ISA
tree? The answer may be simple or complex. If
you don’t have multiple inheritance (that is, if no
@ISA
has more than one element), it is simple:
Perl simply goes from one @ISA
to the next until
it finds the ultimate base class whose @ISA
is
empty.
Multiple inheritance is more complex. It occurs when a
class’s @ISA
has more than one
element. For example, suppose someone had given an existing class,
called Racer
, which has the basic abilities for
anything that can race, so that it’s ready to be the
base class for a runner, a fast car, or a racing turtle. With that,
you can make the RaceHorse
class as simply as
this, maybe:[58]
{ package RaceHorse; our @ISA = qw{ Horse Racer }; }
Now a RaceHorse
can do anything a
Horse
can do, and anything a
Racer
can do as well. When Perl searches for a
method that’s not provided directly by
RaceHorse
, it first searches through all the
capabilities of the Horse
(including all its
parent classes, such as Animal
). When the
Horse
possibilities are exhausted, Perl turns to
see whether Racer
(or one of its subclasses)
supplies the needed method. On the other hand, if you want Perl to
search Racer
and its subclasses before searching
Horse
, put them into @ISA
in
that order (see Figure 11-1).
So far, you’ve seen references to scalars, arrays, hashes, and subroutines. Another important value type in Perl is the filehandle.
However, a filehandle
isn’t stored in a variable. The filehandle is the
handle itself. You can’t take a reference directly
to a filehandle.[59] However, using the
IO::File
built-in class, you can create objects
that act like filehandles within Perl. Here’s a
typical use:
use IO::File; my $fh = IO::File->open("/etc/passwd") or die "constructor failed: $!"; while (<$fh>) { # $fh acts like any filehandle print "a password line is $_"; } close $fh; # nearly all built-ins can use IO::File
Here, $fh
is constructed using the
open
class method of IO::File
,
and then used in places where ordinarily you’d use a
traditional (bareword) filehandle. Furthermore, you also get some
additional methods:
if ($fh->opened) { ... } # file is open $fh->blocking(0); # make I/O be "non-blocking" if supported
The core built-in operations that use
filehandles can all use an IO::File
objects
instead. If the IO::File
object is within a simple
scalar variable, you can always replace the filehandle with the
scalar:
use IO::File; my $fh = IO::File->new; # create unopened "filehandle" object open $fh, ">my_new_file" or die "Cannot create: $!"; print $fh "$_\n" for 1..10; close $fh;
An IO::File
object automatically gets closed cleanly when destroyed, so you can
simplify the previous code as:
use IO::File; { my $fh = IO::File->open(">my_new_file") or die "Cannot create my_new_file: $!"; print $fh, "$_\n" for 1..10; }
As $fh
goes out of scope, the filehandle is
automatically closed. Nice.
If the IO::File
object
is not named by a simple scalar variable, some operations require a
slightly modified syntax to work. For example, copy every file
matched by the glob pattern of *.input
to a
corresponding file whose suffix is .output
, but do
it in parallel. First, open all the files, both inputs and outputs:
my @handlepairs; foreach my $file (<*.input>) { (my $out = $file) =~ s/\.input$/.output/; push @handlepairs, [ (IO::File->new("<$file") || die), (IO::File->new(">$out") || die), ]; }
Now you have an array of
references to arrays, each element of which is an
IO::File
object. Let’s pump the
data:
while (@handlepairs) { @handlepairs = grep { if (defined(my $line = $_->[0]->getline)) { print { $_->[1] } $line; } else { 0; } } @handlepairs; }
As long as you have pairs, keep passing
the list through the grep
structure:
@handlepairs = grep { CONDITION } @handlepairs;
On each pass, only the handle pairs that
evaluate as true in the grep
CONDITION
survive. Inside, you take the first
element of each pair and try to read from it. If
that’s successful, write that line to the second
element of the pair (the corresponding output handle). If the
print
is successful, it returns true, which lets
grep
know that you want to keep that pair. If
either the print
fails or the
getline
returned undef
, the
grep
sees the false value as an indication to
discard that pair. Discarding the pair automatically closes both
filehandles. Cool!
Note that you can’t use the more traditional filehandle read or filehandle print operations because the reading and writing filehandles weren’t in a simple scalar variable. Rewrite that loop to see if copying the handles is easier:
while (@handlepairs) { @handlepairs = grep { my ($IN, $OUT) = @$_; if (defined(my $line = <$IN>)) { print $OUT $line; } else { 0; } } @handlepairs; }
This scenario is arguably better. Most of the time, simply copying
the complexly referenced value into a simple scalar is easier on the
eyes. In fact, another way to write that loop is to get rid of the
ugly if
structure:
while (@handlepairs) { @handlepairs = grep { my ($IN, $OUT) = @$_; my $line; defined($line = <IN>) and print $OUT $line; } @handlepairs; }
As long as someone understands that and
is a
partial evaluator and that print
returns true when
everything is OK, this is a fine replacement. Remember the Perl
motto: “There’s more than one way
to do it” (although not all of them are equally nice
or legitimate).
The answers for all exercises can be found in Section A.10.
The Professor has to read a log file that looks like:
Gilligan: 1 coconut Skipper: 3 coconuts Gilligan: 1 banana Ginger: 2 papayas Professor: 3 coconuts MaryAnn: 2 papayas ...
He wants to write a series of files, called
gilligan.info
, maryann.info
,
and so on. Each file should contain all the lines that begin with
that name. (Names are always delimited by the trailing colon.) At the
end, gilligan.info
should start with:
Gilligan: 1 coconut Gilligan: 1 banana
Now the log file is large, and the coconut-powered computer is not very fast, so he wants to process the input file in one pass and write all output files in parallel. How does he do it?
Hint: Use a hash, keyed by the castaway name, holding
IO::File
objects for each output file. Create them
as necessary.
[57] Although
goto
is generally (and rightfully) considered
evil, this form of goto
, which gives a subroutine
name as a target, is not really the evil goto; it’s
the good goto.
[58] If there is a conflict among the methods
of Horse
and Racer
, or if their
implementations aren’t able to work together, the
situation can become much more difficult.
[59] You can use the glob, take a reference to the glob, or take a reference to the I/O structure within a glob, but that’s still not a reference to the filehandle.