A module is a building block for your program: a set of related subroutines and variables packaged so it can be reused. This chapter looks at the basics of modules: how to bring in modules that others have written, and how to write modules of your own.
To understand what happens with
use
, look at one of the many modules included with
a normal Perl distribution: File::Basename
. This
module parses file specifications into useful pieces in a mostly
portable manner. The default usage:
use File::Basename;
introduces three
subroutines, fileparse
,
basename
, and
dirname
,[60]
into the current package: typically, main
in the
main part of your program. From this point forward, within this
package, you can say: [61]
my $basename = basename($some_full_path); my $dirname = dirname($some_full_path);
as if you had written the basename
and
dirname
subroutines yourself, or (nearly) as if
they were built-in Perl functions.[62]
However, suppose you already had a
dirname
subroutine? You’ve now
overwritten it with the definition provided by
File::Basename
! If you had turned on warnings,
you’d see a message stating that, but otherwise,
Perl really doesn’t care.
Fortunately, you can tell the
use
operation to limit its actions. Do this by
specifying a list of subroutine names following the module name,
called the import list:
use File::Basename ("fileparse", "basename");
Now define the two given subroutines from the module, leaving your
own dirname
alone. Of course, this is awkward to
type, so more often you’ll see this written as:
use File::Basename qw( fileparse basename );
In fact, even if there’s
only one item, you tend to write it with a qw( )
list for consistency and maintenance; often you’ll
go back to say “give me another one from
here,” and it’s simpler if
it’s already a qw( )
list.
You’ve protected the local
dirname
routine, but what if you still want the
functionality provided by
File::Basename
’s
dirname
? No problem. Just spell it out in full:
my $dirname = File::Basename::dirname($some_path);
The list of names following use
doesn’t change which subroutine is defined in the
module’s package (in this case,
File::Basename
). You can always use the full name
regardless of the import list, as in:[63]
my $basename = File::Basename::basename($some_path);
In an extreme (but extremely useful) case, you can specify an empty list for the import list, as in:
use File::Basename ( ); # no import my $base = File::Basename::basename($some_path);
An empty list is different from an absent list. An empty list says “don’t give me anything in my current package,” while an absent list says “give me the defaults.”[64] If the module’s author has done her job well, the default will probably be exactly what you want.
Contrast the subroutines imported by
File::Basename
with what another core (non-CPAN)
module has by looking at File::Spec
. The
File::Spec
module is designed to support
operations commonly performed on file specifications. (A file
specification is usually a file or directory name, but it may be a
name of a file that doesn’t exist—in which
case, it’s not really a filename, is it?)
Unlike the File::Basename
module, the
File::Spec
module has a primarily object-oriented
interface. Saying:
use File::Spec;
in your program imports no subroutines into the current package. Instead, you’re expected to access the functionality of the module using class methods:
my $filespec = File::Spec->catfile( $homedir{gilligan}, 'web_docs', 'photos', 'USS_Minnow.gif' );
This calls the class method
catfile
of the File::Spec
class, building a path appropriate for the local operating system,
and returns a single string.[65] This is similar in syntax to the nearly two dozen other
operations provided by File::Spec
:
they’re all called as class methods. No instances
are ever created.
While it is never stated in
the documentation, perhaps the purpose of creating these as class
methods rather than as imported subroutines is to free the user of
the module from having to worry about namespace collisions (as you
saw for dirname
in the previous section). The idea
of an object class without objects, however, seems a bit off kilter.
Perhaps, that’s why the module’s
author also provides a more traditional interface with:
use File::Spec::Functions qw(catfile curdir);
in which two of the File::Spec
’s
many functions are imported as ordinary callable subroutines:
my $filespec = catfile( $homedir{gilligan}, 'web_docs', 'photos', 'USS_Minnow.gif' );
So as
not to get dismayed about how
“un-OO” the
File::Spec
module might be, let’s
look at yet another core module, Math::BigInt
:
use Math::BigInt; my $value = Math::BigInt->new(2); # start with 2 $value->bpow(1000); # take 2**1000 print $value->bstr( ), "\n"; # print it out
Here, nothing is imported. The entire
interface calls class methods such as new
against
the class name to create instances, and then calls instance methods
against those instances.
A primarily OO module is distinguished from a primarily non-OO module in two ways:
Because methods of the OO module are meant to be called as class methods, they should all set aside their first argument, which is the class name. This class name blesses a new instance but is otherwise ignored. Thus, you should not call OO modules as if they were functional modules, and vice versa. Stick with the design of the module.
So, just what is that
use
doing? How does the import list come in to
action? Perl interprets the use
list as a
particular form of BEGIN
block wrapped around a
require
and a method call. For example, the
following two operations are equivalent:
use Island::Plotting::Maps qw( load_map scale_map draw_map ); BEGIN { require Island::Plotting::Maps; Island::Plotting::Maps->import( qw( load_map scale_map draw_map ) ); }
Break
this code down piece by piece. First, the require
.
This require
is a package-name require, rather
than the string-expression require from earlier chapters. The colons
are turned into the native directory separator (such as
/
for Unix-like systems), and the name is suffixed
with .pm
(for “perl
module”). For this example on a Unix-like system,
you end up with:
require "Island/Plotting/Maps.pm";
Recalling the operation of
require
from earlier, this means you look in the
current value of @INC
, checking through each
directory for a subdirectory named Island
that
contains a further subdirectory named Plotting
that contains the file named Maps.pm
.[66]
If an appropriate file isn’t found after looking at
all of @INC
, the program dies.[67] Otherwise, the first file found is read and evaluated. As
always with require
, the last expression evaluated
must be true (or the program dies),[68] and once a file has
been read, it will not be reread if requested again.[69]
In the module interface, the
require
‘d file is expected to
define subroutines in the same-named package, not the
caller’s package. So, for example, a portion of the
File::Basename
file might look something like
this, if you took out all the good stuff:
package File::Basename; sub dirname { ... } sub basename { ... } sub fileparse { ... } 1;
These three subroutines are then defined
in the File::Basename
package, not the package in
which the use
occurs. A
require
‘d file must return a true
value, so it’s traditional to use
1;
as the last line of a module’s
code.
How are these subroutines imported
from the module’s package to the
use
r’s package?
That’s the second step inside the
BEGIN
block. A routine called
import
in the module’s package is
called, passing along the entire import list.The module author is
responsible for providing an appropriate import
routine. It’s easier than it sounds, as discussed
later in this chapter.
Finally, the whole thing is wrapped in a BEGIN
block. This implies that the use
operation happens
at compile time, rather than runtime, and indeed it does. Thus,
subroutines are associated with those defined in the module,
prototypes are properly defined, and so on.
The downside of
use
being executed at compile time is that it also
looks at @INC
at compile time, which can break
your program in hard-to-understand ways unless you take
@INC
into consideration.
For example, suppose you have your own directory under
/home/gilligan/lib
, and you place your own
Navigation::SeatOfPants
module in
/home/gilligan/lib/Navigation/SeatOfPants.pm
.
Simply saying:
use Navigation::SeatOfPants;
is unlikely to do anything useful because only the system directories
(and typically the current directory) are considered for
@INC
. However, even adding:
push @INC, "/home/gilligan/lib"; # broken use Navigation::SeatOfPants;
doesn’t work. Why? Because the
push
happens at runtime, long after the
use
was attempted at compile time. One way to fix
this is to add a BEGIN
block around the
push
:
BEGIN { push @INC, "/home/gilligan/lib"; } use Navigation::SeatOfPants;
Now the BEGIN
block
compiles and executes at compile time, setting up the proper path for
the following use
.
However, this is noisy and prone to require far more explanation than you might be comfortable with, especially for the maintenance programmer who has to edit your code later. Let’s replace all that clutter with a simple pragma:
use lib "/home/gilligan/lib"; use Navigation::SeatOfPants;
Here, the lib
pragma
takes one or more arguments and adds them at the beginning of the
@INC
array (think
“unshift”).[70] It does so because it is processed at compile time, not
runtime. Hence, it’s ready in time for the
use
immediately following.
Because a use lib
pragma will pretty much always
have a site-dependent pathname, it is traditional and encouraged to
put it near the top of the file. This makes it easier to find and
update when the file needs to move to a new system or when the
lib
directory’s name changes. (Of
course, you can eliminate use lib
entirely if you
can install your modules in a standard @INC
locations, but that’s not always practical.)
Think of use lib
as not “use this
library,” but rather “use this path
to find my libraries (and modules).” Too often, you
see code written like:
use lib "/home/gilligan/lib/Navigation/SeatOfPants.pm"; # WRONG
and then the programmer wonders why it didn’t pull
in the definitions. Be aware that use
lib
indeed runs at compile time, so this also
doesn’t work:
my $LIB_DIR = "/home/gilligan/lib"; ... use lib $LIB_DIR; # BROKEN use Navigation::SeatOfPants;
Certainly the
declaration of $LIB_DIR
is established at compile
time (so you won’t get an error with use strict
, although the actual use lib
should complain), but the actual initialization to the
/home/gilligan/lib/
path happens at runtime. Oops,
too late again!
At
this point, you need to put something inside a
BEGIN
block or perhaps rely on yet another
compile-time operation: setting a constant with use constant
:
use constant LIB_DIR => "/home/gilligan/lib"; ... use lib LIB_DIR; use Navigation::SeatOfPants;
There. Fixed again. That is, until you need the library to depend on the result of a calculation. (Where will it all end? Somebody stop the madness!) This should handle about 99 percent of your needs.
Earlier we skipped over that
“and now magic happens” part where
the import
routine (defined by the module author)
is supposed to take File::Basename::fileparse
and
somehow alias it into the caller’s package so
it’s callable as fileparse
.
Perl provides a lot
of introspection capabilities. Specifically, you can look at the
symbol table (where all subroutines and most variables are named),
see what is defined, and alter those definitions. You saw a bit of
that back in the AUTOLOAD
mechanism earlier. In
fact, as the author of File::Basename
, if you
simply want to force filename
,
basename
, and fileparse
from
the current package into the main
package, you can
write import
like this:
sub import { no strict 'refs'; for (qw(filename basename fileparse)) { *{"main::$_"} = \&$_; } }
Boy, is that cryptic! And limited. What if you
didn’t want fileparse
? What if
you invoked use
in a package other than
main
?
Thankfully, there’s a standard
import
that’s available in the
Exporter
module. As the module author, all you do
is add:
use Exporter; our @ISA = qw(Exporter);
Now
the import
call to the package will inherit upward
to the Exporter
class, providing an
import
routine that knows how to take a list of
subroutines[71] and export
them to the caller’s package.
The
import
provided by Exporter
examines the @EXPORT
variable in the
module’s package to determine which variables are
exported by default. For example, File::Basename
might do something like:
package File::Basename; our @EXPORT = qw( basename dirname fileparse ); use Exporter; our @ISA = qw(Exporter);
The @EXPORT
list both defines a list of available
subroutines for export (the public interface) and provides a default
list to be used when no import list is specified. For example, these
two calls are equivalent:
use File::Basename; BEGIN { require File::Basename; File::Basename->import }
No list is passed to import
. In that case, the
Exporter->import
routine looks at
@EXPORT
and provides everything in the
list.[72]
What if you had subroutines you
didn’t want as part of the default import but would
still be available if requested? You can add those subroutines to the
@EXPORT_OK
list in the module’s
package. For example, suppose that Gilligan’s module
provides the guess_direction_toward
routine by
default but could also provide the
ask_the_skipper_about
and
get_north_from_professor
routines, if requested.
You can start it like this:
package Navigate::SeatOfPants; our @EXPORT = qw(guess_direction_toward); our @EXPORT_OK = qw(ask_the_skipper_about get_north_from_professor); use Exporter; our @ISA = qw(Exporter);
The following invocations would then be valid:
use Navigate::SeatOfPants; # gets guess_direction_toward use Navigate::SeatOfPants qw(guess_direction_toward); # same use Navigate::SeatOfPants qw(guess_direction_toward ask_the_skipper_about); use Navigate::SeatOfPants qw(ask_the_skipper_about get_north_from_professor); ## does NOT import guess_direction_toward!
If any names are specified, they must come from either
@EXPORT
or @EXPORT_OK
, so this
request is rejected by Exporter->import
:
use Navigate::SeatOfPants qw(according_to_GPS);
because
according_to_GPS
is in neither
@EXPORT
nor
@EXPORT_OK
.[73] Thus, with
those two arrays, you have control over your public interface. This
does not stop someone from saying
Navigate::SeatOfPants::according_to_GPS
(if it
existed), but at least now it’s obvious that
they’re using something the module author
didn’t intend to offer them.
As described in the Exporter
manpage, a few
shortcuts are available automatically. You can provide a list that is
the same as asking for the default:
use Navigate::SeatOfPants qw(:DEFAULT);
or the default plus some others:
use Navigate::SeatOfPants qw(:DEFAULT get_north_from_professor);
These are rarely seen in practice. Why? The purpose of explicitly providing an import list generally means you want to control the subroutine names you use in your program. Those last examples do not insulate you from future changes to the module, which may import additional subroutines that could collide with your code.[74]
In a few cases, a module may supply dozens or hundreds of possible
symbols. These modules can use advanced techniques (described in the
Exporter
documentation) to make it easy to import
batches of related symbols. For example, the core
Fcntl
module makes the flock
constants available as a group with the :flock
tag:
use Fcntl qw( :flock ); # import all flock constants
As seen earlier, the normal means of using an object-oriented module is to call class methods and then methods against instances resulting from constructors of that class. This means that an OO module typically exports nothing, so you’ll have:
package My::OOModule::Base; our @EXPORT = ( ); # you may even omit this line use Exporter; our @ISA = qw(Exporter);
As stated in Chapter 8, you can even shorten this down:
package My::OOModule::Base; use base qw(Exporter);
What if you then derive a class from this
base class? The most important thing to remember is that the
import
method must be defined from the
Exporter
class, so you add it like so:
package My::OOModule::Derived; use base qw(Exporter My::OOModule::Base);
However, wouldn’t the call to
My::OOModule::Derived->import
eventually find
its way up to Exporter
via
My::OOModule::Base
? Sure it would. So you can
leave that out:
package My::OOModule::Derived; use base qw(My::OOModule::Base);
Only the base classes at the top of the tree need specify
Exporter
and only when they derive from no other
classes.
Please be aware of all the
other reserved method names that can’t be used by
your OO module (as described in the Exporter
manpage). At the time of this writing, the list is
export_to_level
,
require_version
, and
export_fail
. Also, you may wish to reserve
unimport
because that routine will be called by
replacing use
with no
. That use
is rare for user-written modules, however.
Even though an OO module typically exports nothing, you might choose to export a named constructor or management routine. This routine typically acts a bit like a class method but is meant to be called as a normal routine.
One example can be found
in the LWP
library (on the CPAN). The
URI::URL
module (now deprecated and replaced by
the URI
module) deals with universal resource
identifiers, most commonly seen as URLs such as
http://www.gilligan.crew.hut/maps/island.pdf.
You can construct a URI::URL
object as a
traditional object constructor with:
use URI::URL; my $u = URI::URL->new("http://www.gilligan.crew.hut/maps/island.pdf");
The default import list for
URI::URL
also imports a url
subroutine, which can be used as a constructor as well:
use URI::URL; my $u = url("http://www.gilligan.crew.hut/maps/island.pdf");
Because this imported routine isn’t a class method, you don’t use the arrow method call to invoke it. Also, the routine is unlike anything else in the module: no initial class parameter is passed. Even though normal subroutines and method calls are both defined as subroutines in the package, the caller and the author must agree as to which is which.
The url
convenience routine was nice, initially.
However, it also clashed with the same-name routine in
CGI.pm
, leading to interesting errors (especially
in a mod_perl
setting). (The modern interface in
the URI
module doesn’t export
such a constructor.) Prior to that, in order to prevent a crash, you
had to remember to bring it in as:
use URI::URL ( ); # don't import "url" my $u = URI::URL->new(...);
Let’s use
CGI.pm
as an example of a custom import routine.
Not satisfied with the incredible flexibility of the
Exporter
’s
import
routine, author Lincoln Stein created a
special import
for the CGI
module.[75] If you’ve ever gawked at the dizzying
array of options that can appear after use CGI
,
it’s all a simple matter of programming.
As part of the
extension provided by this custom import
, you can
use the CGI module as an object-oriented module:
use CGI; my $q = CGI->new; # create a query object my $f = $q->param("foo"); # get the foo field
or a function-oriented module:
use CGI qw(param); # import the param function my $f = param("foo"); # get the foo field
If you don’t want to spell out every possible subfunction, bring them all in:
use CGI qw(:all); # define "param" and 800-gazillion others my $f = param("foo");
And then there’s
pragmata available. For example, if you want to disable the normal
sticky field handling, simply add -nosticky
into
the import list:
use CGI qw(-nosticky :all);
If you want to create the start_table
and
end_table
routines, in addition to the others,
it’s simply:
use CGI qw(-nosticky :all *table);
The answers for all exercises can be found in Section A.11.
Take the library you created in Chapter 2 and turn
it into a module you can bring in with use
. Alter
the invoking code so that it uses the imported routines (rather than
the full path), and test it.
[60] As well as a utility
routine, fileparse_set_fstype
.
[61] The new symbols are available for all code compiled in the current package from this point on, whether it’s in this same file or not. However, these symbols won’t be available in a different package.
[62] These routines
pick out the filename and the directory parts of a pathname. For
example, if $some_full_path
were
D:\Projects\Island Rescue\plan
7.rtf
(presumably, the program is running on a
Windows machine), the basename would be
plan
7.rtf
and the
dirname would be D:\Projects\Island Rescue
.
[63] You
don’t need the ampersand in front of any of these
subroutine invocations because the subroutine name is already known
to the compiler following use
.
[64] As you’ll see later in this chapter, the
default list comes from the module’s
@EXPORT
array.
[65] That string might be
something like
/home/gilligan/web_docs/photos/USS_Minnow.gif
on a
Unix system. On a Windows system, it would typically use backslashes
as directory separators. As you can see, this module lets you write
portable code easily, at least where file specs are concerned.
[66] The .pm
portion is defined by the interface
and can’t be changed. Thus, all module filenames
must end in dot-p-m.
[67] Trappable with an eval
, of course.
[68] Again trappable
with eval
.
[69] Thanks to the %INC
hash.
[70]
use lib
also unshifts an
architecture-dependent library below the requested library, making it
more valuable than the explicit counterpart presented earlier.
[71] And variables, although far less common, and arguably the wrong thing to do.
[72] Remember, having no list is not the same as
having an empty list. If the list is empty, the
module’s import
method is simply
not called at all.
[73] This check also
catches misspellings and mistaken subroutine names, keeping you from
wondering why the get_direction_from_professor
routine isn’t working.
[74] For
this reason, it is generally considered a bad idea for an update to a
released module to introduce new default imports. If you know that
your first release is still missing a function, though,
there’s no reason why you can’t put
in a placeholder: sub according_to_GPS { die "not implemented yet" }
.
[75] Some have dubbed this the “Lincoln Loader” out of simultaneous deep respect for Lincoln and the sheer terror of having to deal with something that just doesn’t work like anything else they’ve encountered.