Perl has excellent tools for creating, testing, and distributing modules. Perl’s also good for writing standalone programs that don’t need anything else to be useful, but we don’t have tools for standalone programs as good (or at all) as those for modules. I want my programs to use the module development tools and be testable in the same way as modules. To do this, I restructure my programs to turn them into modulinos.
Other languages aren’t as DWIM (Do What I Mean) as Perl, and they make us
create a top-level subroutine that serves as the starting point for the
application. In C or Java, I have to name this subroutine main
:
/* hello_world.c */ #include <stdio.h> int main ( void ) { printf( "Hello C World!\n" ); return 0; }
Perl, in its desire to be helpful, already knows this and does it
for me. My entire program is the main
routine, which is how Perl ends up with the default package main
. When I run my Perl program, Perl starts to
compile the code it contains as if I had wrapped my main
subroutine around the entire file.
In a module most of the code is in methods or subroutines, so,
unlike a program, most of it doesn’t immediately execute. I have to call a
subroutine to make something happen. Try that with your favorite module;
run it from the command line. In most cases, you won’t see anything
happen. I can use perldoc’s -l
switch to locate the actual module
file:
% perldoc -l Astro::MoonPhase
/perls/perl-5.18.0/lib/site_perl/5.18.0/Astro/Sunrise.pm
When I run it, nothing happens:
% perl /perls/perl-5.18.0/lib/site_perl/5.18.0/Astro/Sunrise.pm
%
I can write my program as a module, then decide at runtime how to treat the code. If I run my file as a program it will act just like a program, but if I include it as a module, perhaps in a test suite, then it won’t run the code and it will wait for me to do something. This way I get the benefit of a standalone program while using the development tools for modules.
My first step takes me backwards in Perl evolution. I need to get
that explicit main
routine back, and
then run it only when I decide I want to run it. For simplicity, I’ll do
this with a “Just another Perl hacker” (JAPH) program, but develop something more complex
later.
Normally, Perl’s version of “Hello World” is simple, but I’ve thrown
in package main
just for fun, and I use
the string “Just another Perl hacker,” instead. I don’t need that for
anything other than reminding the next maintainer what the default package
is. I’ll use this idea later:
#!/usr/bin/perl package main; print "Just another Perl hacker, \n";
Obviously, when I run that program, I get the string as output. I don’t want that in this case, though. I want it to behave more like a module so when I run the file, nothing appears to happen. Perl compiles the code but doesn’t have anything to execute. I wrap the entire program in its own subroutine:
#!/usr/bin/perl package main; sub run { print "Just another Perl hacker, \n"; }
The print
statement won’t run
until I execute the subroutine, and now I have to figure out when to do
that. I have to know how to tell the difference between a program and a
module.
The caller
built-in tells me about the call stack, which lets me know where I am
in Perl’s descent into my program. Programs and modules can use caller
too; I don’t have to use it in a
subroutine. If I use caller
in the top
level of a file I run as a program, it returns nothing because I’m already
at the top level. That’s the root of the entire program. Since I know that
for a file I use as a module caller
returns something, and that when I call the same file as a program
caller
returns nothing, I have what I
need to decide how to act depending on how I employ the module:
#!/usr/bin/perl package main; run() unless caller(); sub run { print "Just another Perl hacker, \n"; }
I’m going to save this program in a file, but now I have to decide
how to name it. Its schizophrenic nature doesn’t suggest a file extension,
but I want to use this file as a module later, so I could go along with
the module file-naming convention, which adds a .pm to the name. That way, I can use
it, and
Perl can find it just as it finds other modules. Still, the terms
program and module get in the
way because it’s really both. It’s not a module in the usual sense,
though, and I think of it as a tiny module, so I call it a
modulino.
Now that I have my terms straight, I save my modulino as Japh.pm. It’s in my current directory, so I
also want to ensure that Perl will look for modules there (i.e., it has
“.” in the search path). I check the behavior of my modulino. First, I use
it as a module. From the command line, I can load a module with the
-M
switch. I use a “null program,”
which I specify with the -e
switch.
When I load it as a module nothing appears to happen:
% perl -MJaph -e 0
%
Perl compiles the module, then goes through the statements it can
execute immediately. It executes caller
, which returns the package name that
loaded my modulino or undef
if I ran it
directly. Since the package name is true, the unless
catches it and doesn’t call run()
. I’ll do more with this in a
moment.
Now I want to run Japh.pm as a
program. This time, caller
returns
nothing because it is at the top level. This fails the unless
check, and so Perl invokes run()
and I see the output. The only difference
is how I called the file. As a module it does module things, and as a
program it does program things. Here I run it as a script and get
output:
% perl Japh.pm
Just another Perl hacker,%
Now that I have the basic framework of a modulino, I can take advantage of its benefits. Since my program doesn’t execute if I include it as a module, I can load it into a test program without it doing anything immediately. I can use all of the Perl testing framework to test programs, too.
If I write my code well—separating things into small subroutines
that only do one thing—I can test each subroutine on its own. Since the
run
subroutine does its work by
printing, I use Test::Output
to capture standard output and compare the result:
use Test::More tests => 2; use Test::Output; use_ok( 'Japh' ); stdout_is( sub{ main::run() }, "Just another Perl hacker, \n" );
This way, I can test each part of my program until I finally put
everything together in my run()
subroutine, which now looks more like what I would expect from a program
in C, where the main
loop calls
everything in the right order.
So far my modulino concept is simple. It checks caller
to see if it’s the top-level program or
if it was loaded by something else. I can choose any condition and any
action though, to make my single file do something else.
Once installed, the tests for Perl modules don’t stick around. The
CPAN client cleans up the test files along with the rest of the
distribution files. What if I want to embed my tests in the code and
have them execute under certain conditions? I can embed the tests in the
module file. The Test::Inline
module does this by embedding testing statements in code;
I’d rather put everything in methods instead. I’ve wanted this for Perl
since I first saw it in Python.
Here’s a small demonstration of the idea. I define some
subroutines that know how to tell if they’re running in a certain
fashion. For the tests, it checks the CPANTEST
environment variable. I use those subroutines to figure out which method
I’ll execute. I still have the run
method as before, but now I also have a test
method. Since I have moved the caller
checks into subroutines, I’ve
introduced another level in the call stack, so I use caller(1)
to look back one level:
package Modulino::Test; use utf8; use strict; use warnings; use v5.10; our $VERSION = '0.10_01'; sub _running_under_tester { !! $ENV{CPANTEST} } sub _running_as_app { ! defined scalar caller(1) } sub _loaded_as_module { defined scalar caller(1); } my $method = do { if( _running_under_tester() ) { 'test' } elsif( _loaded_as_module() ) { say "Loaded as module"; undef } elsif( _running_as_app() ) { 'run' } else { undef } }; __PACKAGE__->$method(@ARGV) if defined $method; sub run { say "Running as program"; }
In the test
method, I get a
list of other methods that I want to run. In this case, those are
methods that start with _test_
. Once
I have all of those method names, I run them through Test::More
’s subtest
method and
call the _test_
method, which I
expect to output proper TAP:
sub test { say "Running as test"; my( $class ) = @_; my @tests = $class->_get_tests; require Test::More; foreach my $test ( @tests ) { Test::More::subtest( $test => sub { my $rc = eval { $class->$test(); 1 }; Test::More::diag( $@ ) unless defined $rc; } ); } Test::More::done_testing(); } sub _get_tests { my( $class ) = @_; no strict 'refs'; my $stub = $class . '::'; my @tests = grep { defined &{"$stub$_"} } grep { 0 == index $_, '_test_' } keys %{ "$stub" }; say "Tests are @tests"; @tests; }
I have one test in this file, and it’s nothing fancy. I use some
Test::More
subroutines that don’t really test anything. This is just a
demonstration that I can make these tests run:
sub _test_run { require Test::More; Test::More::pass(); Test::More::pass(); SKIP: { Test::More::skip( "These tests don't work", 2 ); Test::More::fail(); Test::More::fail(); } } 1;
Putting this all together means I can run this module as a program
with CPANTEST
set to a true
value:
% CPANTEST=1 perl -Ilib lib/Modulino/Test.pm
Running as test
Tests are _test_run
ok 1
ok 2
ok 3 # skip These tests don't work
ok 4 # skip These tests don't work
1..4
ok 1 - _test_run
1..1
Since the tests exist in the module (just as the documentation does), I can run them any time I like, including after dependency upgrades to see if my module still works. For some people, the extra cost of compilation might be worth that; if I had many tests I could store the code in a string and compile it on demand, so I wouldn’t have to compile it for normal runs.
If I want embedded tests, I’m not likely to want to copy the test runner code in every module. I can move most of this into another module that other modules can include.
Because the UNITCHECK
block
isn’t going to work from an included module, I have to adjust my
technique. It’s not as easy to inspect caller
while compiling; I’ll have to wait
until everything is compiled. Not only that, all of the methods in the
module have to be defined by the time the base module wants to test,
since I want to get the test names by looking at the symbol list. I can
use
the common module at the end of the file so it does its
work after everything else is compiled, or I can require
it so it
compiles during the run phase. Here’s what that looks like; it’s the
same code but in a different file and with adjustments to get the right
level of caller
:
package Modulino::Base; use utf8; use strict; no warnings; use vars qw($VERSION); use Carp; our $VERSION = '0.10_01'; sub _running_under_tester { !! $ENV{CPANTEST} } sub _running_as_app { my $caller = scalar caller(1); (defined $caller) && $caller ne 'main'; } # run directly if( ! defined caller(0) ) { carp sprintf "You cannot run %s directly!", __PACKAGE__; } # loaded from a module that was run directly elsif( ! defined caller(1) ) { my @caller = caller(0); my $method = do { if( _running_under_tester() ) { 'test' } elsif( _running_as_app() ) { 'run' } else { undef } }; if( $caller[0]->can( $method ) ) { $caller[0]->$method( @ARGV ); } elsif( __PACKAGE__->can( $method ) ) { # faking inheritance __PACKAGE__->$method( $caller[0], @ARGV ) } else { carp "There is no $method() method defined in $caller[0]\n"; } } sub test { my( $class, $caller ) = @_; my @tests = do { if( $caller->can( '_get_tests' ) ) { $caller->_get_tests; } else { $class->_get_tests( $caller ); } }; require Test::More; Test::More::note( "Running $caller as a test" ); foreach my $test ( @tests ) { Test::More::subtest( $test => sub { my $rc = eval { $caller->$test(); 1 }; Test::More::diag( $@ ) unless defined $rc; } ); } Test::More::done_testing(); } sub _get_tests { my( $class, $caller ) = @_; print "_get_tests class is [$class]\n"; no strict 'refs'; my $stub = $caller . '::'; my @tests = grep { defined &{"$stub$_"} } grep { 0 == index $_, '_test_' } keys %{ "$stub" }; @tests; } 1;
I employ Modulino::Base
with
require
so I can put it at the top of
the file near the rest of the setup:
package Modulino::TestWithBase; use utf8; use strict; use warnings; use v5.10; our $VERSION = '0.10_01'; require Modulino::Base; ...
I check that it still works:
% CPANTEST=1 perl -Ilib lib/Modulino/TestWithBase.pm
_get_tests class is [Modulino::Base]
# Running Modulino::TestWithBase as a test
ok 1
ok 2
ok 3 # skip These tests don't work
ok 4 # skip These tests don't work
1..4
ok 1 - _test_run
1..1
Now that I’ve shown this, I will warn you about it. Many technical
books show things the authors invented for the book, and this is no
different. Most authors, on publishing the book, abandon the invention.
Modulino::Demo
, although on CPAN, is
probably no different. It’s a simple concept that you can reinvent
locally to get exactly what you need.
There are a variety of ways to make a Perl distribution, and we covered these in Chapter 12 of Intermediate Perl. If I start with a program that I already have, I like to use my scriptdist program, which is available on CPAN (and beware, because everyone seems to write this program for themselves at some point). It builds a distribution around the program based on templates I created in ~/.scriptdist, so I can make the distro any way that I like, which also means that you can make yours any way you like, not just my way.
At this point, I need the basic tests and a Makefile.PL to control the whole thing, just as
I do with normal modules. Everything ends up in a directory named after
the program but with .d
appended to it.
I typically don’t use that directory name for anything other than a
temporary placeholder, since I immediately import everything into source
control:
% scriptdist Japh.pm
Quiet is 0
Home directory is /Users/Amelia
RC directory is /Users/Amelia/.scriptdist
Processing Japh.pm...
Install Module::Extract::Use to detect prerequisites
Install Module::Extract::DeclaredMinimumPerl to detect minimum versions
Making directory Japh.pm.d...
Making directory Japh.pm.d/t...
RC directory is /Users/Amelia/.scriptdist
cwd is /Users/Amelia/Desktop
Checking for file [.gitignore]... Adding file [.gitignore]...
Checking for file [.releaserc]... Adding file [.releaserc]...
Checking for file [Changes]... Adding file [Changes]...
Checking for file [MANIFEST.SKIP]... Adding file [MANIFEST.SKIP]...
Checking for file [Makefile.PL]... Adding file [Makefile.PL]...
Checking for file [t/compile.t]... Adding file [t/compile.t]...
Checking for file [t/pod.t]... Adding file [t/pod.t]...
Checking for file [t/test_manifest]... Adding file [t/test_manifest]...
Adding [Japh.pm]...
Copying script...
Opening input [Japh.pm] for output [Japh.pm.d/Japh.pm]
Copied [Japh.pm] with 0 replacements
Creating MANIFEST...
Initialized empty Git repository in /Users/Amelia/Desktop/Japh.pm.d/.git/
[master (root-commit) a799d24] Initial commit by /Users/Amelia/bin/perls/
scriptdist 0.22
10 files changed, 77 insertions(+)
create mode 100644 .gitignore
create mode 100644 .releaserc
create mode 100644 Changes
create mode 100644 Japh.pm
create mode 100644 MANIFEST
create mode 100644 MANIFEST.SKIP
create mode 100644 Makefile.PL
create mode 100644 t/compile.t
create mode 100644 t/pod.t
create mode 100644 t/test_manifest
------------------------------------------------------------------
Remember to push this directory to your source control system.
In fact, why not do that right now?
------------------------------------------------------------------
Inside the Makefile.PL I have
to make only a few minor adjustments to the usual module setup so it
handles things as a program. I put the name of the program in the
anonymous array for EXE_FILES
, and
ExtUtils::MakeMaker
will do the rest. When I run make
install
, the program ends up in the right place (also based on
the PREFIX
setting):
WriteMakefile( 'NAME' => $script_name, 'VERSION' => '0.10', 'EXE_FILES' => [ $script_name ], 'PREREQ_PM' => {}, 'MAN1PODS' => { $script_name => "\$(INST_MAN1DIR)/$script_name.1", }, clean => { FILES => "*.bak $script_name-*" }, );
An advantage of EXE_FILES
is that
ExtUtils::MakeMaker
modifies the shebang line to point to the path of the perl binary that I used to run Makefile.PL. I don’t have to worry about the
location of perl.
Once I have the basic distribution set up, I start off with some basic tests. I’ll spare you the details since you can look in scriptdist to see what it creates. The compile.t test simply ensures that everything at least compiles. If the program doesn’t compile, there’s no sense going on. The pod.t file checks the program documentation for Pod errors (see Chapter 14 for more details on Pod). These are the tests that clear up my most common mistakes (or, at least the ones I made most frequently before I started using these test files with all of my distributions).
Before I get started, I’ll check to ensure everything works correctly. Now that I’m treating my program as a module, I’ll test it every step of the way. The program won’t actually do anything until I run it as a program, though:
% cd Japh.pm.d
% perl Makefile.PL; make test
Checking if your kit is complete... Looks good Writing Makefile for Japh.pm Writing MYMETA.yml and MYMETA.json roscoe_brian[3120]$ make test cp Japh.pm blib/lib/Japh.pm cp Japh.pm blib/script/Japh.pm /usr/bin/perl -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/Japh.pm PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/compile.t .. ok t/pod.t ...... ok All tests successful. Files=2, Tests=3, 0 wallclock secs ( 0.04 usr 0.02 sys + 0.13 cusr 0.02 csys = 0.21 CPU) Result: PASS
Now that I have all of the infrastructure in place, I want to further develop the program. Since I’m treating it as a module, I want to add subroutines that I can call when I want it to do the work. These subroutines should be small and easy to test. I might even be able to reuse these subroutines by simply including my modulino in another program. It’s just a module, after all, so why shouldn’t other programs use it?
First, I move away from a hardcoded message. I’ll do this in baby
steps to illustrate the development of the modulino, and the first thing
I’ll do is move the actual message to its own subroutine. That hides the
message to print behind an interface, and later I’ll change how I get
the message without having to change the run
subroutine. I’ll also be able to test
message
separately. At the same time,
I’ll put the entire program in its own package, which I’ll call Japh
. That helps compartmentalize anything I
do when I want to test the modulino or use it in another program:
#!/usr/bin/perl package Japh; run() unless caller(); sub run { print message(), "\n"; } sub message { 'Just another Perl hacker, '; }
I can add another test file to the t/ directory now. My first test is simple. I
check that I can use
the
modulino and that my new subroutine is there. I won’t get into testing
the actual message yet, since I’m about to change that:
# message.t use Test::More tests => 4; use_ok( 'Japh' ); ok( defined &Japh::message );
Now I want to be able to configure the message. At the moment it’s in English, but maybe I don’t always want that. How am I going to get the message in other languages? I could do all sorts of fancy internationalization things, but for simplicity I’ll create a file that contains the language, the template string for that language, and the locales for that language. Here’s a configuration file that maps the locales to a template string for that language:
en_US "Just another %s hacker, " eu_ES "apenas otro hacker del %s, " fr_FR "juste un autre hacker de %s, " de_DE "gerade ein anderer %s Hacker, " it_IT "appena un altro hacker del %s, "
I add some bits to read the language file. I need to add a
subroutine to read the file and return a data structure based on the
information, and my message
routine
has to pick the correct template. Since message
is now returning a template string, I
need run
to use sprintf
instead. I also add another
subroutine, topic
, to return the type
of hacker I am. I won’t branch out into the various ways I can get the
topic, although you can see how I’m moving the program away from doing
(or saying) one thing to making it much more flexible:
sub run { my $template = get_template(); print message( $template ), "\n"; } sub message { my $template = shift; return sprintf $template, get_topic(); } sub get_topic { 'Perl' } sub get_template { ... shown later ... }
I can add some tests to ensure that my new subroutines still work and also check that the previous tests still work.
Being quite pleased with myself that my modulino now works in many
languages and that the message is configurable, I’m disappointed to find
out that I’ve just introduced a possible problem. Since the user can
decide the format string, he can do anything that printf
allows him to do, and that’s quite a
bit. I’m using user-defined data to run the program, so I should really
turn on taint checking (see Chapter 2), but even
better than that, I should get away from the problem rather than trying
to put a bandage on it.
Instead of printf
, I’ll use the
Template
module. My format strings will turn into templates:
en_US "Just another [% topic %] hacker, " eu_ES "apenas otro hacker del [% topic %], " fr_FR "juste un autre hacker de [% topic %], " de_DE "gerade ein anderer [% topic %] Hacker, " it_IT "Solo un altro hacker del [% topic %], "
Inside my modulino, I’ll include the Template
module and configure the Template
parser so it doesn’t evaluate Perl code. I only need to change message
, because nothing else needs to know
how message
does its work:
sub message { my $template = shift; require Template; my $tt = Template->new( INCLUDE_PATH => '', INTERPOLATE => 0, EVAL_PERL => 0, ); $tt->process( \$template, { topic => get_topic() }, \ my $cooked ); return $cooked; }
Now I have a bit of work to do on the distribution side. My
modulino now depends on Template
,
so I need to add that to the list of prerequisites. This way, CPAN
(or CPANPLUS
) will automatically detect the dependency and install it
as it installs my modulino. That’s just another benefit of wrapping the
program in a distribution:
WriteMakefile( ... 'PREREQ_PM' => { Template => '0', }, ... );
What happens if there is no configuration file, though? My
message
subroutine should still do
something, so I give it a default message from get_template
, but I also issue a warning if I
have warnings enabled:
use File::Spec::Functions qw(catfile); use Carp qw(carp); sub get_template { my $default = "Just another [% topic %] hacker, "; my $file = catfile( qw( t config.txt) ); my $fh; unless( open $fh, '<', $file ) { carp "Could not open '$file'"; return $default; } my $locale = shift || 'en_US'; while( <$fh> ) { chomp; my( $this_locale, $template ) = m/(\S+)\s+"(.*?)"/g; return $template if $this_locale eq $locale; } return $default; }
You know the drill by now: the new additions to the program require more tests. Again, I’ll leave that up to you.
Finally, I need to test the whole thing as a program. I’ve tested
the bits and pieces individually, but do they all work together? To find
out, I use the Test::Output
module to run an external command and capture the output.
I’ll compare that with what I expect. How I do this for programs depends
on what the particular program is supposed to actually do. To run my
program inside the test file, I wrap it in a subroutine and use the
value of $^X
for the perl binary I should use (that will be the
same perl binary that’s running the
tests):
#!/usr/bin/perl use File::Spec::Functions qw(catfile); use Test::More 'no_plan'; use Test::Output; my $script = catfile( qw(blib script Japh.pm ) ); sub run_program { print `$^X $script`; } { # test for US English local %ENV; $ENV{LANG} = 'en_US'; stdout_is( \&run_program, "Just another Perl hacker, \n" ); } { # test for Spanish local %ENV; $ENV{LANG} = 'eu_ES'; stdout_is( \&run_program, "apenas otro hacker del Perl, \n" ); } { # test with no LANG setting local %ENV; delete $ENV{LANG}; stdout_is( \&run_program, "Just another Perl hacker, \n" ); } { # test with nonsense LANG setting local %ENV; $ENV{LANG} = 'blah blah'; stdout_is( \&run_program, "Just another Perl hacker, \n" ); }
Once I create the program distribution, I can upload it to CPAN (or
anywhere else I like) so other people can download it. To create the
archive, I do the same thing I do for modules. First, I run make disttest
, which creates a distribution,
unwraps it in a new directory, and runs the tests. That ensures that the
archive I give out has the necessary files and everything runs properly
(well, most of the time):
% make disttest
After that, I create the archive in whichever format I like:
% make tardist
==OR==% make zipdist
Finally, I upload it to PAUSE and announce it to the world. In real
life, however, I use my release utility that
comes with Module::Release
and this (and much more) all happens in one step.
As a module living on CPAN, my modulino is a candidate for review by CPAN Testers, the loosely connected group of volunteers and automated computers that test just about every module. They don’t test programs, but our modulino doesn’t look like a program.
There is a little-known area of CPAN called “scripts” where people have uploaded stand-alone programs without full distribution support. Kurt Starsinic did some work on it to automatically index programs by category, and his solution simply looks in the program’s Pod documentation for a section called “SCRIPT CATEGORIES”. If I wanted, I could add my own categories to that section, and the programs archive should automatically index those on its next pass:
=pod SCRIPT CATEGORIES CPAN/Administrative =cut
I can create programs that look like modules. The entire program (outside of third-party modules) exists in a single file. Although it runs just like any other program, I can develop and test it just like a module. I get all the benefits of both forms, including testability, dependency handling, and installation. Since my program is a module, I can easily reuse parts of it in other programs, too.
“How a Script Becomes a Module” originally appeared on PerlMonks.
I also wrote about this idea for The Perl Journal in “Scripts as Modules”. Although it’s the same idea, I chose a completely different topic: turning the RSS feed from The Perl Journal into HTML.
I created scriptdist for “Automating Distributions with scriptdist”.
Denis Kosykh wrote “Test Driven Development” for The Perl Review 1.0 (Fall 2004).
Check out some selected modulinos on CPAN: diagnostics
(and its program name, splain), Net::MAC::Vendor
, CPAN::Reporter::PrereqCheck
, and App::Smbxfer
.