As briefly described in Chapter 13, a distribution contains a testing facility
invoked from make test
. This testing facility
permits a module author to write and run tests during development and
maintenance and the ultimate module installer to verify that the
module works in the new environment.
Why have tests during development? One emerging school of thought states that the tests should be written first, even before the module is created, as a reflection of the module’s specification. Of course, the initial test run against the unwritten module will show nearly complete failure. However, as functionality is added, proper functionality is verified immediately. (It’s also handy to invoke the tests frequently as you code to make sure you’re getting closer to the goal, not breaking more things.)
Certainly, errors may be found in the test suite. However, the defect rate for tests are usually far lower than the defect rate for complex module code; if a test fails, it’s usually a good indication that there’s more work to be done.
But even when Version 1.0 of the module is finally shipped, there’s no need to abandon the test suite. Unless You code the mythical “bug-free module,” there will be bug reports. Each bug report can (and should) be turned into a test.[92] While fixing the bug, the remaining tests prevent regression to a less functional version of the code—hence the name regression testing.
Then there’s always the 1.1 or 2.0 releases to think about. When you want to add functionality, start by adding tests.[93] Because the existing tests ensure your upward compatibility, you can be confident that your new release does everything the old release did, and then some.
Good tests also give small examples of what you meant in your documentation, in case your writing isn’t clear.[94] Good tests also give confidence to the installer that this code is portable enough to work on both your system and his system, including all stated and unstated dependencies.
Testing is an art. Dozens of how-to-test books have been written and read, and often ignored. Mostly, it’s important to remember everything you have ever done wrong while programming (or heard other people do), and then test that you didn’t do it again for this project.
Test things that should break (throw exceptions or return false
values) as well as things that should work. Test the edges. Test the
middle. Test one more or one less than the edge. Test things one at a
time. Test many things at once. If something should throw an
exception, make sure it didn’t also negatively
affect the state of the world before it threw the exception. Pass
extra parameters. Pass insufficient parameters. Mess up the
capitalization on named parameters. Throw far too much data at it.
Throw far too little. Test what happens for undef
.
And so on.
For example, suppose that you want to
test Perl’s sqrt
function, which
calculates square roots. It’s obvious that you need
to make sure it returns the right values when its parameter is
0
, 1
, 49
, or
100
. It’s nearly as obvious to
see that sqrt(0.25)
should come out to be
0.5
. You should also ensure that multiplying the
value for sqrt(7)
by itself gives something
between 6.99999
and
7.00001
.[95] You should make sure that
sqrt(-1)
yields a fatal error and that
sqrt(-100)
does too. See what happens when you
request sqrt(&test_sub( ))
, and
&test_sub
returns a string of
"10000
“. What does sqrt(undef)
do? How about sqrt( )
or
sqrt(1,1)
? Maybe you want to give your function a
googol: sqrt(
'1
'
. '0
' x
100
)
. Because this function is
documented to work on $_
by default, you should
ensure that it does so. Even a simple function such as
sqrt
should get a couple of dozen tests; if your
code does more complex tasks than sqrt
does,
expect it to need more tests, too. There are never too many tests.
If you write the code and not just
the tests, think about how to get every line of your code exercised
at least once for full code coverage. (Are you testing the
else
clause? Are you testing every
elsif
case?) If you aren’t
writing the code or aren’t sure, use the code
coverage facilities.[96]
Check out other test suites. The Perl distribution itself comes with thousands of tests, designed to verify that Perl compiles correctly on your machine in every possible way. Michael Schwern earned the title of “Perl Test Master” for getting the Perl core completely tested, and, still constantly beats the drum for “test! test! test!” in the community.
In summary, please write tests. Let’s see how this is done.
Tests are usually invoked (either for
the developer or the installer) using make
test
. The Makefile
invokes the
test harness, which eventually gets around to using the
Test::Harness
module to run the tests.
Each test lives in a separate
.t
file in the t
directory at
the top level of the distribution. Each test is invoked separately,
so an exit
or die
terminates
only that test file, not the whole testing process.
The test file communicates with the test harness through simple messages on standard output. The three most important messages are the test count, a success message, and a failure message.
An individual test file consists of one or more tests. These tests
are numbered as small integers starting with one. The first thing a
test file must announce to the test harness (on
STDOUT
) is the expected test number range, as a
string 1.
.n. For example, if
there are 17 tests, the first line of output should be:
1..17
followed by a newline. The test harness uses the upper number here to
verify that the test file hasn’t just terminated
early. If the test file is testing optional things and has no testing
to do for this particular invocation, the string
1..0
suffices.
After the header, individual successes and failures are indicated by
messages of the form ok
N and
not ok
N. For example,
here’s a test of basic arithmetic. First, print the
header:
print "1..4\n"; # the header
Now test that 1 plus 2 is 3:
if (1 + 2 == 3) { print "ok 1\n"; # first test is OK } else { print "not ok 1\n"; # first test failed }
You can also print the not
if the test
failed.[97]
Don’t forget the space!
print "not " unless 2 * 4 == 8; print "ok 2\n";
You could perhaps test that the results are close enough (important when dealing with floating-point values):
my $divide = 5 / 3; print "not " if abs($divide - 1.666667) > 0.001; # too much error print "ok 3\n";
Finally, you may want to deal with potential portability problems:
my $subtract = -3 + 3; print +(($subtract eq "0" or $subtract eq "-0") ? "ok 4" : "not ok 4"), "\n";
As you can see, there are many styles
for writing the tests. In ancient Perl development, you saw many
examples of each style. Thanks to Michael Schwern and
chromatic
and the other Perl Testing Cabal
members, you can now write these much more simply, using
Test::Simple
.
The
Test::Simple
module is included with the Perl
distribution, starting in Perl 5.8.[98]
Test::Simple
automates
the boring task of writing “ok 1”,
“ok 2”, “ok
3”, and so on, in your program.
Test::Simple
exports one subroutine, called
(appropriately) ok
. It’s best
illustrated by example. For the earlier code, you can rewrite it as:
use Test::Simple tests => 4; ok(1 + 2 == 3, '1 + 2 == 3'); ok(2 * 4 == 8, '2 * 4 == 8'); my $divide = 5 / 3; ok(abs($divide - 1.666667) < 0.001, '5 / 3 == (approx) 1.666667'); my $subtract = -3 + 3; ok(($subtract eq "0" or $subtract eq "-0"), '-3 + 3 == 0');
Ahh. So much simpler. The use
not only pulls the
module in but also defines the number of tests. This generates the
1..4
header. Each ok
test
evaluates its first argument. If the argument is true, it prints the
proper ok
message. If not, it prints the proper
not
ok
message. For this
particular example, the output looks like:[99]
1..4 ok 1 - 1 + 2 == 3 ok 2 - 2 * 4 == 8 ok 3 - 5 / 3 == (approx) 1.666667 ok 4 - -3 + 3 == 0
The ok
N
messages are followed
with the labels given as the second parameters. This is great for
identifying each test, especially because the numbers 1 through 4
don’t appear in the original test anymore. The test
harness ignores this information, unless you invoke
make
test
with
make
test
TEST_VERBOSE=1
, in which case, the information is
displayed for each test.
What if a test fails? If you change the first test to
1
+
2
==
4
, you get:
1..4 not ok 1 - 1 + 2 == 4 # Failed test (1.t at line 4) ok 2 - 2 * 4 == 8 ok 3 - 5 / 3 == (approx) 1.666667 ok 4 - -3 + 3 == 0 # Looks like you failed 1 tests of 4.
The ok
1
became not
ok
1
. But also notice the extra
message indicating the failed test, including its file and line
number. Messages preceded by a pound-sign comment marker are merely
comments, and are (mostly) ignored by the test harness.
For many people,
Test::Simple
is simple enough to use for a wide
range of tests. However, as your Perl hackery evolves,
you’ll want to step up to the next level of Perl
testing hackery as well, with Test::More
.
Like Test::Simple
,
Test::More
is included with the distribution
starting with Perl 5.8. The Test::More
module is
upward-compatible with Test::Simple
, so you can
simply change the module name to start using it. In this example so
far, you can use:
use Test::More tests => 4; ok(1 + 2 == 3, '1 + 2 == 3'); ok(2 * 4 == 8, '2 * 4 == 8'); my $divide = 5 / 3; ok(abs($divide - 1.666667) < 0.001, '5 / 3 == (approx) 1.666667'); my $subtract = -3 + 3; ok(($subtract eq "0" or $subtract eq "-0"), '-3 + 3 == 0');
You get nearly the same output you got with
Test::Simple
, but there’s that
nasty little 4
constant in the first line.
That’s fine once shipping the code, but if
you’re testing, retesting, and adding more tests, it
can be a bit painful to keep the number in sync with the data. You
can change that to no_plan
,[100] as in:
use Test::More "no_plan"; # during development ok(1 + 2 == 3, '1 + 2 == 3'); ok(2 * 4 == 8, '2 * 4 == 8'); my $divide = 5 / 3; ok(abs($divide - 1.666667) < 0.001, '5 / 3 == (approx) 1.666667'); my $subtract = -3 + 3; ok(($subtract eq "0" or $subtract eq "-0"), '-3 + 3 == 0');
The output is now rearranged:
ok 1 - 1 + 2 == 3 ok 2 - 2 * 4 == 8 ok 3 - 5 / 3 == (approx) 1.666667 ok 4 - -3 + 3 == 0 1..4
Note that the number of tests are now at the end. The test harness knows that if it doesn’t see a header, it’s expecting a footer. If the number of tests disagree or there’s no footer (and no header), it’s a broken result. You can use this while developing, but be sure to put the final number of tests in the script before you ship it as real code.
But wait:
there’s more (to Test::More
).
Instead of a simple yes/no, you can ask if two values are the same:
use Test::More "no_plan"; is(1 + 2, 3, '1 + 2 is 3'); is(2 * 4, 8, '2 * 4 is 8');
Note that you’ve gotten rid of numeric equality and instead asked if “this is that.” On a successful test, this doesn’t give much advantage, but on a failed test, you get much more interesting output. The result of this:
use Test::More "no_plan"; is(1 + 2, 3, '1 + 2 is 3'); is(2 * 4, 6, '2 * 4 is 6');
is the interesting:
ok 1 - 1 + 2 is 3 not ok 2 - 2 * 4 is 6 # Failed test (1.t at line 4) # got: '8' # expected: '6' 1..2 # Looks like you failed 1 tests of 2.
Of course, this is an
error in the test, but note that the output told you what happened:
you got an 8
but were expecting a
6
.[101]
This is far better than just “something went
wrong” as before. There’s also a
corresponding isnt( )
when you want to compare for
inequality rather than equality.
What about that third test, where
the value had to be less than a tolerance? Well, just use the
cmp_ok
routine instead:
use Test::More "no_plan"; my $divide = 5 / 3; cmp_ok(abs($divide - 1.666667), '<' , 0.001, '5 / 3 should be (approx) 1.666667');
If the test given in the second argument fails between the first and third arguments, then you get a descriptive error message with both of the values and the comparison, rather than a simple pass/fail value as before.
How about that last test? You
wanted to see if the result was a 0 or minus 0 (on the rare systems
that give back a minus 0). You can do that with the
like
function:
use Test::More "no_plan"; my $subtract = -3 + 3; like($subtract, qr/^-?0$/, '-3 + 3 == 0');
Here, you’ll take
the string form of the first argument and attempt to match it against
the second argument. The second argument is typically a regular
expression object (created here with qr
) but can
also be a simple string, which is converted to a regular expression
object. The string form can even be written as if it was (almost) a
regular expression:
like($subtract, q/^-?0$/, '-3 + 3 == 0');
The advantage to using the string form is that it is portable back to older Perls.[102]
If the match succeeds, it’s a good test. If not, the
original string and the regex are reported along with the test
failure. You can change like
to
unlike
if you expect the match to fail instead.
For object-oriented modules, you might
want to ensure that object creation has succeeded. For this,
isa_ok
and can_ok
give good
interface tests:
use Test::More "no_plan"; use Horse; my $trigger = Horse->named("Trigger"); isa_ok($trigger, "Horse"); isa_ok($trigger, "Animal"); can_ok($trigger, $_) for qw(eat color);
This results in:
ok 1 - The object isa Horse ok 2 - The object isa Animal ok 3 - Horse->can('eat') ok 4 - Horse->can('color') 1..4
Here you’re testing that it’s a horse, but also that it’s an animal, and that it can both eat and return a color.[103]
You could further test to ensure that each horse has a unique name:
use Test::More "no_plan"; use Horse; my $trigger = Horse->named("Trigger"); isa_ok($trigger, "Horse"); my $tv_horse = Horse->named("Mr. Ed"); isa_ok($tv_horse, "Horse"); # Did making a second horse affect the name of the first horse? is($trigger->name, "Trigger", "Trigger's name is correct"); is($tv_horse->name, "Mr. Ed", "Mr. Ed's name is correct"); is(Horse->name, "a generic Horse");
The output of this is:
ok 1 - The object isa Horse ok 2 - The object isa Horse ok 3 - Trigger's name is correct ok 4 - Mr. Ed's name is correct not ok 5 # Failed test (1.t at line 13) # got: 'an unnamed Horse' # expected: 'a generic Horse' 1..5 # Looks like you failed 1 tests of 5.
Oops! Look at that. You wrote a
generic
Horse
, but the string
really is an
unnamed Horse
.
That’s an error in the test, not in the module, so
you should correct that test error and retry. Unless, of course, the
module’s spec actually called for
'a
generic Horse
‘.
Again, don’t be afraid to just write the tests and test the module. If you get either one wrong, the other will generally catch it.
Even the use
can be
tested by Test::More
:
use Test::More "no_plan"; BEGIN { use_ok("Horse") } my $trigger = Horse->named("Trigger"); isa_ok($trigger, "Horse"); # .. other tests as before ..
The difference between doing this as a test and doing it as a simple
use
is that the test won’t
completely abort if the use
fails, although many
other tests are likely to fail as well. It’s also
counted as one of the tests, so you get a “test
succeeded” for free even if all it does is compile
properly to help pad your success numbers for the weekly status
report.
The use
is placed
inside a BEGIN
block so any exported subroutines
are properly declared for the rest of the program, as recommended by
the documentation. For most object-oriented modules, this
won’t matter because they don’t
export subroutines.
If you write tests directly from the
specification before you’ve written the code, the
tests are expected to fail. You can include some of your tests inside
a TODO
block to include them for test count but
denote them as unavailable at the same time. For example, suppose you
haven’t taught your horses how to talk yet:
use Test::More 'no_plan'; use_ok("Horse"); my $tv_horse = Horse->named("Mr. Ed"); TODO: { local $TODO = "haven't taught Horses to talk yet"; can_ok($tv_horse, "talk"); # he can talk! } is($tv_horse->name, "Mr. Ed", "I am Mr. Ed!");
Here, the test is inside a
TODO
block, setting a package
$TODO
variable with the reason why the items are
unfinished:[104]
ok 1 - use Horse; not ok 2 - Horse->can('talk') # TODO haven't taught Horses to talk yet # Failed (TODO) test (1.t at line 7) # Horse->can('talk') failed ok 3 - I am Mr. Ed! 1..3
Note that the TODO
test counts toward the total
number of tests. Also note that the message about why the test is a
TODO
test is displayed as a comment. The comment
has a special form, noted by the test harness, so you will see it
during a make
test
run.
You can have multiple TODO
tests in a given block,
but only one reason per block, so it’s best to group
things that are related but use different blocks for different
reasons.
Initially, the h2xs
program gives you a single testing file,
t/1.t
.[105] You can stick all your
tests into this file, but it generally makes more sense to break the
tests into logical groups.
The easiest way to add additional tests is to create
t/2.t
. That’s it—just bump
the 1 to a 2. You don’t need to change anything in
the Makefile.PL
or in the test harness: the file
is noticed and executed automatically.
You can keep adding files until you get to 9.t
,
but once you add 10.t
, you might notice that it
gets executed between 1.t
and
2.t
. Why? Because the tests are always executed in
sorted order. This is a good thing because it lets you ensure that
the most fundamental tests are executed before the more exotic tests,
simply by controlling the names.
Many people choose to rename the files to reflect a specific ordering
and purpose by using names like 01-core.t
,
02-basic.t
, 03-advanced.t
,
04-saving.t
, and so on. The first two digits
control the testing order, while the rest of the name gives a hint
about the general area of testing. Whatever plan you decide to use,
stick with it, document it if necessary, and remember that the
default order is controlled by the name.
One advantage to using
the ok( )
functions (and friends) is that they
don’t write to STDOUT
directly,
but to a filehandle secretly duplicated from
STDOUT
when your test script begins. If you
don’t change STDOUT
in your
program, of course, this is a moot point. But let’s
say you wanted to write test a routine that writes something to
STDOUT
, such as making sure a horse eats properly:
use Test::More 'no_plan'; use_ok 'Horse'; isa_ok(my $trigger = Horse->named('Trigger'), 'Horse'); open STDOUT, ">test.out" or die; $trigger->eat("hay"); close STDOUT; open T, "test.out" or die; my @contents = <T>; close T; is(join("", @contents), "Trigger eats hay.\n", "Trigger ate properly"); END { unlink "test.out" } # clean up after the horses
Note that just before you start
testing the eat
method, you (re-)open
STDOUT
to your temporary output file. The output
from this method ends up in the test.out
file.
Bring the contents of that file in and give it to the is( )
function. Even though you’ve closed
STDOUT
, the is( )
function can
still access the original STDOUT
, and thus the
test harness sees the proper ok
or not ok
messages.
If you create temporary files like this,
please note that your current directory is the same as the test
script (even if you’re running make test
from the parent directory). Also pick fairly safe
cross-platform names if you want people to be able to use and test
your module portably.
The answers for all exercises can be found in Section A.12.
Write a module distribution, starting from the tests first.
Your goal is to create a module My::List::Util
that exports two routines on request: sum
and
shuffle
. The sum
routine takes
a list of values and returns the numeric sum. The
shuffle
routine takes a list of values and
randomly shuffles the ordering, returning the list.
Start with sum
. Write the tests, and then add the
code. You’ll know you’re done when
the tests pass. Now include tests for shuffle
, and
then add the implementation for shuffle.
Be sure to update the documentation and MANIFEST
file as you go along.
If you can pair up with someone on this exercise, even better. One of
you writes the test for sum
and the implementation
code for shuffle
, and the other does the opposite.
Swap the t/*
files, and see if you can locate any
errors!
[92] If you’re reporting a bug in someone else’s code, you can generally assume that sending them a test for the bug will be appreciated. A patch would be appreciated even more!
[93] And writing the documentation at the same
time, made easier by Test::Inline
, as
you’ll see later.
[94] Many modules we’ve used from the CPAN were documented more by test examples than by the actual POD. Of course, any really good example should be repeated in your module’s POD documentation.
[95] Remember, floating-point numbers aren’t always exact; there’s usually a little roundoff. Feel free to write your tests to require more precision than this test implies but don’t require more precision than you can get on another machine!
[96] Basic code coverage tools such
as Devel::Cover
are found in the CPAN.
[97] On some platforms, this may fail
unnecessarily. For maximum portability, print the entire string of
ok N
or not ok N
in one
print
step.
[98] Older Perl versions back to 5.004_03 can install the same module from the CPAN.
[99] Don’t be misled when reading the mathematics
of the output. The first number and the dash on each
ok
line are just labels; Perl
isn’t telling you that 1
-
1
+
2
==
3
!
[100] You
can do this with Test::Simple
as well.
[101] More precisely: you got an
'8
' but were expecting a '6
‘.
Did you notice that these are strings? The is
test
checks for string equality. If you don’t want that,
just build an ok
test instead. Or try
cmp_ok
, coming up in a moment.
[102] The qr//
form
wasn’t introduced until Perl 5.005.
[103] Well,
you’re testing to see that it
can('eat')
and can('color')
.
You haven’t checked whether it really can use those
method calls to do what you want!
[104]
TODO
tests require
Test::Harness
Version 2.0 or later, which comes
with Perl 5.8, but in earlier releases, they have to be installed
from the CPAN .
[105] As of Perl 5.8, that is.
Earlier versions create a test.pl
file, which is
still run from a test harness during make
test
, but the output wasn’t
captured in the same way.