Appendix A. Answers to Exercises

This appendix contains the answers to the exercises presented throughout the book.

They’re all referring to the same thing, except for the second one, ${$ginger[2]}[1]. That one is the same as $ginger[2][1], whose base is the array @ginger, rather than the scalar $ginger.

The curly braces of the anonymous hash constructor make a reference to a hash. That’s a scalar (as are all references), so it’s not suitable to use alone as the value of a hash. Perhaps this code’s author intended to assign to scalar variables (like $passenger_1 and $passenger_2) instead of to hashes. But you can fix the problem simply by changing the two pairs of curly braces to parentheses.

If you tried running this, Perl may have given you a helpful diagnostic message as a warning. If you didn’t get the warning, perhaps you didn’t have warnings turned on, either with the - w switch or with the use warnings pragma. Even if you don’t usually use Perl’s warnings, you should enable them during debugging. (How long would it take you to debug this without Perl’s warnings to help you? How long would it take to enable Perl’s warnings? ‘Nuff said.)

What if you got the warning message but couldn’t tell what it meant? That’s what the perldiag manpage is for. Warning texts need to be concise because they’re compiled into the perl binary (the program that runs your Perl code). But perldiag should list all the messages you should ever get from Perl, along with a longer explanation of what each one means, why it’s a problem, and how to fix it.

If you want to be ultimately lazy, you can add use diagnostics; at the beginning of your program, and any error message will look itself up in the documentation and display the entire detailed message. Don’t leave this in production code, however, unless you like burning a lot of CPU cycles every time your program starts, whether or not an error occurs.

You will be keeping count of how much data has been sent to all machines, so at the start, set the variable $all to a name that will stand in for all of them. It should be a name that will never be used for any real machine, of course. Storing it in a variable is convenient for writing the program and makes it easy to change later.

my $all = "**all machines**";

The input loop is nearly the same as given in the chapter, but it skips comment lines. Also, it keeps a second running total, filed under $all.

my %total_bytes;
while (<>) {
  next if /^#/;
  my ($source, $destination, $bytes) = split;
  $total_bytes{$source}{$destination} += $bytes;
  $total_bytes{$source}{$all} += $bytes;
}

Next, make a sorted list. This holds the names of the source machines in descending order of total transferred bytes. This list is used for the outer for loop. (Rather than using a temporary array, @sources, you might have put the sort directly into the parens of the for loop.)

my @sources =
  sort { $total_bytes{$b}{$all} <=> $total_bytes{$a}{$all} }
  keys %total_bytes;

for my $source (@sources) {
  my @destinations = 
    sort { $total_bytes{$source}{$b} <=> $total_bytes{$source}{$a} }
    keys %{ $total_bytes{$source} };
  print "$source: $total_bytes{$source}{$all} total bytes sent\n";
  for my $destination (@destinations) {
    next if $destination eq $all;
    print "  $source => $destination:",
      " $total_bytes{$source}{$destination} bytes\n";
  }
  print "\n";
}

Inside the loop, print out the total number of bytes sent from that source machine, then make sorted list of the destination files (similar to the list in @sources). As you step through that list, use next to skip over the dummy $all item. Because that item will be at the head of the sorted list, why wasn’t shift used to discard it, since that would avoid checking repeatedly for $all inside the inner loop? The answer is in this footnote.[107]

You can simplify this program, perhaps. The subexpression $total_bytes{$source} is used many times in the large output for loop (and twice in the input loop as well). That can be replaced by a simple scalar, initialized at the top of the loop:

for my $source (@sources) {
  my $tb = $total_bytes{$source};
  my @destinations = sort { $tb{$b} <=> $tb{$a} } keys %$tb;
  print "$source: $tb{$all} total bytes sent\n";
  for my $destination (@destinations) {
    next if $destination eq $all;
    print "  $source => $destination: $tb{$destination} bytes\n";
  }
  print "\n";
}

This makes the code shorter and (likely) a bit faster as well. Give yourself extra credit if you thought to do this. Also give yourself extra credit if you thought that it might be too confusing and decided not to make the change.

use Storable;

my $all = "**all machines**";
my $data_file = "total_bytes.data";

my %total_bytes;
if (-e $data_file) {
  my $data = retrieve $data_file;
  %total_bytes = %$data;
}

while (<>) {
  next if /^#/;
  my ($source, $destination, $bytes) = split;
  $total_bytes{$source}{$destination} += $bytes;
  $total_bytes{$source}{$all} += $bytes;
}

store \%total_bytes, $data_file;

### remainder of program is unchanged

This is similar to what you saw in Chapter 4, but now it uses Storable.

Near the top, put the filename into a variable. You can then retrieve the data but only if the data file already exists.

After reading the data, use Storable again to write it back out to the same disk file.

If you chose to write the hash’s data to a file the hard way, by writing your own code and your own file format, you’re working too hard. More to the point, unless you’re extraordinarily talented or spend way too long on this exercise, you almost certainly have bugs in your serialization routines, or at least flaws in your file format.

There should probably be some checks to ensure that Storable was successful. It will catch some errors (and die), but it will simply return undef for some. See the documentation for Storable. (Of course, if you checked the return values from store and retrieve, you should give yourself extra credit on the previous exercise.)

The program should save the old data file (if any) under a backup filename so that it’s easy to revert the latest additions. In fact, it could even keep several backups, such as the last week’s worth.

It might also be nice to be able to print the output without having any new input data. As it’s written, this can be done by giving an empty file (such as /dev/null) as the input. However there should be an easier way. The output functionality could be separated entirely from the updating, in fact.

sub data_for_path {
  my $path = shift;
  if (-f $path or -l $path) {
    return undef;
  }
  if (-d $path) {
    my %directory;
    opendir PATH, $path or die "Cannot opendir $path: $!";
    my @names = readdir PATH;
    closedir PATH;
    for my $name (@names) {
      next if $name eq "." or $name eq "..";
      $directory{$name} = data_for_path("$path/$name");
    }
    return \%directory;
  }
  warn "$path is neither a file nor a directory\n";
  return undef;
}

sub dump_data_for_path {
  my $path = shift;
  my $data = shift;
  my $prefix = shift || "";
  print "$prefix$path";
  if (not defined $data) { # plain file
    print "\n";
    return;
  }
  my %directory = %$data;
  if (%directory) {
    print ", with contents of:\n";
    for (sort keys %directory) {
      dump_data_for_path($_, $directory{$_}, "$prefix  ");
    }
  } else {
    print ", an empty directory\n";
  }
}
dump_data_for_path(".", data_for_path("."));

By adding a third (prefix) parameter to the dumping subroutine, you can ask it to indent its output. By default, the prefix is empty, of course.

When the subroutine calls itself, it adds two spaces to the end of the prefix. Why the end and not the beginning? Because it’s comprised of spaces, either end will work. By using trailing spaces, you can call the subroutine like this:

dump_data_for_path(".", data_for_path("."), ">  ");

This invocation quotes the entire output by prefixing each line with the given string. You can (in some hypothetical future version of this program) use such quoting to denote NFS-mounted directories, or other special items.

Here’s one way to do it. First, create the base class of LivingCreature with a single speak method:

use strict;
{ package LivingCreature;
  sub speak {
    my $class = shift;
    if (@_) {                # something to say
      print "a $class goes '@_'\n";
    } else {
      print "a $class goes ", $class->sound, "\n";
    }
  }
}

A person is a living creature, so define the derived class here:

{ package Person;
  our @ISA = qw(LivingCreature);
  sub sound { "hmmmm" }
}

The Animal class comes next, making appropriate sounds, but unable to talk (except to Dr. Doolittle):

{ package Animal;
  our @ISA = qw(LivingCreature);
  sub sound { die "all Animals should define a sound" }
  sub speak {
    my $class = shift;
    die "animals can't talk!" if @_;
    $class->SUPER::speak;
  }
}
{ package Cow;
  our @ISA = qw(Animal);
  sub sound { "moooo" }
}
{ package Horse;
  our @ISA = qw(Animal);
  sub sound { "neigh" }
}
{ package Sheep;
  our @ISA = qw(Animal);
  sub sound { "baaaah" }
}
{ package Mouse;
  our @ISA = qw(Animal);
  sub sound { "squeak" }
  sub speak {
    my $class = shift;
    $class->SUPER::speak;
    print "[but you can barely hear it!]\n";
  }
}

Finally, have the person speak:

Person->speak;                # just hmms
Person->speak("Hello, world!");

Notice that the main speak routine has now moved into the LivingCreature class, which means you don’t need to write it again to use it in Person. In Animal, though, you need to check that to ensure an Animal won’t try to speak before calling SUPER::speak.

Although it’s not the way the assignment was written, you can get a similar result if you choose to make Person a subclass of Animal. (In that case, LivingCreature would presumably be needed as a parent class for an eventual Plant class.) Of course, since an Animal can’t speak, how can a Person? The answer is that Person::speak would have to handle its parameters, if any, before or after (or instead of) calling SUPER::speak.

Which would be the better way to implement this? It all depends upon what classes you’ll need in the future and how you’ll use them. If you expect to add features to Animal that would be needed for Person, it makes sense for Person to inherit from Animal. If the two are nearly completely distinct, and nearly anything that a Person has in common with an Animal is common to all LivingCreatures, it’s probably better to avoid the extra inheritance step. The ability to design a suitable inheritance structure is a crucial talent for any OOP programmer.

In fact, you may find that after developing the code one way, you’ll want to “refactor” the code a different way. This is common with OOP. However, it’s very important to have enough testing in place to ensure that you don’t break things while you’re moving them around.

First, start the Animal package:

use strict;
{ package Animal;
  use Carp qw(croak);

And now for the constructor:

## constructors
sub named {
  ref(my $class = shift) and croak "class name needed";
  my $name = shift;
  my $self = { Name => $name, Color => $class->default_color };
  bless $self, $class;
}

Now, for virtual methods: the methods that should be overridden in a subclass. Perl doesn’t require virtual methods to be declared in the base class, but it’s nice as a documentation item.

## backstops (should be overridden)
sub default_color { "brown" }
sub sound { croak "subclass must define a sound" }

Next comes the methods that work with either a class or an instance:

## class/instance methods
sub speak {
  my $either = shift;
  print $either->name, " goes ", $either->sound, "\n";
}
sub name {
  my $either = shift;
  ref $either
    ? $either->{Name}
    : "an unnamed $either";
}
sub color {
  my $either = shift;
  ref $either
    ? $either->{Color}
    : $either->default_color;
}

Finally, the methods that work only for the particular instance:

## instance-only methods
sub set_name {
  ref(my $self = shift) or croak "instance variable needed";
  $self->{Name} = shift;
}
sub set_color {
  ref(my $self = shift) or croak "instance variable needed";
  $self->{Color} = shift;
}
}

Now that you have your abstract base class, define some concrete classes that can have instances:

{ package Horse;
  our @ISA = qw(Animal);
  sub sound { "neigh" }
}
{ package Sheep;
  our @ISA = qw(Animal);
  sub color { "white" }    # override the default color
  sub sound { "baaaah" }   # no Silence of the Lambs
}

Finally, a few lines of code to test your classes:

my $tv_horse = Horse->named("Mr. Ed");
$tv_horse->set_name("Mister Ed");
$tv_horse->set_color("grey");
print $tv_horse->name, " is ", $tv_horse->color, "\n";
print Sheep->name, " colored ", Sheep->color, " goes ", Sheep->sound, "\n";
use IO::File;
my %output_handles;
while (<>) {
  unless (/^(\S+):/) {
    warn "ignoring the line with missing name: $_";
    next;
  }
  my $name = lc $1;
  my $handle = $output_handles{$name} ||=
    IO::File->open(">$name.info") || die "Cannot create $name.info: $!";
  print $handle $_;
}

At the beginning of the while loop, use a pattern to extract the person’s name from the data line, issuing a warning if that’s not found.

Once you have the name, force it to lowercase so that an entry for “Maryann” will get filed in the same place as one for “MaryAnn.” This is also handy for naming the files, as the next statement shows.

The first time through the loop, the filehandle must be created. Let’s see how to do that. The || operator has a higher precedence than the assignment, so it is evaluated first; the program will die if the file can’t be created. The ||= operator assigns the filehandle to the hash, and the = operator passes it to $handle as well.

The next time you have the same name in $name, the ||= operator kicks in. Remember that $gilligan ||= $anything is effectively like $gilligan = $gilligan || $anything. If the variable on the left is a false value (such as undef), it’s replaced by the value on the right, but if it’s true (such as a filehandle), the value on the right won’t even be evaluated. Thus, since the hash already has a value for that person’s name, the hash’s value is used and assigned directly to $handle without having to (re-)create the file.

It wasn’t necessary to code the castaways’ names into this program, because they will be read in as data. This is good because any additional castaway won’t require having to rewrite the program. If someone’s name is accidentally misspelled, however, it puts some of their data into a new file under the wrong name.

These complete projects are far too complex to represent completely here. We hope readers of this book will share their solutions with each other on the Internet. Try a search on the Web for the phrase “Perl Alpaca book LPORM answers.” You can also use that phrase on the web page where you offer your solutions for downloading. To help anyone who finds your page, include links from your page to any other solutions that look interesting to you.

We search the Web from time to time, and so we may visit your site and look at your results. If we especially like your solution to the halting problem, we’ll be sure to let you know.



[107] Even though the dummy item will sort to the head of the sorted list, it won’t necessarily be the first item in the list. If a machine sent data to just one other, that destination machine’s total will be equal to the source machine’s total output, so that list could sort in either order.