The Storable Module

In addition to Data::Dumper, there are other data marshalling modules available that you might wish to investigate, including the fast and efficient Storable.

The following code takes the same approach as the example we listed for Data::Dumper to show the basic store and retrieve cycle:

#!/usr/bin/perl -w
#
# ch02/marshal/storabletest: Create a Perl hash and store it externally. Then, 
#                            we reset the hash and reload the saved one.

use Storable qw( freeze thaw );

### Create some values in a hash
my $megalith = {
    'name' => 'Stonehenge',
    'mapref' => 'SU 123 400',
    'location' => 'Wiltshire',
};

### Print them out
print "Initial Values:   megalith = $megalith->{name}\n" .
      "                  mapref   = $megalith->{mapref}\n" .
      "                  location = $megalith->{location}\n\n";

### Store the values to a string
my $storedValues = freeze( $megalith );

### Reset the variables to rubbish values
$megalith = {
    'name' => 'Flibble Flabble',
    'mapref' => 'XX 000 000',
    'location' => 'Saturn',
};

### Print out the rubbish values
print "Rubbish Values:   megalith = $megalith->{name}\n" .
      "                  mapref   = $megalith->{mapref}\n" .
      "                  location = $megalith->{location}\n\n";

### Retrieve the values from the string
$megalith = thaw( $storedValues );

### Display the re-loaded values
print "Re-loaded Values: megalith = $megalith->{name}\n" .
      "                  mapref   = $megalith->{mapref}\n" .
      "                  location = $megalith->{location}\n\n";

exit;

This program generates the following output, which illustrates that we are storing data persistently then retrieving it:

Initial Values:   megalith = Stonehenge
                  mapref   = SU 123 400
                  location = Wiltshire

Rubbish Values:   megalith = Flibble Flabble
                  mapref   = XX 000 000
                  location = Saturn

Re-loaded Values: megalith = Stonehenge
                  mapref   = SU 123 400
                  location = Wiltshire

Storable also has functions to write and read your data structures directly to and from disk files. It can also be used to write to a file cumulatively instead of writing all records in one atomic operation.

So far, all this sounds very similar to Data::Dumper, so what’s the difference? In a word, speed. Storable is fast, very fast—both for saving data and for getting it back again. It achieves its speed partly by being implemented in C and hooked directly into the Perl internals, and partly by writing the data in its own very compact binary format.

Here’s our update program reimplemented yet again, this time to use Storable:

#!/usr/bin/perl -w
#
# ch02/marshal/update_storable: Updates the given megalith data file
#                               for a given site. Uses Storable data
#                               and updates the map reference field.

use Storable qw( nfreeze thaw );

### Check the user has supplied an argument to scan for
###     1) The name of the file containing the data
###     2) The name of the site to search for
###     3) The new map reference
die "Usage: updatemegadata <data file> <site name> <new map reference>\n"
    unless @ARGV == 3;

my $megalithFile = $ARGV[0];
my $siteName     = $ARGV[1];
my $siteMapRef   = $ARGV[2];
my $tempFile     = "tmp.$$";

### Open the data file for reading, and die upon failure
open MEGADATA, "<$megalithFile"
    or die "Can't open $megalithFile: $!\n";

### Open the temporary megalith data file for writing
open TMPMEGADATA, ">$tempFile"
    or die "Can't open temporary file $tempFile: $!\n";

### Scan through all the entries for the desired site
while ( <MEGADATA> ) {

    ### Convert the ASCII encoded string back to binary
    ### (pack ignores the trailing newline record delimiter)
    my $frozen = pack "H*", $_;
    
    ### Thaw the frozen data structure
    my $fields = thaw( $frozen );
    
    ### Break up the record data into separate fields
    my ( $name, $location, $mapref, $type, $description ) = @$fields;
    
    ### Skip the record if the extracted site name field doesn't match
    next unless $siteName eq $name;
    
    ### We've found the record to update
    ### Create a new fields array with new map ref value
    $fields = [ $name, $location, $siteMapRef, $type, $description ];
    
    ### Freeze the data structure into a binary string
    $frozen = nfreeze( $fields );
    
    ### Encode the binary string as readable ASCII and append a newline
    $_ = unpack( "H*", $frozen ) . "\n";
    
}
continue {

    ### Write the record out to the temporary file
    print TMPMEGADATA $_
        or die "Error writing $tempFile: $!\n";
}

### Close the megalith input data file
close MEGADATA;

### Close the temporary megalith output data file
close TMPMEGADATA
    or die "Error closing $tempFile: $!\n";

### We now "commit" the changes by deleting the old file...
unlink $megalithFile
    or die "Can't delete old $megalithFile: $!\n";

### and renaming the new file to replace the old one.
rename $tempFile, $megalithFile
    or die "Can't rename '$tempFile' to '$megalithFile': $!\n";

exit 0;

Since the Storable format is binary, we couldn’t simply write it directly to our flat file. It would be possible for our record-delimiter character ("\n") to appear within the binary data, thus corrupting the file. We get around this by encoding the binary data as a string of pairs of hexadecimal digits.

You may have noticed that we’ve used nfreeze() instead of freeze() . By default, Storable writes numeric data in the fastest, simplest native format. The problem is that some computer systems store numbers in a different way from others. Using nfreeze() instead of freeze() ensures that numbers are written in a form that’s portable to all systems.

You may also be wondering what one of these records looks like. We’ll here’s the record for the Castlerigg megalithic site:

0302000000050a0a436173746c6572696767580a0743756d62726961580a0a4e59203239312032
3336580a0c53746f6e6520436972636c65580aa34f6e65206f6620746865206c6f76656c696573
742073746f6e6520636972636c65732072656d61696e696e6720746f6461792e20546869732073
69746520697320636f6d707269736564206f66206c6172676520726f756e64656420626f756c64
657273207365742077697468696e2061206e61747572616c20616d706869746865617472652066
6f726d656420627920737572726f756e64696e672068696c6c732e5858

That’s all on one line in the data file; we’ve just split it up here to fit on the page. It doesn’t make for thrilling reading. It also doesn’t let us do the kind of quick precheck shortcut that we used with Data::Dumper and the previous flat-file update examples. We could apply the pre-check after converting the hex string back to binary, but there’s no guarantee that strings appear literally in the Storable output. They happen to now, but there’s always a risk that this will change.

Although we’ve been talking about Storable in the context of flat files, this technique is also very useful for storing arbitrary chunks of Perl data into a relational database, or any other kind of database for that matter. Storable and Data::Dumper are great tools to carry in your mental toolkit.