In
addition to Data::Dumper
, there are other data
marshalling modules available that you might wish to investigate,
including the fast and efficient Storable
.
The following code takes the same approach as the example we listed
for Data::Dumper
to show the basic store and
retrieve cycle:
#!/usr/bin/perl -w # # ch02/marshal/storabletest: Create a Perl hash and store it externally. Then, # we reset the hash and reload the saved one. use Storable qw( freeze thaw ); ### Create some values in a hash my $megalith = { 'name' => 'Stonehenge', 'mapref' => 'SU 123 400', 'location' => 'Wiltshire', }; ### Print them out print "Initial Values: megalith = $megalith->{name}\n" . " mapref = $megalith->{mapref}\n" . " location = $megalith->{location}\n\n"; ### Store the values to a string my $storedValues = freeze( $megalith ); ### Reset the variables to rubbish values $megalith = { 'name' => 'Flibble Flabble', 'mapref' => 'XX 000 000', 'location' => 'Saturn', }; ### Print out the rubbish values print "Rubbish Values: megalith = $megalith->{name}\n" . " mapref = $megalith->{mapref}\n" . " location = $megalith->{location}\n\n"; ### Retrieve the values from the string $megalith = thaw( $storedValues ); ### Display the re-loaded values print "Re-loaded Values: megalith = $megalith->{name}\n" . " mapref = $megalith->{mapref}\n" . " location = $megalith->{location}\n\n"; exit;
This program generates the following output, which illustrates that we are storing data persistently then retrieving it:
Initial Values: megalith = Stonehenge mapref = SU 123 400 location = Wiltshire Rubbish Values: megalith = Flibble Flabble mapref = XX 000 000 location = Saturn Re-loaded Values: megalith = Stonehenge mapref = SU 123 400 location = Wiltshire
Storable
also has functions to write and read your
data structures directly to and from disk files. It can also be used
to write to a file cumulatively instead of writing all records in one
atomic operation.
So far, all this sounds very similar to
Data::Dumper
, so what’s the difference? In a
word, speed. Storable
is
fast, very fast—both for saving data and for getting it back
again. It achieves its speed partly by being implemented in C and
hooked directly into the Perl internals, and partly by writing the
data in its own very compact binary format.
Here’s our update program reimplemented yet again, this time to
use Storable
:
#!/usr/bin/perl -w # # ch02/marshal/update_storable: Updates the given megalith data file # for a given site. Uses Storable data # and updates the map reference field. use Storable qw( nfreeze thaw ); ### Check the user has supplied an argument to scan for ### 1) The name of the file containing the data ### 2) The name of the site to search for ### 3) The new map reference die "Usage: updatemegadata <data file> <site name> <new map reference>\n" unless @ARGV == 3; my $megalithFile = $ARGV[0]; my $siteName = $ARGV[1]; my $siteMapRef = $ARGV[2]; my $tempFile = "tmp.$$"; ### Open the data file for reading, and die upon failure open MEGADATA, "<$megalithFile" or die "Can't open $megalithFile: $!\n"; ### Open the temporary megalith data file for writing open TMPMEGADATA, ">$tempFile" or die "Can't open temporary file $tempFile: $!\n"; ### Scan through all the entries for the desired site while ( <MEGADATA> ) { ### Convert the ASCII encoded string back to binary ### (pack ignores the trailing newline record delimiter) my $frozen = pack "H*", $_; ### Thaw the frozen data structure my $fields = thaw( $frozen ); ### Break up the record data into separate fields my ( $name, $location, $mapref, $type, $description ) = @$fields; ### Skip the record if the extracted site name field doesn't match next unless $siteName eq $name; ### We've found the record to update ### Create a new fields array with new map ref value $fields = [ $name, $location, $siteMapRef, $type, $description ]; ### Freeze the data structure into a binary string $frozen = nfreeze( $fields ); ### Encode the binary string as readable ASCII and append a newline $_ = unpack( "H*", $frozen ) . "\n"; } continue { ### Write the record out to the temporary file print TMPMEGADATA $_ or die "Error writing $tempFile: $!\n"; } ### Close the megalith input data file close MEGADATA; ### Close the temporary megalith output data file close TMPMEGADATA or die "Error closing $tempFile: $!\n"; ### We now "commit" the changes by deleting the old file... unlink $megalithFile or die "Can't delete old $megalithFile: $!\n"; ### and renaming the new file to replace the old one. rename $tempFile, $megalithFile or die "Can't rename '$tempFile' to '$megalithFile': $!\n"; exit 0;
Since the Storable
format is binary, we
couldn’t simply write it directly to our flat file. It would be
possible for our record-delimiter character ("\n"
)
to appear within the binary data, thus corrupting the file. We get
around this by encoding the binary data as a string of pairs of
hexadecimal digits.
You may have noticed that we’ve used
nfreeze()
instead of
freeze()
.
By default, Storable
writes numeric data in the
fastest, simplest native format. The problem is that some computer
systems store numbers in a different way from others. Using
nfreeze()
instead of freeze()
ensures that numbers are written in a form that’s portable to
all systems.
You may also be wondering what one of these records looks like. We’ll here’s the record for the Castlerigg megalithic site:
0302000000050a0a436173746c6572696767580a0743756d62726961580a0a4e59203239312032 3336580a0c53746f6e6520436972636c65580aa34f6e65206f6620746865206c6f76656c696573 742073746f6e6520636972636c65732072656d61696e696e6720746f6461792e20546869732073 69746520697320636f6d707269736564206f66206c6172676520726f756e64656420626f756c64 657273207365742077697468696e2061206e61747572616c20616d706869746865617472652066 6f726d656420627920737572726f756e64696e672068696c6c732e5858
That’s all on one line in the data file; we’ve just split
it up here to fit on the page. It doesn’t make for thrilling
reading. It also doesn’t let us do the kind of quick precheck
shortcut that we used with Data::Dumper
and the
previous flat-file update examples. We could apply the pre-check
after converting the hex string back to binary, but there’s no
guarantee that strings appear literally in the
Storable
output. They happen to now, but
there’s always a risk that this will change.
Although we’ve been talking about Storable
in the context of flat files, this technique is also very useful for
storing arbitrary chunks of Perl data into a relational database, or
any other kind of database for that matter.
Storable
and Data::Dumper
are
great tools to carry in your mental
toolkit.