Data
types are the kinds of values Perl supports. Common data types include
arbitrarily long
strings (e.g., "hi,
bob"
), intergers (e.g., 42
) and
floating point numbers (e.g., 3.14
). Perl is
a loosely typed
language, which means that Perl works hard to let you forget about what kind of
data you're dealing with. For the most part, you will be dealing with strings,
which plays to Perl's strengths. To manipulate data, variables are employed.
Table 41-1 lists the most common
variable types in Perl. For the full story on Perl data types, read the
perldata manpage.
When you want
to store single values, like any of those given in the previous paragraph,
you will use a scalar
variable. Scalars
are labeled with a $
followed by a letter
and any sequence of letters, numbers, and underscores. Scalars defined at the top of scripts
are often used as constants. You may need to tweak some of
them, particularly those containing filesystem paths, to get third-party
scripts to run on your system.
Of course, values can be compared to each other or added together. Perl has relational operators that treat values as numbers and other relational operators that treat values as strings. Although Perl has different operators for numbers and strings, Perl makes scalar values do the right thing most of the time. For example, you want to create a series of filenames like mail_num. The following code does this.
foreach my $num (1..10) { print "mail_" . $num . "\n"; }
Even though $num
is a number, the
string concatenation operator is able to use it as a string. Table 40-2 shows string operators,
and Table 41-3 shows the
numerical ones. See the perlop
manpage
for the full story.
Table 41-2. String operators
Operator |
Example |
Description |
---|---|---|
|
String concatenation | |
|
String equality test | |
|
String inequality test | |
|
True if left string comes after right in ASCII | |
|
True if left string comes before right in ASCII | |
|
Return -1 if left operand ASCII-sorts before the right; 0 if right and left are equal; 1 if right sorts before left | |
|
Return an all-lowercase copy of the given string | |
|
Return an all-uppercase copy of the given string |
Table 41-3. Numerical operators
Operator |
Example |
Description |
---|---|---|
|
Numerical addition | |
|
Numerical subtraction | |
|
Numerical multiplication | |
|
Numerical division | |
|
Autoincrement; adds one to a number | |
|
Numeric equality test | |
|
Numeric inequality test | |
|
Numeric less-than test | |
|
|
Numeric greater-than test |
|
Return -1 if left is numerically less than right; 0 if left equals right; 1 if right is less than left | |
|
True if left operand is numerically less than or equal to right | |
|
True if left is numerally greater than or equal to right |
You may have noticed that some of the operators in the previous tables were described as returning true or false values. A true value in Perl is any value that isn't false, and there are only 4 kinds of false values in Perl:
values that are numerically zero
values that are empty strings
values that are undef
empty lists
Like many other languages, Perl supports
Boolean operators (see Table 41-3) that return true or
false values. Typically, you encounter these in if
statements like the following:
if ($temp < 30 && $is_rainy) { print "I'm telecommuting today\n"; }
Another common use of Boolean operators is to short-circuit two expressions. This is a way to prevent the right operand from executing unless the left operand returns a desired truth value. Consider the very ordinary case of opening a filehandle for reading. A common idiom to do this is:
open (FH, "filename
") || die "Can't open file";
This short-cut operation depends on the open
function returning a true value if it can open the
requested file. Only if it cannot is the right side of the ||
operator executed (die
prints whatever message you provide and halts the
program).
Looking at Table 41-4, you will notice that there appear to be redundant operators. The operators that are English words have a lower precedence that the symbolic ones. Precedence is simply the order in which Perl executes expressions. You are probably familiar with precedence rules from mathematics:
1 + 2 * 3 + 4 = 11 (1 + 2) * (3 + 4) = 21
Similarly, Perl's operators have precedence as well, as shown in Example 41-2.
Because ||
has a lower precedence that
the lc
operator, the first line of Example 41-2 is a Boolean test between
two expressions. In the second line, the Boolean ||
operator is used to create a default argument to lc
should $a
be a false value.
Because Perl doesn't require parentheses around built-in operators and functions, you will often see code like:
open FH, "> " . "filename
" or die "Can't open file";
print FH "[info]: disk write error\n";
Precedence ambiguities can be resolved by using parentheses where doubt occurs.
Although Perl has many special variables, the one you'll encounter most is
$_
. Many operators and functions, such as
lc
and print
,
will operate on $_
in the absence of an
explicit parameter, as in Example
41-3.
In this example, every line read from standard input with the <>
operator is available inside the
while (Section 41.7) loop through $_
. The print
function, in the absence of an explicit argument,
echoes the value of $_
. Note that
$_
can be assigned to (e.g., $_ = "Hello, Perl
") just like any other
scalar.
When
you want to collect more than one value into a variable, you have two ways
to go in Perl. If you need an ordered set of values, you will choose to use
a Perl array. These variables start with @
and are followed by a label that follows the same convention as a scalar.
Two global arrays have already been mentioned: @INC
and @ARGV
. Since
arrays hold multiple values, getting and setting values is a little
different from scalars. Here's an example of creating an array with values,
looking at one, and assigning a new value to that array index.
@things = ('phone', 'cat', 'hard drive'); print "The second element is: ", $things[1], "\n"; $things[1] = 'dog'; print "The second element is now: ", $things[1], "\n";
In the first line, the array @things
is
initialized with a list of three scalar values. Array indexes begin with
zero, so the second element is accessed through the index value of 1. Arrays
will grow as needed, so you could have added a fourth element like
this:
$things[3] = 'DVD player';
Why is a $
used here and not @
? Use @
only when referring to the whole array variable. Each element is a scalar
whose name is $things[
index
]. This rule comes up again when dealing
with hashes.
Typically you will want to iterate through all the values in an array, which is done with loops ( Section 41.7). Although there are several looping constructs, the most common idiom to examine all the values in an array sequentially is shown in Example 41-4.
Example 41-4. Using foreach to loop through an array
print "Paths Perl checks for modules\n"; foreach my $el (@INC) { print $el, "\n"; }
Lists are a data type that is closely related to arrays. Lists are sequences of scalar values enclosed in parentheses that are not associated with an array variable. They are used to initialize a new array variable. Common array operators are listed in Table 41-5.
my @primes = (1,3,5,7,9,11); my @empty_list = ( );
Table 41-5. Common array operators
Name |
Example |
Description |
---|---|---|
|
Return last element of array; remove that element from array | |
|
Add the contents of
@ | |
|
Return the first element of array; shift all elements one index lower (removing the first element) | |
|
Add @ |
Associative
arrays, or
hashes, are
a collection of scalar values that are arranged in key-value pairs. Instead of using
integers to retrieve values in a hash, strings are used. Hashes begin with
%
. Example 41-5 shows a hash variable in action.
Example 41-5. Using hashes
my %birthdays = ( 'mom' => 'JUN 14', 'archie' => 'JUN 12', 'jay' => 'JUL 11', ); print "Archie's birthday is: ", $birthdays{'archie'}, "\n"; $birthday{'joe'} = 'DEC 12'; print "My birthday is: ", $birthdays{'joe'}, "\n";
Hashes are a funny kind of list. When initializing a hash with values, it
is common to arrange the list in key-value pairs. The strange-looking
=>
operator is often called a
"fat comma" because
these two lines of Perl do the same thing:
%birthdays = ( 'jay' => 'JUL 11' ); %birthdays = ( 'jay', 'JUL 11');
Use the fat comma when initializing hashes since it conveys the association between the values better. As an added bonus, the fat comma makes unquoted barewords on its left into quoted strings.
Example 41-6 shows some quoting styles for hash keys.
Unlike arrays, hashes use strings to index into the list. So to retrieve the birthday of "jay", put the key inside curly braces, like this:
print "Jay's birthday is: ", $birthdays{'jay'}, "\n";
Because Perl assumes that barewords used as a key when retrieving a hash
value are autoquoted, you may omit quotes between the curly braces (e.g.,
$birthday{jay}
). Like arrays, hashes
will grow as you need them to. Whenever you need to model a set or record
the number of event occurrences, hashes are the variable to use.
Like arrays, you will often need to iterate over the set of key-value pairs in a hash. Two common techniques for doing this are shown in Example 41-7. Table 41-6 lists common Perl hash functions.
Example 41-7. Iterating over a hash
my %example = (foo => 1, bar => 2, baz => 3); while (my ($key, $value) = %example) { print "$key has a value of $value\n"; } foreach my $key (keys %example) { print "$key has a value of $example{$key}\n"; }
Table 41-6. Common Perl hash functions
Name |
Example |
Description |
---|---|---|
|
Delete the key-value pair from hash that is indexed on
| |
|
Return the next key-value pair in hash; the pairs aren't usefully ordered | |
|
Return true if hash has
| |
|
Return the list of keys in the hash; not ordered | |
|
Return the list of values in the hash; values will be
in the same order as keys fetched by |
As odd as it may first seem, it is sometimes necessary to have variables for variables. A funny kind of scalar, a reference is a sort of IOU that promises where the original variable's data can be found. References are primarily used in cases. First, because hashes and arrays store only scalar values, the only way to store one multivalued data type in another is to store a reference instead (see the perldsc manpage for more details). Second, when the size of a data structure makes a variable inefficient to pass into subroutines, a reference is passed instead. Third, because arguments passed into subroutines are really just copies of the original, there's no way to change the original values of the arguments back in the calling context. If you give a subroutine a reference as an argument, it can change that value in the caller. Consult the perlref and perlreftut manpages for more details on references.
Taking a reference to a variable is straightforward. Simply use the
reference operator, \
, to create a
reference. For example:
$scalar_ref = \$bob; $array_ref = \@things; $hash_ref = \%grades;
You can even create references without variables:
$anonymous_array = [ 'Mojo Jo-Jo', 'Fuzzy Lumpkins', 'Him' ]; $anonymous_hash = { 'pink' => 'Blossom', 'green' => 'Buttercup', 'blue' => 'Bubbles', };
The square brackets return a reference to the list that they surround. The curly braces create a reference to a hash. Arrays and hashes created in this way are called anonymous because there is no named variable to which these references refer.
There are two ways of
dereferencing references (that is, getting back the original values). The
first way is to use {}
. For instance:
print "Your name is: ", ${$scalar_ref}; foreach my $el ( @{$anonymous_array} ) { print "Villian: $el\n"; } while (my ($key, $value) = each %{$anonymous_hash}) { print "$key is associated with $value\n"; }
The second way, using
->
, is useful only for references to
collection types.
print "$anonymous_hash->{'pink'} likes the color pink\n"; # 'Blossom' print "The scariest villian of all is $anonymous_array->[2]\n"; # 'Him'