Unlike for
and while
, the loop
command does not evaluate a test condition to determine whether to continue looping. To break out of the loop, you have to explicitly use the break
keyword, as you can see in the following examples:
3loops.rb
i=0 loop do puts(arr[i]) i+=1 if (i == arr.length) then break end end loop { puts(arr[i]) i+=1 if (i == arr.length) then break end }
These use the loop
method repeatedly to execute the block of code that follows. These blocks are just like the iterator blocks you used earlier with the each
method. Once again, you have a choice of block delimiters, either curly brackets or do
and end
.
In each case, the code iterates through the array, arr
, by incrementing a counter variable, i
, and breaking out of the loop when the (i == arr.length)
condition evaluates to true. Note that without a break
, these would loop forever.
Digging Deeper
Ruby provides a number of ways of iterating over items in structures such as arrays and ranges. Here we discover the inner details of the enumerations and comparisons.
The Enumerable Module
Hashes, Arrays, Ranges, and Sets all include a Ruby module called Enumerable
. It provides these data structures with a number of useful methods such as include?
, which returns true if a specific value is found; min
, which returns the smallest value; max
, which returns the largest; and collect
, which creates a new structure made up of values returned from a block. In the following code, you can see some of these functions being used on an array:
enum.rb
x = (1..5).collect{ |i| i } p( x ) #=> [1, 2, 3, 4, 5] arr = [1,2,3,4,5] y = arr.collect{ |i| i } p( y ) #=> [1, 2, 3, 4, 5] z = arr.collect{ |i| i * i } p( z ) #=> [1, 4, 9, 16, 25] p( arr.include?( 3 ) ) #=> true p( arr.include?( 6 ) ) #=> false p( arr.min ) #=> 1 p( arr.max ) #=> 5
These same methods are available to other collection classes too, as long as those classes include Enumerable
. Here’s an example using the Hash class:
enum2.rb
h = {'one'=>'for sorrow', 'two'=>'for joy', 'three'=>'for a girl', 'four'=>'for a boy'} y = h.collect{ |i| i } p( y )
This code outputs the following:
[["one", "for sorrow"], ["two", "for joy"], ["three", "for a girl"], ["four", "for a boy"]]
Note that because of changes in the way hashes are stored, the order of the items displayed when this code runs differs in Ruby 1.8 and Ruby 1.9. Remember too that the items in a Hash are not indexed in sequential order, so when you use the min
and max
methods, these return the items that are lowest and highest according to their numerical value—here the items are strings, and the numerical value is determined by the ASCII codes of the characters in the key.
p( h.min ) #=> ["one", "for sorrow"] p( h.max ) #=> ["two", "for joy"]
Custom Comparisons
What if you want min
and max
to return items based on some other criterion (say the length of a string)? The easiest way to do this would be to define the nature of the comparison inside a block. This is done in a similar manner to the sorting blocks I defined in Chapter 4. You may recall that you sorted a hash (here the variable h
) by passing a block to the sort
method like this:
h.sort{ |a,b| a.to_s <=> b.to_s }
The two parameters, a
and b
, represent two items from the hash that are compared using the <=>
comparison method. You can similarly pass blocks to the max
and min
methods:
h.min{ |a,b| a[0].length <=> b[0].length } h.max{|a,b| a[0].length <=> b[0].length }
When a hash passes items into a block, it does so in the form of arrays, each of which contains a key-value pair. So, if a hash contains items like this:
{'one'=>'for sorrow', 'two'=>'for joy'}
then the two block arguments, a
and b
, would be initialized to two arrays:
a = ['one', 'for sorrow'] b = ['two', 'for joy']
This explains why the two blocks in which I have defined custom comparisons for the max
and min
methods specifically compare the first elements, at index 0, of the two block parameters:
a[0].length <=> b[0].length
This ensures that the comparisons are based on the keys in the hash. There is a potential pitfall here, however. As explained in the previous chapter, the default ordering of hashes is different in Ruby 1.8 and Ruby 1.9. This means that if you sort by the length of the key, as I did with my custom comparator earlier, and more than one key has the same length, the first match returned will be different in different versions of Ruby. For example, in my hash, the first two keys (“one” and “two”) have the same length. So when I use min
with a comparison based on the key length, the result will be different in Ruby versions 1.8 and 1.9:
p( h.min{|a,b| a[0].length <=> b[0].length } )
Ruby 1.8 displays the following:
["two", "for joy"]
Ruby 1.9 displays the following:
["one", "for sorrow"]
This is another illustration of why it is always safer to make no assumptions of the ordering of the elements in a hash. Now let’s assume you want to compare the values rather than the keys. In the previous example, you could do this quite simply by changing the array indexes from 0 to 1:
enum3.rb
p( h.min{|a,b| a[1].length <=> b[1].length } ) p( h.max{|a,b| a[1].length <=> b[1].length } )
The value with the lowest length is “for joy” and the value with the highest length is “for a secret never to be told,” so the previous code displays the following:
["two", "for joy"] ["seven", "for a secret never to be told"]
You could, of course, define other types of custom comparisons in your blocks. Let’s suppose, for example, that you want the strings “one,” “two,” “three,” and so on, to be evaluated in the order in which you would speak them. One way of doing this would be to create an ordered array of strings:
str_arr=['one','two','three','four','five','six','seven']
Now, if a hash, h
, contains these strings as keys, a block can use str_array
as a reference in order to determine the minimum and maximum values. This also assures that we obtain the same results no matter which version of Ruby is used:
h.min{|a,b| str_arr.index(a[0]) <=> str_arr.index(b[0])} h.max{|a,b| str_arr.index(a[0]) <=> str_arr.index(b[0])}
["one", "for sorrow"] ["seven", "for a secret never to be told"]
All the previous examples use the min
and max
methods of the Array and Hash classes. Remember that these methods are provided to those classes by the Enumerable
module, which is “included” in the Array and Hash classes.
There may be occasions when it would be useful to be able to apply Enumerable
methods such as max
, min
, and collect
to classes that do not descend from existing classes (such as Array) that implement those methods. You can do that by including the Enumerable
module in your class and then writing an iterator method called each
like this:
include_enum1.rb
class MyCollection include Enumerable def initialize( someItems ) @items = someItems end def each @items.each{ |i| yield( i ) } end end
Here you initialize a MyCollection object with an array, which will be stored in the instance variable, @items
. When you call one of the methods provided by the Enumerable
module (such as min
, max
, or collect
), this will call the each
method to obtain each piece of data one at a time. So, here the each
method passes each value from the @items
array into the block where that item is assigned to the block parameter i
. The keyword yield
is a special bit of Ruby magic that runs a block of code that was passed to the each
method. You’ll look at this in much more depth when I discuss Ruby blocks in Chapter 10.
Now you can use the Enumerable
methods with your MyCollection objects:
include_enum2.rb
things = MyCollection.new(['x','yz','defgh','ij','klmno']) p( things.min ) #=> "defgh" p( things.max ) #=> "yz" p( things.collect{ |i| i.upcase } ) #=> ["X", "YZ", "DEFGH", "IJ", "KLMNO"]
You could similarly use your MyCollection class to process arrays such as the keys or values of hashes. Currently the min
and max
methods adopt the default behavior: They perform comparisons based on numerical values. This means that “xy” is considered to have a “higher” value than “abcd” on the basis of the characters’ ASCII values. If you want to perform some other type of comparison—say, by string length, so that “abcd” would be deemed to be higher than “xz”—you can override the min
and max
methods:
def min @items.to_a.min{|a,b| a.length <=> b.length } end def max @items.to_a.max{|a,b| a.length <=> b.length } end
Here is the complete class definition with its versions of each
, min
, and max
:
include_enum3.rb
class MyCollection include Enumerable def initialize( someItems ) @items = someItems end def each @items.each{ |i| yield i } end def min @items.to_a.min{|a,b| a.length <=> b.length } end def max @items.to_a.max{|a,b| a.length <=> b.length } end end
A MyCollection object can now be created, and its overridden methods can be used in this way:
things = MyCollection.new(['z','xy','defgh','ij','abc','klmnopqr']) x = things.collect{ |i| i } p( x ) #=> ["z", "xy", "defgh", "ij", "abc", "klmnopqr"] y = things.max p( y ) #=> "klmnopqr" z = things.min p( z ) #=> "z"
each and yield
So what is really going on when a method from the Enumerable
module uses the each
method that you’ve written? It turns out that the Enumerable
methods (min
, max
, collect
and so forth) pass to the each
method a block of code. This block of code expects to receive one piece of data at a time (namely, each item from a collection of some sort). Your each
method supplies it with that item in the form of a block parameter, such as the parameter i
here:
def each @items.each{ |i| yield( i ) } end
As mentioned earlier, the keyword yield
tells the code to run the block that was passed to the each
method—that is, to run the code supplied by the Enumerable
module’s min
, max
, or collect
methods. This means that the code of those methods can be used with all kinds of different types of collections. All you have to do is include the Enumerable
module into your class and write an each
method that determines which values will be used by the Enumerable
methods.