Where’s My for Loop?

Clojure has no for loop and no direct mutable variables. Clojure provides indirect mutable references, but these must be explicitly called out in your code. See Chapter 6, State and Concurrency for details. So how do you write all that code you’re accustomed to writing with for loops?

Rather than create a hypothetical example, we decided to grab a piece of open source Java code (sort of) randomly, find a method with some for loops and variables, and port it to Clojure. We opened the Apache Commons project, which is very widely used. We selected the StringUtils class in Commons Lang, assuming that such a class would require little domain knowledge to understand. We then browsed for a method that had multiple for loops and local variables and found indexOfAny:

data/snippets/StringUtils.java
 // From Apache Commons Lang, http://commons.apache.org/lang/
 public​ ​static​ ​int​ indexOfAny(String str, ​char​[] searchChars) {
 if​ (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
 return​ -1;
  }
 for​ (​int​ i = 0; i < str.length(); i++) {
 char​ ch = str.charAt(i);
 for​ (​int​ j = 0; j < searchChars.length; j++) {
 if​ (searchChars[j] == ch) {
 return​ i;
  }
  }
  }
 return​ -1;
 }

indexOfAny walks str and reports the index of the first char that matches any char in searchChars, returning -1 if no match is found.

Here are some example results from the documentation for indexOfAny:

 StringUtils.indexOfAny(​null​, *) = -1
 StringUtils.indexOfAny(​""​, *) = -1
 StringUtils.indexOfAny(*, ​null​) = -1
 StringUtils.indexOfAny(*, []) = -1
 StringUtils.indexOfAny(​"zzabyycdxx"​,[​'z'​,​'a'​]) = 0
 StringUtils.indexOfAny(​"zzabyycdxx"​,[​'b'​,​'y'​]) = 3
 StringUtils.indexOfAny(​"aba"​, [​'z'​]) = -1

Two ifs, two fors, three possible points of return, and three mutable local variables are in indexOfAny, and the method is 14 lines long, as counted by David A. Wheeler’s SLOCCount.[18]

Now let’s build a Clojure index-of-any, step by step. If we just wanted to find the matches, we could use a Clojure filter. But we want to find the index of a match. So we create indexed, a function that takes a collection and returns an indexed collection:

src/examples/exploring.clj
 (​defn​ indexed [coll] (map-indexed vector coll))

indexed returns a sequence of pairs of the form [idx elt]. Try indexing a string:

 (indexed ​"abcde"​)
 -> ([0 ​\a​] [1 ​\b​] [2 ​\c​] [3 ​\d​] [4 ​\e​])

Next, we want to find the indices of all the characters in the string that match the search set.

Create an index-filter function that is similar to Clojure’s filter but that returns the indices instead of the matches themselves:

src/examples/exploring.clj
 (​defn​ index-filter [pred coll]
  (when pred
  (​for​ [[idx elt] (indexed coll) :when (pred elt)] idx)))

Clojure’s for is not a loop but a sequence comprehension (see Transforming Sequences). The index/element pairs of (indexed coll) are bound to the names idx and elt. The comprehension yields the value of idx for each matching pair, for only those pairs where (pred elt) is true.

Clojure sets are functions that test membership in the set. So you can pass a set of characters and a string to index-filter and get back the indices of all characters in the string that belong to the set. Try it with a few different strings and character sets:

 (index-filter #{​\a​ ​\b​} ​"abcdbbb"​)
 -> (0 1 4 5 6)
 
 (index-filter #{​\a​ ​\b​} ​"xyz"​)
 -> ()

At this point, we’ve accomplished more than the stated objective. index-filter returns the indices of all the matches, and we need only the first index. So, index-of-any simply takes the first result from index-filter:

src/examples/exploring.clj
 (​defn​ index-of-any [pred coll]
  (first (index-filter pred coll)))

Test that index-of-any works correctly with a few different inputs:

 (index-of-any #{​\z​ ​\a​} ​"zzabyycdxx"​)
 -> 0
 (index-of-any #{​\b​ ​\y​} ​"zzabyycdxx"​)
 -> 3

As the following table shows, the Clojure version is simpler than the imperative version by every metric.

Metric

LOC

Branches

Exits/Method

Variables

Imperative version

14

4

3

3

Functional version

6

1

1

0

What accounts for the difference?

Unnecessary complexity tends to snowball. For example, the special case branches in the imperative indexOfAny use the magic number -1 to indicate a nonmatch. Should the magic number be a symbolic constant? Whatever you think the right answer is, the question itself disappears in the functional version. While shorter and simpler, the functional index-of-any is also vastly more general:

As an example of how much more general the functional index-of-any is, you could use code like we just wrote to find the third occurrence of “heads” in a series of coin flips:

 (nth
 (index-filter #{:h} [:t :t :h :t :h :t :t :t :h :h])
 2)
 -> 8

So, writing index-of-any in a functional style, without loops or variables, is simpler, less error prone, and more general than the imperative indexOfAny. On larger units of code, these advantages become even more telling.