Spaceships, Elvis, and Groovy inject

When I first started learning Groovy, I took to collect pretty quickly. The current trend of adopting “functional programming” practices works well in Groovy, though the names of the methods are somewhat surprising. For example, collect is like map, findAll is the equivalent of filter, and inject is the proposed replacement for reduce (or whatever the similar process is for your favorite language).

As a trivial example, in Groovy you can write:

(1..20).collect { it * 2 }      // double them all
       .findAll { it % 3 == 0 } // find the doubles divisible by 3
       .sum()                   // add them up

which is a functional style, even though Groovy is an object-oriented language.

Notice, however, that I didn’t use inject. That’s not an accident, of course. For years, while I “got” collect and findAll, I never found a usage for inject that couldn’t be done in an easier way. The inject version of the above example would be:

(1..20).collect { it * 2 }                   // double them all
       .findAll { it % 3 == 0 }              // find the doubles divisible by 3
       .inject(0) { acc, val -> acc + val }  // add them up

That seems like a lot of work compared to the sum method, especially when I always had trouble remembering exactly what the arguments to the closure meant.

That changed recently. One of the examples I use when teaching Groovy to Java developers is do some basic sorting. I like that example, because it shows not only how easy it is to replace anonymous inner classes with closures, but also because it shows how much the Groovy JDK simplifies coding.

As a preface to my example, consider making an ArrayList of strings:

def strings = 'this is a list of strings'.split()
assert strings.class == java.lang.String[]

The split method splits the string at spaces by default, and returns, sadly, a string array. What I want is a List, and converting an array into a List is a special blend of Java awkwardness and verbosity, though the code isn’t too bad once you’ve seen it.

The conversion is trivial in Groovy, however.

List strings = 'this is a list of strings'.split()
assert strings.class == java.util.ArrayList

Just replace def with the datatype you want, and Groovy will do its best to do the conversion for you. 🙂

To sort a list, Java has the various static sort methods in the java.util.Collections class. The sort method with no arguments does the natural, alphabetical (more properly, lexicographical, where there capital letters come before the lowercase letters) sort.

List strings = 'this is a list of strings'.split()
Collections.sort()  // natural sort (alphabetical)
assert strings == ['a', 'is', 'list', 'of', 'strings', 'this']

This sorts the strings in place (a destructive sort) and returns void, so to see the actual sort you have to print it.

How do you test this? I wrote an assert that hardwired the results, because I knew what they had to be. That’s hardly generalizable, however, and this is where inject comes in.

Have you ever looked at the definition of inject in the GroovyDocs? Here it is, from the class org.codehaus.groovy.runtime.DefaultGroovyMethods.

public static T inject(E[] self, U initialValue, @ClosureParams(value=FromString.class,options=”U,E”) Closure closure)

Iterates through the given array, passing in the initial value to the closure along with the first item. The result is passed back (injected) into the closure along with the second item. The new result is injected back into the closure along with the third item and so on until all elements of the array have been used. Also known as foldLeft in functional parlance.

Parameters:
self – an Object[]
initialValue – some initial value
closure – a closure

Returns:
the result of the last closure call

You would be forgiven for being seriously confused right now. I’ve been using Groovy since about 2007 and if I didn’t already know what inject did, I’d be seriously confused, too.

The DefaultGroovyMethods class contains lots of methods that are added to the library at runtime via Groovy metaprogramming. In this case, the inject method is added to collection (the first argument above). The second argument to inject is an initial value. An initial value to what, you say? The third argument to inject is a closure, and it takes two arguments, and the second argument to inject is the initial value of the first argument of the closure. The subsequent values to that argument are the result of the closure. The second argument to the closure is each element of the collection, in turn.

I expect that almost nobody actually read that last paragraph. Or, more likely, you started it and abandoned it somewhere in the middle. I can hardly blame you.

As usual, the indefatigable Mr. Haki comes to the rescue. Here is one of his examples:

(1..4).inject(0) { result, i ->
    println "$result + $i = ${result + i}"
    result + i
}

and the output is:

0 + 1 = 1
1 + 2 = 3
3 + 3 = 6
6 + 4 = 10

The value of result starts at the initial value (here, 0), and is assigned the result of each execution of the closure.

That’s the sort (no pun intended) of example I’d seen before, and because I always associated inject with accumulators, I never actually needed it. After all, the Groovy JDK adds a sum method already.

The key is to note that the value of “result” is actually whatever is returned from the closure (Mr. Haki’s second example illustrates this beautifully — seriously, go read his post). This actually makes it easy to use to test the sort.

Try this out for size:

List strings = 'this is a list of strings'.split()
Collections.sort(strings)
strings.inject('') { prev, curr ->
    println "prev: $prev, curr: $curr"
    assert prev <= curr
    curr  // value of 'prev' during next iteration
}

The result is:

prev: , curr: a
prev: a, curr: is
prev: is, curr: list
prev: list, curr: of
prev: of, curr: strings
prev: strings, curr: this

Since the closure returns the current value, that becomes the value of prev during the next iteration.

Actually, this can be simplified too. As of Groovy 1.8, there’s now an inject method that leaves out the initialization value. When you call it, the first two elements of the collection become the first two arguments of inject. In other words, now I can do this:

List strings = 'this is a list of strings'.split()
Collections.sort(strings)
strings.inject { prev, curr ->
    println "prev: $prev, curr: $curr"
    assert prev <= curr
    curr  // value of 'prev' during next iteration
}

The output now is:

prev: a, curr: is
prev: is, curr: list
prev: list, curr: of
prev: of, curr: strings
prev: strings, curr: this

Sweet. Now I have a real, live use case for inject. 🙂 I promised you more, however. The title of this post refers to Elvis and spaceships, too.

Assume you want to sort the strings by length. In Java, when you can’t modify the class to be sorted (String), you use the two-argument sort method in Collections. The second argument is of type java.util.Comparator, which gives rise to the dreaded anonymous inner class monster:

List strings = 'this is a list of strings'.split()
Collections.sort(strings, new Comparator<String>()) { // R: Holy anonymous inner class, Batman!
    int compare(String s1, String s2) {               // B: Yes, Robin, with generics and everything.
        s1.size() <=> s2.size()                       // R: Gosh, gee, and a spaceship!
    }
})
assert strings == ['a', 'is', 'of', 'this', 'list', 'strings']
assert strings*.size() == [1, 2, 2, 4, 4, 7]          // R: Holy spread-dot operator, too!
                                                      // B: Dude, seriously, get a grip.

The spaceship operator returns -1, 0, or 1 when the left side is less than, equal to, or greater than the right side. It’s like a comparator, except the values are fixed to -1, 0, and 1.

The nice thing about the spread-dot operator here is that it doesn’t care whether the resulting strings are also alphabetical or not. The fact that equals length strings were also sorted alphabetically is just a side effect of the algorithm.

One of the common idioms in the Groovy JDK is to take static methods in Java and make them instance methods in Groovy. Here, the sort method is a static method in the Collections class. The Groovy JDK makes it an instance method in Collection (singular).

List strings = 'this is a list of strings'.split()
strings.sort { s1, s2 -> s1.size() <=> s2.size() }  // R: Holy closure coercion, Batman!
                                                    // B: No, the instance method takes a closure.
                                                    //    Seriously, did you remember your ADHD meds today?
assert strings*.size() == [1, 2, 2, 4, 4, 7]

The sort method in Groovy is now an instance method, which takes a one- or two-argument closure. The two-argument variety is implemented like a traditional comparator, meaning you return negative, zero, or positive as usual.

Even better, the one-argument version of sort says, in effect, transform each element into a number and Groovy will sort the numbers and use that as a way to sort the collection.

List strings = 'this is a list of strings'.split()
strings.sort { it.size() }                     // R: Everything is awesome!
assert strings*.size() == [1, 2, 2, 4, 4, 7]   // B: Shut up kid, or you'll find yourself floating home.

Here’s the best part. What if you want to sort by length, and then sort equal lengths reverse alphabetically? (I’ll use reverse alpha because the length sort also did alphabetical by accident).

Now I can use Elvis and spaceships together:

List strings = 'this is a list of strings'.split()
strings.sort { s1, s2 -> 
    s1.size() <=> s2.size() ?: s2 <=> s1
}

The Elvis operator says if the result is true by the Groovy Truth, use it, else use a default. Here it means do the length comparison and if the result is non-zero, we’re good. Otherwise do the (reverse) alphabetical comparison.

R: So that’s Elvis being carried back to his home planet by two tandem spaceships, right? Meaning it’s the fat Elvis from the 70s and not the thin Elvis from the 50s?
B: BAM! POW! OOF!

Here, finally, is a test based on inject for that sort:

strings.inject { prev, curr ->
    assert prev.size() <= curr.size()
    if (prev.size() == curr.size()) {
        assert prev >= curr
    }
    curr
}

There you have it: Elvis, spaceships, and inject all in a dozen lines of code. Now you add that to your Groovy utility belt.

(B: POW! BAM! OOF!)

%d bloggers like this: