Adding a license to source files

A few years ago I remember seeing a blog post by a person interviewing potential new developers. He said that when the prospect featured Java prominently on their resume, he would make sure to give them a programming test that would be easy to solve in a different language but hard in Java, just to see how they handled it. He normally used some sort of file manipulation as an example, because Java makes that particularly challenging while scripting languages often make those problems simple.

About a week ago, someone contacted me about the source code from my book, Making Java Groovy. The source code is located at my GitHub repository, https://github.com/kousen/Making-Java-Groovy. The requester noted that I didn’t have any sort of license file on my code and wondered what the terms were.

Leaving aside the wonder that (1) somebody found the book code useful (ack! humblebrag alert!) and (2) he actually asked permission to use it, it occurred to me I really ought to have something in place for that eventuality. I asked my editor at Manning about it and she didn’t answer right away, so I interpreted that as freedom to do whatever I wanted. 🙂

A friend on a mailing list suggested that the Apache 2 license is appropriate if I don’t care too much how the code is used (and I don’t), so I decided to add that license to each source file. That brings me, at long last, to the original subject of this post: how do I add a license statement to the top of a large number of source files nested in many subdirectories?

I thought I would solve the problem with the eachFileRecurse method that Groovy adds to Java’s java.io.File class. I quickly realized, though, that there were directories I wanted to skip, and that lead me to the traverse method, which takes a Map of parameters.

Here’s the result:

import static groovy.io.FileType.*
import static groovy.io.FileVisitResult.*
String license = '''/* ===================================================
 * Copyright 2012 Kousen IT, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 * ========================================================== */
'''
dir = '/Users/kousen/mjg'
new File(dir).traverse(
    type       : FILES, 
    nameFilter : ~/.*(java|groovy)$/,
    preDir     : { if (it.name == '.metadata') return SKIP_SUBTREE }) { file ->
    // only add license if not already there
    if (!file.text.contains(license)) {
        def source = file.text
        file.text = "$license$source"
    }
    assert file.text.contains(license)
}

I used static imports for the FileType and FileVisitResult classes. The FILES constant comes from FileType, and the SKIP_SUBTREE constant comes from FileVisitResult. The parameters I used return only files whose name ends in either ‘java’ or ‘groovy’ and aren’t in any directory tree including ‘.metadata’.

Ultimately everything is based on the getText and setText methods that the Groovy JDK adds to the java.io.File class. Both are called using the text property. The getText method returns all the existing source code, and the setText method acts as an alias for the write method, which automatically closes (and therefore flushes) the file when finished. I used a multiline string for the license and included the training carriage return, so writing the license followed by the source did the trick.

The documentation for these methods is, shall we say, a little thin. I therefore did what I normally do in these situations: I found the test case in the Groovy source distribution. The test in question is called FileTest and can be viewed in the Groovy GitHub repository here. The test cases showed how to use all of the methods, including traverse, so it was just a question of looking for the right example.

(Incidentally, one of the less publicized but really sweet features of GitHub is the code browser. Just find the project you want and dig into the directories until you find a file, and then GitHub provides syntax highlighting and everything. It’s a great, great feature, especially if you don’t want to clone the source of every project you care about onto your own local disk.)

Since I hadn’t known about the traverse method ahead of time, and I messed up the regular expressions for a while (sigh), solving the problem took longer than I expected. Still, it’s hard to beat a solution that takes less than a dozen lines. Hopefully someone else will find this helpful as well. And regarding the job interview situation described above, to paraphrase John Lennon in the rooftop concert at the end of Let It Be, on behalf of Groovy and myself, I hope I passed the audition.

%d bloggers like this: