Groovyness with Excel and XML

Today in class one of the students mentioned that they need to read data from an Excel spreadsheet supplied by one of their clients and transform the data into XML adhering to their own schema.

I’ve thought about similar problems for some time and looked at the various Java APIs for accessing Excel. I spent a fair amount of time working with the POI project at Apache, which is a poor substitute but at least worked.

On the XML side, the Java libraries have gotten better, but working with XML in Java is rarely fun. I know the Apache group has built a few helper projects to make it easier, but I haven’t used them that much. In class, students don’t really want to talk about other projects; they want to know what’s in the standard libraries.

In short, I know I could write the necessary code to take data out of Excel and write it out to XML, but it would be long and awkward. It certainly wouldn’t be much fun.

Now, though, I’m spending a lot of time with Groovy. I’m working my way through the book Groovy in Action (Manning), which has jumped to the top of my favorite technical books list. I’m still learning, but I knew there was a Groovy library for accessing Excel, and I knew Groovy had a “builder” for outputting XML. I just needed to see how to write the actual code. I set up a sample Excel spreadsheet with a few rows of data and went to work.

Here’s the result. It’s about 25 lines of code all told. In other words, it’s almost trivial. I’m amazed.

package com.kousenit;


import org.codehaus.groovy.scriptom.ActiveXProxy


def addresses = new File('addresses.xls').canonicalPath
def xls = new ActiveXProxy('Excel.Application')

// get the workbooks object
def workbooks = xls.Workbooks
def workbook = workbooks.Open(addresses)

// select the active sheet
def sheet = workbook.ActiveSheet

// get the XML builder ready
def builder = new groovy.xml.MarkupBuilder()
builder.people {

for (row in 2..1000) {
def ID = sheet.Range("A${row}").Value.value
if (!ID) break

// use the builder to write out each person
person (id: ID) {
name {
firstName sheet.Range("B${row}").Value.value
lastName sheet.Range("C${row}").Value.value
}

address {
street sheet.Range("D${row}").Value.value
city sheet.Range("E${row}").Value.value
state sheet.Range("F${row}").Value.value
zip sheet.Range("G${row}").Value.value
}
}
}
}

// close the workbook without asking for saving the file
workbook.Close(false, null, false)
// quits excel
xls.Quit()
xls.release()

I’d call that a successful experiment. It certainly was a happy one. I know I’ll do more in the future. I’d bet that somebody with more experience could show me how to condense that even further.

Groovy is just plain fun, and I haven’t felt that way about Java for a long, long time.

4 responses to “Groovyness with Excel and XML”

  1. Thanks for the tip!
    I’m happy to discover all these useful libraries for Groovy, they’re helping a lot trying to leave PHP.
    Hope I can contribute some of my own at some point (I’m too still working through the GIA book, which is relly great as you said).

    Thanks again!

  2. fantastic, this is what i’m looking for.

  3. I almost never comment, but I read a few of the comments on this page Groovyness with Excel and XML | Stuff I’ve learned recently…. I actually do have 2 questions for you if it’s okay.
    Could it be simply me or does it seem like a
    few of the comments come across like they are left by brain dead individuals?
    😛 And, if you are posting on other online sites, I
    would like to keep up with anything fresh you have to post.
    Could you make a list of every one of all your social sites like your linkedin profile, Facebook
    page or twitter feed?

  4. “for (row in 2..1000) {
    def ID = sheet.Range(“A${row}”).Value.value
    if (!ID) break”

    Is there no other way to determine if we have reached the end of used excel rows?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.