Stuff I've learned recently…

I teach this stuff. I didn't say I could do it.

Nothing makes you want Groovy more than XML

I’m in Delaware this week teaching a course in Java Web Services using RAD7. The materials include a chapter on basic XML parsing using Java. An exercise at the end of the chapter presented the students with a trivial XML file, similar to:


<library>
  <book isbn="1932394842">
    <title>Groovy in Action</title>
    <author>Dierk Koenig</author>
  </book>
  <book isbn="1590597583">
    <title>Definitive Guide to Grails</title>
    <author>Graeme Rocher</author>
  </book>
  <book isbn="0978739299">
    <title>Groovy Recipes</title>
    <author>Scott Davis</author>
  </book>
</library>

(with different books, of course) and asked the students to find a book with a particular isbn number and print it’s title and author values.

I sighed and went to work, producing a solution roughly like this:


import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class ParseLibrary {
    public static void main(String[] args) {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        Document doc = null;
        try {
            DocumentBuilder builder = factory.newDocumentBuilder();
            doc = builder.parse("books.xml");
        } catch (Exception e) {
            e.printStackTrace();
            return;
        }
        NodeList books = doc.getElementsByTagName("book");
        for (int i = 0; i < books.getLength(); i++) {
            Element book = (Element) books.item(i);
            if (book.getAttribute("isbn").equals("1932394842")) {
                NodeList children = book.getChildNodes();
                for (int j = 0; j < children.getLength(); j++) {
                    Node child = children.item(j);
                    if (child.getNodeType() == Node.ELEMENT_NODE) {
                        if (child.getNodeName().equals("title")) {
                            System.out.println("Title: "
                                + child.getFirstChild().getNodeValue());
                        } else if (child.getNodeName().equals("author")) {
                            System.out.println("Author: "
                                + child.getFirstChild().getNodeValue());
                        }
                    }
                }
            }
        }
    }
}

The materials didn’t supply a DTD, so I didn’t have any ID attributes to make it easier to get to the book I wanted. That meant I was reduced to continually using getElementsByTagName(String). I certainly didn’t want to traverse the tree, what with all those whitespace nodes containing the carriage-return/line-feed characters. So I found the book nodes, cast them to Element (because only Elements have attributes), found the book I wanted, got all of its children, found the title and author child elements, then grabbed their text values, remembering to go to the element’s first child before doing so.

What an unsightly mess. The only way to simplify it significantly would be to use a 3rd partly library, which the students didn’t have, and it would still be pretty ugly.

One of the students said, “I kept waiting for you to say, ‘this is the hard way, now for the easy way,’ but you never did.”

I couldn’t resist replying, “well, if I had Groovy available, the whole program reduces to:


def library = new XmlSlurper().parse('books.xml')
def book = library.books.find { it.@isbn == '1932394842' }
println "Title: ${book.title}\nAuthor: ${book.author}"

“and I could probably shorted that if I thought about it. How’s that for easy?”

On the bright side, as a result I may have sold another Groovy course. 🙂 For all of Groovy’s advantages over raw Java (and I keep finding more all the time), nothing sells it to Java developers like dealing with XML.

Ken Kousen

March 12, 2008

Groovy

Groovy, Java, XML

24 responses to “Nothing makes you want Groovy more than XML”

Brett Knights

March 12, 2008 at 7:51 pm

Well if they can’t find and install Jaxen it’s unlikely they’re going to find and install Groovy.

Also for the task at hand your code is way wordy. What’s below is shorter and could still benefit from a couple of methods to make the main body more readable. It’s not quite as efficient as yours but if you’re going to go to Groovy efficiency isn’t your primary driver anyway.

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;

public class ParseLibrary throws Exception {
public static void main(String[] args) {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(“books.xml”);

NodeList books = doc.getElementsByTagName(“book”);
for (int i = 0; i < books.getLength(); i++) {
Element book = (Element) books.item(i);
if (book.getAttribute(“isbn”).equals(“1932394842”)) {
NodeList titles = book.getElementsByTagName(“title”);
if(titles ! = null) for(int t = 0; t< titles.getLength(); t++) System.out.println(“Title: ” + titles.item(t).getFirstChild().getNodeValue());

NodeList authors = book.getElementsByTagName)”author”);
if(authors ! = null) for(int a = 0; a< authors.getLength(); a++) System.out.println(“Author: ” + authors.item(a).getFirstChild().getNodeValue());

break; // just one book per isbn
}
}
}
}

Loading…
Brett Knights

March 12, 2008 at 7:54 pm

Or would be more readable if you’re comments let me format it properly.

Loading…
Ken Kousen

March 12, 2008 at 8:04 pm

Hi Brett,

Yes, your code is somewhat shorter, but I’d still take the Groovy solution any time. And as for Jaxen, yes, that helps a lot, but Groovy not only makes XML easier, it makes everything easier.

One thing is indisputable, though. As much as I like the overall product, WordPress is a truly lousy way to display source code.

Thanks for commenting, though. 🙂

Loading…
Jon Chase

March 14, 2008 at 9:51 am

I hear ya – I just had the same Groovy XML experience:

The original Java code (and a so-so Groovy impl): http://www.juliesoft.com/blog/jon/index.php/2008/03/09/groovy-is-coming/

The final Groovy code: http://www.juliesoft.com/blog/jon/index.php/2008/03/12/groovy-micro-benchmark-revisited-groovy-is-fast/

Loading…
Jon Chase

March 14, 2008 at 9:52 am

And yes, WordPress’s formatting could be better (that’s why I just use screenshots for my code!!).

🙂

Loading…
Ken Kousen

March 14, 2008 at 10:07 am

Jon, those are very interesting results. I’m glad you found a way to get the efficiency back. Personally, I worry a lot less about efficiency in a technology as new as Groovy, figuring that’ll come automatically with time. I’ve heard many reports of progress in that area already.

And Brett, you’re right, I should think about doing screen shots for my code. What a pain, though. My current system is to paste in the code, then go to code view and add tabs and sprinkle in %lt;pre%gt; and %lt;code%gt; tags as necessary. It’s a really lousy system.

Loading…
Groovy on Grails » Blog Archive » Nothing Makes You Want Groovy More Than XML (Ken Kousen)

March 16, 2008 at 2:51 am

[…] – Ken Kousen […]

Loading…
Jim

March 16, 2008 at 9:57 pm

Good article. BTW,

library.books.find { it.@isbn == ‘1932394842’ }

should be

library.book.find { it.@isbn == ‘1932394842’ } // ‘book’ should be singular

Loading…
Ken Kousen

March 16, 2008 at 10:03 pm

Of course, you’re right. I really am going to have to start pasting in images of my source code rather than trying to just type it into WordPress.

Thanks for catching that.

Loading…
Pavan Sibal

March 25, 2008 at 6:41 pm

I too like Groovy,but I do also think that XPATH expressions can be easily used to extract a particular node like groovy expressions.

Loading…
name

September 1, 2008 at 1:56 am

Hello!,

Loading…
Michael Mellinger

September 18, 2008 at 2:53 pm

This line:
def book = library.books.find { it.@isbn == ‘1932394842’ }

Should be:
def book = library.book.find { it.@isbn == ‘1932394842’ }

def library = new XmlSlurper().parse(‘books.xml’)
def book = library.book.find { it.@isbn == ‘1932394842’ }
println “Title: ${book.title}\nAuthor: ${book.author}”

Loading…
Ken Kousen

September 18, 2008 at 2:56 pm

Thanks for the typo catch. Entering code in WordPress is really annoying. 🙂

Loading…
Paul

September 28, 2008 at 10:34 pm
For displaying source code in wordpress, use the syntaxhighlighter plugin: http://wordpress.org/extend/plugins/syntaxhighlighter/

Just wrap your code in
```
code here
```
. Languages are defined on the plugins homepage. (even though not all languages are implemented, you can still use ‘java’ and it does a good job of groovy)

Have a look here to see it in action:
http://www.javathinking.com/?p=95

Loading…
Paul

September 28, 2008 at 10:38 pm

Hey, it looks like you do have the plugin installed, because my comment is rendering code using it! It should say:

|sourcecode language=’css’|code here|/sourcecode|

where | should really be [ and ]

Loading…
Ken Kousen

September 28, 2008 at 10:45 pm

Paul, that is so sweet! I had no idea the plugin was installed here, at WordPress. I guess maybe it should have been obvious, but I didn’t see it documented anywhere.

Chalk that up as yet another thing that I wish I realized years ago. 🙂

Thanks!

Loading…

José A. Romero L.

June 29, 2012 at 9:01 am

Well, I’m afraid you’ve succeeded at selling another Groovy course — not so at teaching them good Java programming. Maybe next time you could consider present your students with something cleaner? 😉 e.g.

import java.io.FileInputStream;

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;

import org.xml.sax.InputSource;

public class ParseLibrary {
    public static void main( String[] args ) {
 
        XPathFactory xpathFactory = XPathFactory.newInstance();
        XPath xpath = xpathFactory.newXPath();
        XPathExpression xpathExpression = null;
        try {
            xpathExpression = xpath.compile( "/library/book[@isbn = '1932394842']/title" );
            InputSource is = new InputSource( new FileInputStream( "/home/ja/books.xml" ) );
            String title = xpathExpression.evaluate( is );
            System.out.println( "The title is: " + title );
        } catch( Exception e ) {
            e.printStackTrace();
        }
    }
}

Matthias Hryniszak

June 29, 2012 at 9:07 am

But still the Groovy-way beats the hell out of Java…

Loading…
Groovy, The Gateway Drug | Should Be Simple

July 27, 2012 at 4:41 pm

[…] enough, you will realize Java feels stifling. “Dicing all this XML sure would be easier in Groovy,” you will think. You will notice and understand dynamic language zealots. The cool kids […]

Loading…
Tomek Kaczanowski

February 19, 2013 at 7:34 am

love the way Groovy allows me to work with XML! Thanks for this post!

Loading…
Mohammad K

June 4, 2013 at 4:36 pm

Thanks for the helpful post!
I was thinking if you could help me with my issue?
What would be the best way to programmatically remove all the nodes from the whole XML document, so the xml looks like this:

Dierk Koenig

Graeme Rocher

Scott Davis

I would like to do that using Groovy. Any help?

Loading…
Morgan Conrad

July 23, 2014 at 2:39 pm

I am so sold by the “Groovy-way” that I wrote a smallish Java library to largely (somewhat?) mimic it. Please check it out at https://github.com/MorganConrad/xen. It’s still *very* preliminary, but if you check out test/GeocoderDemo.java it more or less matches the example from “Making Java Groovy”.

Loading…
Ken Kousen

July 23, 2014 at 3:57 pm

Very impressive. 🙂 I’ll have to give it a try next time I have to deal with XML.

Loading…
Krishna R

September 1, 2014 at 7:44 am

@Jose, your example is a clean way using 2 languages – java & XPath… the groovy example demonstrates how easy it is for groovy programmers to parse XML w/o learning another language such as XPath…

Loading…