Categories
Google Maps Groovy

Trivial Geocoding with Google and Groovy

I’m building a simple Google Maps mashup that will show where I’ve given classes over the past few years. It’s not much, but it’s an easy demonstration of the technology and, even better, an easy way for me to learn the Google Maps API and to play with Groovy some more.

As for Google Maps, the documentation online isn’t bad, but I have a better alternative. Scott Davis wrote an excellent introductory book for the technology called Google Maps API, v2. I bought the eBook at Pragmatic Programmers for a whopping $8.50. I’m glad to have the eBook, but it’s a bit of a shame, too, because I can’t think of a way to get it autographed when I see Scott at the No Fluff, Just Stuff conference in September.

One of the key elements of any Google map is the latitude and longitude of a particular location. To determine that, you need a geocoder, which is an application that turns place names into lat, long pairs. There are many free ones available on the Internet, but since I’m using Google anyway, I figured I might as well take advantage of theirs.

The data is available at a URL with the appropriate query parameters set. That’s an example of a RESTful web service, of course. Here’s the query for the home office of Kousen IT, Inc.:

http://maps.google.com/maps/geo?q=Marlborough,+CT&output=csv&key=xyz…

I’m using the csv response type, which returns a string of the form:

200,4,41.63257,-72.46314

The first element is the response code (200 for ok, 404 for not found, 610 for “you forgot to add in your key,” etc). The second element is the magnification level, and the last two elements are the latitude and longitude desired. There are also response types for XML, JSON, and others.

I’m going to put all my city, state locations in a database table, but just to test it out I wrote the following Groovy script:


def key = 'my Google Maps API key'
def cities = ['Marlborough','Camarillo','Boston','Winston-Salem']
def states = ['CT','CA','MA','NC']
for (i in 0..3) {
    def city = cities[i]
    def state = states[i]
    def query = "q=${city},+${state}&output=csv&key=${key}"
    def url_string = base + '?q=' +
        ',+' + URLEncoder.encode("${city}","UTF-8") +
        ',+' + state + "&output=csv&key=${key}"
    def results = new URL(url_string).text
    println "${city}, ${state}: " + 
        results.split(',')[-2,-1]*.toDouble()
}

I know there are better ways to store and access the data, but as I say I’m going to move to a database soon anyway. I used the URLEncoder class (from Java) to make sure that if I have any cities with spaces in the name (“Los Angeles”) it gets added to the URL in an encoded form (“Los+Angeles”).

The other parts I like are:

  1. Using the split(String) method in the String class to tokenize the String. I’m still used to using the old StringTokenizer class, but since Java 1.5 the split method has been available. I might as well get used to it.
  2. The beautiful array access from the end of the array [-2,-1] rather than from the beginning [2,3]. If they ever add any data to the response, I’m still fine as long as the lat and long are still the last two elements.
  3. Using the “spread-dot” operator to convert each element of the resulting string into a double.

If this code was going to be any longer I’d create a DAO class for the conversion. I might do that anyway, but this was so easy I couldn’t resist just writing it directly. The results were:

Marlborough, CT: [41.63257, -72.46314]
Camarillo, CA: [34.22291, -119.05074]
Boston, MA: [42.35864, -71.05665]
Winston-Salem, NC: [36.0996, -80.24105]

I still can’t get over how much easier it is to do anything in Groovy compared to raw Java. Programming is fun again. 🙂

Categories
Baseball Groovy

Groovier Box Scores

I made a couple more fixes to my box scores script to make it a bit groovier. First is a trivial one, but it’s much more in the Groovy idiom than in Java.

I replaced

def cal = Calendar.getInstance()

with

def cal = Calendar.instance

Groovy automatically uses the getter if you access a property of a class, as long as the property itself is private. Properties in Groovy are private by default, too, which is much more intuitive than Java’s “package-private” access. Of course, methods are public by default.

The other modification I made had to do with the fact that I was concerned about reading the remote XML file line by line. I thought it might be more appropriate to read the entire file into a local variable and then parse the file.

To do that, I found that the URL class had a getText() method (or, more in the Groovy spirit, a text property). That meant I could read the entire page by writing

def gamePage = new URL(url).text

Now the matching can be done all at once via

def m = gamePage =~ pattern

which results in a collection of matches. The only complication is that the pattern I’m searching for (/${day}_(\w*)mlb_(\w*)mlb_(\d) /) appears twice in each line, once as the text value of the <a> tag and once as it’s href attribute. I figured the easiest way to deal with that was to use eachWithIndex and only worry about the even-numbered matches:

def m = gamePage =~ pattern
if (m) {
    (0..<m.count).eachWithIndex { line, i ->
      if (i % 2) {
          away = m[line][1]
          home = m[line][2]
          num = m[line][3]

etc. The rest is essentially the same.

A good source for figuring out the Groovy way to do things is the PLEAC Groovy page. It rocks.

Categories
Baseball Groovy

Groovy Box Scores (minor correction)

I noticed running the Groovy code I posted the other day that I accidentally reversed home and away. It’s not critical, because I still got the URL right, but it’s better to be right.

The fix was just to switch the groups:

away = m.group(1)
home = m.group(2)

and then to update the ${away} and ${home} in the URL link for the individual games.

I’m not sure that the best way to go is to use the eachLine method on the open stream, either. It’s probably better to download the whole page and then process it. I’m not sure how eachLine is working under the hood. If it’s sending a new HTTP request per line, it’s going to be pretty slow.

I also did some very rudimentary Date processing, always an ugly and awkward thing in Java. The URL’s for each game need the day, month, and year, where the day and month have two digits and the year has four. Just to keep things simple, I did it this way:

def cal = Calendar.getInstance()
def year = cal.get(Calendar.YEAR)
def m = cal.get(Calendar.MONTH) + 1  // Ugly off-by-one correction
def d = cal.get(Calendar.DAY_OF_MONTH)
def month = (m < 10)? "0" + m : m
def day = (d < 10) ? "0" + d : d

Now I can run the script without arguments and it checks on the status of the current day’s games. I’ll update it soon so that I can enter in a date, but dates are always awkward so I’m hesitating. When I turn all this into a web app (probably using Grails), I try to insert some calendar widget with some Ajaxy goodness.

Categories
Baseball Groovy

Groovy Box Scores

Long ago I decided the best thing about Ruby on Rails was Ruby. Ruby is a great language, with a friendly community and lots of samples to learn from. Still, it’s quite a radical change from Java, which is the language where I am most comfortable.

That brought me to Grails, on a journey I’ve discussed here before. Since Rails taught me about Ruby, I suspected that the coolest aspect of Grails was going to be Groovy. As I spend more and more time with both Groovy and Grails, I’m not sure I want to downplay Grails while praising Groovy, but Groovy sure is a lot of fun.

As we get deep into the baseball pennant races, I’ve been spending more time on the game online. Recently, to my surprise, I discovered that MLB actually makes the box scores from each game available online in XML format. In other words, just by processing some XML, I can access whatever game data I like.

I found out about this from the interesting book Baseball Hacks, by Joseph Adler. I got the book when it came out but didn’t get very far into it because the language of choice in the book was Perl. I’m really not a Perl hacker by any means, so I kind of lost interest. Then I saw the hacks on accessing data online, and I was hooked all over again.

Processing XML with Java is never a fun thing to do. The programming model is awkward at best, and filled with indirection (you have to get a factory to get a DomBuilder / SAXParser / TransformerFactory, then set properties on it, then get the object you really wanted, etc). Then getting the element you wanted isn’t terribly fun, either. It wasn’t until Java 5 that the language finally introduced an XPath processor.

(Incidentally, I usually say that my two least favorite things to do in programming are debugging JavaScript and traversing DOM trees in Java. Ajax gives me the chance to do both at the same time! Fortunately, JavaScript is much friendlier to XML than Java is, and the great Ajax libraries like Prototype, Dojo, and Scriptaculous make everything easier. But I digress…)

The JSP Standard Tag Libraries (JSTL) makes all of that much easier, too. Not only are the tags simple (imports and transforms and the like), but you can just use a JavaScript-like EL dot notation to traverse the tree. Unfortunately, though, that doesn’t seem to have made its way into Java yet.

Enter Groovy. Since the data is online already in XML form, I wondered how I could access it in Groovy. It turns out that accessing and parsing the data takes about two lines:

def url = ... // whatever the url is, online or otherwise
def boxscore = new XmlParser().parse(url)

and we’re done. (Note an XmlSlurper works just as well for an alternative.)

Traversing the resulting tree is also trivial.

To see an example, I was going to paste a box score here, but it’s probably just as easy to see it online. Here’s a link to the box score for the game Boston at Chicago from today, which the Red Sox won 14 — 2.

(Gee, I wonder why I picked that game? :))

The Baseball Hacks book shows how to examine all files of that form using Perl. Since I’m trying to learn more Groovy, I’m redoing the examples. Of course, since Groovy is object-oriented, the next step will be to create actual classes and objects out of these things, not just live on functional programming, but that will come later.

Here’s a snippet to grab the box score and do some basic processing with it.

// Just a sample for the moment:
def year = '2007'
def month =  '08'
def day = '25'
def num = '1' // 1 for single game, 1 or 2 for double header

// Build the URL
def base = 'http://gd2.mlb.com/components/game/mlb/'
def url = base + "year_${year}/month_${month}/day_${day}/"
url += "gid_${year}_${month}_${day}_${away}mlb_${home}mlb_${num}/boxscore.xml"

// Read and parse the box score
def boxscore = new XmlParser().parse(url)

// Collect all the <batter> elements inside all the <batting> elements
def batters = boxscore.batting.batter
for (b in batters) {
    println b.'@name' + ' went ' + b.'@h' + ' for ' + b.'@ab'
}
println batters.size() + " total batters"
println 'Total hits: ' + batters.'@h'*.toInteger().sum()

println "Batters with at least one hit:"
println batters.findAll {
    it.'@h'.toInteger() > 0
}.collect {
    it.'@name' + '(' + it.'@h' + ')'
}

Note how easy it is to access child elements and even attributes (prefaced by the @ sign). I also love the spread operator (*.) which allows me to grab the “hits” attribute of each batter, convert them all into integers, and then add them up. I also get to use closures to find all the batters with at least one hit, collect them into a list, and print their names. There may be a more elegant (read “groovier”) way to do that, but this worked for me.

The URL for each individual game in a directory corresponding to the string above that begins with “gid”. The parent directory for that date lists all the games for that day. In order to process all the games for a given date, somehow I need a list of those directories.

Adler does the Perl equivalent of screen scraping to get those values. In other words, he basically reads the HTML page and looks for the link tags that have that href in them. Of course, as a Perl hacker, he uses regular expressions.

I’m a relatively normal Java programmer, which means I’ve spent most of my career avoiding regular expressions unless absolutely necessary. One of my absolute favorite programming quotes is in the Groovy in Action book (GinA), p. 76:

Once a programmer had a problem. He thought he could solve it with a regular expression. Now he had two problems.

That slays me. Unfortunately (or not, since I really do need to learn this stuff), the best way I could find to solve the same problem was still a regular expression. Since regex’s have been a part of Java for a couple of versions now, it’s high time I got better at them, especially if I want to make any progress in Groovy.

It took some time for me to realize it, but the key to making my program work was “grouping”. I hadn’t realized that if you put parentheses in a regular expression, you can easily get at the grouped values. In this particular case, the base URL for the day is a web page that contains a series of links in the form I want:

<li>
<a href=“gid_2007_08_25_atlmlb_slnmlb_1/”>gid_2007_08_25_atlmlb_slnmlb_1/</a>
</li>

and so on for each game. Here’s what I ultimately did:

println "Games for ${month}/${day}/${year}"
def url = base + "year_${year}/month_${month}/day_${day}/"
def gamePage = new URL(url)

def pattern = ~/${day}_(\w*)mlb_(\w*)mlb_(\d)/

gamePage.openStream().eachLine() { line ->
    def m = pattern.matcher(line)
    if (m) {
        home = m.group(1)  // group 1 is the home team abbrev
        away = m.group(2)  // group 2 is the away team abbrev
        num = m.group(3)   // group 3 is the num (1 or 2)
        def game = "gid_${year}_${month}_${day}_${home}mlb_${away}mlb_${num}/boxscore.xml"

        // if the game hasn't started, the box score won't be there
        // Use a try/catch block for this situation
        try {
            def boxscore = new XmlParser().parse(url + game)

            // Team names are attributes of <boxscore>
            // Run totals are attributes of the single <linescore> child of <boxscore>
            def awayName = boxscore.'@away_fname'
            def awayScore = boxscore.linescore[0].'@away_team_runs'
            def homeName = boxscore.'@home_fname'
            def homeScore = boxscore.linescore[0].'@home_team_runs'
            println awayName + " " + awayScore + ", " +  homeName + " " + homeScore +
                 " (game " + num + ")"

            // Winning and losing pitchers are in a "note" attribute of <pitcher>
           def pitchers = boxscore.pitching.pitcher
           pitchers.each { p ->
               if (p.'@note' && p.'@note' =~ /W|L|S/) {
                   println "  " + p.'@name' + " " + p.'@note'
               }
           }
        } catch (Exception e) {
           println abbrevs[away] + " at " +  abbrevs[home] + " not started yet"
        }
    }
}

At the top of my script I have a map called “abbrevs” which looks like:

def abbrevs = [atl:"Atlanta", bos:"Boston",
    sln:"St. Louis", cha:"Chicago (A)", chn:"Chicago (N)"  // ... and so on ...
]

The result is a listing like:

Games for 08/25/2007
Atlanta Braves 3, St. Louis Cardinals 0 (game 1)
Boston Red Sox 14, Chicago White Sox 2 (game 1)
    Wakefield (W, 16-10)
    Buehrle (L, 9-9)
Arizona at Chicago (N) not started yet

and so on.

The next step is to use a builder to convert the box score into HTML. I did that, but since this post is already getting out of hand I think I’ll save that for my next one. I also did some very rudimentary date processing so that I could get the box scores for the current date without having to hard-wire anything.

It’s amazing how much easier this is than basic Java processing, but what also makes it so cool is that I was able to use my Java knowledge to help. For example, to get the basic web page for processing I already knew about the URL class and its openStream() method. The rest I got from Groovy.

Next time I’ll get into the builder and the date processing. Then I can start developing a much more object-oriented version, which will probably contain classes called Boxscore, Pitcher, Batter, and so on.

Categories
Grails Groovy

Moving from Groovy to Grails

When I first heard about Ruby on Rails late in 2005, I got very excited about it and was eager to learn more. Over the Xmas/New-Years break I purchased both the so-called “pick-axe” book (Programming Ruby, by Dave Thomas) and the Agile Web Developer’s Guide to Rails (by Dave Thomas and David Heinemeier Hansson).

I tried plowing through the Rails book, but while the text is clear and well-written, I quickly realized that in order to make any real progress I needed to learn as much Ruby as I could. Learning both at the same time was just too much. I therefore went back to Ruby, and after making progress on that, I eventually dug into Rails again. Since my primary skills are Java, J2EE, and XML related, I couldn’t spend full time on those topics, but slowly made progress over the first few months of 2006.

Eventually, I abandoned that effort. I’ve talked about it here in earlier posts, but the short answer is that I came to believe the Ruby (and Rails, even more so) is really just a niche product. While both are very interesting and helped me learn a ton of new concepts (Ruby especially helped my JavaScript when I got to Ajax), I don’t believe that the industry is ready to abandon its considerable investment in Java technologies so it can start rewriting everything in Ruby. Add to that the fact that Rails is great if you’re starting from scratch but very annoying if you’re not (especially if you’re stuck with a legacy database that violates Rails conventions and even uses — horror of horrors — compound primary keys), and it just didn’t seem worth the effort.

I also became rather disenchanted with the arrogance and even haughtiness of the Rails team, from DHH on down. I liked that they violated accepted conventional wisdom and made it work, but I got really tired of the superior attitude and unwillingness to acknowledge that not everyone has the freedom or interest in doing everything their way. In the end, I decided that (1) Ruby is totally cool and can do almost anything (see Enterprise Integration with Ruby and Ruby Cookbook), and (2) Rails has a lot of great ideas, but neither was really my future.

In the fall of 2006, I then had the great opportunity of attending a talk by Jason Rudolph, who is a committer on the Grails project. Grails, as it sounds, is a framework rather analogous to Rails, based on the Groovy language. The cool part, though, is that Groovy code compiles directly to the JDK, so it cleanly interoperates with existing Java classes. Grails itself uses Groovy in many places, but really gets its power by leveraging existing, well-established Java projects like Spring and Hibernate.

In other words, Groovy and Grails are perfect for me. I’ve therefore been trying to absorb them off-and-on for the past six months or so. It’s hard to dedicate contiguous blocks of time to them, but I keep plugging away at them. From my Ruby experience, I decided that the real key was to learn Groovy before trying to absorb Grails. Therefore, I purchased Groovy in Action (by Dierk Koenig and others) and The Definitive Guide to Grails (by Graeme Rocher), and also acquired Jason Rudolph’s book Getting Started with Grails. That latter book has gotten dated very quickly, but is still a lot of fun.

I’ve mentioned it here many times, but Groovy is seriously cool. In fact, pretty much every Java developer I’ve shown it to gets excited about it. Even silly, trivial things are great.

For example, to make a Java POJO that will be an entity in a database somewhere, I have to write a class like:

public class Employee {

  private int id;
  private String name;
  private double salary;

  // ... other attributes as necessary ...

  public Employee() {}

  public Employee(int id, String name) {
    this.id = id;
    this.name = name;
  }

  // ... other constructors as necessary ...

  public int getId() { return id; }

  public void setId(int id) { this.id = id; }

  // ... all of the other getters and setters ...

  // ... overrides of equals(), hashCode(), and toString() ...

}

That’s a fair amount of code for what is essentially a trivial data structure. Even though Eclipse can generate much of that for me, it’s still rather tedious and verbose, especially when you compare it to the analogous Groovy bean:

class Employee {
  int id
  String name
  // ... other properties ...
}

That’s it. Not only don’t I need any semicolons (trivial, I know, but still kind of cool), but the attributes are assumed to be private, any required public getters and setters show up whenever I access or set a property, and any constructors I might need are already there, too. Truly sweet.

I can populate my Employee instance as easily as

Employee fred = new Employee()
fred.id = 1
fred.name = "Fred Flintstone"
// ... etc ...

which uses the public setter methods even though it looks like I’m accessing private properties. Actually, there’s an easier way.

Employee fred = new Employee(id:1, name:"Fred Flintstone", ...)

where I’m using Groovy map properties to populate the attributes, even though I didn’t write a constructor at all.

I can also do it this way:

def values = [id:1, name:"Fred Flintstone",...]
Employee fred = new Employee()
values.each {  key, value ->
  fred.setProperty("$key",value)
}

which looks Ruby-like and is also pretty cool. That’s still too complicated, though. I can also do this:

def values = ... // as above
Employee fred = new Employee()
fred.properties = values

That’s better, but I can even do:

def values = ... // as above
Employee fred = new Employee(values)

using the map directly. The best part is that I can even write the Employee class in Java and populate in a Groovy script much the same way. How cool is that?

Groovy therefore simplifies Java dramatically. Its dynamic typing makes coding much simpler, too. Groovy is attractive for its Java simplifications, but once you dig into it, its closure support is exciting, but builders are simply incredible. There’s nothing better than being able to build an XML document by writing

def  builder = new groovy.xml.MarkupBuilder
builder.employee (id:fred.id) {
  name fred.name
  // ... other instance variables ...
}

which automatically creates an XML file with all the tags, attributes, and text values automagically inserted. That’s unbelievably sweet.

On my current trip down here to Atlanta, though, I decided I’d finally read enough Groovy, even though I’m far from good at it. It was time to start digging into Grails again.

Grails is still evolving quickly, so every book I have is effectively out of date. Still, what I’ve seen is fantastic. I can’t wait to learn more, and, even better, start building some real sites with it. I’m also eagerly awaiting the Grails 1.0 release, tentatively scheduled for October.

I’d write more about it, but this post has already gone on way too long. Still, the more I learn the more enjoyable the whole thing is. It’s been a while since I felt that way about a programming language.

Categories
Groovy

Groovyness with Excel and XML

Today in class one of the students mentioned that they need to read data from an Excel spreadsheet supplied by one of their clients and transform the data into XML adhering to their own schema.

I’ve thought about similar problems for some time and looked at the various Java APIs for accessing Excel. I spent a fair amount of time working with the POI project at Apache, which is a poor substitute but at least worked.

On the XML side, the Java libraries have gotten better, but working with XML in Java is rarely fun. I know the Apache group has built a few helper projects to make it easier, but I haven’t used them that much. In class, students don’t really want to talk about other projects; they want to know what’s in the standard libraries.

In short, I know I could write the necessary code to take data out of Excel and write it out to XML, but it would be long and awkward. It certainly wouldn’t be much fun.

Now, though, I’m spending a lot of time with Groovy. I’m working my way through the book Groovy in Action (Manning), which has jumped to the top of my favorite technical books list. I’m still learning, but I knew there was a Groovy library for accessing Excel, and I knew Groovy had a “builder” for outputting XML. I just needed to see how to write the actual code. I set up a sample Excel spreadsheet with a few rows of data and went to work.

Here’s the result. It’s about 25 lines of code all told. In other words, it’s almost trivial. I’m amazed.

package com.kousenit;


import org.codehaus.groovy.scriptom.ActiveXProxy


def addresses = new File('addresses.xls').canonicalPath
def xls = new ActiveXProxy('Excel.Application')

// get the workbooks object
def workbooks = xls.Workbooks
def workbook = workbooks.Open(addresses)

// select the active sheet
def sheet = workbook.ActiveSheet

// get the XML builder ready
def builder = new groovy.xml.MarkupBuilder()
builder.people {

for (row in 2..1000) {
def ID = sheet.Range("A${row}").Value.value
if (!ID) break

// use the builder to write out each person
person (id: ID) {
name {
firstName sheet.Range("B${row}").Value.value
lastName sheet.Range("C${row}").Value.value
}

address {
street sheet.Range("D${row}").Value.value
city sheet.Range("E${row}").Value.value
state sheet.Range("F${row}").Value.value
zip sheet.Range("G${row}").Value.value
}
}
}
}

// close the workbook without asking for saving the file
workbook.Close(false, null, false)
// quits excel
xls.Quit()
xls.release()

I’d call that a successful experiment. It certainly was a happy one. I know I’ll do more in the future. I’d bet that somebody with more experience could show me how to condense that even further.

Groovy is just plain fun, and I haven’t felt that way about Java for a long, long time.