Persistence providers are not all alike

At the NFJS conference I attended over the weekend, I wound up going to two session by Mark Richards.  His topic was the JPA specification as part of the overall EJB3 spec.

As I’ve mentioned here, I’m quite interested in that.  Will and I just finished an introductory EJB3 course for Capstone where I wrote the bulk of the JPA stuff.

(Actually, I took advantage of Will shamefully.  I can’t believe how much he ultimately wrote.  Still, he seems okay with it.  I still feel like I got lucky.)

I sat in part of the “Introduction to Java Persistence API” and the bulk of the “Advance Java Persistence API” talks.  I was already familiar with the material, but there’s always more to learn.  Plus, I had awkward questions to ask. 🙂

After the introductory talk, I approached Mark and asked him two questions.  It helps to know that Mark is a “Certified Senior IT Architect” at IBM.

My first question: what the heck is IBM’s problem?  In other words, why don’t they have an application server that supports Java EE 5, or at least JPA and EJB3?  What are they waiting for?  I mean, Sun is already there (of course) and their Sun Java Application Server (Glassfish) is actually usable.  JBoss is pretty much ready to go.  WebLogic is in their last betas.  Where is WebSphere?  Don’t they care at all?

You may say that it was an unfair question, and you’d be right.  It’s just that I have no one else to ask it to.  Or, stated a bit more honestly, I have no one else in any position of influence to vent my frustrations to.

He started off by saying that IBM was moving “deliberately” because they don’t own a persistence provider, the way Oracle has TopLink or JBoss uses Hibernate.  Now, I’m basically a friendly person, but I knew that was nonsense and said so.  He backed down and admitted there were actually “lots of reasons.”  I let it go at that.

A clue can probably be found in the excellent podcasts made by the Java Posse.   In podcast #106 they had an interview with an IBMer, and he basically said that IBM is not a technology-driven company.  Instead, they’ve made the business decision to wait to ask their clients to upgrade to a new version, even implying that the demand for Java EE 5 is not there in there the marketplace.

I find that highly questionable.  IBM developers are at the heart of many of the major open source projects we use today.  IBM even donated Eclipse, for crying out loud, and Eclipse is basically the Emacs of our generation.  IBM tries to be on the forefront of technology.

That said, RAD6 is a mess and from what I’ve gathered, RAD7 is worse.  The Eclipse part is great, but there are huge numbers of bugs and problems in the products, not to mention that they suck up all the memory on your machine and go looking for more much like the aliens in Independence Day.  I think they’re going slowly because they think they’ve still got the market under control and that they don’t want to jump to the new version quickly.  From what I gather, the version of WebSphere that will support Java EE 5 (WAS 7?) will not be out until at least the middle of 2008.  Only time will tell if that decision costs them market share.

Anyway, that’s not Mark’s fault.  It was amusing having him sit next to Brian Goetz from Sun at the last Birds of a Feather session and watch them disagree on fundamental issues.

(Aside: Brian Goetz is a huge name in multithreaded programming.  He wrote the Addison-Wesley book on  threading (Java Concurrency in Practice), and, in a much-discussed move, actually joined Sun about a year ago as most of the best developers were leaving.  One of the best lines of the conference was when he admitted that people said to him, “That’s the first time I’ve seen a rat jump ONTO a sinking ship.” :))

I don’t blame Mark for not having a decent answer — I suspect it’s more a marketing issue than anything else.

The other question I asked him about was the strange unidirectional one-to-many behavior I discussed here in an earlier post.  That’s where the “one” class has a collection of the “many” type, but the “many” class doesn’t have an attribute of “one” type.

Trivial example: an Order may have a collection of Product instances, but the Product doesn’t have a reference to the Order.

Now, the database implementation doesn’t care whether the relationship is bidirectional or not.  The PRODUCT table is going to have a foreign-key to the ORDER table, because there’s no way to know how many columns you’d need in the ORDER table to do it the other way around.  If the association is bidirectional, then there’s no problem.  The Product class has an attribute of type Order and adds a @ManyToOne annotation on it, while the Order class has a collection of Product attribute called “products”, on which it adds a @OneToMany(mappedBy=”product”) annotation.  Everybody’s happy.

Except that it’s wrong.  Why make the Product know about the Order?  And what happens if you forget to set the Order attribute of the Product?  Do you get referential integrity issues or worse?  Add to that the fact that it’s just ugly and you see there’s an issue.

The problem gets much worse, however, if the relationship is unidirectional on the collection side.  The JPA specification states that in a unidirectional association like that, the database implementation should use a link table between the two entity tables.  But nobody does it that way, for good and sound reasons.

Therefore, I asked Mark Richards about it.  Why did the spec recommend that?  Once he realized I wasn’t asking about how to map it, but rather disagreeing with the recommendation in the spec, he lowered his voice and became a conspirator.  “It’s that way,” he said, “because Sun wanted it that way.”

He claimed that this issue was roundly debated during the JSR specification meetings and the debates weren’t terribly friendly, either, but that Sun ultimately made a decision and therefore this is what we have.

Hmm.  It’s hard to know how much truth is in that.  I wasn’t there, and don’t know anybody who was.  It’s so in character for someone from IBM to blame Sun for any problems that it could be true or not.

(Obvious example of an awkward IBM/Sun relationship: why name their flagship editor  platform Eclipse, anyway?  Is that supposed to say something about its relationship to Sun?  Inquiring minds want to know, but of course IBM claims it was all a coincidence.)

So the answer to that is also essentially, that’s what the spec says and we have to live with it.

In the talks themselves, Mark took great pains to try to show how the same code could use either TopLink or Hibernate as its persistence provider.  He switched back and forth many times and showed how it all worked.

Except when it didn’t.  He demonstrated several cases where Hibernate violated the spec in significant ways.  He told a story about asking a member of the Hibernate core team about a particular issue, only to receive an earful about how they knew better than the spec and weren’t going to change their product for something they didn’t agree with.  It came across as arrogance worthy of the Rails team, which apparently flows from Gavin King on down.

I have no idea if that’s actually true or not, either, but it’s not the first time I’ve heard it.

Finally, I get to the reason for today’s post.  As part of our EJB3 materials, we implemented a system where a Proposal has both public Comments and professional Comments.  In other words, we had two one-to-many relationships between Proposal and Comment, and both were unidirectional.  To illustrate the issues with the spec, we decided to show how to implement the public comments using a link table (as the spec suggests) and the professional comments by making the relationship bidirectional.  So far, so good.

The problem came when we tried to do the cascade delete.  Deleting a Proposal ought to delete both kinds of Comments.  We set CascadeType.ALL on both relationships and hoped for the best.

What happened?  TopLink deleted the professional comments without a problem (the bidirectional version), but failed with a foreign key violation when trying to delete the public comments (through the link table).

I decided to re-write the test to use Hibernate, and, lo and behold, that worked like a charm.  Go figure.

So what’s the conclusion?  I’m not trying to change the spec.  I’m an instructor and developer who has to use what’s available and show others how to deal with it.  Frankly, it’s an interesting demonstration to see one provider work and the other fail.  I guess I was just surprised, given the build-up, which one succeeded.

I have other, more philosophical comments to make about those presentations and the conference in general, but that’ll wait until another post.

Just to leave on an up note, however, one of the best lines of the conference was when Neal Ford asked everyone, “do you think it would have been easier to introduce Groovy into your organization if the language had been called Enterprise Business Execution Language?”


One response to “Persistence providers are not all alike”

  1. Thanks a lot !! very helpful post!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.