Caveat Container

I said that I'd blog about a couple of painful lessons learned and that's a bit overdue. Sorry, I went to Holland for a week! However, here's one of those lessons learned...

If you work with containers in Java (collection classes), you'll find that they have a contains(obj) method. The idea is that you can quickly check whether a given object is already in the collection by saying:


if (myCollection.contains(myObject)) ...
In many cases, you'll find this "just works" because contains() has the same idea of "equal" that you do for your objects. That is, it checks whether that exact same object is a member of the collection. Unfortunately, there are situations where "the same" object might be a separate instance that just happens to have all the same data (or at least the same key data). That can happen pretty easily when you are working with an ORM (like Transfer - although in our case it's Hibernate). You may be manipulating an object with, say, id = 783 and ask the ORM to load it from the database (or cache) so you can work on the "original" object. If the object you were working with has become disconnected from the ORM's cached version, you will now have two instances of the "same" object. They are "equal" as far as you are concerned but contains() will not consider them the same and you might be surprised to see it return false when you know the object is in the collection!

This happened to me. My unit tests all passed but they had no disconnected objects. The real application began to fail and it took me a while to realize why.

For me, objects are "equal" if their id properties are equal (and they are the same type of course). In my Groovy domain objects, I define equals() to specify this behavior and Groovy conveniently calls equals() when I say objA == objB. Operator overloading is very useful (and easy) in Groovy and it can help make your code much more readable.

Unfortunately (for me), Java's collection classes don't respect that when contains() is called.

Fortunately, Groovy provides a fairly simple workaround. You can use the find() method and a closure to specify the filter. find() returns the first matching object - or null if no objects match. myCollection.contains(myObject) can therefore be replaced with myCollection.find { it == myObject } or, if you want to be explicit, myCollection.find { obj -> obj == myObject }

Since CFML does not allow you to compare objects directly, you're always going to have to write your own equality method. That means that any collection-based operations you do must explicitly call the appropriate equality method so, hopefully, you won't fall foul of Java's very literal "equality" test for the contains() method. If you work across both Java and CFML tho', you'll need to bear this in mind.

Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
Chris Scott's Gravatar Hey Sean, if you also override hashCode contains() should work as expected. One of the nice little things I just found in NetBeans is ctl+i > generate equals and hashCode, which pops up a dialog for you to select which fields you would like to use in these methods and writes the proper math to return an int in hashCode.
# Posted By Chris Scott | 11/8/08 12:10 PM
Sean Corfield's Gravatar Yes, a couple of folks have noted that overriding hasCode() would solve this problem but there are times when I need to establish "same" vs "equal" (I was already overriding equals() to define my own "equal" behavior). The caveat I was trying to highlight is that contains() is "same" not "equal" and it surprised me. Perhaps I wasn't clear...
# Posted By Sean Corfield | 11/8/08 1:44 PM
Jaime Metcher's Gravatar I'm pretty sure that what you've done breaks one of the invariants for java.lang.Object - sounds like a world of pain to me. No doubt you've worked things out for your case, but the listeners out there should be clear about the fact that introducing inconsistencies between hashCode() and equals() is explicitly forbidden.
# Posted By Jaime Metcher | 11/9/08 7:19 PM
Sean Corfield's Gravatar @Jaime, it's not quite that simple. a.equals(b) implies a.hashCode() == b.hashCode() but the reverse is not required. MyObject.hashCode() can return 1 in all cases and still be a viable implementation. The hashCode() "method is supported for the benefit of hashtables" - equals() is intended to be "the most discriminating possible equivalence relation on objects".

However, that's a very narrow view of equality and it doesn't work well in a lot of situations.

It's also worth pointing out that Collection.contains() is supposed to rely on equals() - which didn't seem to be the case for my code.

And then I found this comment (about equals() and hashCode() usage):

"Some of the code shipped with the standard Java libraries gets it wrong."

*groan*
# Posted By Sean Corfield | 11/9/08 7:43 PM
Jaime Metcher's Gravatar @Sean,

So did you in fact override hashCode()? Given the contract for hashCode(), contains() is quite within its rights to check the hashcode first, then only proceed to calling equals() if the hashcodes are the same.

Either way, it's definitely a gotcha. IIRC, this can cause issues for Hibernate as well.
# Posted By Jaime Metcher | 11/9/08 8:46 PM
Sean Corfield's Gravatar @Jaime, you're missing my point: the default behavior for hashCode() is intended to support object *identity* (same object) - and that concept is distinct form *equality*.

Java has painted itself into a bit of a corner here. It provides a default behavior based on identity but then requires you override *two* functions if you need equality instead and forces you to lose identity (because you are forced to change hashCode() so that it no longer provides object identity). Essentially, if you apply a less discriminating equivalence relation to your objects, you are *forced* to make HashTable performance worse by changing hashCode() to be less unique as well.

The documentation for Collection.contains() specifies a dependency on equals() but that is clearly incorrect - and your assertion that it may use hashCode() seems correct. If contains() behaved per the documentation, it would obey object equality and ignore object identity.

That was the caveat I was trying to highlight (so perhaps we are just in agreement but coming at this from different positions?).
# Posted By Sean Corfield | 11/9/08 9:01 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.9.2.002. Contact Blog Owner