Understand_Don_t_Assume

Quick Profile

Alex Ooi Profile Picture
Hometown: Melbourne, Australia
Specializations: Java, Ruby on Rails
University: Software Engineering & Economics, Melbourne University
High School: VCE, Melbourne High School
Links:
 

Popular Articles

Saturday, May 29, 2010

for the sake of this post ... assume I'm using JDK 1.3 because I'm unfortunately working on a legacy app :(

oh, and make a note of the little bit of personal reflection i do at the end of this post ... relates to recent events that have me doing a bit of reflection

On with it already!!!

Today I was using a java.util.TreeMap. I actually didn't need to, and I got around the bug in our code by reverting back to a standard java.util.HashMap. Nonetheless, I thought this was interesting in two parts. First, because java.util.TreeMap did something I didn't expect. Secondly, and most importantly, because it re-inforced in me the need to carefully read the docs of external classes before using them.

The Bug

Lets assume that I'm using an instance of the following class as the key to the TreeMap.

public class MyKey implements Serializable, Comparable
{
    private String fieldOne;
    private String fieldTwo;

    // more java code ...

    public int compareTo(Object other)
    {

         // compares this objects fieldOne and
         // fieldTwo to other objects fieldOne and fieldTwo

    }

    // notice that there is no implementation of 
    // Object#equals(Object) or Object#hashCode()

}

The bug itself can be summarized by this code (it doesn't actually resemble what I was doing, but merely to illustrate my misunderstanding):


....

  Map dataHolder = new TreeMap()
  MyKey keyOne = new MyKey( ... ); // same field one and two
  MyKey keyTwo = new MyKey( ... ); // same field one and two
  dataHolder.put(keyOne, "one");
  dataHolder.put(keyTwo, "two"); 
  assertThat(dataHolder.values().size(), is(2); // fails ... it is 1

....

My first reaction (since I hadn't actually had much to do with the MyKey class up until now) was that #equals and #hashCode had been overridden to compare equality based upon fieldOne and/or fieldTwo. This was because other keys were working as I expected, where those keys had different fieldOne and fieldTwo values. But the MyKey classs doesn't overwrite equals and hashCode ...

Essentially, a java.util.TreeMap determines if a key exists by its #compareTo method. Not by its #equals method (as java.util.HashMap does). This meant that my initial assumption that the map would match keys by object equality was incorrect. I assumed that ,since java.util.HashMap matched keys by object equality, so too would TreeMap with the only difference being that TreeMap would use the compareTo simply to order. A simple check of the java.util.TreeMap documentation clearly states that:

"... but a map performs all key comparisons using its compareTo 
(or compare) method, so two keys that are deemed equal by this
 method are, from the standpoint of the sorted map, equal."

So even if two candidate keys are not equal under the Object#equals contract, they are still deemed equal by the java.util.TreeMap if their #compareTo methods return 0! I never had the pleasure of digging into this because, like most developers out there, whenever I've needed an implementation of a Map I would inevitably turn to using a HashMap ...

In fact, that key is just plain broken!!

Digging further, I took a closer look at java.lang.Comparable and realized that the implementation of the key class was broken according to the contract of Comparable. Any developer worth their salt would know of the general rule of Object#equals and Object#hashCode. But (and perhaps I'm the exception here??? or maybe not???), how many know of the contract defined in the Comparable interface that states:

"The natural ordering for a class C is said to be consistent 
with equals if and only if e1.compareTo(e2) == 0 has
the same boolean value as e1.equals(e2) for every e1 and 
e2 of class C"

This is why the key that we were using is broken. #compareTo was not implemented in a way that is consistent with Object#equals. #compareTo was implemented to observe the two String fields, whilst #equals hadn't been implemented, meaning it was checking equality based upon instance equality! The author of this class should have overridden #equals and #hashCode when they implemented #compareTo! The doco for Comparable even goes on to state that using a non-conforming class as a key to a TreeMap will cause "strange" behavior! Exactly what I was observing!

Understand, Don't Assume

Yeh, so that brings me to my second and most important point. I made the mistake of assuming how a class worked without either reading the doco or looking at the source code. That got me thinking. How often do developers simply use some code simply because it sounds like it does the job or it seems to do the job (after only a few simple tests)? How often do developers type the ubiquitous "dot" after referring to an instance of a class and wait for the "auto-completion" popup to appear before carelessly choosing a method to use? How often do developers just copy-and-paste random code from the internet without properly verifying it?

Anyway, fortunately in this case I had the chance to correct my initial mistake. I did several things. First, I read the javadoc for java.util.TreeMap carefully to understand, without digging into the actual source, how it works. Second, I had a good think about the differences between TreeMap and its more common brother, HashMap. And Thirdly, I discussed this particular bug with a couple of colleagues in order to gain a better understanding, and re-inforce my subsequent understanding, of the differences between the two classes.

Finally, on a strange tangent...

Recent events have had me re-thinking things a bit. Sometimes, in life, we don't always have the chance to redeem our mistakes. In those cases the best we can do is analyse what went wrong and, because we can't change others and we can't change what is gone, we can at least attempt to better ourselves from our experience. It reminds me of a quote from a recent movie that left a real mark on me... For those without a decent internet connection, or if that link has gone down, it was a line that Alfred (played by the excellent Michael Caine) delivered to Bruce (the ever brilliant Christian Bale) during one of his darkest moments in the movie. it goes something like this:

"Why do we fall? So that we can learn to pick ourselves up again."

perhaps I'm being dramatic ... but ... well ... this is one small step in my attempt to get back up again ...

Fortunately, this particular bug wasn't particularly serious or unfixable. It was also particularly interesting ... Certainly the most interesting thing I've come across in the last few months (damn my projects have been boring lately!!!) ... So don't forget. The next time you are tempted to simply use a method/class because it looks like it doest the job ... remember ... don't assume it does! Find the javadoc or the source, and verify that it really does behave the way you expect it should! It sure is damn easy to slip ...

Comments ...

That incident also points out the importance of documentation. I hope you learn
the importance of documentation, and not just saying "oh.. my code is well
documented through the test cases".

Did you find what you learnt through running the test cases that Sun wrote on
TreeMap, or did you find them through reading the Javadoc? :)

I'm not saying that test cases are not important. But undocumented code is as
equally bad as untested code.
Posted by Ultimate Bat Pig on Monday, May 31, 2010 at 10:56 AM
Its important to strive for all of these things ... high test coverage and
better documentation. Even self-documenting code!! ... but traditional
processes are just as bad as hard-core unit-test fanatics. Traditional
processes place a huge emphasis on super fat documents instead of the software
itself ...
Posted by Alex Ooi on Monday, May 31, 2010 at 03:30 PM
It sounds like someone is impersonating the Ultimate Bat Pig!
Posted by Alex Ooi on Monday, May 31, 2010 at 04:04 PM
Fooking imitators!
Posted by Ultimate Bat Pig on Tuesday, June 01, 2010 at 12:08 PM
> Traditional processes place a huge emphasis on super fat documents instead
of the software itself ...

That's quite a myopic view point.
Posted by tester on Wednesday, June 02, 2010 at 06:45 PM

Add a Comment

*
*
You must answer the following simple maths question before your comment will be accepted.
*