Equals and HashCode in Java

This is one of my pet peeves. Partially because it isn’t really explained that well in Java documentation, but mostly because I’ve seen it wrong so many places. As I’ve come across this twice in the past two days in our code, I felt the need to blog about it.

Let’s look at the contract for hashCode.

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

The summary is “if two things are equal (according to equal()), they must have the same hashCode”. Which probably means that we should be considering the same data in the “equals” method as we are in the “hashCode” method.

What’s the deal with HashCode?

Let’s say you need a map. You want to store key/value pairs, and you want to be able to get a value via a key. How would you implement this? Well, one thing you could do is have a list of all the key/value pairs and loop through the whole list. For each key/value pair, you could see if that key matches the one you are looking for. If it does, you return that value. Hooray!

The only problem with this is that it is slow. What if you have a map of a million items? A billion? You potentially have to loop through the *whole* list. On every lookup. That makes puppies sad.

Sad puppy

How could we do better? Well, one way would be to keep a bunch of lists of key/value pairs inside your object. When you add a key/value to your map, you put it in the “right list” somehow. Then when you go to get something from your map, you look in just the right list, and only have to go through all the key/values in that list. If you have a lot of lists, each list should hold very little, so there isn’t much to search through. Now you’re fast. Yay!

What is this magic “somehow”? This is the HashCode. It’s an int that lets you say what “bucket” to stick your key/value pair into.

The HashCode/Equals connection

This “bucketizing” is why equals and hashCode have to match up. Let’s look at an example:

Map capitals = new HashMap();
capitals.put("Alaska", "Juneau");
capitals.put("Arkansas", "Little Rock");
capitals.put("Virginia", "Richmond");
String foo = capitals.get("Virginia");

Let’s imagine for a second that HashMap made 26 buckets, and hashCode returned a number based on the first letter of the string. So, the “A” bucket would have two items in it and the “V” bucket would only have one. When we try to get the capital of Virginia, the hash map would know to only look in the “V” bucket. If we were to ask for the capital of Nevada, it would go straight to the “N” bucket, see it was empty, and return null. For getting the capital of Arkansas, it would go to the “A” bucket and through everything in that bucket, calling equals() on each key.

So that means that if two things are equal but don’t have the same hashcode, HashMap might not look in the right bucket, and therefore wouldn’t find the key in the map, even though it is really there.

Real example
Now let’s look at a real example (only slightly obfuscated for publication):


Message upper = new Message("I like e. e. cummings!");
Message lower = new Message("I like E. E. Cummings!");

HashMap myMap = new HashMap();
myMap.put(upper, "PoetryFan");
System.out.println(myMap.get(lower));
System.out.println(upper.equals(lower));


which outputs:

null
true


So what this means is that these two Messages are equal, but the HashMap doesn’t think so. BAD!!!

The implementation of this is:

    
public boolean equals(Object obj) {
    if (!(obj instanceof Message)) {
        return false;
    }
    if (!caseSensitiveCompare()) {
        return StringUtils.nullSafeEqualsIgnoreCase(
                 this.getString(), 
                 ((Message) obj).getText());
    } else {
        return StringUtils.nullSafeEquals(
                 this.getString(), 
                 ((Message) obj).getText());
    }
}
    
public int hashCode() {
    return string.hashCode();
}

HashCode is hard – let’s go shopping!

As we can see from the above example, it is very easy to get this stuff wrong. Here, the “case insensitivity” was added later, but the hashCode method wasn’t updated to match. Any time I touch anything having to do with equals/hashcode, I have someone else review what I did. We should all probably do that.

So, what makes a good hashCode method? I’ll leave that for the subject of a future post, but for now it suffices to to say that first and foremost it should be correct as stated above. Making it good is a bit harder.

One thought on “Equals and HashCode in Java”

  1. Also as a quick reminder, if you are going to implement equals() or hashCode(), you should always implement the other.

    We have a checkstyle check (http://checkstyle.sourceforge.net/config_coding.html#EqualsHashCode) and a many findbugs checks** on the subject.

    I totally agree with Ben. HashCode is hard, and so is equals…

    ** Here are some of the less repetitive ones (and these cannot even check Ben’s case where the code is simply wrong):

    http://findbugs.sourceforge.net/bugDescriptions.html#EQ_COMPARETO_USE_OBJECT_EQUALS
    http://findbugs.sourceforge.net/bugDescriptions.html#EQ_SELF_NO_OBJECT
    http://findbugs.sourceforge.net/bugDescriptions.html#HE_INHERITS_EQUALS_USE_HASHCODE
    http://findbugs.sourceforge.net/bugDescriptions.html#EQ_OTHER_NO_OBJECT
    http://findbugs.sourceforge.net/bugDescriptions.html#EQ_OTHER_USE_OBJECT
    http://findbugs.sourceforge.net/bugDescriptions.html#EQ_OVERRIDING_EQUALS_NOT_SYMMETRIC

Leave a Reply

Your email address will not be published. Required fields are marked *