Hibernate anti-pattern #1 – hashCode/equals

I’ve been doing a lot of Hibernate refactoring and performance tuning at work lately and have been spending time wrestling with some interesting anti-patterns that have crept into the code. One is our particularly bad hashCode/equals implementation which has some pretty frustrating implications1.

There’s a lengthy discussion here about various ways to implement hashCode and equals for Hibernate-managed entities. The post is great, and the discussion in the comments is also good.


The assumption is that the 1st-level cache (and also the 2nd-level cache if implemented) ensures that instances are unique. So our base entity class has a final equals method that uses the same logic as Object – two instances are equal if they’re the same (==). Likewise, hashCode is final and returns System.identityHashCode(this), the same value as if the method weren’t overridden (we have a base class between Object and our base entity class that necessitates explicitly reverting the behavior to that of Object):

@Override
public final boolean equals(final Object o) {
  return o == this;
}

@Override
public final int hashCode() {
  return System.identityHashCode(this);
}

This turns out to be overly simplistic and optimistic – it sure would be nice if that’s all it took, since there’s no equals/hashCode maintenance for new entity classes, no custom entity-specific logic to implement.

If you call Foo foo1 = session.get(Foo.class, 123) and Foo foo2 = session.get(Foo.class, 123), the 2nd call will return a reference to the cached 1st instance – there’s only 1 database hit. This is one of the primary benefits of using an ORM framework like Hibernate – there’s some overhead in its approach but optimizations like this will tend to far outweigh the costs as long as you’re using the framework appropriately.

So in this case, foo1 == foo2, and the equals method works fine.

But if you mix eagerly loaded entity instances and proxies, weird things happen. For example, suppose Bar has a lazy many-to-one reference to Foo (or a lazy-loaded collection of Foos); == will always be false even if the target entity of the proxy is the same as the non-proxy instance:

Foo foo = (Foo)session.get(Foo.class, 123);
Bar bar = (Bar)session.get(Bar.class, 321);

// always false since bar's Foo is a proxy
System.out.println(bar.getFoo() == foo);

// true for proper equals, always false for us
System.out.println(bar.getFoo().equals(foo));

The result is that data access implementation logic has bled up past the service tier and into the web tier; the web tier shouldn’t have to be aware of proxies but now has to be. If there’s any risk of comparing proxied and non-proxied instances, we have to use primary key equality to test instance equality instead of the much more natural equals.

What’s worse, we can’t mix persistent and new entity instances in collections that maintain uniqueness (Sets/Maps). A new instance can never be equal to a persistent instance even if every non-pk property value is the same, so we again have to deal with the data tier bleeding into other tiers and jump through hoops to get Set uniqueness and contains to work.


So here’s my proposed approach. Comparing two persistent instances using primary key is very efficient since primary keys are unique. This doesn’t help when comparing persistent and new instances though, and is inappropriate for usage in hashCode for the reasons discussed in the Hibernate post referenced above. Specifically, the hashCode will change from whatever you use before the instance is persisted to the primary key (or some value derived from it) afterwards. This will confuse hash-based collections since the instance will most likely be in the wrong hash bucket and cannot be found.

So equals compares primary keys if both are non-null. If either is null, then the class must define the fields that determine uniqueness and compare those.

hashCode always uses the same fields used in the equals comparison and never the primary key. If the entity is immutable or the fields that are used for the calculation are, then the hashCode could be cached to save the cost of unnecessary recalculation.

Most of this logic is in the base entity class, with abstract methods that force each entity class to define entity-specific behavior:

import java.io.Serializable;

import javax.persistence.Column;
import javax.persistence.MappedSuperclass;
import javax.persistence.Transient;
import javax.persistence.Version;

import org.apache.commons.lang.builder.HashCodeBuilder;
import org.hibernate.Hibernate;

/**
 * Abstract base class for entities.
 * @param <T> the type
 */
@MappedSuperclass
public abstract class BaseEntity<T extends BaseEntity<T>>
       extends ValueObject {

  private int _version;

  /**
   * Default.
   */
  protected BaseEntity() {
    // default
  }

  // optimistic locking version, private since only Hibernate uses these
  @Version @Column(name = "oplock", nullable = false)
  @SuppressWarnings("unused")
  private int getVersion() {
    return _version;
  }
  @SuppressWarnings("unused")
  private void setVersion(final int version) {
    _version = version;
  }

  /**
   * Utility method for <code>equals()</code> methods.
   * @param o1  one object
   * @param o2  another object
   * @return <code>true</code> if they're both <code>null</code> or both equal
   */
  protected boolean areEqual(final Object o1, final Object o2) {
    if (o1 == null) {
      if (o2 != null) {
        return false;
      }
    }
    else if (!o1.equals(o2)) {
      return false;
    }

    return true;
  }

  /**
   * Utility method for <code>equals()</code> methods.
   * @param s1  one string
   * @param s2  another string
   * @param ignoreCase if <code>false</code> do case-sensitive comparison
   * @return <code>true</code> if they're both <code>null</code> or both equal
   */
  protected boolean areEqual(
      final String s1,
      final String s2,
      final boolean ignoreCase) {
    // for use in custom equals() methods

    if (s1 == null && s2 == null) {
      return true;
    }

    if (s1 == null || s2 == null) {
      return false;
    }

    return ignoreCase ? s1.equalsIgnoreCase(s2) : s1.equals(s2);
  }

  /**
   * Utility method for <code>equals()</code> methods.
   * @param d1  one date
   * @param d2  another date
   * @return <code>true</code> if they're both <code>null</code> or both equal
   */
  protected boolean areEqual(
      final Date d1,
      final Date d2) {
    // for use in custom equals() methods

    if (d1 == null && d2 == null) {
      return true;
    }

    if (d1 == null || d2 == null) {
      return false;
    }

    return d1.getTime() == d2.getTime();
  }

  /**
   * Utility method for <code>equals()</code> methods.
   * @param f1  one float
   * @param f2  another float
   * @return <code>true</code> if they're equal
   */
  protected boolean areEqual(
      final float f1,
      final float f2) {
    // for use in custom equals() methods

    return Float.floatToIntBits(f1) == Float.floatToIntBits(f2);
  }

  /**
   * Utility method for <code>hashCode()</code> methods.
   * @param values  the values to use in calculation
   * @return  the hash code value
   */
  protected int calculateHashCode(final Object... values) {
    HashCodeBuilder builder = new HashCodeBuilder();
    for (Object value : values) {
      builder.append(value);
    }
    return builder.toHashCode();
  }

  /**
   * {@inheritDoc}
   */
  @Override
  public int hashCode() {
    return calculateHashCode(getHashCodeData());
  }

  /**
   * Get the data used to calculate hash code;
   * use getters not fields in case the instance is a proxy.
   * @return the data
   */
  @Transient
  protected abstract Object[] getHashCodeData();

  /**
   * Allows type-specific "this".
   * @return  this
   */
  @Transient
  protected abstract T getThis();

  /**
   * Get the primary key.
   * @return  the pk, or <code>null</code> if not set or if not supported
   */
  @Transient
  public abstract Serializable getPk();

  /**
   * {@inheritDoc}
   */
  @SuppressWarnings("unchecked")
  @Override
  public final boolean equals(final Object other) {
    if (this == other) {
      return true;
    }

    if (other == null
        || // looks into the target class of a proxy if necessary
        !Hibernate.getClass(other).equals(Hibernate.getClass(this))) {
      return false;
    }

    // if pks are both set, compare
    if (getPk() != null) {
      Serializable otherPk = ((BaseEntity)other).getPk();
      if (otherPk != null) {
        return getPk().equals(otherPk);
      }
    }

    return dataEquals((T)other);
  }

  /**
   * Compare data only; null, class, and pk have been checked.
   * @param other the other instance
   * @return <code>true</code> if equal
   */
  protected abstract boolean dataEquals(T other);
}

A concrete subclass must implement getHashCodeData and dataEquals, for example using the username property for a User class:

/**
 * {@inheritDoc}
 */
@Override
protected boolean dataEquals(final User other) {
  return areEqual(getUsername(), other.getUsername(), true);
}

/**
 * {@inheritDoc}
 */
@Override
@Transient
protected Object[] getHashCodeData() {
  return new Object[] { getUsername() };
}

This has worked well so far. It’s consistent, even when an instance transitions from new to persistent and its primary key becomes non-null. The data used for hashCode calculation never changes, so that’s consistent. The data for equals does change, but it stays consistent – comparing two persistent entities or mixing persistent and new returns the correct result in all cases since non-pk data is used for comparison.

Note that equals is final but hashCode isn’t to allow caching.


My original implementation of this was way too simplistic, just using the Jakarta Commons Lang reflection methods that I described here. This approach is great for regular value objects and DTOs, but is way too expensive for Hibernate entities since every field is used, even collections and the primary key. Lazy-loaded collections will be retrieved from the database to calculate hashCode and equals, which is almost always inappropriate.

So I worked on a slightly smarter approach that skipped the primary key and collections but still used reflection for the rest of the fields. This is still too expensive, and breaks when using proxies since the fields will be null – the getter methods delegate to the data that’s in the proxy’s target instance.

Eventually I ended up with the current implementation, which has lasted for a while with no issues.




  1. Ironically, The Decidertm was asked recently why this choice was made and he couldn’t defend it, but was sure that there was a good reason for it and wasn’t willing to change anything even given the implications of the decision.[back]

15 Responses to “Hibernate anti-pattern #1 – hashCode/equals”

  1. Peter Lawrey says:

    A more cryptic by shorter way to test equality is the following.

    return (o1 == o2) || (o1 != null && o1.equals(o2));

  2. Mike Funk says:

    Many folks, for valid technical reasons, assign guids in their entity constructors and don’t use lazy loading. Web services and Swing apps come to mind, even web apps that prefer to use detached objects. Never assume lazy loading is the ideal object graph navigation solution. Remember only one rule: every app is, indeed, unique in its requirements.

  3. Maulin Shah says:

    I really like your idea, thanks a lot. The one thing that I dn’t like is that there is reliance in this model on Hibernate. And there really doesn’t have to be. The only reason you use it is to handle the proxy issue. But with generics, you can actually get the “real” class of the object at runtime without the help of hibernate.

    @SuppressWarnings(“unchecked”)
    public Class entityClass() {
    ParameterizedType thisType = (ParameterizedType) getClass().getGenericSuperclass();
    return (Class) thisType.getActualTypeArguments()[0];
    }

    and then modify you equals:

    @SuppressWarnings(“unchecked”)
    @Override
    public final boolean equals(final Object other) {
    if (this == other) {
    return true;
    }

    if (other == null) {
    return false;
    }

    if (!entityClass().isAssignableFrom(other.getClass())) {
    return false;
    }

    // if pks are both set, compare
    if (getPk() != null) {
    Serializable otherPk = ((BaseEntity)other).getPk();
    if (otherPk != null) {
    return getPk().equals(otherPk);
    }
    }
    }

    return dataEquals((T) other); //we can cast this because of the class checking above
    }

  4. Maulin Shah says:

    sorry — you lose all the parameterization in my previous post! <T> was filtered out…

  5. Maulin Shah says:

    Er… that had some bugs as well…. but you get the idea.

  6. Chris Messina says:

    What about cases where your fields are not immutable? Wouldn’t this break the relationship between equals and hashcode? Maybe I am wrong but this will ONLY be valid when the objects are immutable. I know you mention that if the entity or fields are immutable that the hashcode could be cached, but I would think it would not only be uncacheable, but also invalid otherwise.

    example, you have two instances of the same entity which has already been saved and given an id. You change a field (lets say name) in one of the instances. equals will return true, but hashcode will be different (which will cause issues when storing in a collection).

    instance A: id = 1, name=”A”
    instance B: id = 1, name=”B”

    A.equals(B) returns true, but A.hashcode() != B.hashcode().

  7. Burt says:

    @Chris
    You’re right that this is a potential problem, but this is Hibernate after all, and you’re not going to have two different instances of a persistent entity with the same pk. It happens, unfortunately (especially mixing lazy-loaded proxies with regular instances) but it’s not the normal behavior.

  8. @Maulin

    You can actually “tunnel” the classes of proxied entities by having a new method that delegates to getClass() rather than using generic subclassing and reflection:

    public Class getRealClass() { getClass(); }

  9. Jan says:

    @Burt
    You could also use only id object in the hashCode calculation when present this will fix the problem as noticed by Chris.

  10. lucapette says:

    Fine solution. Thank you for sharing it, this post saves me a lot of typing.

  11. lucapette says:

    And many problems too. :)

  12. The lack of equals and hashCode implementations promotes the same issues as when using Java/JPA/Hibernate – they need to be defined on the “natural key” of the entity as the author noted.

    My vote is that GORM should always require you to declare a natural key as well as the object’s primary key – just as you would by convention with JPA/Hibernate and Java.

  13. Thomas Haines says:

    I had been struggling with this issue for a while – I found your solution very useful, thanks ))

  14. Octavian says:

    Hi guys,

    Interesting (and complex :) ) general approach!

    What if I have an entity that does not have a natural key? E.g. some representation of an index card:

    Icard {
    String id;
    String title;
    String side1;
    }

    In my case, I want to be able to change both the title and the side1 information and none of them should be unique…

    /Tavi

  15. Wim Ockerman says:

    We just take a UUID for the primary key and don’t leave it to hibernate to assign it. So we assign it during object construction. This way you can have a simple equals/hashcode just over the UUID id.

Leave a Reply

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.