Equality And Hashcodes

The equals() and hashCode() methods belong to the Object class, thus it belongs to every object being created. However, only the toString() method is often mentioned, and less of the other 2 methods except in certification. In actual fact, they are important and should be understood by all Java programmers.

Before Equality

Consider a simple BankAccount object:


  public class BankAccount {
    // attributes ...

    public BankAccount(int acctNo, String name, double balance) {
      // set attributes
    }

    // get/set methods
  }

To compare if two bank accounts are the same, we can use


    if (account1 == account2)

But this method only works if they’re the exact same object. However, two accounts are said to be the same if they have the same acctNo (perhaps two get calls were executed to retrieve the same account from the database). To solve this, we say that 2 accounts are identical if they have the same acctNo.


    if (account1.getAcctNo() == account2.getAcctNo())

Now this is rather un-object-oriented, by design the BankAccount class should be responsible for knowing if another instance is the same as itself. By practicality, if the BankAccount’s “key” changes (e.g. it is only identical if it is from the same bank and has the same acctNo) your comparison becomes complicated (bug-prone) and you have a lot of code changes wherever you compare your objects.

The Graceful Solution

Have the BankAccount override the equals() method, which is supposed to check whether 2 objects are identical.


  public class BankAccount {
    // as above...

    public boolean equals(Object o) {
      BankAccount acct = (BankAccount)o;
      return getAcctNo() == acct.getAcctNo();
    }
  }

The check is now performed within the object (encapsulation), and the users are shielded from how to check for equality. Note that the getAcctNo() method is used instead of directly accessing the private member: this is good practice as the accessor method may perform other tasks such as calculation to return the correct value. Now to compare two BankAccounts, simply call


    if (acct1.equals(acct2))

If now the BankAccount’s key is changed to include the Bank, simply change the BankAccount class:


  public class BankAccount {
    private Bank bank;
    // as above...

    public boolean equals(Object o) {
      BankAccount acct = (BankAccount)o;
      return getBank().equals(acct.getBank()) &&
        (getAcctNo() == acct.getAcctNo());
    }
  }

All the callers would still call .equals() and get the same result. Notice the .equals() is used again to test for equality of the Bank. Therefore the Bank class should also override and implement the corresponding equals method.

Using equals() in Collections

A better use of equals() is achieved when these objects are contained in Collections. The traditional way to check if an object exists in a collection is


  ArrayList list = new ArrayList();
  BankAccount acct = new BankAccount(100, "John Smith", 0);
  BankAccount acct2 = new BankAccount(101, "Foo Bar", 1000);
  list.add(acct);
  list.add(acct2);

  int acctNo = 100;
  for (int i=0;i

Once the equals method is implemented correctly, you can change it to the following:


  ArrayList list = new ArrayList();
  BankAccount acct = new BankAccount(100, "John Smith", 0);
  BankAccount acct2 = new BankAccount(101, "Foo Bar", 1000);
  list.add(acct);
  list.add(acct2);

  // assuming constructor with acctNo only exists.
  BankAccount newAcct = new BankAccount(100);
  if (list.contains(newAcct))
    System.out.println("Account Found.");

This code will print "Account Found", because the BankAccount with the account number 100 already exists in the list. The contains() method actually uses the equals() method to compare objects. So instead of coding your own loop to search for a required object, this method can be used. Advantages? Less code (no loop), less bugs (less code), easy to read (contains method is more meaningful than a search loop).

The remove(Object) method of Collection classes also uses the equals() method to decide if to remove the object. Therefore be warned that if you abuse the equals() method (use it for other purposes or always return true), certain core API methods which depend on the equals() method may behave erratically.

Workaround to retrieve equivalent objects from Lists

So now you know that this object exists. How do you retrieve this object? There is no direct get method that accepts an Object parameter, because theoratically you already have the equivalent object. You can use the "traditional" method of looping, and key checking. However using this workaround you can do key gets from the Collection "using" the equals method:


  BankAccount johnAccount = new BankAccount(100);
  johnAccount = list.get(list.indexOf(johnAccount));

The indexOf method will return you the index of the object within the List, based on the equals() method. Given the index you can retrieve the equivalent object using the get method, and obtain the object given the key. Of course you may want to ensure existence of the object within the List by checking if indexOf returns you -1 before attempting the get.

Side note: You should also understand that list searches are linear, therefore if you have a bigger list and searching by key is frequent, performance might become an issue. You might want to consider alternative collections like SortedLists or Hashtables instead of using this "workaround".

Hashcodes

To understand this section I assume that you not only have knowledge in working with Hasthables, you know how it is implemented (and hopefully coded one yourself before). If you do not need to use hashtables, or have yet to understand hashing, you should skip this section.

In short, hash tables need to decide which "bucket" an added object goes into, so that it can be retrieved from the same "bucket" later. It does this by using the hashCode() method of the Object class. Quoted from the Hashtable API:

To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.

Doing so will ensure that even if you have different instances of the object, but are identical, you will be able to retreive the same object from the hashtable. The implemented hashCode method should return the same integer for identical objects (when .equals() return true) and multiple invocations on the same unmodified object should yield the same integer. On the other extreme, always returning 1 as a hashCode is not meaningful at all; it causes all your object to be hashed into the same single bucket, defeating the purpose of a hashtable.

Suppose that the BankAccount class did not implement the hashCode() method. Check this code out:


  Hashtable accounts = new Hashtable();
  BankAccount acct = new BankAccount(100, "John Smith", 0);
  Investment investment = new Investment(...);
  accounts.put(acct, investment); // map bank account to investment

  System.out.println(accounts.get(acct));
  BankAccount newAcct = new BankAccount(100);
  System.out.println(accounts.get(newAcct));

What happens is that the same object can still be retrieved in the 1st println, since the default hashCode returns the same number during multiple invocations. However, the different instance returns a different number, which results in the 2nd println to be null. Only when the hashCode() method is implemented properly, the 2nd println will produce an Investment object as well.

The general approach to a good hashCode is to use your object key as the hashCode. For example, objects with integer keys may return the int directly as a hashcode. For Strings, the String object provides a reasonable hashCode that can be used. For composite keys you may try to combine their hashCode values to form a new hashCode.


  // Example of integer key
  private int acctNo;
  public int hashCode() {
    return acctNo;
  }

  // Example of String key
  private String name;
  public int hashCode() {
    return name.hashCode();
  }

  // Example of composite key
  private Bank bank;
  private int acctNo;
  public int hashCode() {
    return bank.hashCode() + acctNo;
  }

Note that it is not necessary to return distinct integers for different objects. It is sufficient to maintain a spread in numbers so that the hashtable can work efficiently.

Leave a Reply