Implementing the hashCode() Method – Object Comparison
14.3 Implementing the hashCode() Method
Hashing is an efficient technique for storing and retrieving data. A common hashing scheme uses an array where each element is a list of items. The array elements are called buckets. Operations in a hashing scheme involve computing an array index from an item. Converting an item to its array index is done by a hash function. The array index returned by the hash function is called the hash code of the item— it is also called the hash value of the item. The hash code identifies a particular bucket.
Storing an item involves the following steps:
Hash the item to determine the bucket.
If the item does not match an item already in the bucket, it is stored in the bucket.
Note that no duplicate items are stored. A lookup for an item is also a two-step process:
Hash the item to determine the bucket.
If the item matches an item in the bucket, the item is present; otherwise, it is not.
Different items can hash to the same bucket, meaning that the hash function returns the same hash code for these items. This condition is called a collision. The list maintained by a bucket contains the items that hash to the bucket.
The hash code of an item only identifies the bucket. Finding an item in the bucket entails a search, and requires an equality function to compare items. The items maintained in a hash-based storage scheme must, therefore, provide two essential functions: a hash function and an equality function.
The performance of a hashing scheme is largely affected by how well the hash function distributes a collection of items over the available buckets. A hash function should not be biased toward any particular hash codes. An ideal hash function produces a uniform distribution of hash codes for a collection of items across all possible hash codes. Such a hash function is not an easy task to design. Fortunately, heuristics exist for constructing adequate hash functions.
Here we mention two abstract data types that can be implemented using a hashing scheme:
- A set is an abstract data type that stores an unordered collection of unique items. Items in a set are called elements and the element to search for in a set is referred to as the search key. The hashing is used to store and lookup an item in a set, as explained earlier in this section.
- A map (also called, a hash table) is an abstract data type that maintains its items as key–value entries. An entry associates a key with a value. The keys in a hash table are unique. The hashing is done on a search key to provide efficient lookup of the entry with the associated value. Matching the search key with a key in an entry determines the value.
If objects of a class are to be maintained in hash-based collections and maps provided by the java.util package (see Table 15.2, p. 788), the class must provide appropriate implementations of the following methods from the Object class:
- A hashCode() method that produces hash codes for the objects
- An equals() method that tests objects for object value equality
int hashCode()
When storing objects in hash tables, this method can be used to get a hash code for an object. This method tries to return distinct integers for distinct objects as their default hash code. The hashCode() method is usually overridden by a class, as in the case of the wrapper classes and the String class.
As a general rule for implementing these methods, a class that overrides the equals() method must override the hashCode() method. Consequences of not doing so are illustrated in Example 14.5 using the class UsableVNO that does not override the hash-Code() method. Elements of this class are used as elements in a set that uses the hashCode() method of its elements to maintain them in the set. The output from the program shows that a set with the following elements is created:
Set: [(8.19.81), (2.48.28), (3.49.1), (9.1.1), (10.23.78)]
The hashCode() method from the Object class is not overridden by the UsableVNO class and is, therefore, used to compute the hash codes of the elements. This method returns the memory address of the object as the default hash code. The attempt to find the search key (9.1.1) in the set is unsuccessful:
Search key (9.1.1) contained in set: false
The output from the program shows the hash codes assigned by this method to the search key and the elements in the set:
Search key (9.1.1) has hash code: 2036368507
Hash codes for the elements:
(3.49.1): 1705929636
(8.19.81): 1221555852
(2.48.28): 1509514333
(10.23.78): 1556956098
(9.1.1): 1252585652
The hash codes of two objects, which are equal according to the equals() method of the class UsableVNO, are not equal according to the hashCode() method of the Object class. Therefore, the version number (9.1.1) in the set has a different hash code than the search key (9.1.1). These objects hash to different buckets. The lookup for the search key is done in one bucket and does not find the element (9.1.1), which is to be found in a completely different bucket. Just overriding the equals() method is not enough. The class UsableVNO violates the key tenet of the hashCode() contract: Equal objects must produce equal hash codes.
Example 14.5 Implications of Not Overriding the Object.hashCode() Method
import static java.lang.System.out;
import java.util.*;
public class TestUsableVNO2 {
public static void main(String[] args) {
// Print name of version number class:
out.println(UsableVNO.class);
// An array of version numbers.
UsableVNO[] versions = new UsableVNO[] { // (1)
new UsableVNO( 3,49, 1), new UsableVNO( 8,19,81),
new UsableVNO( 2,48,28), new UsableVNO(10,23,78),
new UsableVNO( 9, 1, 1)};
// Search key:
UsableVNO searchKey = new UsableVNO(9,1,1); // (2)
// Create a list:
List<UsableVNO> vnoList = Arrays.asList(versions); // (3)
// Searching in a set:
Set<UsableVNO> vnoSet = new HashSet<>(vnoList); // (4)
out.println(“Set: ” + vnoSet);
out.printf(“Search key %s contained in set: %s%n”, searchKey,
vnoSet.contains(searchKey)); // (5)
out.println();
// Search key and its hash code:
out.printf(“Search key %s has hash code: %d%n”, searchKey,
searchKey.hashCode()); // (6)
// Hash values for elements:
out.println(“Hash codes for the elements:”);
for (UsableVNO element : versions) { // (7)
out.printf(” %10s: %s%n”, element, element.hashCode());
}
}
}
Output from the program:
class UsableVNO
Set: [(8.19.81), (2.48.28), (3.49.1), (9.1.1), (10.23.78)]
Search key (9.1.1) has hash code: 2036368507
Hash codes for the elements:
(3.49.1): 1705929636
(8.19.81): 1221555852
(2.48.28): 1509514333
(10.23.78): 1556956098
(9.1.1): 1252585652