Set vs HashSet

After studying Set, HashSet and TreeSet individually, let us summarize their differences. Now let us go with Set and HashSet. These two are very much related as HashSet is a class derived from Set interface. As a derived class of Set, the HashSet attains the properties of Set. Important and the most frequently used derived classes of Set are HashSet and TreeSet.

Comparison of HashSet and TreeSet is available at HashSet vs TreeSet.

Set vs HashSet

Interface Set

The important feature of Set interface is it does not allow the elements in duplicates; stores unique elements. It permits null elements, but accept only once as duplicates are not permitted (if added a duplicate, the earlier one is simply overridden and not a compilation error or exception). These features are inherited by HashSet (of course also by TreeSet).

The Set and HashSet are part of collections framework as they are derived from Collection interface. Set uses hashing algorithm to look (or search) up the element.

The list of methods of Set interface is available at interface Set Tutorial and a study on HaSet at HashSet Tutorial.

class HashSet

HashSet is an implementation of Set. All the properties of Set, discussed earlier, are attained by HashSet. To have the advantage of performance, the HashSet object can be assigned explicitly with initial capacity and load factor. The capacity gives the existing storage capability and the load factor gives increment rate of providing additional capacity when the existing capacity is exhausted.
Like HashMap, HashSet methods are not synchronized. But a synchronized version of HashSet can be obtained as follows:

HashSet hs = new HashSet();
Set mySet = Collections.synchronizedSet(hs);

Now, the Set mySet methods are synchronized but still HashSet hs methods are not. The programmer is at liberty to use mySet in multithreaded environment and hs in non-multithreaded environment.

HashSet inherits all the methods of Collection and Set interfaces and adds one extra as given below.

Object clone() : Returns a cloned HashSet object containing the same elements of original HashSet.

Programs on HashSet are available at HashSet Genreral.

About Hashing and Hashcode

Comparing two strings letter by letter in a for loop is a time taking process. To make faster, the JVM converts each string into an integer number called hashcode. Different strings with different sequence of characters have different hashcodes. Comparison with integer numbers gives maximum performance. Usage of hashcode numbers for comparison, searching of duplicate elements and identification is faster.

Hashing is process of converting a string or object into a 32-bit hashcode integer number. Two objects are said to be equal if their hashcodes are same. hashCode() is used in combination with equals() method. When compared, hashing is done automatically by the JVM. Hashing, in data structures, is done implicitly in the basic operations with add(), contains(), remove() and size() etc. Hashing is more useful to compare the sets of large content.

Leave a Comment Cancel Reply