Collections Framework
Introduction to the Java Collections Framework, covering ArrayList, LinkedList, HashMap, HashSet, and other essential data structures.
HashSet in Java
What is a HashSet?
A HashSet
is a class in the Java Collections Framework that implements the Set
interface. It represents a collection of elements that contains no duplicate elements. More formally, sets contain no pair of elements e1
and e2
such that e1.equals(e2)
, and at most one null element.
HashSet
is backed by a HashMap
instance. Essentially, the keys of the HashMap
are the elements of the HashSet
, and the values are dummy objects (usually just PRESENT
, a private static final Object
). This design leverages the HashMap's efficient lookup capabilities.
Key Characteristics of HashSet
:
- Unordered:
HashSet
does not guarantee any particular order of elements. The order can change over time as elements are added or removed. If you need a specific order, consider usingLinkedHashSet
orTreeSet
. - No Duplicates: It ensures that there are no duplicate elements. If you try to add an element that already exists, it will be ignored (the
add()
method will returnfalse
). - Allows One Null Element: You can add one
null
element to aHashSet
. - Not Synchronized:
HashSet
is not synchronized, meaning it's not thread-safe. If multiple threads access aHashSet
concurrently, and at least one of the threads modifies it, it must be synchronized externally. UseCollections.synchronizedSet(new HashSet(...))
to create a synchronizedHashSet
. - Fast Performance: Provides excellent performance for adding, removing, and checking for the existence of elements, typically with O(1) average time complexity.
Creating a HashSet
You can create a HashSet
using several constructors:
HashSet()
: Creates an emptyHashSet
with the default initial capacity (16) and load factor (0.75).HashSet(int initialCapacity)
: Creates an emptyHashSet
with the specified initial capacity and the default load factor (0.75).HashSet(int initialCapacity, float loadFactor)
: Creates an emptyHashSet
with the specified initial capacity and load factor.HashSet(Collection<? extends E> c)
: Creates aHashSet
containing the elements of the specified collection.
import java.util.HashSet;
import java.util.Arrays;
import java.util.List;
public class HashSetExample {
public static void main(String[] args) {
// Creating an empty HashSet with default capacity and load factor
HashSet<String> set1 = new HashSet<>();
// Creating a HashSet with specified initial capacity
HashSet<Integer> set2 = new HashSet<>(32);
// Creating a HashSet with specified initial capacity and load factor
HashSet<Double> set3 = new HashSet<>(64, 0.8f);
// Creating a HashSet from a collection (e.g., a List)
List<String> list = Arrays.asList("apple", "banana", "cherry");
HashSet<String> set4 = new HashSet<>(list);
System.out.println("Set 1: " + set1);
System.out.println("Set 2: " + set2);
System.out.println("Set 3: " + set3);
System.out.println("Set 4: " + set4);
}
}
Adding Elements
You can add elements to a HashSet
using the add()
method. The add()
method returns true
if the element was successfully added (i.e., it wasn't already present), and false
otherwise.
import java.util.HashSet;
public class HashSetAddExample {
public static void main(String[] args) {
HashSet<String> fruits = new HashSet<>();
fruits.add("apple");
fruits.add("banana");
fruits.add("cherry");
System.out.println("HashSet: " + fruits); // Output: HashSet: [banana, apple, cherry] (order is not guaranteed)
boolean added = fruits.add("apple"); // Trying to add a duplicate
System.out.println("Was 'apple' added again? " + added); // Output: Was 'apple' added again? false
System.out.println("HashSet after attempting to add a duplicate: " + fruits); // Output: HashSet after attempting to add a duplicate: [banana, apple, cherry]
fruits.add(null);
System.out.println("HashSet with null: " + fruits); // Output: HashSet with null: [null, banana, apple, cherry]
fruits.add(null); //Trying to add another null
System.out.println("HashSet with null: " + fruits); // Output: HashSet with null: [null, banana, apple, cherry]
}
}
Removing Elements
You can remove elements from a HashSet
using the remove()
method. The remove()
method returns true
if the element was successfully removed, and false
otherwise (if the element was not present in the set).
import java.util.HashSet;
public class HashSetRemoveExample {
public static void main(String[] args) {
HashSet<String> colors = new HashSet<>();
colors.add("red");
colors.add("green");
colors.add("blue");
System.out.println("HashSet before removal: " + colors); // Output: HashSet before removal: [blue, red, green]
boolean removed = colors.remove("red");
System.out.println("Was 'red' removed? " + removed); // Output: Was 'red' removed? true
System.out.println("HashSet after removing 'red': " + colors); // Output: HashSet after removing 'red': [blue, green]
removed = colors.remove("purple"); // Trying to remove an element that doesn't exist
System.out.println("Was 'purple' removed? " + removed); // Output: Was 'purple' removed? false
System.out.println("HashSet after attempting to remove 'purple': " + colors); // Output: HashSet after attempting to remove 'purple': [blue, green]
colors.remove(null); // Trying to remove null that doesn't exist. If it exists, it will be removed.
System.out.println("HashSet after attempting to remove null: " + colors); // Output: HashSet after attempting to remove null: [blue, green]
colors.add(null);
System.out.println("HashSet before removing null: " + colors); //Output: [null, blue, green]
colors.remove(null);
System.out.println("HashSet after removing null: " + colors); //Output: [blue, green]
}
}
You can also remove all elements from the set using clear()
method
import java.util.HashSet;
public class HashSetClearExample {
public static void main(String[] args) {
HashSet<String> names = new HashSet<>();
names.add("Alice");
names.add("Bob");
names.add("Charlie");
System.out.println("HashSet before clear: " + names); // Output: HashSet before clear: [Bob, Alice, Charlie]
names.clear();
System.out.println("HashSet after clear: " + names); // Output: HashSet after clear: []
}
}
Checking for Existence
You can check if an element exists in a HashSet
using the contains()
method. This method returns true
if the element is present in the set, and false
otherwise.
import java.util.HashSet;
public class HashSetContainsExample {
public static void main(String[] args) {
HashSet<String> animals = new HashSet<>();
animals.add("dog");
animals.add("cat");
animals.add("bird");
System.out.println("HashSet: " + animals); // Output: HashSet: [dog, cat, bird]
boolean containsDog = animals.contains("dog");
System.out.println("Contains 'dog'? " + containsDog); // Output: Contains 'dog'? true
boolean containsElephant = animals.contains("elephant");
System.out.println("Contains 'elephant'? " + containsElephant); // Output: Contains 'elephant'? false
}
}
Iterating over a HashSet
You can iterate over the elements of a HashSet
using several techniques:
- Iterator: Using an
Iterator
is the most general way to iterate over any collection. - Enhanced For Loop (for-each loop): This is a more concise and readable way to iterate.
- Stream API (Java 8+): Using streams allows you to perform functional-style operations on the elements.
import java.util.HashSet;
import java.util.Iterator;
public class HashSetIterationExample {
public static void main(String[] args) {
HashSet<String> cities = new HashSet<>();
cities.add("London");
cities.add("Paris");
cities.add("New York");
cities.add("Tokyo");
System.out.println("HashSet: " + cities); // Output: HashSet: [New York, Paris, London, Tokyo] (order not guaranteed)
// 1. Using Iterator
System.out.println("Iterating using Iterator:");
Iterator<String> iterator = cities.iterator();
while (iterator.hasNext()) {
String city = iterator.next();
System.out.println(city);
}
// 2. Using Enhanced For Loop (for-each loop)
System.out.println("\nIterating using Enhanced For Loop:");
for (String city : cities) {
System.out.println(city);
}
// 3. Using Stream API (Java 8+)
System.out.println("\nIterating using Stream API:");
cities.stream().forEach(System.out::println);
}
}
Importance of hashCode()
and equals()
Methods
The correct functioning of a HashSet
heavily relies on the proper implementation of the hashCode()
and equals()
methods of the objects stored in the set. Here's why:
hashCode()
:- When you add an element to a
HashSet
, thehashCode()
method of the element is used to determine the bucket (index) in the underlyingHashMap
where the element should be stored. - A good
hashCode()
implementation should distribute objects evenly across the buckets to avoid collisions. A collision occurs when multiple objects have the same hash code, leading to multiple elements being stored in the same bucket. - If two objects are equal according to the
equals()
method, they must have the samehashCode()
value. This is a critical contract!
- When you add an element to a
equals()
:- When you try to add an element to a
HashSet
, and thehashCode()
of the element matches thehashCode()
of an existing element in the same bucket, theequals()
method is used to determine if the two elements are actually equal. - If
equals()
returnstrue
, the new element is considered a duplicate and is not added to the set. Ifequals()
returnsfalse
, the new element is added to the same bucket (creating a collision, which can impact performance). - The
equals()
method defines what it means for two objects of a class to be considered equivalent.
- When you try to add an element to a
Consequences of Incorrect Implementations:
- If two equal objects have different hash codes, the
HashSet
might allow duplicates, violating the fundamental property of a set. - If the
hashCode()
method is poorly implemented and generates the same hash code for many different objects, theHashSet
's performance will degrade significantly (approaching O(n) for operations likeadd()
andcontains()
). - If
equals()
is implemented incorrectly, theHashSet
will not accurately determine equality between objects, potentially leading to unexpected behavior.
Example: Overriding hashCode()
and equals()
import java.util.HashSet;
import java.util.Objects;
class Point {
private int x;
private int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
public int getX() {
return x;
}
public int getY() {
return y;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
return x == point.x && y == point.y;
}
@Override
public int hashCode() {
return Objects.hash(x, y);
}
@Override
public String toString() {
return "(" + x + ", " + y + ")";
}
}
public class HashCodeEqualsExample {
public static void main(String[] args) {
HashSet<Point> points = new HashSet<>();
points.add(new Point(1, 2));
points.add(new Point(3, 4));
points.add(new Point(1, 2)); // Duplicate
System.out.println("HashSet of Points: " + points); // Output: HashSet of Points: [(1, 2), (3, 4)] (no duplicates)
}
}
In this example, the Point
class overrides both equals()
and hashCode()
. Two Point
objects are considered equal if their x
and y
coordinates are the same. The hashCode()
method is implemented using Objects.hash()
, which generates a hash code based on the x
and y
coordinates. This ensures that equal Point
objects have the same hash code.
Time Complexity of Operations
The time complexity of operations on a HashSet
is generally very efficient, provided that the hashCode()
method is well-distributed and collisions are minimized.
add(element)
: Average case: O(1). Worst case: O(n) (rare, occurs with many hash collisions).remove(element)
: Average case: O(1). Worst case: O(n) (rare, occurs with many hash collisions).contains(element)
: Average case: O(1). Worst case: O(n) (rare, occurs with many hash collisions).size()
: O(1)isEmpty()
: O(1)clear()
: O(1)- Iteration (using Iterator or for-each loop): O(n), where n is the number of elements in the set.
The "average case" complexity assumes that the hashCode()
method is well-distributed, resulting in a small number of collisions. The "worst case" complexity occurs when many elements have the same hash code, forcing the HashSet
to search through a long chain of elements to find or add the correct one.