Collections Framework – 3 [ Set ]

The Set interface defines the functionality of a collection that cannot contain duplicates in its elements.

Items added to a Set type collection must reimplement their equals() and hashCode() methods.

These methods are used when adding an item to determine if it is already present in the collection.

The value returned by hashCode() is searched in the collection :

  • If no object in the collection has the same hash value then the object is not yet in the collection and can be added;
  • If one or more objects in the collection have the same hash value then the equals() method of the object to add is invoked on each of the objects to determine whether the object is already present or not in the collection.

What You Need

  • About 13 minutes
  • A favorite text editor or IDE
  • Java 8 or later

1. Methods of Set Interface

The Set interface defines several methods :

MethodRole
boolean add(E e)Adds the specified element to this set if it is not already present.
boolean addAll(Collection<? extends E> c)Adds all of the elements in the specified collection to this set if they’re not already present.
void clear()Removes all of the elements from this set.
boolean contains(Object o)Returns true if this set contains the specified element.
boolean containsAll(Collection<?> c)Returns true if this set contains all of the elements of the specified collection.
boolean isEmpty()Returns true if this set contains no elements.
boolean remove(Object o)Removes the specified element from this set if it is present.
boolean removeAll(Collection<?> c)Removes from this set all of its elements that are contained in the specified collection.
boolean retainAll(Collection<?> c)Retains only the elements in this set that are contained in the specified collection.
int size()Returns the number of elements in this set.

2. Implementations of Set Interface

ImplementationRole
java.util.HashSet<E>It is not a thread-safe implementation but it is possible to add a null element
java.util.LinkedHashSet<E>Similar to HashSet by defining the traversal order which is the one in which the elements were added
java.util.TreeSet<E>The elements are sorted but it is not thread safe and cannot add null element
java.util.concurrent.ConcurrentSkipListSet<E>An ordered set of elements capable of handling strong concurrency
java.util.concurrent.CopyOnWriteArraySet<E>A thread-safe implementation by creating a new copy when invoking a method that modifies the contents of the collection
java.util.EnumSet<E extends Enum<E>>All elements of the collection must belong to the same enumeration

2.1 HashSet

The HashSet class, added in Java 1.2, is a simple implementation of the Set interface.

It has below characteristics :

  • It does not offer any guarantee on the order of traversal during the iteration on the elements which it contains;
  • It does not allow adding duplicates but it allows adding a null element.

It internally uses a HashMap whose key is the element and whose value is an identical Object instance for all elements.

import java.util.ArrayList;
import java.util.HashSet;
import java.util.Set;

public class HashSetTest {
    public static void main(String[] args) {
        Set<ArrayList<Integer>> set = new HashSet<>();

        ArrayList<Integer> list1 = new ArrayList<>();

        ArrayList<Integer> list2 = new ArrayList<>();

        list1.add(1);
        list1.add(2);
        list2.add(1);
        list2.add(2);
        set.add(list1);
        set.add(list2);

        System.out.println(list1.hashCode() == list2.hashCode());
        System.out.println(list1.equals(list2));

        System.out.println(set.size());
    }
}

The output of above code snippet is below :

true
true
1

2.2 LinkedHashSet

The LinkedHashSet is a direct descendant of the HashSet.

It has a doubly-linked list running through all of its entries and maintains a predictable iteration order which is defined by the insertion order of the elements into the set.

import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.Set;

public class LinkedHashSetTest {
    public static void main(String[] args) {
        Set<Integer> set = new HashSet<>();

        set.add(5);
        set.add(1);
        set.add(4);
        set.add(2);
        set.add(3);

        set.spliterator().forEachRemaining(System.out::println);

        System.out.println();

        set = new LinkedHashSet<>();

        set.add(5);
        set.add(1);
        set.add(4);
        set.add(2);
        set.add(3);

        set.spliterator().forEachRemaining(System.out::println);
    }
}

The output of above code snippet is below :

1
2
3
4
5

5
1
4
2
3

2.3 TreeSet

The TreeSet class, added in Java 1.2, stores its elements in an ordered manner by comparing them to each other.

This class makes it possible to insert elements in any order and to restore these elements in a specific order during its traversal.

The order of the items in a TreeSet can be determined in two ways :

  • The natural order of elements if they implement the Comparable interface;
  • The order obtained by using a Comparator provided as a parameter of the TreeSet constructor.
import java.util.Collections;
import java.util.TreeSet;

public class TreeSetTest {
    public static void main(String[] args) {
        TreeSet<Integer> set = new TreeSet<>();

        set.add(2);
        set.add(3);
        set.add(1);
        set.add(5);
        set.add(4);

        set.spliterator().forEachRemaining(System.out::println);

        System.out.println();

        set = new TreeSet<>(Collections.reverseOrder());

        set.add(2);
        set.add(3);
        set.add(1);
        set.add(5);
        set.add(4);

        set.spliterator().forEachRemaining(System.out::println);
    }
}

The output of above code snippet is below :

1
2
3
4
5

5
4
3
2
1

The TreeSet class is not thread-safe, only one thread should be able to modify its content.

If several threads must be able to modify the content, it has to invoke the synchronizedSortedSet() method of the Collections class which will create a wrapper whose methods are synchronized.

SortedSet set = Collections.synchronizedSortedSet(new TreeSet());

With this solution, several threads can modify the collection but only one at a time, which can lead to performance degradations in the event of strong concurrency.

In this case, it is better to use another Set type implementation like the ConcurrentSkipListSet class.

2.4 ConcurrentSkipListSet

The ConcurrentSkipListSet class, added to Java 1.6, makes it possible to implement an ordered set of elements capable of managing strong concurrency.

The elements in a ConcurrentSkipListSet are ordered in their natural order if they implement the Comparable interface or according to an order defined by the instance of type Comparator supplied to the constructor of ConcurrentSkipListSet.

It is particularly useful for managing an ordered set of items that can be accessed and modified by multiple threads.

Adding, removing, and getting an item from the collection is done concurrently by multiple threads.

Its operations are CAS (Compare And Swap) : they do not set any lock that could introduce contention.

It uses a skip list type data structure, unlike a binary tree, the organization does not need to be readjusted when adding or removing an element.

import java.util.Collections;
import java.util.Set;
import java.util.concurrent.ConcurrentSkipListSet;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class ConcurrentSkipListSetTest {
    public static void main(String[] args) throws InterruptedException {
        Set<Integer> set = new ConcurrentSkipListSet<>(Collections.reverseOrder());

        set.add(3);
        set.add(3);
        set.add(2);
        set.add(2);
        set.add(1);
        set.add(1);
        set.add(5);
        set.add(5);
        set.add(4);
        set.add(4);

        set.spliterator().forEachRemaining(System.out::println);

        set.clear();

        ExecutorService es = Executors.newFixedThreadPool(5);

        try {
            for (int i = 0; i < 10000; i++) {
                final Integer element = Integer.valueOf(i);
                es.execute(() -> {
                    set.add(element);
                });
            }
        } finally {
            es.shutdown();
            es.awaitTermination(5, TimeUnit.SECONDS);
        }

        System.out.println("Set size = " + set.size());
    }
}

Below is the output of above code snippet :

5
4
3
2
1
Set size = 10000

2.5 CopyOnWriteArraySet

The CopyOnWriteArraySet class, added in Java 1.5, is a Set type implementation that is thread safe and provides good read performance.

It internally uses an instance of type CopyOnWriteArrayList to store the elements.

The update operations are costly in terms of resources because it involves an integral copy of the elements.

So the elements in CopyOnWriteArraySet should therefore preferably not be modified too frequently.

import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.concurrent.CopyOnWriteArraySet;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;

public class CopyOnWriteArraySetTest {
    public static void main(String[] args) throws InterruptedException {
        Set<String> set = new CopyOnWriteArraySet<>();

        AtomicBoolean writen = new AtomicBoolean(false);

        Runnable reader = () -> {
            while (!writen.get()) {
                System.out.println(
                        "[" + Instant.now() + "] " + Thread.currentThread().getName() + " : " + set);
                try {
                    TimeUnit.MILLISECONDS.sleep(500);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        };

        Runnable writer = () -> {
            String name = Thread.currentThread().getName();
            set.add(name);
            System.out.println();
            System.out.println("[" + Instant.now() + "] " + name + " : " + set);
            System.out.println();
        };

        int nReader = 10;
        int nWriter = 3;

        List<Thread> threads = new ArrayList<>();

        for (int i = 0; i < nReader; i++) {
            Thread thread = new Thread(reader, "Reader_" + i);

            thread.start();

            threads.add(thread);
        }

        for (int j = 0; j < nWriter; j++) {
            Thread thread = new Thread(writer, "Writer_" + j);

            TimeUnit.SECONDS.sleep(1);

            thread.start();

            threads.add(thread);
        }

        TimeUnit.SECONDS.sleep(1);

        writen.set(true);

        for (Thread thread : threads) {
            thread.join();
        }
    }
}

The output of above code snippet is below :

[2025-04-10T20:27:28.330806536Z] Reader_8 : []
[2025-04-10T20:27:28.324553425Z] Reader_0 : []
[2025-04-10T20:27:28.327419594Z] Reader_3 : []
[2025-04-10T20:27:28.325943619Z] Reader_4 : []
[2025-04-10T20:27:28.324544664Z] Reader_1 : []
[2025-04-10T20:27:28.328862563Z] Reader_6 : []
[2025-04-10T20:27:28.333435310Z] Reader_9 : []
[2025-04-10T20:27:28.326574689Z] Reader_5 : []
[2025-04-10T20:27:28.332556073Z] Reader_7 : []
[2025-04-10T20:27:28.324528499Z] Reader_2 : []
[2025-04-10T20:27:28.896590193Z] Reader_8 : []
[2025-04-10T20:27:28.896893784Z] Reader_1 : []
[2025-04-10T20:27:28.896589979Z] Reader_0 : []
[2025-04-10T20:27:28.896853964Z] Reader_3 : []
[2025-04-10T20:27:28.897042180Z] Reader_4 : []
[2025-04-10T20:27:28.897137308Z] Reader_6 : []
[2025-04-10T20:27:28.900170004Z] Reader_9 : []
[2025-04-10T20:27:28.900302708Z] Reader_2 : []
[2025-04-10T20:27:28.900170032Z] Reader_5 : []
[2025-04-10T20:27:28.900169996Z] Reader_7 : []

[2025-04-10T20:27:29.335263899Z] Writer_0 : [Writer_0]

[2025-04-10T20:27:29.397473990Z] Reader_8 : [Writer_0]
[2025-04-10T20:27:29.397879955Z] Reader_1 : [Writer_0]
[2025-04-10T20:27:29.397868512Z] Reader_0 : [Writer_0]
[2025-04-10T20:27:29.397950077Z] Reader_3 : [Writer_0]
[2025-04-10T20:27:29.398302023Z] Reader_6 : [Writer_0]
[2025-04-10T20:27:29.398093690Z] Reader_4 : [Writer_0]
[2025-04-10T20:27:29.401070901Z] Reader_2 : [Writer_0]
[2025-04-10T20:27:29.401081531Z] Reader_5 : [Writer_0]
[2025-04-10T20:27:29.401522104Z] Reader_7 : [Writer_0]
[2025-04-10T20:27:29.401070887Z] Reader_9 : [Writer_0]
[2025-04-10T20:27:29.898192574Z] Reader_8 : [Writer_0]
[2025-04-10T20:27:29.898678842Z] Reader_1 : [Writer_0]
[2025-04-10T20:27:29.898701624Z] Reader_6 : [Writer_0]
[2025-04-10T20:27:29.898440653Z] Reader_0 : [Writer_0]
[2025-04-10T20:27:29.898701583Z] Reader_3 : [Writer_0]
[2025-04-10T20:27:29.898830545Z] Reader_4 : [Writer_0]
[2025-04-10T20:27:29.901652469Z] Reader_2 : [Writer_0]
[2025-04-10T20:27:29.901963421Z] Reader_5 : [Writer_0]
[2025-04-10T20:27:29.902180824Z] Reader_9 : [Writer_0]
[2025-04-10T20:27:29.903282143Z] Reader_7 : [Writer_0]

[2025-04-10T20:27:30.335877315Z] Writer_1 : [Writer_0, Writer_1]

[2025-04-10T20:27:30.399013194Z] Reader_8 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.399345995Z] Reader_6 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.399584785Z] Reader_0 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.399013169Z] Reader_1 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.400083167Z] Reader_4 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.400281996Z] Reader_3 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.402481348Z] Reader_2 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.403120769Z] Reader_5 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.403082701Z] Reader_9 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.404289260Z] Reader_7 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.899791561Z] Reader_6 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.899791682Z] Reader_8 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.900121338Z] Reader_0 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.901073035Z] Reader_3 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.900194542Z] Reader_1 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.900717837Z] Reader_4 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.903574952Z] Reader_5 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.903575528Z] Reader_2 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.903757900Z] Reader_9 : [Writer_0, Writer_1]
[2025-04-10T20:27:30.905284828Z] Reader_7 : [Writer_0, Writer_1]

[2025-04-10T20:27:31.336743570Z] Writer_2 : [Writer_0, Writer_1, Writer_2]

[2025-04-10T20:27:31.400843474Z] Reader_6 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.401392494Z] Reader_0 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.401873343Z] Reader_3 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.401739967Z] Reader_8 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.402335305Z] Reader_1 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.402346061Z] Reader_4 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.404100111Z] Reader_5 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.404628415Z] Reader_9 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.404621450Z] Reader_2 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.405888367Z] Reader_7 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.901537475Z] Reader_6 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.901937393Z] Reader_0 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.902411397Z] Reader_3 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.902783859Z] Reader_1 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.902975326Z] Reader_4 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.902548594Z] Reader_8 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.904783788Z] Reader_5 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.905391114Z] Reader_9 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.905390883Z] Reader_2 : [Writer_0, Writer_1, Writer_2]
[2025-04-10T20:27:31.906644246Z] Reader_7 : [Writer_0, Writer_1, Writer_2]

2.6 EnumSet

An EnumSet is a specialized Set collection to work with enum classes.

It can contain only enum values and all the values have to belong to the same enum.

The elements are stored following the order in which they are declared in the enum.

EnumSet should always be preferred over any other Set implementation when we are storing enum values.

import java.util.EnumSet;
import java.util.Iterator;
import java.util.Set;

public class EnumSetTest {
    enum Color {
        RED, YELLOW, GREEN, BLUE, BLACK, WHITE
    }

    public static void main(String[] args) {
        Set<Color> set = EnumSet.allOf(Color.class);

        System.out.println(set);
        System.out.println();

        Iterator<Color> it = set.iterator();

        while (it.hasNext()) {
            Color color = it.next();

            System.out.println(color);
        }
    }
}

Below is the output of above code snippet :

[RED, YELLOW, GREEN, BLUE, BLACK, WHITE]

RED
YELLOW
GREEN
BLUE
BLACK
WHITE

3. The choice of a set implementation

Some Set implementations are specialized for use in particular situations, for instance :

  • the EnumSet class should only be used to manage a set of enumerations;
  • the CopyOnWriteArraySet class should only be used for small thread-safe collections, where the operations performed are essentially reads.

If the collection is not used by several threads, it is possible to use the HashSet, LinkedHashSet and TreeSet classes.

Otherwise, the ConcurrentSkipListSet or CopyOnWriteArraySet class should be used only if the accesses are essentially reads.

It is also possible to use a synchronized version of a Set type implementation using the synchronizedSet() method of the Collections class.

If the data must be sorted, the TreeSet class must be used.

If the data in the collection needs to be traversed frequently, it is better to use the LinkedHashSet class.