## 1.概述

Set is one of the commonly used collection types in Java. Today, we’ll discuss how to find the difference between two given sets.

Set是Java中常用的集合类型之一。今天，我们将讨论如何找到两个给定集合之间的差异。

## 2.对问题的介绍

Before we take a closer look at the implementations, we need first to understand the problem. As usual, an example may help us to understand the requirement quickly.

Let’s say we have two Set objects, set1 and set2:

``````set1: {"Kotlin", "Java", "Rust", "Python", "C++"}
set2: {"Kotlin", "Java", "Rust", "Ruby", "C#"}``````

As we can see, both sets contain some programming language names. The requirement “Finding the difference between two Sets” may have two variants:

• Asymmetric difference – Finding those elements that are contained by set1 but not contained by set2; in this case, the expected result is {“Python”, “C++”}
• Symmetric difference – Finding the elements in either of the sets but not in their intersection; if we look at our example, the result should be {“Python”, “C++”, “Ruby”, “C#”}

In this tutorial, we’ll address the solution to both scenarios. First, we’ll focus on finding the asymmetric differences. After that, we’ll explore finding the symmetric difference between the two sets.

Next, let’s see them in action.

## 3.非对称性差异

### 3.1.使用标准的removeAll方法

The Set class has provided a removeAll method. This method implements the removeAll method from the Collection interface.

Set类提供了一个removeAll方法。这个方法实现了Collection接口中的removeAll方法。

The removeAll method accepts a Collection object as the parameter and removes all elements in the parameter from the given Set object. So, if we pass the set2 object as the parameter in this way, “set1.removeAll(set2)“, the rest of the elements in the set1 object will be the result.

removeAll方法接受一个Collection对象作为参数，并从给定的Set对象中删除参数中的所有元素。因此，如果我们以这种方式传递set2对象作为参数，”set1.removeAll(set2)“，那么set1对象中的其余元素将成为结果。

For simplicity, let’s show it as a unit test:

``````Set<String> set1 = Stream.of("Kotlin", "Java", "Rust", "Python", "C++").collect(Collectors.toSet());
Set<String> set2 = Stream.of("Kotlin", "Java", "Rust", "Ruby", "C#").collect(Collectors.toSet());
Set<String> expectedOnlyInSet1 = Set.of("Python", "C++");

set1.removeAll(set2);

assertThat(set1).isEqualTo(expectedOnlyInSet1);``````

As the method above shows, first, we initialize the two Set objects using Stream. Then, after calling the removeAll method, the set1 object contains the expected elements.

This approach is pretty straightforward. However, the drawback is obvious: After removing the common elements from set1, the original set1 is modified.

Therefore, we need to backup the original set1 object if we still need it after calling the removeAll method, or we have to create a new mutable set object if the set1 is an immutable Set.

Next, let’s take a look at another approach to returning the asymmetric difference in a new Set object without modifying the original set.

### 3.2.使用Stream.filter方法

The Stream API has been around since Java 8. It allows us to filter elements from a collection using the Stream.filter method.

Stream API自Java 8以来一直存在。它允许我们使用Stream.filter方法从一个集合中过滤元素。

We can also solve this problem using Stream.filter without modifying the original set1 object. Let’s first initialize the two sets as immutable sets:

``````Set<String> immutableSet1 = Set.of("Kotlin", "Java", "Rust", "Python", "C++");
Set<String> immutableSet2 = Set.of("Kotlin", "Java", "Rust", "Ruby", "C#");
Set<String> expectedOnlyInSet1 = Set.of("Python", "C++");``````

Since Java 9, the Set interface introduced the static of method. It allows us to initialize an immutable Set object conveniently. That is to say, if we attempt to modify immutableSet1, an UnsupportedOperationException will be thrown.

Next, let’s write a unit test that uses Stream.filter to find the difference:

``````Set<String> actualOnlyInSet1 = immutableSet1.stream().filter(e -> !immutableSet2.contains(e)).collect(Collectors.toSet());
assertThat(actualOnlyInSet1).isEqualTo(expectedOnlyInSet1);
``````

As we can see in the method above, the key is “filter(e -> !immutableSet2.contains(e))“. Here, we only take the elements that are in immutableSet1 but not in immutableSet2.

If we execute this test method, it passes without any exception. It means this approach works, and the original sets are not modified.

### 3.3.使用Guava库

Guava is a popular Java library that ships with some new collection types and convenient helper methods. Guava has provided a method to find the asymmetric differences between two sets. Therefore, we can use this method to solve our problems easily.

Guava是一个流行的Java库，它带有一些新的集合类型和方便的辅助方法。Guava提供了一种方法来寻找两个集合之间的不对称差异。因此，我们可以使用这个方法来轻松解决我们的问题。

But first, we need to include the library in our classpath. Let’s say we manage the project dependencies by Maven. We may need to add the Guava dependency to the pom.xml:

``````<dependency>
<artifactId>guava</artifactId>
<version>31.1-jre</version>
</dependency>
``````

Once Guava is available in our Java project, we can use its Sets.difference method to get the expected result:

``````Set<String> actualOnlyInSet1 = Sets.difference(immutableSet1, immutableSet2);
assertThat(actualOnlyInSet1).isEqualTo(expectedOnlyInSet1);
``````

It’s worth mentioning that the Sets.difference method returns an immutable Set view containing the result. It means:

• We cannot modify the returned set
• If the original set is a mutable one, changes to the original set may be reflected in our resulting set view

### 3.4.使用Apache Commons库

Apache Commons is another widely used library. The Apache Commons Collections4 library provides many nice collection-related methods as complementary to the standard Collection API.

Apache Commons是另一个广泛使用的库。Apache Commons Collections4库提供了许多不错的与集合相关的方法，作为标准集合API的补充。

Before we start using it, let’s add the dependency to our pom.xml:

``````<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
``````

The commons-collections4 library has a CollectionUtils.removeAll method. It’s similar to the standard Collection.removeAll method but returns the result in a new Collection object instead of modifying the first Collection object.

commons-collections4库有一个CollectionUtils.removeAll方法。它类似于标准的Collection.removeAll方法，但是在一个新的Collection对象中返回结果而不是修改第一个Collection对象

Next, let’s test it with two immutable Set objects:

``````Set<String> actualOnlyInSet1 = new HashSet<>(CollectionUtils.removeAll(immutableSet1, immutableSet2));
assertThat(actualOnlyInSet1).isEqualTo(expectedOnlyInSet1);
``````

The test will pass if we execute it. But, we should note that the CollectionUtils.removeAll method returns the result in the Collection type.

If a concrete type is required – for instance, Set in our case – we’ll need to convert it manually. In the test method above, we’ve initialized a new HashSet object using the returned collection.

## 4.对称性差异

So far, we’ve learned how to get the asymmetric difference between two sets. Now, let’s take a closer look at the other scenario: finding the symmetric difference between two sets.

We’ll address two approaches to get the symmetric difference from our two immutable set examples.

The expected result is:

``Set<String> expectedDiff = Set.of("Python", "C++", "Ruby", "C#");``

Next, let’s see how to solve the problem.

### 4.1.使用HashMap。

One idea to solve the problem is first creating a Map<T, Integer> object.

Then, we iterate through the two given sets and put each element to the map as the key. If the key exists in the map, it means this is a common element in both sets. We set a special number as the value – for example, Integer.MAX_VALUE. Otherwise, we put the element and the value 1 as a new entry in the map.

Finally, we find out the keys whose value is 1 in the map, and these keys are the symmetric difference between two given sets.

Next, let’s implement the idea in Java:

``````public static <T> Set<T> findSymmetricDiff(Set<T> set1, Set<T> set2) {
Map<T, Integer> map = new HashMap<>();
set1.forEach(e -> putKey(map, e));
set2.forEach(e -> putKey(map, e));
return map.entrySet().stream()
.filter(e -> e.getValue() == 1)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
}

private static <T> void putKey(Map<T, Integer> map, T key) {
if (map.containsKey(key)) {
map.replace(key, Integer.MAX_VALUE);
} else {
map.put(key, 1);
}
}
``````

Now, let’s test our solution and see if it can give the expected result:

``````Set<String> actualDiff = SetDiff.findSymmetricDiff(immutableSet1, immutableSet2);
assertThat(actualDiff).isEqualTo(expectedDiff);
``````

The test passes if we run it. That is to say, our implementation works as expected.

### 4.2.使用Apache Commons库

We’ve already introduced the Apache Commons library when finding the asymmetric difference between two sets. Actually, the commons-collections4 library has a handy SetUtils.disjunction method to return the symmetric difference between two sets directly:

``````Set<String> actualDiff = SetUtils.disjunction(immutableSet1, immutableSet2);
assertThat(actualDiff).isEqualTo(expectedDiff);
``````

As the method above shows, unlike the CollectionUtils.removeAll method, the SetUtils.disjunction method returns a Set object. We don’t need to manually convert it to Set.

## 5.总结

In this article, we’ve explored how to find differences between two Set objects through examples. Further, we’ve discussed two variants of this problem: finding asymmetric differences and symmetric differences.

We’ve addressed solving the two variants using the standard Java API and widely used external libraries, such as Apache Commons-Collections and Guava.

As always, the source code used in this tutorial is available over on GitHub.