120. Filtering a Collection by a List

A common problem that we encounter in applications is filtering a Collection by a List. Mainly, we start from a huge Collection, and we want to extract from it the elements that match the elements of a List.

In the following examples, let's consider the Melon class:

public class Melon {

private final String type;
private final int weight;

// constructor, getters, equals(), hashCode(),
// toString() omitted for brevity
}

Here, we have a huge Collection (in this case, an ArrayList) of Melon:

List<Melon> melons = new ArrayList<>();
melons.add(new Melon("Apollo", 3000));
melons.add(new Melon("Jade Dew", 3500));
melons.add(new Melon("Cantaloupe", 1500));
melons.add(new Melon("Gac", 1600));
melons.add(new Melon("Hami", 1400));
...

And we also have a List containing the types of melons that we want to extract from the preceding ArrayList:

List<String> melonsByType 
= Arrays.asList("Apollo", "Gac", "Crenshaw", "Hami");

One solution to this problem may involve looping both collections and comparing the types of melons, but the resultant code will be pretty slow. Another solution to this problem may involve the List.contains() method and a lambda expression:

List<Melon> results = melons.stream()
.filter(t -> melonsByType.contains(t.getType()))
.collect(Collectors.toList());

The code is compact and fast. Behind the scenes, List.contains() relies on the following check:

// size - the size of melonsByType
// o - the current element to search from melons
// elementData - melonsByType
for (int i = 0; i < size; i++)
if (o.equals(elementData[i])) {
return i;
}
}

However, we can give another boost to performance via a solution that relies on HashSet.contains() instead of List.contains(). While List.contains() uses the preceding for statement to match the elements, HashSet.contains() uses Map.containsKey(). Mainly, Set is implemented based on a Map, and each added element is mapped as a key-value of the element-PRESENT type. So, element is a key in this Map, while PRESENT is just a dummy value.

When we call HashSet.contains(element), we actually call Map.containsKey(element). This method matches the given element with the proper key in the map based on its hashCode(), which is much faster than equals().

Once we convert the initial ArrayList to a HashSet, we are ready to go:

Set<String> melonsSetByType = melonsByType.stream()
.collect(Collectors.toSet());

List<Melon> results = melons.stream()
.filter(t -> melonsSetByType.contains(t.getType()))
.collect(Collectors.toList());

Well, this solution is faster than the previous one. It should run in half of the time required by the previous solution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.119.106