190. Partitioning

Partitioning is a type of grouping that relies on a Predicate to divide a stream into two groups (a group for true and a group for false). The group for true stores the elements of the stream that have passed the predicate, while the group of false stores the rest of the elements (the elements that fail the predicate).

This Predicate represents the classification function of partitioning and is known as the partitioning function. Since the Predicate is evaluated to a boolean value, the partitioning operation returns a Map<Boolean, V>.

Let's assume that we have the following Melon class and List of Melon:

public class Melon {

  private final String type;
  private int weight;

  // constructors, getters, setters, equals(),
  // hashCode(), toString() omitted for brevity
}

List<Melon> melons = Arrays.asList(new Melon("Crenshaw", 1200),
  new Melon("Gac", 3000), new Melon("Hemi", 2600),
  new Melon("Hemi", 1600), new Melon("Gac", 1200),
  new Melon("Apollo", 2600), new Melon("Horned", 1700),
  new Melon("Gac", 3000), new Melon("Hemi", 2600));

Partitioning is done via Collectors.partitioningBy(). This method comes in two flavors, and one of them receives a single argument, that is, partitioningBy(Predicate<? super T> predicate).

For example, partitioning melons by a weight of 2,000 g with duplicates can be done as follows:

Map<Boolean, List<Melon>> byWeight = melons.stream()
  .collect(partitioningBy(m -> m.getWeight() > 2000));

The output will be as follows:

{
  false=[Crenshaw(1200g),Hemi(1600g), Gac(1200g),Horned(1700g)],
  true=[Gac(3000g),Hemi(2600g),Apollo(2600g), Gac(3000g),Hemi(2600g)]
}

The advantage of partitioning over filtering consists of the fact that partitioning keeps both lists of the stream elements.

The following diagram depicts how partitioningBy() works internally:

If we want to reject duplicates, then we can rely on other flavors of partitioningBy(), such as partitioningBy(Predicate<? super T> predicate, Collector<? super T,A,D> downstream). The second argument allows us to specify another Collector for implementing the downstream reduction:

Map<Boolean, Set<Melon>> byWeight = melons.stream()
  .collect(partitioningBy(m -> m.getWeight() > 2000, toSet()));

The output will not contain duplicates:

{
  false=[Horned(1700g), Gac(1200g), Crenshaw(1200g), Hemi(1600g)], 
  true=[Gac(3000g), Hemi(2600g), Apollo(2600g)]
}

Of course, in this case, distinct() will do the job as well:

Map<Boolean, List<Melon>> byWeight = melons.stream()
  .distinct()
  .collect(partitioningBy(m -> m.getWeight() > 2000));

Other collectors can be used as well. For example, we can count the elements from each of these two groups via counting():

Map<Boolean, Long> byWeightAndCount = melons.stream()
  .collect(partitioningBy(m -> m.getWeight() > 2000, counting()));

The output will be as follows:

{false=4, true=5}

We can also count the elements without duplicates:

Map<Boolean, Long> byWeight = melons.stream()
  .distinct()
  .collect(partitioningBy(m -> m.getWeight() > 2000, counting()));

This time, the output will be as follows:

{false=4, true=3}

Finally, partitioningBy() can be combined with collectingAndThen(), which we introduced in the Grouping section. For example, let's partition the melons by weight of 2,000 g and keep only the heaviest from each partition:

Map<Boolean, Melon> byWeightMax = melons.stream()
  .collect(partitioningBy(m -> m.getWeight() > 2000,      
    collectingAndThen(maxBy(comparingInt(Melon::getWeight)),
      Optional::get)));

The output will be as follows:

{false=Horned(1700g), true=Gac(3000g)}

Table of Contents for 190. Partitioning

Create new playlist

Sign In

Sign Up

Table of Contents for
190. Partitioning