Utilizing the groupBy operation

The groupBy operation doesn't involve any repartitioning. The groupBy operation converts the input stream into a grouped stream. The main function of the groupBy operation is to modify the behavior of the subsequent aggregate function. The following diagram shows how the groupBy operation groups the tuples of a single partition:

Utilizing the groupBy operation

Working of the groupBy operation

  • If the groupBy operation is used before the partition aggregate, then the partition aggregate will run the aggregate on each group created within the partition.
  • If the groupBy operation is used before the aggregate, then in that case, tuples of the same batch are first repartitioned into a single partition and then the groupBy operation is applied on each single partition. At the end, it will perform the aggregate operation on each group.

So far, we have covered the basics of the Trident APIs. In the following section, we will cover how to write a non-transactional topology in Trident.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.150.231