The groupBy
operation doesn't involve any repartitioning. The groupBy
operation converts the input stream into a grouped stream. The main function of the groupBy
operation is to modify the behavior of the subsequent aggregate
function. The following diagram shows how the groupBy
operation groups the tuples of a single partition:
groupBy
operation is used before the partition aggregate, then the partition aggregate will run the aggregate on each group created within the partition.groupBy
operation is used before the aggregate, then in that case, tuples of the same batch are first repartitioned into a single partition and then the groupBy
operation is applied on each single partition. At the end, it will perform the aggregate operation on each group.So far, we have covered the basics of the Trident APIs. In the following section, we will cover how to write a non-transactional topology in Trident.
3.16.76.237