This is the cumulative form of mapPartitions and mapToPair. Like mapPartitions, it runs map transformations on every partition of the RDD, and instead of JavaRDD<T>, this transformation returns JaPairRDD <K,V>.
In the following example, will convert JavaPairRDD of <String, Integer> type using mapPartitionsToPair:
Java 7:
JavaPairRDD<String, Integer> pairRDD = intRDD.mapPartitionsToPair(new PairFlatMapFunction<Iterator<Integer>, String,Integer>() {
@Override
public Iterator<Tuple2<String, Integer>> call(Iterator<Integer> t) throws Exception {
List<Tuple2<String,Integer>> list =new ArrayList<>();
while(t.hasNext()) {
int element=t.next();
list.add(element%2==0?new Tuple2<String, Integer>("even",element): new
Tuple2<String, Integer>("odd",element));
}
return list.iterator();
}});
Java 8:
JavaPairRDD<String, Integer> pairRDD = intRDD.mapPartitionsToPair(t -> {
List<Tuple2<String,Integer>> list =new ArrayList<>();
while(t.hasNext()){
int element=t.next();
list.add(element%2==0?new Tuple2<String, Integer>("even",element):
new Tuple2<String, Integer>("odd",element));
}
return list.iterator();});
Like mapPartitions and mapPartitionsWithIndex, mapPartitionsToPair also provides an overoaded method that helps user to specify value for preservePartioning.