Cartesian

Cartesian transformation generates a cartesian product of two RDDs. Each element of our first RDD is paired with each element of the second RDD. Therefore, if the cartesian operation is executed on an RDD of types X and an RDD of type Y it will return an RDD that will consist of <X,Y> pairs. The resultant RDD will consist of all the possible pairs of <X,Y>.

Cartesian transformation of RDD of Strings and RDD of integers can be executed as follows:

JavaRDD<String> rddStrings = javaSparkContext.parallelize(Arrays.asList("A","B","C"));
JavaRDD<Integer> rddIntegers = javaSparkContext.parallelize(Arrays.asList(1,4,5));
rddStrings.cartesian(rddIntegers);

The next set of transformation works on PairRDDs:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.93.0