flatMap

In flatMap() transformation, an element of source RDD can be mapped to one or more elements of target RDD.A function is executed on every element of source RDD that produces one or more outputs. The return type of the flatMap() function is java.util.iterator, that is, it returns a sequence of elements.

The following example of flatMap() transformation will execute a function on an RDD of strings that will split every string element into words and produce an RDD with those words as its individual elements:

We will start with creating an RDD of strings using the javaSparkContext:

JavaRDD<String> stringRDD =javaSparkContext.parallelize(Arrays.asList("Hello
Spark", "Hello Java"));

Java 7:

stringRDD.flatMap(new FlatMapFunction<String, String>() {
@Override
public Iterator<String> call(String t) throws Exception {
return Arrays.asList(t.split(" ")).iterator();
}
});

Java 7:

stringRDD.flatMap(t -> Arrays.asList(t.split(" ")).iterator());

So in this case, the number of elements of target RDD is more than or equal to the number of elements of source RDD.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.74.25