While distributed computational jobs are a lot of fun, they are much more applicable when the results get stored somewhere useful. While the methods for loading an RDD are largely found in the SparkContext
class, the methods for saving an RDD are defined on the RDD classes. In Scala, implicit conversion exists so that an RDD that can be saved as a sequence file is converted to the appropriate type, and in Java explicit conversion must be used.
Here are the different ways to save an RDD:
rddOfStrings.saveAsTextFile("out.txt") keyValueRdd.saveAsSequenceFile("sequenceOut")
rddOfStrings.saveAsTextFile("out.txt") keyValueRdd.saveAsSequenceFile("sequenceOut")
rddOfStrings.saveAsTextFile("out.txt")
18.221.126.56