Saving your data

While distributed computational jobs are a lot of fun, they are much more applicable when the results get stored somewhere useful. While the methods for loading an RDD are largely found in the SparkContext class, the methods for saving an RDD are defined on the RDD classes. In Scala, implicit conversion exists so that an RDD that can be saved as a sequence file is converted to the appropriate type, and in Java explicit conversion must be used.

Here are the different ways to save an RDD:

  • Scala:
    rddOfStrings.saveAsTextFile("out.txt")
    keyValueRdd.saveAsSequenceFile("sequenceOut")
  • Java:
    rddOfStrings.saveAsTextFile("out.txt")
    keyValueRdd.saveAsSequenceFile("sequenceOut")
  • Python:
    rddOfStrings.saveAsTextFile("out.txt")
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.126.56