Errors and recovery

Generally, the question that needs to be asked for your application is: is it critical that you receive and process all the data? If not, then on failure, you might just be able to restart the application and discard the missing or lost data. If this is not the case, then you will need to use checkpointing, which will be described in the next section.

It is also worth noting that your application's error management should be robust and self-sufficient. What we mean by this is that if an exception is non-critical, then manage the exception, perhaps log it, and continue processing. For instance, when a task reaches the maximum number of failures (specified by spark.task.maxFailures), it will terminate processing.

This property, among others can be set during creation of the SparkContext object or as additional command line parameters when invoking spark-shell or spark-submit.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.244.217