Summary

In this chapter, you saw how difficult the testing and debugging your Spark applications are. These can even be more critical in a distributed environment. We also discussed some advanced ways to tackle them altogether. In summary, you learned the way of testing in a distributed environment. Then you learned a better way of testing your Spark application. Finally, we discussed some advanced ways of debugging Spark applications.

We believe that this book will help you to gain some good understanding of Spark. Nevertheless, due to page limitation, we could not cover many APIs and their underlying functionalities. If you face any issues, please don't forget to report this to Spark user mailing list at [email protected]. Before doing so, make sure that you have subscribed to it.

This is more or less the end of our little journey with advanced topics on Spark. Now, a general suggestion from our side to you as readers or if you are relatively newer to the data science, data analytics, machine learning, Scala, or Spark is that you should at first try to understand what types of analytics you want to perform. To be more specific, for example, if your problem is a machine learning problem, try to guess what type of learning algorithms should be the best fit, that is, classification, clustering, regression, recommendation, or frequent pattern mining. Then define and formulate the problem, and after that, you should generate or download the appropriate data based on the feature engineering concept of Spark that we have discussed earlier. On the other hand, if you think that you can solve your problem using deep learning algorithms or APIs, you should use other third-party algorithms and integrate with Spark and work straight away.

Our final recommendation to the readers is to browse the Spark website (at http://spark.apache.org/) regularly to get the updates and also try to incorporate the regular Spark-provided APIs with other third-party applications or tools to get the best result of the collaboration.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.131.13.132