Cloud

Although, most of this book will concentrate on examples of Apache Spark installed on physically server-based clusters (with the exception of https://databricks.com/), I wanted to make the point that there are multiple cloud-based options out there. There are cloud-based systems that use Apache Spark as an integrated component, and cloud-based systems that offer Spark as a service. Even though this book cannot cover all of them in depth, I thought that it would be useful to mention some of them:

  • Databricks is covered in two chapters in this book. It offers a Spark cloud-based service, currently using AWS EC2. There are plans to extend the service to other cloud suppliers (https://databricks.com/).
  • At the time of writing (July 2015) this book, Microsoft Azure has been extended to offer Spark support.
  • Apache Spark and Hadoop can be installed on Google Cloud.
  • The Oryx system has been built at the top of Spark and Kafka for real-time, large-scale machine learning (http://oryx.io/).
  • The velox system for serving machine learning prediction is based upon Spark and KeystoneML (https://github.com/amplab/velox-modelserver).
  • PredictionIO is an open source machine learning service built on Spark, HBase, and Spray (https://prediction.io/).
  • SeldonIO is an open source predictive analytics platform, based upon Spark, Kafka, and Hadoop (http://www.seldon.io/).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.131.212