Standalone

By specifying a Spark configuration local URL, it is possible to have the application run locally. By specifying local[n], it is possible to have Spark use n threads to run the application locally. This is a useful development and test option because you can also test some sort of parallelization scenarios but keep all log files on a single machine. The standalone mode uses a basic cluster manager that is supplied with Apache Spark. The spark master URL will be as follows:

spark://<hostname>:7077

Here, <hostname> is the name of the host on which the Spark master is running. I have specified 7077 as the port, which is the default value, but it is configurable. This simple cluster manager currently only supports FIFO (first in first out) scheduling. You can contrive to allow concurrent application scheduling by setting the resource configuration options for each application. For example, spark.core.max is used to share the processor cores between applications. A more detail discussion will be carried out later this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.98.111