YARN is Apache Hadoop's NextGen MapReduce. The Spark project provides an easy way to schedule jobs on YARN once you have a Spark assembly built. It is important that the Spark job you create uses a standalone master URL. The example Spark applications all read the master URL from the command-line arguments, so specify --args
standalone.
To run the same example as in the SSH section, do the following:
sbt/sbt assembly #Build the assembly SPARK_JAR=./core/target/spark-core-assembly-0.7.0.jar ./run spark.deploy.yarn.Client --jar examples/target/scala-2.9.2/spark-examples_2.9.2-0.7.0.jar --class spark.examples.GroupByTest --args standalone --num-workers 2 --worker-memory 1g --worker-cores 1
3.22.77.63