Yet Another Resource Negotiator (YARN)

Hadoop YARN is one of the most popular resource managers in the big data world. Apache Spark provides seamless integration with YARN. Apache Spark applications can be deployed to YARN using the same spark-submit command.

Apache Spark requires HADOOP_CONF_DIR or YARN_CONF_DIR environment variables to be set and pointing to the Hadoop configuration directory, which contains core-site.xml, yarn-site.xml, and so on. These configurations are required to connect to the YARN cluster.

To run Spark applications on YARN, the YARN cluster should be started first. Refer to the following official Hadoop documentation that describes how to start the YARN cluster: https://hadoop.apache.org/docs

YARN in general consists of a resource manager (RM) and multiple node managers (NM) where resource manager is the master node and node managers are slave nodes. NMs send detailed report to RM at every defined interval that tell RM how many resources (such as CPU slots and RAM) are available on NMs.

To deploy an application, a client connects to RM and submits the application. Based on application configuration, RM finds an available slot on one of the NM and runs the application master (AM) for the submitted application on that slot. As soon as the AM initializes, it connects to RM and negotiates resources (asks for containers) to run the application. Then based on cluster resource availability, RM gives addresses of NMs to the AM along with credentials (security tokens) to run containers on NMs. The AM connects to NMs with security credentials and NM schedules containers for that application. When containers are launched, they connect to the AM and run the application.

In YARN, if the AM crashes then RM restarts it (on the same node or different node), however, in that case all application containers need to be restarted.

Similar to the standalone cluster, Spark applications in Yarn also run the following modes:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.34.146