Container failures

Whenever a container finishes, the ApplicationMaster is informed of this event by the ResourceManager. So the ApplicationMaster interprets that the container status received through the ResourceManager is the success or failure from container exit status. The ApplicationMaster handles the failures of the job containers.

It is the responsibility of the application frameworks to manage the container's failures, and the responsibility of the YARN framework is to provide information to the application framework. As a part of allocating the API's response, the ResourceManager collects information on the finished containers from the ApplicationMaster, as the containers return all this information to the corresponding ApplicationMaster. It is the responsibility of the ApplicationMaster to validate the container's status, exit code, and diagnostic information and appropriate action on it, for example when the MapReduce ApplicationMaster retries the map and reduce tasks by requesting new containers, until the configured number of tasks fail for a single job.

To address container allocation failure scenarios, the ResourceManager collects container information by executing the Allocate call, and the AllocateResponse usually does not return any containers. However, the Allocate call should be made periodically to ensure that all containers are assigned. When the container arrives, it is for sure that the framework will have sufficient resources, and the ApplicationMaster will not receive more containers than it asked for. Also, the ApplicationMaster can make separate container requests, ResourceRequests, typically one per second.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.218.184