Common problems and their solutions

The following is a list of common problems and their solutions:

  • When I try to format the HDFS node, I get the exception java.io.IOException: Incompatible clusterIDs in namenode and datanode?

    This issue usually appears if you have a different/older cluster and you are trying to format a new namenode; however, the datanodes still point to older cluster ids. This can be handled by one of the following:

    1. By deleting the DFS data folder, you can find the location from hdfs-site.xml and restart the cluster
    2. By modifying the version file of HDFS usually located at <HDFS-STORAGE-PATH>/hdfs/datanode/current/
    3. By formatting namenode with the problematic datanode's cluster ID:
        $ hdfs namenode -format -clusterId <cluster-id>
      
  • My Hadoop instance is not starting up with the ./start-all.sh script? When I try to access the web application, it shows the page not found error?

    This could be happening because of a number of issues. To understand the issue, you must look at the Hadoop logs first. Typically, Hadoop logs can be accessed from the /var/log folder if the precompiled binaries are installed as the root user. Otherwise, they are available inside the Hadoop installation folder.

  • I have setup N node clusters, and I am running the Hadoop cluster with ./start-all.sh. I am not seeing many nodes in the YARN/NameNode web application?

    This again can be happening due to multiple reasons. You need to verify the following:

    1. Can you reach (connect to) each of the cluster nodes from namenode by using the IP address/machine name? If not, you need to have an entry in the /etc/hosts file.
    2. Is the ssh login working without password? If not, you need to put the authorization keys in place to ensure logins without password.
    3. Is datanode/nodemanager running on each of the nodes, and can you connect to namenode/AM? You can validate this by running ssh on the node running namenode/AM.
    4. If all these are working fine, you need to check the logs and see if there are any exceptions as explained in the previous question.
    5. Based on the log errors/exceptions, specific action has to be taken.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.22.216