Home Forums HDFS ClusterID mismatch for namenode and datanodes in fully distributed cluster.

This topic contains 4 replies, has 3 voices, and was last updated by  Juan-Manuel Clavijo 2 weeks, 6 days ago.

  • Creator
    Topic
  • #57700

    Rushikesh Deshmukh
    Participant

    I have set up fully distributed cluster for hadoop with namenode, secondary namenode and 6 datanodes on different containers of docker on same machine.After formatting namenode, when I start all processes using ./start-dfs.sh. I am able see namenode and seconday namenode processes, but not datanode. I am facing below error in logs of datanode:

    FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-1934698098-..-1406095928965 (storage id DS-1439452174-10.15.10.209-50010-1405956351221) service to /..:8020
    java.io.IOException: Incompatible clusterIDs in /root/datanode: namenode clusterID = CID-7553ffee-bf82-44ba-8eea-75d96fc9ffee; datanode clusterID = CID-58e351b4-e758-4051-8267-1857e1de1fb0
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
    at java.lang.Thread.run(Thread.java:744)
    2014-07-23 16:10:42,809 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1934698098-..-1406095928965 (storage id DS-1439452174-..-50010-1405956351221) service to /..:8020

Viewing 4 replies - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #60229

    Juan-Manuel Clavijo
    Participant

    If you’re testing out one of the examples or somehow accidentally have a clusterID that is different and have not set a specific directory to store the data. Hadoop by default puts the data and Version file in /tmp/hadoop-{user}/dfs folder. In there you can see that the data and name folders have a current folder with a version file. This is where the CID is located. I deleted these folders since I wasn’t working on anything critical but I think shutting down and changing the cluster ID in all the version files that you are using will solve this problem

    Collapse
    #57713

    Ramesh Babu
    Participant

    stop all the hadoop services. copy the namenode CID to datanode CID. then start the hadoop services.
    (or)
    stop all the hadoop services. delete namenode and datanode data directories. format the namenode and start datanode.

    both the cases checks the directories path properly. check the hdfs-site.xml respectively.

    Collapse
    #57702

    Ramesh Babu
    Participant

    The error is incompatiable namenode clusterID = CID-7553ffee-bf82-44ba-8eea-75d96fc9ffee; datanode clusterID = CID-58e351b4-e758-4051-8267-1857e1de1fb0.
    change the datanode cluster id in version file under datanode hdfs directory path.
    (or)
    if you don’t have any data in the cluster Please remove namenode and datanode directories and format the namenode.

    Collapse
    #57701

    Rushikesh Deshmukh
    Participant

    I have also tried clearing data in dfs.data.dir directory which is “/root/datanode” and again start process for datanode either from datanode container or from namenode container. In both cases ClusterID for datanode remain same ‘CID-58e351b4-e758-4051-8267-1857e1de1fb0′ which doesn’t match with ClusterID of namenode. If I copy ClusterID from VERSION of namenode container and paste in datanode container VERSION file. After starting datanode its ClusterID for datanode remain same ‘CID-58e351b4-e758-4051-8267-1857e1de1fb0′ and thus not matches with namenode. If I format namenode, ClusterID for namenode changes, but after starting datanode process its ClusterID remains old one i.e. ‘CID-58e351b4-e758-4051-8267-1857e1de1fb0′, thus not able start datanode processes on datanode containers. Can you suggest if I am missing anything?

    Collapse
Viewing 4 replies - 1 through 4 (of 4 total)