HDP on Linux – Installation Forum

HDFS down and not coming back

  • #30842
    Ardavan Moinzadeh

    I upload my error log regarding to HDFS failure named as Snamenode failed.txt, can you tell me what caused the failure?

    I have a cluster of three nodes. A, B , C with the following architecture:
    A: Namenode/ Nagios/ Ganglia collector/ Hiveserver2,/Hive Metastore /,WebhCat /, Hbase master/, oozie server/ Zookeeper
    B:Snamenode /Jobtracker zookeeper
    what is the solution to bring HDFS up again?

to create new topics or reply. | New User Registration

  • Author
  • #30911
    Sasha J

    Just start HDFS again.
    It should start normally from the second try.

    Thank you!

    Ardavan Moinzadeh

    Before submitting this post I did try to start HDFS both from Ambari and from the boxes by running ./start-dfs.sh
    it’s not coming up!
    I have 3 alerts on my hosts:
    Jobtracker and SNamenode is down on B and not coming up
    Ironically A & C are green but still HDFS is down.

    Sasha J

    This means you have something misconfigured somehow…
    Or some permission issues.

    Take a look on the NN, DN and SNN logs.


    Ardavan Moinzadeh

    why I can’t start the SNN?
    it says it cannot assign requested address! ..what does it mean? I am able to SSH betwenn all 3 nodes! all /etc/host directory are correct!

    logging to /var/log/hadoop/root/hadoop-root-secondarynameno de-bddec1v6-0011.out
    localhost: Exception in thread “main” java.net.BindException: Cannot assign requested address
    localhost: at sun.nio.ch.Net.bind(Native Method)
    localhost: at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
    localhost: at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
    localhost: at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:2 16)
    localhost: at org.apache.hadoop.http.HttpServer.start(HttpServer.java:602)
    localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initializeHttpWebServe r(SecondaryNameNode.java:278)
    localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNa meNode.java:218)
    localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNo de.java:150)
    localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode .java:676)

    Sasha J

    Is NameNode process running?
    Are DataNode processes running?
    What is your /etc/hosts files containing?
    It seems to me like you use “localhost” as name of nodes on all of them, which is incorrect.

    What is NameNode LOG file says? Not .out, but .log

    Thank you!

    Ardavan Moinzadeh

    etc/host on all nodes is identical:
    (( Private IP address node1
    private IP address node 2
    Private IP address node 3 localhost

    my attempt to upload the log file in ftp://ftp.hortonworks.com/ was not succesfull.
    This is the error I see in Namenode log file:
    :java.io.FileNotFoundException: /data/b/hadoop/hdfs/namenode/in_use.lock (Permission denied)
    is this SSH issue?


    Hi Ardavan,
    What are the permissions for the list of files located in /data/b/hadoop/hdfs/namenode/ ? Ideally, they should be own by hdfs:hadoop


    Ardavan Moinzadeh


    I was able to resolve that issue. Now that I am trying to bring up SNN it fails on me! For some reason I can’t login into your FTP to upload my log files ==>FTP Listing of Root at http://ftp.support.hortonworks.com

    This is a part of my log
    2013-08-06 23:58:18,240 INFO org.apache.hadoop.hdfs.server.common.Storage: Recovering storage directory /data/b/hadoop/hdfs/namesecondary from failed checkpoint
    2013-08-06 23:58:18,252 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint:
    2013-08-06 23:58:18,253 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.security.AccessControlException: Access denied for user hdfs. Superuser privilege is required
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:93)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5927)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5824)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.rollEditLog(NameNode.java:1022)
    at sun.reflect.GeneratedMethodAccessor122.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438)

    at org.apache.hadoop.ipc.Client.call(Client.java:1118)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at $Proxy5.rollEditLog(Unknown Source)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:512)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:396)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:360)
    at java.lang.Thread.run(Thread.java:662)

    Thank you

    Seth Lyubich

    Hi Ardavan ,

    Can you please let us know at which point you are getting the error and which user you are using to start the process? Also, looking at your log file:

    Recovering storage directory /data/b/hadoop/hdfs/namesecondary from failed checkpoint

    Access denied for user hdfs. Superuser privilege is required

    Did you change any permission or tried to start process as user root? You might need to check permission of /data/b/hadoop/hdfs/namesecondary directory.

    Hope this helps,


    Ardavan Moinzadeh

    Hello Seth,
    yes, at first I did try starting the process as root, however later on I continued with hdfs user . some of the permissions where changed so I matched them with other working clusters.

    on Node B which SNN is installed under /data/b …. /data/i a directory called namenode is shown which based on the initial configuration this folder should not be here so I deleted it. After formating the namenode and:
    a: starting the namenode
    b:starting all datanode
    c: starting the secondary namenode
    it seems like non of my 3 datanodes or coming up — same for SNN.

    this is what I have so far!…

    What do you suggest?
    Thank you

    Ardavan Moinzadeh

    Suddenly Nagios is not showing the alrets on the right tab on Ambari..acceesing it through web UI I see the follwing error: Error: Could not read object configuration data!

    Has this anything to do with my issue with HDFS?

    Ardavan Moinzadeh

    and can’t even start or stop services anymore! tried restarting ambari-server and ambari-agent didn’t help

    Sasha J

    The best way to handle this situation is wipe out everything and start from the scratch.
    Something seems to be seriously damaged in there….

    Thank you!

    Seth Lyubich

    Hi Ardavan,

    You also can check if any PID files are owned by root, since you tried to start processes as user root. If you find any of such processes you can try to remove PID file(s) and try to restart the process.


    Ardavan Moinzadeh

    I had incompatible namespaceID issue. there are two ways you can fix this issue:
    A:delete the data directory and reformat the HDFS and start your service which is not recommended if your cluster is in production.

    B:. You can edit the value of NameSpaceID in /current/version to match the value of current NameNode and then restart the service.

    Thank you


    Hi Ardavan,

    Thanks for letting us know that this issue is now resolved.


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.