Home Forums HDFS Namenode stuck on safe mode

This topic contains 12 replies, has 3 voices, and was last updated by  Sivagamasundari Veerabahu 1 year ago.

  • Creator
    Topic
  • #43120

    Namenode is stuck on safe mode for a long time,
    Installation was successful and was trying to create a dir in hdfs..

    hadoop fs -mkdir /test
    All of a sudden the namenode went into safemode.
    I tried to leave it by using
    hadoop dfsadmin -safemode leave.
    It says off but then goes back on again.

    I see resource manager heap is 93%..
    Is this why it went into safe mode ?
    I can I do to bring the resource manager heap size down ?
    Please advice.

    I tried [hdfs@if21t01ha ~]$ hadoop dfsadmin -safemode get
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

    Safe mode is ON
    [hdfs@if21t01ha ~]$ hadoop dfsadmin -safemode leave
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

    Safe mode is OFF
    [hdfs@if21t01ha ~]$ hadoop dfsadmin -safemode get
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

    Safe mode is ON
    [hdfs@if21t01ha ~]$ hadoop dfsadmin -report
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

    Safe mode is ON
    Configured Capacity: 460570681344 (428.94 GB)
    Present Capacity: 438981455872 (408.83 GB)
    DFS Remaining: 438246457344 (408.15 GB)
    DFS Used: 734998528 (700.95 MB)
    DFS Used%: 0.17%
    Under replicated blocks: 380
    Blocks with corrupt replicas: 0
    Missing blocks: 0

    ————————————————-
    Datanodes available: 2 (2 total, 0 dead)

    Live datanodes:
    Name: xxxxxxxxxxxxxxxxxxxxxxx
    Hostname: xxxxxxxxxxxxxxxxxxxxxx
    Decommission Status : Normal
    Configured Capacity: 230285340672 (214.47 GB)
    DFS Used: 367499264 (350.47 MB)
    Non DFS Used: 10797742080 (10.06 GB)
    DFS Remaining: 219120099328 (204.07 GB)
    DFS Used%: 0.16%
    DFS Remaining%: 95.15%
    Last contact: Wed Nov 06 15:06:29 EST 2013

    Name: xxxxxxxxxxxxxxxxxxxxxxxxxxx
    Hostname: xxxxxxxxxxxxxxxxxxxxxxxxx
    Decommission Status : Normal
    Configured Capacity: 230285340672 (214.47 GB)
    DFS Used: 367499264 (350.47 MB)
    Non DFS Used: 10791483392 (10.05 GB)
    DFS Remaining: 219126358016 (204.08 GB)
    DFS Used%: 0.16%
    DFS Remaining%: 95.15%
    Last contact: Wed Nov 06 15:06:27 EST 2013

Viewing 12 replies - 1 through 12 (of 12 total)

The topic ‘Namenode stuck on safe mode’ is closed to new replies.

  • Author
    Replies
  • #44160

    Thanks Kenny , this information was very helpful

    Collapse
    #44083

    Kenny Zhang
    Moderator

    Hi,

    You can setup a cron job to either delete the old logs or archive it. As long as you don’t move the current log, which is ambari-server.log or ambari-agent.log, it will not impact the running service. Please make sure you only move the old logs, eg: ambari-server.log.1 ambari-server.log.2…

    Thanks,
    Kenny

    Collapse
    #43655

    How can I purge the logs when they get accumulated , can i do it thru Ambari ?
    For now I added space, but I see the logs growing , would like to know what’s the best way to purge them without impacting the services.

    Please advice

    Collapse
    #43613

    Dave
    Moderator

    Hi,

    Also note that if /tmp or the log directory fills up towards 100% then all the services will stop suddenly.
    This was probably causing the issue as you state it was 98% full.

    Thanks

    Dave

    Collapse
    #43472

    Kenny Zhang
    Moderator

    Hi,

    The first could be the reason, could you please attach the namenode log for us to confirm?
    FYI, the Ambari resourcemanager heap size is not accurate, it’s a bug. You should be fine now.

    Thanks
    Kenny

    Collapse
    #43448

    Dave,
    Sure I can increase it, but my question is , we did not do anything, we were not running anything.
    So wondering what caused the Namenode to go into safe mode.
    1. My log dir was 98% full on space, so I thought that could be the reason and added more storage to the dir. – did not encounter safemode issue after adding storage so far.
    2. I have 2 datanodes..(its 2 node cluster) and my replication factor is default 3. Is this a problem?
    Could any of the above cause an issue, Please advice.

    Collapse
    #43440

    Dave
    Moderator

    Hi,

    If you have 24GB of RAM available to the machines, why don’t you consider upping the amount of heap available to the ResourceManager and NameNode.
    You should also consider tweaking the other components java options to get the most out of your cluster.

    Thanks

    Dave

    Collapse
    #43367

    Dave,

    The namenode keeps going back to safe mode, and complaining that the namenode is running low on resources.
    I am rebooting the services to bring the resources back and then bring the namenode back from safemode.

    Namenode Java Heap size on the config in Ambari says – 1024 mb (thats the default)
    ResourceManager Java heap size – 1024 mb
    Can you please advice.

    Collapse
    #43273

    Dave,

    Namenode Java Heap size on the config in Ambari says – 1024 mb (thats the default)
    ResourceManager Java heap size – 1024 mb

    I only saw resource manager heap size red in dashboard as 93%
    Again why does it go up when nothing was running ?

    If I change this value do I need to restart the namenode ?
    Does the namenode be stopped when I am making this change ?
    Whats the recommended value ?

    FYI – We have 24 GB RAM each on the 2 nodes..

    Collapse
    #43181

    Dave
    Moderator

    Hi,

    How much resource do you have available to your NN?

    Thanks

    Dave

    Collapse
    #43175

    Dave

    The reason was the resource manager heap size showed 93% on the dashboard.
    I dont understand why the heap size was 93% because we were not running anything. I brought all the services down, and then brought back up, which brought the heap size to 11% and then I was able to leave safe mode.
    Question is why was the heap size high when nothing was running ?
    and do i need to add more, if so how and where ? Currently it has default size
    is there any recommended size, ? I have a 2 node cluster in RHEL5.8 Vmware

    Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the
    NN will immediately return to safe mode. Use “hdfs dfsadmin -safemode leave” to turn safe mode off.

    Collapse
    #43173

    Dave
    Moderator

    Hi,

    Can you attach the namenode log here so we can see what is causing it to be in safemode?

    Thanks

    Dave

    Collapse
Viewing 12 replies - 1 through 12 (of 12 total)