Home Forums HDFS Namenode does not start – HDFS error ulimit

This topic contains 5 replies, has 2 voices, and was last updated by  Vincent Wylenzek 4 months, 3 weeks ago.

  • Creator
    Topic
  • #54456

    Vincent Wylenzek
    Participant

    Today we installed 5 nodes with HDP 2.1
    We initiated the install through Ambari 1.5.1 on svr 36. There is one name node on svr 38.
    svr 38 doesn’t start up.

    The context = RHEL 5. 9 / HDP 2.1

    It throws the following error during startup HDFS (script ends at: ulimit -a, process 22513 fails without an error)

    + echo starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-svr38.rdo01.local.out
    starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-svr38.rdo01.local.out
    + cd /usr/lib/hadoop
    + case $command in
    + ‘[' -z /usr/lib/hadoop-hdfs ']‘
    + hdfsScript=/usr/lib/hadoop-hdfs/bin/hdfs
    + java -version
    java version “1.7.0_19″
    OpenJDK Runtime Environment (rhel-2.3.9.1.el5_9-x86_64)
    OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
    + echo 22513
    + sleep 1
    + nohup nice -n 0 /usr/lib/hadoop-hdfs/bin/hdfs –config /etc/hadoop/conf namenode
    + head /var/log/hadoop/hdfs/hadoop-hdfs-namenode-svr38.rdo01.local.out
    + ‘[' true = '' ']‘
    + echo ‘ulimit -a for user hdfs’
    + ulimit -a
    + sleep 3
    + ps -p 22513
    + whoami
    hdfs
    + id
    uid=1008(hdfs) gid=502(hadoop) groups=502(hadoop)
    + echo ‘NameNode draait niet meer, exiting’
    NameNode draait niet meer, exiting
    + exit 1

    (We added some verbose logging)

    Any ideas?

Viewing 5 replies - 1 through 5 (of 5 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #54547

    Vincent Wylenzek
    Participant

    OS = RHEL 5.9

    Collapse
    #54546

    Vincent Wylenzek
    Participant

    Tried to move the name node to svr 36, but same issue popped up.
    Now I tried to execute an ambari-reset on svr 36, but the reset python script failes through an DDL error (PK constraint?)

    Collapse
    #54545

    Vincent Wylenzek
    Participant

    Thanks for your reply.
    WIth set -x a lot is logged. But nog significant errors.
    Only;

    I am using STARTUP_MSG: java = 1.7.0_19 <– is this version probably causing these issues?
    2014-05-26 11:33:16,276 WARN common.Util (Util.java:stringAsURI(56)) – Path /var/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,277 WARN common.Util (Util.java:stringAsURI(56)) – Path /tmp/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,277 WARN common.Util (Util.java:stringAsURI(56)) – Path /var/tmp/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,277 WARN common.Util (Util.java:stringAsURI(56)) – Path /usr/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,277 WARN common.Util (Util.java:stringAsURI(56)) – Path /usr/local/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,278 WARN common.Util (Util.java:stringAsURI(56)) – Path /data1/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,278 WARN common.Util (Util.java:stringAsURI(56)) – Path /opt/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    2014-05-26 11:33:16,278 WARN common.Util (Util.java:stringAsURI(56)) – Path /hadoop/hadoop/hdfs/data should be specified as a URI in configuration files. Please update hdfs configuration.
    ……. (lot of connection time outs)

    2014-05-26 11:54:26,843 ERROR impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(140)) – Got sink exception, retry in 4982ms
    2014-05-26 11:54:26,843 ERROR impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(140)) – Got sink exception, retry in 4982ms
    org.apache.hadoop.metrics2.MetricsException: Failed to putMetrics
    at org.apache.hadoop.metrics2.sink.ganglia.GangliaSink30.putMetrics(GangliaSink30.java:193)
    at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:173)
    at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.consume(MetricsSinkAdapter.java:41)
    at org.apache.hadoop.metrics2.impl.SinkQueue.consumeAll(SinkQueue.java:87)
    at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter.publishMetricsFromQueue(MetricsSinkAdapter.java:127)
    at org.apache.hadoop.metrics2.impl.MetricsSinkAdapter$1.run(MetricsSinkAdapter.java:86)
    Caused by: java.io.IOException: Network is unreachable
    at java.net.PlainDatagramSocketImpl.send(Native Method)
    at java.net.DatagramSocket.send(DatagramSocket.java:676)
    at org.apache.hadoop.metrics2.sink.ganglia.AbstractGangliaSink.emitToGangliaHosts(AbstractGangliaSink.java:259)
    at org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31.emitMetric(GangliaSink31.jav

    Collapse
    #54542

    Robert Molina
    Moderator

    Hi Vincent,
    Can you look in the namenode logs or .out file to see if there are any messages mentioned there? Also, can you provide the ulimit -a output for the machine running namenode? Also, what OS version is it running?

    Regards,
    Robert

    Collapse
    #54509

    Vincent Wylenzek
    Participant

    Additional info:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py”, line 106, in execute
    method(env)
    File “/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py”, line 38, in start
    namenode(action=”start”)
    File “/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py”, line 45, in namenode
    create_log_dir=True
    File “/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/utils.py”, line 63, in service
    not_if=service_is_up
    File “/usr/lib/python2.6/site-packages/resource_management/core/base.py”, line 148, in __init__
    self.env.run()
    File “/usr/lib/python2.6/site-packages/resource_management/core/environment.py”, line 149, in run
    self.run_action(resource, action)
    File “/usr/lib/python2.6/site-packages/resource_management/core/environment.py”, line 115, in run_action
    provider_action()
    File “/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py”, line 239, in action_run
    raise ex
    Fail: Execution of ‘ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop/sbin/hadoop-daemon.sh –config /etc/hadoop/conf start namenode’ returned 1.

    Collapse
Viewing 5 replies - 1 through 5 (of 5 total)