Home Forums HDP on Windows – Installation Failed_start namenode secondarynamenode datanode

This topic contains 22 replies, has 4 voices, and was last updated by  Dave 8 months ago.

  • Creator
    Topic
  • #48764

    Lucho Farje
    Participant

    Hello, I have succesfully installed Hadoop 2.0 on a single node windows server 2012 but when I run start_local_hdp_services.cmd using powershell command line or command prompt I got these errors below:

    I have also tried to run start_local_hdp_services.cmd as hadoop user but I don’t know if these errors because of rights or something else. Any clues to overcome these errors would be very much appreciated!

    PS D:\hdp> .\start_local_hdp_services.cmd
    starting namenode
    starting secondarynamenode
    starting datanode
    starting resourcemanager
    starting nodemanager
    starting historyserver
    starting zkServer
    starting master
    starting regionserver
    starting hwi
    starting hiveserver2
    starting metastore
    Start-Service : Failed to start service ‘Apache Hadoop metastore (metastore)’.
    At D:\hdp\manage_local_hdp_services.ps1:77 char:16
    + $foo = Start-Service -Name $serviceName.Name -ErrorAction Continue
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
    ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

    starting derbyserver
    Start-Service : Failed to start service ‘Apache Hadoop derbyserver (derbyserver)’.
    At D:\hdp\manage_local_hdp_services.ps1:77 char:16
    + $foo = Start-Service -Name $serviceName.Name -ErrorAction Continue
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
    ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

    starting templeton
    Start-Service : Failed to start service ‘Apache Hadoop templeton (templeton)’.
    At D:\hdp\manage_local_hdp_services.ps1:77 char:16
    + $foo = Start-Service -Name $serviceName.Name -ErrorAction Continue
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
    ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

    starting oozieservice
    Sent all start commands.
    total services
    15
    running services
    6
    not yet running services
    9
    Failed_Start namenode secondarynamenode datanode historyserver hwi hiveserver2 metastore derbyserver templeton

    Best regards,
    Lucho Farje

Viewing 22 replies - 1 through 22 (of 22 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #50657

    Dave
    Moderator

    Hi Lucho,

    What is shown in the installation log, or the logs for the failed services?
    Do the services fail if you go to : run > services.msc and start them manually?

    Thanks

    Dave

    Collapse
    #50620

    Lucho Farje
    Participant

    Hello, I am sure that I have choosen not formatting HDFS the first time. Still not work :(

    Thanks for your help.
    Lucho

    Collapse
    #50356

    L Vadhula
    Participant

    I just updated the other thread, but I could work around the issue by not formatting the HDFS the first time I am trying to generate data on *NEW DISKS*. I first try to teragen, which fails and then go ahead and format and teragen (which succeeds). This approach has helped me overcome the namenode crashes.

    If at all it would help someone.

    Collapse
    #50055

    L Vadhula
    Participant

    Yeah, I am not sure I am of much help to you at this point. The one thing I have seen in the past is when I had the master node to be the Zookeeper host too. Once I changed that I got past that error. Please refer to this: http://hortonworks.com/community/forums/topic/teragen-error-jobtracker-is-not-yet-running/.
    I have since been able to run upto 16 nodes successfully. Once I get past 16, the namenode service just keeps terminating, like you mentioned (starts, stops, starts again and hten stops again almost immediately).

    Collapse
    #50026

    Lucho Farje
    Participant

    Hello, thanks for answering my questions.

    The root problem I have is that when I run the bat script to start the hadoop processes so it seems that the windows services are trying to start up but there are some services that they can’t start and they are still looping and running like daemons and sometimes I see them like running but in a second I see them stopped!

    I have read this doc bk_installing_hdp_for_windows-20140120.pdf but I think that I’m going to read it one more time.
    I have also followed this link http://hortonworks.com/blog/install-hadoop-windows-hortonworks-data-platform-2-0/

    If you have more information about my problem or you receive answer of your questions and you can share them with me so I will appreciate that.
    Thanks in advanced.
    Lucho

    Collapse
    #50012

    L Vadhula
    Participant

    Also, did you follow the bk_installing_hdp_for_windows-20130612.pdf document? I am surprised you didn’t have to use the clusterproperties.txt file. Just asking this so you don’t have to wait before I reply something tomorrow. Most of what I know is by reading that document.
    Please let me know what your error is though, and if you manage to overcome that the solution too.

    Collapse
    #50011

    L Vadhula
    Participant

    You are mostly right, you have a single node installation and my suggestions really don’t apply for you.
    But, you do have a Slave, only in your case the master and the slave would be the same machine. Your SLAVE_HOSTS is also LUKE and so is the NAMENODE_HOST.
    I am trying to use a multi-node installation. Sorry, but what error are you running into with this? Is it your NAMENODE service that fails, or is it something else?

    Collapse
    #49972

    Lucho Farje
    Participant

    Hello, I’m using a hadoop single node installation. I found that file and it contains this following lines

    #Log directory
    HDP_LOG_DIR=d:\hadoop\logs

    #Data directory
    HDP_DATA_DIR=d:\hdpdata

    #hosts
    NAMENODE_HOST=LUKE
    SECONDARY_NAMENODE_HOST=LUKE
    RESOURCEMANAGER_HOST=LUKE
    HIVE_SERVER_HOST=LUKE
    OOZIE_SERVER_HOST=LUKE
    WEBHCAT_HOST=LUKE
    SLAVE_HOSTS=LUKE
    CLIENT_HOSTS=LUKE
    HBASE_MASTER=LUKE
    HBASE_REGIONSERVERS=LUKE
    ZOOKEEPER_HOSTS=LUKE
    FLUME_HOSTS=LUKE

    #Database host
    DB_FLAVOR=DERBY
    DB_HOSTNAME=LUKE
    DB_PORT=1527

    Besides HIVE and OOZIE properties

    I’m not using slaves and I don’t think now that I should make the change you suggested because I’ve done a single node installation and I think you have done that in a cluster node installation. am I right?

    What more do you suggest I may do to overcome this problem? or just wait?

    Thanks for you answer,
    Lucho Farje

    Collapse
    #49968

    L Vadhula
    Participant

    No. I think once installed the clusterproperties.txt file is renamed as “cluster.properties”. That’s the filename.
    As for the slave hosts, I don’t understand the question. So, you are using only 3 slaves?
    IF you change the slave hosts, you will have to reinstall hadoop, yes. This information is gathered from clusterproperties.txt file and all the hosts are set-up that way (well, in your case they may be VMs). May be there is a way to not having to re-do the whole process (like you may be able to just update cluster.properties and install HDP on each of your newly added slaves), but I am not sure.

    I am still hoping to get an answer from the others as to why the namenode service will terminate itself soon after starting. I am going to reduce my cluster size to fewer machines (like everything upto 16 works perfectly fine)! I shall update this thread if I find something out, but please let us know what your configuration is and what is working or failing. THat will help someone.
    They were generally very responsive on these questions, but for some reason I don’t see much here :( Hopefully someone will look at this soon, Seth?

    Collapse
    #49941

    Lucho Farje
    Participant

    Hello, I can not find a clusterproperties.txt file in my hadoop installation folders but I do find in the folder that I have copied all installation files.
    I have installed hadoop and leave this file with all default values. I found that parameter that you wrote but I don’t know if I change it will be used by hadoop services….
    #Hosts
    NAMENODE_HOST=NAMENODE_MASTER.acme.com
    SECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.com
    RESOURCEMANAGER_HOST.acme.com
    HIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.com
    OOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.com
    WEBHCAT_HOST=WEBHCAT_MASTER.acme.com
    FLUME_HOSTS=FLUME_SERVICE1.acme.com,FLUME_SERVICE2.acme.com,FLUME_SERVICE3.acme.com
    HBASE_MASTER=HBASE_MASTER.acme.com
    HBASE_REGIONSERVERS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    ZOOKEEPER_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    SLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com

    Do I have to install again hadoop and enter the proper values in this file clusterproperties.txt?
    I will wait your answer regarding the other problem.
    Thanks for you help!
    Lucho

    Collapse
    #49939

    L Vadhula
    Participant

    Sorry Lucho, I have not got any answer to my question yet :(
    Increase the worker node count – just modify your clusterproperties.txt to have the names listed for SLAVE_HOSTS.
    THe problem with the services restarting is actually a big blocker for me.I am still waiting to hear from the folks here to see what I can do to get around it.

    Thanks,
    VS

    Collapse
    #49590

    Lucho Farje
    Participant

    Hello, I got any update to thisĀ“since your aswer. I would like to try what you have written here. Could you please let me know how may I increase the worker node count? This is a single node installation.
    I don’t know what else I need to overcome this problem but I can start with your suggestion and then I can update here how was that. I’d like to have hadoop running on my virtual machine.
    I’m experienced something like you’ve described some apache hadoop services keeps restarting repeatedly and it’s not possible to start them.

    Regards,
    Lucho

    Collapse
    #49589

    L Vadhula
    Participant

    Any update to this? I seem to be running into the same issue once I increased the worker node count beyond 20!
    Please let me know what needs to be done to overcome the problem. In my case the namenode service keeps restarting repeatedly :(

    Collapse
    #48966

    Lucho Farje
    Participant

    Hello, I can’t find hive.log file and I’m not using MSSQL as a repository but instead I’ve installed Derby but it does not start up either!
    I have added CLASSPATH environment variable to the server and add the java\jre\bin folder to the path variable and it does not help either!
    But when I run javac I got an error. I wonder if hadoop needs javac when start up?

    Best regards,
    Lucho Farje

    Collapse
    #48906

    BILL FRIESENHAHN
    Participant

    I just solved my problem. When I looked in the c:\hadoop\logs\hive\hive.log file, it printed an exception stating:
    The specified datastore driver (“com.microsoft.sqlserver.jdbc.SQLServerDriver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

    To my mistake, I copied the sqljdbc4.jar to the wrong path. Copying it to the $HIVE_HOME/lib directory fixed the problem.

    Thanks, Bill.

    Collapse
    #48898

    BILL FRIESENHAHN
    Participant

    I am also having a very similar problem except that it is the metastore and the hiveserver2 services that are failing to remain started. The metastore.trace file keeps logging the following contents:
    HadoopServiceTraceSource Information: 0 : Tracing successfully initialized
    DateTime=2014-02-18T19:29:22.5025768Z
    Timestamp=120549545823
    HadoopServiceTraceSource Information: 0 : Loading service xml: c:\hdp\hive-0.12.0.2.0.6.0-0009\bin\metastore.xml
    DateTime=2014-02-18T19:29:22.5181769Z
    Timestamp=120549582535
    HadoopServiceTraceSource Information: 0 : Successfully parsed service xml for service Metastore
    DateTime=2014-02-18T19:29:22.5649769Z
    Timestamp=120550104049
    HadoopServiceTraceSource Information: 0 : Command line: C:\java\jdk1.6.0_31\bin\java -Xmx1000m -Dfile.encoding=UTF-8 -Dhadoop.log.dir=c:\hadoop\logs\hadoop -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=c:\hdp\hadoop-2.2.0.2.0.6.0-0009 -Dhadoop.id.str=asip -Dhadoop.root.logger=INFO,console -Djava.library.path=;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\bin -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -classpath c:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\common\lib\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\common\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs\lib\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\yarn\lib\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\yarn\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\mapreduce\lib\*;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\mapreduce\*;;c:\hdp\hive-0.12.0.2.0.6.0-0009\conf;c:\hdp\hive-0.12.0.2.0.6.0-0009\lib\*;; org.apache.hadoop.hive.metastore.HiveMetaStore -hiveconf hive.querylog.location=c:\hadoop\logs\hive\history -hiveconf hive.log.dir=c:\hadoop\logs\hive
    DateTime=2014-02-18T19:29:22.5649769Z
    Timestamp=120550104697
    HadoopServiceTraceSource Information: 0 : ServiceHost#OnStart
    DateTime=2014-02-18T19:29:22.7053772Z
    Timestamp=120551465513
    HadoopServiceTraceSource Information: 0 : Child process started, PID: 5488
    DateTime=2014-02-18T19:29:22.7053772Z
    Timestamp=120551497090
    HadoopServiceTraceSource Information: 0 : Child process exited with exit code: 1
    DateTime=2014-02-18T19:29:25.6130518Z
    Timestamp=120580510683
    HadoopServiceTraceSource Information: 0 : Service host not in shutdown mode, terminating service host
    DateTime=2014-02-18T19:29:25.6130518Z
    Timestamp=120580510885

    I looked for the hadoop.log file that was mentioned in the metastore.trace file but it does not exist on the machine.

    My system was installed on 3 Windows 2008 R2 machines using VMWare. The Java version is: 1.6.0_31
    HDP Setup:
    Except for the Slave Hosts all hosts were set to: HadoopHome
    Slave hosts: HadoopData1,HadoopData2
    DB Flavor:MSSQL

    I would very much appreciate some guidance to this problem. Thanks, Bill.

    Collapse
    #48879

    Lucho Farje
    Participant

    Hello Dave, I can’t find a file with that name hadoop-namenode-LUKE.log but I have found other ones under d:\hadoop\logs\hadoop a group of files with similar filename structure such as yarn-nodemanager-LUKE.log or yarn-resourcemanager-LUKE.log and those two services nodemanager and resourcemanager runs stable in my virtual machine which means that I can see them with status “running” all the time in windows services.

    But I have found in the yarn-resourcemanager-LUKE.log this error below:
    java.io.IOException: An existing connection was forcibly closed by the remote host
    at sun.nio.ch.SocketDispatcher.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(Unknown Source)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
    at sun.nio.ch.IOUtil.read(Unknown Source)
    at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
    at org.apache.hadoop.ipc.Server.channelRead(Server.java:2602)
    at org.apache.hadoop.ipc.Server.access$3200(Server.java:122)
    at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1505)
    at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:792)
    at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:591)
    at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:566)

    I have also seen in the services windows that the Apache* services sometimes there are with status “running” but suddenly dissapear and it seems that they are not running in other words they are trying to start up but they can’t start.

    I hope that what I written here is a clue for helping me.

    Thanks,
    Lucho

    Collapse
    #48865

    Dave
    Moderator

    Hi Lucho,

    Can you paste any errors you see in this file:
    hadoop-namenode-LUKE.log

    Thanks

    Dave

    Collapse
    #48802

    Lucho Farje
    Participant

    Hello again, I was looking into D:\hdp\hadoop-2.2.0.2.0.6.0-0009\bin directory and I found a namenode.trace which is being updated every some minutes and I found these messages:
    HadoopServiceTraceSource Information: 0 : Command line: C:\Java\jdk1.6.0_31\bin\java -Xmx1000m -Dhadoop.log.dir=d:\hadoop\logs\hadoop -Dhadoop.log.file=hadoop-namenode-LUKE.log -Dhadoop.home.dir=d:\hdp\hadoop-2.2.0.2.0.6.0-0009 -Dhadoop.id.str=Lucho Farje -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\bin -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -classpath d:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\common\lib\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\common\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs\lib\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\hdfs\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\yarn\lib\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\yarn\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\mapreduce\lib\*;d:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\mapreduce\* org.apache.hadoop.hdfs.server.namenode.NameNode
    DateTime=2014-02-17T20:20:12.4737782Z
    Timestamp=11259994933
    HadoopServiceTraceSource Information: 0 : ServiceHost#OnStart
    DateTime=2014-02-17T20:20:12.5357763Z
    Timestamp=11260159031
    HadoopServiceTraceSource Information: 0 : Child process started, PID: 4580
    DateTime=2014-02-17T20:20:12.5357763Z
    Timestamp=11260163400
    HadoopServiceTraceSource Information: 0 : Child process exited with exit code: 1
    DateTime=2014-02-17T20:20:13.0037957Z
    Timestamp=11261420687
    HadoopServiceTraceSource Information: 0 : Service host not in shutdown mode, terminating service host
    DateTime=2014-02-17T20:20:13.0037957Z
    Timestamp=11261420798

    It seems that the service can not start.

    Thanks for your help
    Lucho

    Collapse
    #48800

    Lucho Farje
    Participant

    Hello again! I was checking the windows error logs and I got these events:

    The Apache Hadoop metastore service entered the running state.

    some miliseconds after:

    The Apache Hadoop metastore service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 5000 milliseconds: Restart the service.

    it happens exactly the same with namemode service. I tried to start the service using services windows tab but I got an error message “Error 1067: the process termined unexpectly”

    I got more confused now and I don’t know if this is because of configuration or windows resources.

    Thanks in advanced for your help
    Lucho

    Collapse
    #48799

    Lucho Farje
    Participant

    Hi Dave, thanks for reply. I see that almost all Apache* services have “manual” as Startup Type. I got these errors below when I ran again start_local_hdp_services.cmd but I see other services as “running” services like Hbase regionserver, zookeeper, resourcemanager, oozieservice and nodemanager. Where I can find the logs you asked?
    I have a suspicion that this is something with configuration but I don’t know which file I should configure.

    Start-Service : Failed to start service ‘Apache Hadoop metastore (metastore)’.
    At D:\hdp\manage_local_hdp_services.ps1:77 char:16
    + $foo = Start-Service -Name $serviceName.Name -ErrorAction Continue
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
    ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

    Start-Service : Failed to start service ‘Apache Hadoop namenode (namenode)’.
    At D:\hdp\manage_local_hdp_services.ps1:77 char:16
    + $foo = Start-Service -Name $serviceName.Name -ErrorAction Continue
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OpenError: (System.ServiceProcess.ServiceController:ServiceController) [Start-Service],
    ServiceCommandException
    + FullyQualifiedErrorId : StartServiceFailed,Microsoft.PowerShell.Commands.StartServiceCommand

    Thanks in advanced for your help!
    Lucho

    Collapse
    #48784

    Dave
    Moderator

    Hi Lucho,

    Can you check the services tab on your local machine (where you run “start_local_hdp_services”) ?
    Do the namenode and metastore show they are running?
    If so, can you attach the logs for them here?

    Thanks

    Dave

    Collapse
Viewing 22 replies - 1 through 22 (of 22 total)