Home Forums HDP on Linux – Installation Propperly restarting the cluster after power off (Ambari install).

This topic contains 7 replies, has 2 voices, and was last updated by  Santiago Basaldúa 12 months ago.

  • Creator
    Topic
  • #46151

    Hi,

    I have finally succeeded installing HDP 2.0 using Ambari (after some failed manual attempts). The cluster is for software development only so it is not powered on 24h. I could not restart it again. (I shut down the machines without stopping all services from Ambari, so it may be my fault) and had to reinstall via ‘sudo ambari-server reset’.

    My question is: What is the correct procedure for stopping and starting the cluster? What is starting manually? (As mentioned in the troubleshooting part of the installation via Ambari manual) Since Ambari uses its own directories and users the method described in installing HDP manually no longer works. Where are the scripts created by Ambari to start and stop individual components?

    Thank you in advance.

Viewing 7 replies - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #46172

    Hi Jeff,

    HBase is the only service I cannot start in the order in which they are listed. But starting all services in order: HDFS, YARN, MapReduce2, Hive, WebHCat, Oozie, Ganglia, Nagios, Zookeeper and finally HBase works fine. But using ‘start all’ still doesn’t work. It is only a matter of making many clicks because starting all services in that order works fine even without waiting for each service to start before starting the other. The order is kept by Ambari and it does not try to run one background operation before the previous one has finished. It would still be nice to automate this task or understand why ‘start all’ doesn’t work (maybe HDFS takes too long to start).

    Collapse
    #46168

    Jeff Sposetti
    Moderator

    Hi Santiago,

    Good to hear about your progress. Was HBase the only service you had trouble with during “start all”? For HBase, you should make sure NN is fully started before attempting to start HBase.

    Jeff

    Collapse
    #46166

    Hi Jeff,

    Almost solved! As you mentioned, the Ambari agents had to run (my misunderstanding) and were not. Once I started the agents, the Ambari server was able to start everything, but not automatically clicking on ‘start all’. Start all failed to start most services and I had to start them manually since now the ‘Start’ button from each failed service was enabled. HBase failed to start again twice, but I succeeded starting it when all the other services were running. It also passed all the smoke tests. The cluster is 100% healthy now, but the starting procedure needs some improvement. Any ideas?

    Thank you very much for your help.

    Collapse
    #46164

    Jeff Sposetti
    Moderator

    Hi,

    So you say “All services are in yellow with ‘heartbeat lost’. I think your Ambari Agents are not started. So go onto the machines, and run “ambari-agent start”.

    FWIW, you do have Ambari Agents installed because the choice between auto (via SSH) or manual host registration really is about how you expect the Ambari Server to connect to the machines and install the Ambari Agents. If you choose SSH, that means the Ambari Server can SSH to the machines, install the Ambari Agent, start it, and continue. If you chose manual (for those who do not have SSH access), you would have had to install + start the Agents yourself. In both cases, end result is the agents on the boxes.

    The “Yellow” indicates heartbeat lost, so that makes me think the Ambari Agents aren’t started on the machines. So your start/stop from Ambari can’t talk to the machines. Go onto the machines, start the agents and see if the heartbeat comes back. As well, you can use ckconfig to set the ambari agents to auto-start.

    Let us know how it goes. Thanks!

    Jeff

    Collapse
    #46156

    Oops. ambari-server does have a –version even if not displayed. The version is: 1.4.2.104

    Collapse
    #46155

    Hi Jeff,

    Thanks for the quick answer. Yes when I installed everything was running fine at 100%, smoke tests, dashboard and all services. I do not have Ambari agents since I enabled password-free ssh across the nodes.

    I power back the machines with exactly the same dns server resolving all nodes forwards and backwards and the password-free ssh I used to install. I start the Ambari console. All services are in yellow with ‘heartbeat lost’ (except HBase in red).

    I click on ‘start all’ and it fails. After a while I get 100% in orange and, clicking to see the details, all I can ‘HBase check ..’, ‘Pig check .. ‘, etc.
    It looks like it is trying to run checks rather than starting the services. Hdfs is not even started as there in no service at port 50070 and the hadoop fs -xxx commands are not working. No Hadoop service is running.

    I cannot just select the service hdsf and click on ‘start’ because the button is disabled. Only the ‘start all’ button is enabled and it is not starting anything.

    I am not sure about the Ambari version, (ambari-server has no version option and the web interface shows the version of anything but Ambari) but I downloaded an installed yesterday. When I click on Ambari it goes to the ambsri.apache.org and says Version 1-4-1 SNAPSHOT but that may be the latest version regardless of what is installed.

    Collapse
    #46152

    Jeff Sposetti
    Moderator

    Hi Santiago,

    To confirm your steps:

    1) You performed the install.
    2) All Hadoop Services were started and running fine, and the Ambari Server and the Ambari Agents were fine
    3) You powered off the machines
    4) You powered back on the machines and started Ambari Server and the Ambari Agents
    5) Now you are unable to get the Hadoop Services to start?
    6) Also, confirm what version of Ambari you are using? “ambari-server –version”

    Thanks,

    Jeff

    Collapse
Viewing 7 replies - 1 through 7 (of 7 total)