The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

HDP on Linux – Installation Forum

Can't restart cluster – ambari not proving useful

  • #58050
    Brian Greeson

    Hey guys, I’m running a hdp 2.1 cluster on a set of machines running CENTOS 6.5

    I successfully installed the cluster ( a few times now), but I am running into consistent issues when restarting the machines that make up ther cluster. I’d like to resolve the issue and get my cluster running again, as well figure out a procedure for avoiding the issue in the future

    I made sure to firstly, run stop all from the ambari interface. Then, I shut down all the systems, including the system running ambari-server. Is there anything wrong with this procedure? Should I run ambari-server stop on the ambari-server system prior to shutting that machine down??

    Anyhow, I powered on all the systems, and waited to ensure they are all up. Then, I contact the web interface on the ambari-server machine. I log in. I run the start all, and it consistently fails, usually around 11 or 12 seconds.

    The odd thing is, the nodes, when you look at the individual services, sometimes a failed service will have the red circle with exclamation point and sometimes it will have the yellow bar. It’s not really clear what either of those mean, since there is no key, but I assume the red is a complete failure.

    When I click on any of the failed services, I get messages indicating that the service failed to install, although it was previously installed and I never had wanted to install it again. I only wanted to start my cluster.

    I would attach a screenshot of one of the task lists for a node, but I don’t see any option do so.

    I can paste the stdout and stderr for one of the services on a particular node which is red if that will be helpful?

    What does the yellow line mean as comapred to the red circle??


  • Author
  • #58053
    Jeff Sposetti

    Please post the stdout and stderr from the Start All. Just pick a host and a component task where you see a red exclamation failure.

    Red exclamation means the task failed.

    The yellow with a bar means the task was cancelled. If a master component start task fails (red exclamation), ambari will cancel the remaining tasks (so not to bother with attempting the perform the tasks since a master component that task is dependent on failed.

    If you try to start services individually (and not Start All), how does that work? Start with HDFS > Start, and so on.

    Brian Greeson

    Hi Jeff,

    Thanks for the response. I will report back with that information. However, I’ve noticed one other thing firstly. I’ve noticed that the yum database seems to be corrupted on the nodes, any idea what could have caused this? I feel like this could be causing the errors, so I want to correct that and then I’ll let you know if I still have issues.

    Brian Greeson

    Hi Jeff,

    I was able to successfully restart the cluster after resolving an issue with the yum database that has occurred on all master and slave nodes.

    The nature of the error messages was something like ” failed to install service X”
    I’m assuming what happened is this:
    Ambari used yum to check if packages existed
    – Yum is broken
    Since yum is broken Ambari assumes packages are missing
    – Ambari attempts to use yum (which is broken) to install missing packages….fails

    Doing the following on all affected nodes fixed my yum issues
    # rm -f /var/lib/rpm/__db*
    # rpm –rebuilddb
    # yum update

    However, the question remains, what caused this? I’ve only successfully installed the cluster. Started it, stopped it. Then I shutdown the nodes and powered them up. That’s when this issue manifested. It seems to me that ambari must be the cause, then.

    Any thoughts?

    Thanks again,

    Brian Greeson

    Upon shutting down the cluster via stop all, then powering off the machines. Attempting to start the cluster again, I’ve encountered the same issues with the yum databases. Any clue?


    Hi Brian, are you able to reply with how much RAM was allocated to your agent, and the memory block size?
    I found some related issues on the Red Hat Bugzilla page that may point to the RAM being less than 1GB, and/or the memory block size being less than 4KB.

The forum ‘HDP on Linux – Installation’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.