HDP on Linux – Installation Forum

Cluster uninstallation failed

  • #11962
    Shyam Kadari
    Member

    Hello:
    I am installing HDP (latest version) using “hmc” on a five node VM nodes. The installation failed becasue of puppet cert problems.
    I deleted the puppet certs on nodes so they can be regenarated.
    hmc required me to uninstall before attempting to install again.
    When I uninstalled, uninstall failed.
    hmc shows following error:

    Cluster uninstallation failed
    Failed to uninstall the cluster. View the troubleshooting guide.
    What should I do to recover from this uninstall error?

    S

    Here is the part of the error file:

    [2012:11:08 22:44:07][WARN][PuppetInvoker][PuppetInvoker.php:344][waitForResults]: Kick timed out, waited 120 seconds
    [2012:11:08 22:44:08][INFO][PuppetInvoker][PuppetInvoker.php:237][createGenKickWaitResponse]: Response of genKickWait:
    Array
    (
    [result] => 0
    [error] =>
    [nokick] => Array
    (
    )

    [failed] => Array
    (
    [0] => logproc-name-1.highwire.org
    [1] => logproc-data-2.highwire.org
    [2] => logproc-data-3.highwire.org
    [3] => logproc-data-1.highwire.org
    [4] => logproc-data-4.highwire.org
    )

    [success] => Array
    (
    )

    [timedoutnodes] => Array
    (
    [0] => logproc-data-2.highwire.org
    [1] => logproc-data-3.highwire.org
    [2] => logproc-data-1.highwire.org
    [3] => logproc-data-4.highwire.org
    )

    )

    [2012:11:08 22:44:08][INFO][Cluster:HighWireDev01][Cluster.php:150][_uninstallAllServices]: Persisting puppet report for uninstall HDP
    [2012:11:08 22:44:08][ERROR][Cluster:HighWireDev01][Cluster.php:164][_uninstallAllServices]: Puppet kick failed, no successful nodes
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:610][persistTransaction]: persist: 4-60-2:FAILED: Cluster uninstall:FAILED
    [2012:11:08 22:44:08][INFO][Cluster:HighWireDev01][Cluster.php:1039][setState]: HighWireDev01 – FAILED
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:610][persistTransaction]: persist: 4-61-60:FAILED:Miscellaneous uninstall:FAILED
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:556][setServiceState]: MISCELLANEOUS – FAILED
    [2012:11:08 22:44:08][INFO][Service: MISCELLANEOUS (HighWireDev01)][Service.php:130][setState]: MISCELLANEOUS – FAILED dryRun=
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:610][persistTransaction]: persist: 4-62-60:FAILED:Nagios uninstall:FAILED
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:556][setServiceState]: NAGIOS – FAILED
    [2012:11:08 22:44:08][INFO][Service: NAGIOS (HighWireDev01)][Service.php:130][setState]: NAGIOS – FAILED dryRun=
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:610][persistTransaction]: persist: 4-63-60:FAILED:Ganglia uninstall:FAILED
    [2012:11:08 22:44:08][INFO][OrchestratorDB][OrchestratorDB.php:556][setServiceState]: GANGLIA – FAILED
    [2012:11:08 22:44:08][INFO][Service: GANGLIA (HighWireDev01)][Service.php:130][setState]: GANGLIA – FAILED

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #11963
    Shyam Kadari
    Member

    The orignial problem that led me to uninstall is because of the following puppet errors:

    Thu Nov 08 12:59:25 -0800 2012 Puppet (notice): Using cached catalog
    Thu Nov 08 12:59:25 -0800 2012 Puppet (debug): catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson
    Thu Nov 08 12:59:25 -0800 2012 Puppet (err): Could not retrieve catalog from remote server: certificate verify failed. This is often because the time is out of sync on the server or client
    Thu Nov 08 12:59:25 -0800 2012 Puppet (info): Not using expired catalog for logproc-data-3.highwire.org from cache; expired at Wed Nov 07 17:59:33 -0800 2012
    Thu Nov 08 12:59:25 -0800 2012 Puppet (notice): Using cached catalog
    Thu Nov 08 12:59:25 -0800 2012 Puppet (err): Could not retrieve catalog; skipping run
    Thu Nov 08 12:59:25 -0800 2012 Puppet (debug): Value of ‘preferred_serialization_format’ (pson) is invalid for report, using default (marshal)
    Thu Nov 08 12:59:25 -0800 2012 Puppet (debug): report supports formats: b64_zlib_yaml marshal raw yaml; using marshal
    Thu Nov 08 12:59:25 -0800 2012 Puppet (err): Could not send report: certificate verify failed. This is often because the time is out of sync on the server or client

    Shyam Kadari

    #11966
    Robert
    Participant

    Hi Shyam,
    You need to uninstall puppet on the node where hmc is installed. You can simply run a ‘yum erase puppet’ command and that should wipe out the puppet along with hmc since it depends on it. After doing that, install hmc ‘yum install hmc’ and also make sure to the times are sync on all the machines. Ideally to do this, you can enable the ntp service on your nodes so they will all sync up. Once done and verified all the times are the same, go ahead and execute ‘service hmc start’ and try to install again.

    HTH

    Robert

    #11967

    I generate key by using following command
    ssh-keygen

    i gave file path for private key file for root = /root/.ssh/id_rsa.pub
    for Host File =hostname.txt in this file i gave hostname

    Still i got an error as follows

    Failed. Reason: Permission denied, please try again.
    Permission denied, please try again.
    Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

    please help me any one

    Regards,
    T Chenna krishna

    #11970
    Shyam Kadari
    Member

    Hello Robert:

    I have uninstalled and installed hmc and puppet multiple times and tried to creat a cluster.
    At different points of installation different puppet agents failed to complete.

    A) All the RPMS are already intalled on all the nodes.
    B) The VM’s are running using a SAN has a storage

    C) The bellow is from hmc.log on node logproc-name-1

    2012:11:09 06:29:47][WARN][PuppetInvoker][PuppetInvoker.php:344][waitForResults]: Kick timed out, waited 120 seconds
    [2012:11:09 06:29:48][INFO][PuppetInvoker][PuppetInvoker.php:292][genKickWait]: Kick attempt (2/3)
    [2012:11:09 06:29:49][INFO][PuppetInvoker][PuppetInvoker.php:97][sendKick]: logproc-name-1 previous kick still running, will continue to wait
    [2012:11:09 06:29:49][INFO][PuppetInvoker][PuppetInvoker.php:332][waitForResults]: Waiting for results from logproc-name-1.highwire.org
    [2012:11:09 06:29:49][INFO][PuppetInvoker][PuppetInvoker.php:336][waitForResults]: 0 out of 1 nodes have reported for txn 3-58-57
    [2012:11:09 06:30:54][INFO][PuppetInvoker][PuppetInvoker.php:336][waitForResults]: 1 out of 1 nodes have reported for txn 3-58-57
    [2012:11:09 06:30:55][INFO][PuppetInvoker][PuppetInvoker.php:237][createGenKickWaitResponse]: Response of genKickWait:
    Array
    (
    [result] => 0
    [error] =>
    [nokick] => Array
    (
    )

    [failed] => Array
    (
    [0] => logproc-name-1
    )

    [success] => Array
    (

    Your help would be much appreciated.

    Shyam Kadari

    #11971
    tedr
    Member

    T Chenna Krishna,

    It looks like what went wrong is that you pointed HMC at your public key not the private key that it needs. Please try pointing it at id_rsa not id_rsa.pub.

    Ted.

    Shyam,

    Could you run the script talked about here: http://hortonworks.com/community/forums/topic/hmc-installation-support-help-us-help-you/ and upload the results?

    Ted.

    #12027

    Thanks Shyam,
    i changed the file,still i got same error.
    I used following linux command to generate Key
    ssh-keygen
    is this correct or not
    do you please suggest me?

    #12028
    tedr
    Member

    T Chenna Krishna,

    Ssh-keygen is what you should have used to generate the keys with. Are you running the installer as root or some other user?

    Ted.

    #12121

    ya.. I am running cluster as root only.

    #12127
    tedr
    Member

    T Chenna Krishna,

    Have you fully set up passwordless ssh to all nodes on the cluster? Have you copied your ssh key to the other boxes?

    Ted

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.