Home Forums HDP on Linux – Installation Registration with the server failed

This topic contains 7 replies, has 2 voices, and was last updated by  Larry Liu 1 year, 5 months ago.

  • Creator
    Topic
  • #23812

    Knut Nordin
    Member

    Hi,

    I am having trouble getting past the step “Confirm Hosts” in the Ambari cluster install wizard. Installation of the ambari server and agents seem to succeed, but when the process stops at the step “Registering with the server”.

    My setup
    ==============
    Hardware: Three Amazon instances running in a Virtual Private Cloud (to allow permanent private hostnames)
    OS: RedHat Enterprise Linux 6.4
    SELinux disabled
    iptables disabled
    No firewall between instances
    SSH set up and confirmed to be working from Ambari host to all nodes, including itself
    All hostnames in lower case (and confirmed to be reachable by ping and ssh)

    Registration log from GUI (tail)
    ==================
    INFO 2013-04-30 06:32:20,232 Controller.py:153 – Got server response: {u’executionCommands’: [],
    u’registrationCommand’: None,
    u’responseId’: 1,
    u’restartAgent’: False,
    u’statusCommands’: []}
    INFO 2013-04-30 06:32:20,232 Controller.py:116 – No commands from the server : []
    INFO 2013-04-30 06:32:20,232 Controller.py:116 – No commands from the server : []
    INFO 2013-04-30 06:32:20,232 Controller.py:180 – No commands sent from the Server.
    “, None)

    STDERR
    Connection to ip-10-0-0-233 closed.
    Registering with the server…
    Registration with the server failed.

    Ambari server log (tail)
    ====================
    06:47:08,254 INFO QueryImpl:130 – Executing resource query: {Host=null}
    06:47:08,255 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostResourceProvider for request type Host
    06:47:11,315 INFO QueryImpl:130 – Executing resource query: {Host=null}
    06:47:11,316 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostResourceProvider for request type Host
    06:47:14,373 INFO QueryImpl:130 – Executing resource query: {Host=null}

    Ambari agent log (tail)
    =================
    INFO 2013-04-30 07:17:34,031 Heartbeat.py:68 – Heartbeat dump: {‘componentStatus’: [],
    ‘hostname’: ‘ip-10-0-0-233.eu-west-1.compute.internal’,
    ‘nodeStatus’: {’cause’: ‘NONE’, ‘status’: ‘HEALTHY’},
    ‘reports’: [],
    ‘responseId’: 265,
    ‘timestamp’: 1367320654029}
    INFO 2013-04-30 07:17:34,080 Controller.py:153 – Got server response: {u’executionCommands’: [],
    u’registrationCommand’: None,
    u’responseId’: 266,
    u’restartAgent’: False,
    u’statusCommands’: []}
    INFO 2013-04-30 07:17:34,080 Controller.py:116 – No commands from the server : []
    INFO 2013-04-30 07:17:34,080 Controller.py:116 – No commands from the server : []
    INFO 2013-04-30 07:17:34,080 Controller.py:180 – No commands sent from the Server.

    I can not find anything that looks like a proper error message in the logs.

    I have retried the registration, reset the ambari server, restarted the machines and retried the entire installation from scratch several times, but always with the same result. Please advise!

Viewing 7 replies - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #24073

    Larry Liu
    Moderator

    Hi, Knut,

    It is great news that you got it work.

    To answer your question, if you use DNS, /etc/hosts is not requred.

    This is werid scenario. I have experience that when I reset ambari server when it was still running, reset didn’t work. Ambari server has to be stopped before reset.

    Larry

    Collapse
    #24035

    Knut Nordin
    Member

    Well.. After resetting the ambari server for the n:th time, the entire installation process now suddenly worked!

    The only difference from previous times is that I had not started the web gui at all after rebooting the host. Could there be some state inconsistency issues with the installation wizard perhaps?

    Collapse
    #24032

    Knut Nordin
    Member

    Hi again!

    I have now figured out how to query postgreSQL for this info. Although I did successfully register the hosts in the GUI, they do not seem to have been added to the ambari.hosts table.

    The table is actually completely empty, so this is not just some mismatch between hostname formats or so.

    \Knut

    Collapse
    #24031

    Knut Nordin
    Member

    Hi Larry,

    Thanks for the reply!

    I have actually not added the host names to /etc/hosts. Since I can use DNS to resolve them, I assumed it was not necessary. Is that incorrect?

    The contents of /etc/hosts is simply as follows:
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6

    Is there a way to manually interact with the postgre database to find out what is in the hosts table? I have tried running psql from the command line but am not sure what credentials to use.

    Thanks!
    \Knut

    Collapse
    #23817

    Larry Liu
    Moderator

    Hi, Knut

    The error ‘Detail: Key (host_name)=(ip-10-0-0-233.eu-west-1.compute.internal) is not present in table “hosts”.’ means that the hostname ip-10-0-0-233.eu-west-1.compute.internal is not added to tables hosts. It seems there are still issues with registering hosts.

    I assume you use /etc/hosts. Can you please make sure that all the hosts are in /etc/hosts on all the servers?

    Please also list the content of your /etc/hosts here.

    Larry

    Collapse
    #23814

    Knut Nordin
    Member

    Update 2: Deployment hangs

    Once getting past the registration step, the installation process now hangs on the deployment step.
    After hitting “Deploy” in the “Review” step, the progress bar reaches step 34 out of 59 and then hangs.

    When reviewing the Ambari server log, there seem to be some issue with the hostnames added to the Ambari database:

    08:15:21,295 ERROR BaseManagementHandler:65 – Caught a runtime exception while attempting to create a resource
    javax.persistence.RollbackException: Exception [EclipseLink-4002] (Eclipse Persistence Services – 2.4.0.v20120608-r11652): org.eclipse.persistence.exceptions.DatabaseException
    Internal Exception: org.postgresql.util.PSQLException: ERROR: insert or update on table “clusterhostmapping” violates foreign key constraint “fk_clusterhostmapping_host_name”
    Detail: Key (host_name)=(ip-10-0-0-233.eu-west-1.compute.internal) is not present in table “hosts”.
    Error Code: 0
    Call: INSERT INTO ambari.ClusterHostMapping (cluster_id, host_name) VALUES (?, ?)
    bind => [2 parameters bound]
    Query: DataModifyQuery(name=”clusterEntities” sql=”INSERT INTO ambari.ClusterHostMapping (cluster_id, host_name) VALUES (?, ?)”)
    at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commitInternal(EntityTransactionImpl.java:102)
    at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:63)
    at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:87)
    at org.apache.ambari.server.state.cluster.ClustersImpl.mapHostToCluster(ClustersImpl.java:318)
    at org.apache.ambari.server.controller.AmbariManagementControllerImpl.createHosts(AmbariManagementControllerImpl.java:599)
    at org.apache.ambari.server.controller.internal.HostResourceProvider$1.invoke(HostResourceProvider.java:109)
    at org.apache.ambari.server.controller.internal.HostResourceProvider$1.invoke(HostResourceProvider.java:106)
    at org.apache.ambari.server.controller.internal.AbstractResourceProvider.createResources(AbstractResourceProvider.java:249)

    I have tried running ambari-server reset and restarting the installation process, but it seems like the reset is not clearing out everything that is needed because I get the exact same problem again.

    Is there a way to manually check and if necessary update the database entries that are involved here?

    Collapse
    #23813

    Knut Nordin
    Member

    Update: I finally managed to get past this step. I noticed that the agent configs were using the fully qualified host name of the server, while i had filled in only the “internal” hostnames as reported by the machine itself. Using the fully qualified names in the GUI resolved things.

    Collapse
Viewing 7 replies - 1 through 7 (of 7 total)