Home Forums HDP Installation Installation failed – Add nodes step

Tagged: 

This topic contains 28 replies, has 5 voices, and was last updated by  zuhair attya 3 months ago.

  • Creator
    Topic
  • #13650

    fbvdka
    Member

    I can not finalise the installation when I am trying add nodes.
    Actually, it failed during installation of ruby-libs.
    When I try yum install rubylibs , my system tells me that it requires libreadlines.so.5 even if compat-readline5 is installed !
    My system is RHEL 6.

Viewing 13 replies - 16 through 28 (of 28 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #13908

    fbvdka
    Member

    Hi Larry and tedr,

    I am using DNS. I can sucessfully ssh to other nodes from the master, and ssh to itself too, obvioussly passwordless.

    Collapse
    #13799

    tedr
    Member

    Hi Fabrice,

    Can you successfully ssh from the node on which Ambari is being run to all other nodes in the cluster? Also check that you can ssh from the Ambari machine to itself. All of this ssh’ing must be passwordless.

    Thanks,
    Ted.

    Collapse
    #13797

    Larry Liu
    Moderator

    Hi, Fabrice

    Thanks for providing the test information.

    Are you using /etc/hosts or DNS? If you use /etc/hosts, can you please check if all the hosts in the cluster are in the /etc/hosts on each node?

    Hope this helps.

    Larry

    Collapse
    #13784

    fbvdka
    Member

    Hi Larry,

    I am trying to install one master and three nodes.

    I runned the command netstat and I had the following results :


    tcp 0 0 0.0.0.0:8441 0.0.0.0:* LISTEN 10434/java
    tcp 0 0 10.192.135.146:8441 10.192.135.148:53701 TIME_WAIT -
    tcp 0 0 10.192.135.146:8441 10.192.135.149:62078 FIN_WAIT2 -
    tcp 0 0 10.192.135.146:8441 10.192.135.148:53703 FIN_WAIT2 -

    In the /var/log/ambari-server/ambari-server.log there is a warning :

    09:09:24,332 WARN nio:651 - javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    And in one node /var/log/ambari-agent/ambari-agent.log

    INFO 2013-01-17 16:53:28,979 security.py:48 - SSL Connect being called.. connecting to the server
    INFO 2013-01-17 16:53:29,093 Controller.py:103 - Unable to connect to: https://@ipmaster:8441/agent/v1/register/@ipnode
    Traceback (most recent call last):
    File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 88, in registerWithServer
    response = self.sendRequest(self.registerUrl, data)
    File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 235, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
    File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 76, in __init__
    self.connect()
    File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 81, in connect
    self.httpsconn.connect()
    File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 65, in connect
    ca_certs=server_crt)
    File "/usr/lib64/python2.6/ssl.py", line 338, in wrap_socket
    suppress_ragged_eofs=suppress_ragged_eofs)
    File "/usr/lib64/python2.6/ssl.py", line 120, in __init__
    self.do_handshake()
    File "/usr/lib64/python2.6/ssl.py", line 279, in do_handshake
    self._sslobj.do_handshake()
    SSLError: [Errno 8] _ssl.c:490: EOF occurred in violation of protocol

    Collapse
    #13731

    Larry Liu
    Moderator

    Hi, Fabrice

    From your log file, I see the following error: Unable to connect to: https://xxxxx:8441/agent/v1/register/xxxx

    Can you please also check if your ambari master server is able to connect to https://xxxxx:8441/agent/v1/register/xxxx? Please run the following command on ambari master:
    netstat -anp| grep 8441

    If there is no record returned from above command, can you please attach the log files from the following directory?

    /var/log/ambari-server
    /var/log/ambari-agent

    Thanks

    Larry

    Collapse
    #13718

    Larry Liu
    Moderator

    Hi Fabrice,

    Can you please provide the following information?

    1. How many nodes in your cluster?
    2. Do you have ssh passwordless set up?
    3. What is in the /etc/hosts on each node?
    4. What steps you have performed?

    Thanks

    Larry

    Collapse
    #13714

    Larry Liu
    Moderator

    HI Fabrice,

    Thanks for trying Ambari 1.2. I am looking into the issue you are experiencing.

    Larry

    Collapse
    #13713

    fbvdka
    Member

    I have reinstalled with ambari 1.2
    Now I have the following error because the registration failed :
    (‘INFO 2013-01-17 14:57:12,147 security.py:48 – SSL Connect being called.. connecting to the server
    INFO 2013-01-17 14:57:12,256 Controller.py:103 – Unable to connect to: https://xxxxx:8441/agent/v1/register/xxxx
    Traceback (most recent call last):
    File “/usr/lib/python2.6/site-packages/ambari_agent/Controller.py”, line 88, in registerWithServer
    response = self.sendRequest(self.registerUrl, data)
    File “/usr/lib/python2.6/site-packages/ambari_agent/Controller.py”, line 235, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
    File “/usr/lib/python2.6/site-packages/ambari_agent/security.py”, line 76, in __init__
    self.connect()
    File “/usr/lib/python2.6/site-packages/ambari_agent/security.py”, line 81, in connect
    self.httpsconn.connect()
    File “/usr/lib/python2.6/site-packages/ambari_agent/security.py”, line 65, in connect
    ca_certs=server_crt)
    File “/usr/lib64/python2.6/ssl.py”, line 338, in wrap_socket
    suppress_ragged_eofs=suppress_ragged_eofs)
    File “/usr/lib64/python2.6/ssl.py”, line 120, in __init__
    self.do_handshake()
    File “/usr/lib64/python2.6/ssl.py”, line 279, in do_handshake
    self._sslobj.do_handshake()
    SSLError: [Errno 8] _ssl.c:490: EOF occurred in violation of protocol
    ‘, None)

    STDERR
    Connection to xxxxx closed.
    Registering with the server…
    Registration with the server failed.

    Collapse
    #13712

    fbvdka
    Member

    Hi Ted,

    Yes I have.

    Collapse
    #13690

    tedr
    Member

    HI Fabrice,

    Thanks for trying HDP.

    Have you made sure that the firewall (iptables) is off and SELinux is disabled on all nodes on your cluster?

    Thanks!
    Ted.

    Collapse
    #13689

    fbvdka
    Member

    Finally I have succeeded in installing ruby-libs. I think it was a problem of versionning.
    But now, when I am trying to configure my cluster, I received a message “Async call failed” at the step “Add Nodes’

    Collapse
    #13686

    fbvdka
    Member

    We did not installed a version of ruby-libs before.
    Here the message I received :
    Preparing discovered nodes
    Entry Id : 103
    Final result : TOTALFAILURE
    Progress at the end : : 1 / 4 in progress; 3 failed
    Additional information :
    Host Info
    xxxx
    Failed. Reason: Error: Package: ruby-libs-1.8.7.352-7.el6_2.i686 (G02R01C00)
    Requires: libreadline.so.5
    xxxx:_ERROR_:retcode:[1], CMD:[out=`yum install -y ruby-devel rubygems`]: OUT:[Loaded plugins: priorities, security
    35 packages excluded due to repository priority protections
    Setting up Install Process
    No package rubygems available.
    Resolving Dependencies
    --> Running transaction check
    ---> Package ruby-devel.i686 0:1.8.7.352-7.el6_2 will be installed
    --> Processing Dependency: libruby.so.1.8 for package: ruby-devel-1.8.7.352-7.el6_2.i686
    ---> Package ruby-devel.x86_64 0:1.8.7.352-7.el6_2 will be installed
    --> Running transaction check
    ---> Package ruby-libs.i686 0:1.8.7.352-7.el6_2 will be installed
    --> Processing Dependency: libgdbm.so.2 for package: ruby-libs-1.8.7.352-7.el6_2.i686
    --> Processing Dependency: libreadline.so.5 for package: ruby-libs-1.8.7.352-7.el6_2.i686
    --> Running transaction check
    ---> Package gdbm.i686 0:1.8.0-36.el6 will be installed
    ---> Package ruby-libs.i686 0:1.8.7.352-7.el6_2 will be installed
    --> Processing Dependency: libreadline.so.5 for package: ruby-libs-1.8.7.352-7.el6_2.i686
    --> Finished Dependency Resolution
    You could try using --skip-broken to work around the problem
    You could try running: rpm -Va --nofiles --nodigest]

    Collapse
    #13651

    Sasha J
    Moderator

    Fabrice,
    this may be a versioning problem…
    HMC trying to install exact version of ruby-libs, and if you have ti installed but on different version, it may fail.
    Please, remove your installed package and rerun node addition. This should pull and install expected version.

    Thank you!
    Sasha

    Collapse
Viewing 13 replies - 16 through 28 (of 28 total)