Home Forums HDP on Linux – Installation SSH issues with Single node install using HMC

This topic contains 15 replies, has 2 voices, and was last updated by  Sasha J 1 year, 11 months ago.

  • Creator
    Topic
  • #8169

    Max
    Member

    I am attempting to install a single node cluster on CentOS 6.3 (VMware). I have gone through all the steps but must have taken a wrong turn somewhere as while in “Node Discovery and Preparation” I get “Failed. Reason: ssh: connect to host node1.localdomain port 22: Connection timed out” on the host “node1.localdomain”.

    - The firewall is disabled
    - Already ran ssh-keygen -t rsa and cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    - passwordless ssh localhost is working fine
    - my Hostdetail.txt contains node1.localdomain

    I appreciate any help to get this resolved.

    Thanks!
    Max.

Viewing 15 replies - 1 through 15 (of 15 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #8320

    Sasha J
    Moderator

    Max,
    this is related to RHEL 6.3 as well…
    Please, comment out line checkJDK in “start” section of /etc/init.d/hmc script…
    This will also addresses in future releases, for now this is workaround.

    Thank you!
    Sasha

    Collapse
    #8319

    Max
    Member

    Hi Sasha and as always thank you for your assistance. That seams to have fixed the issue.

    When I do attempt to start the HMC service I continuously get the following message:

    [root@bigdata-node /]# service hmc start
    Do you agree to Oracle’s Java License at
    /usr/share/hmc/licenses/ORACLE_JDK_LICENSE.txt?(y/n)y
    Would you like us to download the JDK binaries for you?(y/n)n
    Please download jdk-6u31-linux-x64.bin and jdk-6u31-linux-i586.bin from Oracle to /var/run/hmc/downloads/
    [root@bigdata-node /]#

    It seems that HMC start process wants to download jdk-6u31-linux 32 and 64 bit each and every time. To get around it I manually copy the two files into /var/run/hmc/downloads/ directory. Something (probably the HMC stop process) deletes these files. your thoughts?

    Thanks,
    M.

    Collapse
    #8314

    Sasha J
    Moderator

    Max,
    this is recently discovered issue, fix requested and should be available soon.
    for some reasons, RHEL 6.3 behave differently than RHEL 5.x during the startup and remove files from /var/run.
    HMC checking for file /var/run/hadoop/hdfs/namenode-formatted existence during HDFS start. as file removed by RHEL, namenode process trying to format namenode location, but it already formatted and manenode process failed.
    Here is workaround for this problem:
    add the following lne :

    touch /var/run/hadoop/hdfs/namenode-formatted

    to /etc/init.d/hmc script.
    It should look like this:
    ….

    case “$1″ in
    start)
    checkHDPRepo
    #checkJDK
    bootPuppet
    echo “Starting HMC Installer ”
    /etc/init.d/httpd start
    RETVAL=$?
    if [ $RETVAL = 0 ]; then
    echo -n “Starting HMC”
    else
    echo -n “Failed to start HMC”
    fi
    echo
    touch /var/run/hadoop/hdfs/namenode-formatted
    ;;
    stop)
    ….

    and restart HMC service.
    HDFS should start normally after this workaround applied.

    Thank you!
    Sasha

    Collapse
    #8311

    Max
    Member

    Good morning Sasha.

    I followed all you instructions and installed HMC successfully. Thank you for step-by-step assistance. You were a great help!

    I rebooted the OS and the attempt to restart the services is failing now.

    HMC service and HMC agent start are good:
    [root@bigdata-node /]# service hmc start
    Starting HMC Installer [ OK ]
    Starting httpd: [Tue Aug 14 23:44:13 2012] [warn] The Alias directive in /etc/httpd/conf.d/hdp_mon_nagios_addons.conf at line 1 will probably never match because it overlaps an earlier Alias.
    [ OK ]
    Starting HMC
    [root@bigdata-node /]# service hmc-agent start
    Starting puppet: debug: Failed to load library ‘ldap’ for feature ‘ldap’
    debug: Puppet::Type::User::ProviderLdap: feature ldap is missing
    debug: Puppet::Type::User::ProviderUser_role_add: file roleadd does not exist
    debug: Puppet::Type::User::ProviderDirectoryservice: file /usr/bin/dscl does not exist
    debug: Puppet::Type::User::ProviderPw: file pw does not exist
    debug: /File[/var/lib/puppet/client_data]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/ssl/certs/bigdata-node.localdomain.pem]: Autorequiring File[/var/lib/puppet/ssl/certs]
    debug: /File[/etc/puppet/agent/namespaceauth.conf]: Autorequiring File[/etc/puppet/agent]
    debug: /File[/var/lib/puppet/ssl]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/ssl/crl.pem]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/ssl/certs/ca.pem]: Autorequiring File[/var/lib/puppet/ssl/certs]
    debug: /File[/etc/puppet/agent/puppet.conf]: Autorequiring File[/etc/puppet/agent]
    debug: /File[/var/lib/puppet/ssl/certs]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/state/last_run_report.yaml]: Autorequiring File[/var/lib/puppet/state]
    debug: /File[/var/lib/puppet/state/resources.txt]: Autorequiring File[/var/lib/puppet/state]
    debug: /File[/var/lib/puppet/facts]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/ssl/public_keys]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/ssl/private_keys]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/state/graphs]: Autorequiring File[/var/lib/puppet/state]
    debug: /File[/var/lib/puppet/state/state.yaml]: Autorequiring File[/var/lib/puppet/state]
    debug: /File[/var/lib/puppet/ssl/private]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/state/last_run_summary.yaml]: Autorequiring File[/var/lib/puppet/state]
    debug: /File[/var/lib/puppet/ssl/public_keys/bigdata-node.localdomain.pem]: Autorequiring File[/var/lib/puppet/ssl/public_keys]
    debug: /File[/var/lib/puppet/ssl/private_keys/bigdata-node.localdomain.pem]: Autorequiring File[/var/lib/puppet/ssl/private_keys]
    debug: /File[/var/lib/puppet/client_yaml]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/lib]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/classes.txt]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/clientbucket]: Autorequiring File[/var/lib/puppet]
    debug: /File[/var/lib/puppet/ssl/certificate_requests]: Autorequiring File[/var/lib/puppet/ssl]
    debug: /File[/var/lib/puppet/state]: Autorequiring File[/var/lib/puppet]
    debug: Finishing transaction 70347664200980
    [ OK ]

    However, the HDFS service start failed:

    {
    “2″: {
    “nodeReport”: [],
    “nodeLogs”: []
    },
    “3″: {
    “nodeReport”: [],
    “nodeLogs”: []
    },
    “4″: {
    “nodeReport”: {
    “PUPPET_KICK_FAILED”: [],
    “PUPPET_OPERATION_FAILED”: [
    "bigdata-node.localdomain"
    ],
    “PUPPET_OPERATION_TIMEDOUT”: [],
    “PUPPET_OPERATION_SUCCEEDED”: []
    },
    “nodeLogs”: {
    “bigdata-node.localdomain”: {
    “reportfile”: “/var/lib/puppet/reports/5-4-3/bigdata-node.localdomain”,
    “overall”: “FAILED”,
    “finishtime”: “2012-08-15 09:11:58.587629 -07:00″,
    “message”: [
    "Loaded state in 0.01 seconds",

    ......
    "Not using expired catalog for bigdata-node.localdomain from cache; expired at Tue Aug 14 00:30:04
    "\"Wed Aug 15 09:12:22 -0700 2012 Puppet (debug): Executing 'test -e /usr/lib/jvm/java-1.6.0.31.x64/jdk1.6.0_31/bin/java'\"",
    "\"Wed Aug 15 09:12:23 -0700 2012 Exec[/tmp/checkForFormat.sh](provider=posix) (debug): Executing check ‘test -f /var/run/hadoop/hdfs/namenode-formatted’\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 Puppet (debug): Executing ‘test -f /var/run/hadoop/hdfs/namenode-formatted’\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 Exec[/tmp/checkForFormat.sh](provider=posix) (debug): Executing ‘sh /tmp/checkForFormat.sh hdfs /etc/hadoop/conf /var/run/hadoop/hdfs/namenode-formatted /usr/local/hadoop/hdfs/namenode ‘\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 Puppet (debug): Executing ‘sh /tmp/checkForFormat.sh hdfs /etc/hadoop/conf /var/run/hadoop/hdfs/namenode-formatted /usr/local/hadoop/hdfs/namenode ‘\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Exec[/tmp/checkForFormat.sh]/returns (notice): DIrname = /usr/local/hadoop/hdfs/namenode\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Exec[/tmp/checkForFormat.sh]/returns (notice): ERROR: Namenode directory(s) is non empty. Will not format the namenode. List of non-empty namenode dirs /usr/local/hadoop/hdfs/namenode\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Exec[/tmp/checkForFormat.sh]/returns (err): change from notrun to 0 failed: sh /tmp/checkForFormat.sh hdfs /etc/hadoop/conf /var/run/hadoop/hdfs/namenode-formatted /usr/local/hadoop/hdfs/namenode returned 1 instead of one of [0] at /etc/puppet/agent/modules/hdp-hadoop/manifests/namenode/format.pp:49\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Hdp::Exec[set namenode mark]/Anchor[hdp::exec::set namenode mark::begin] (notice): Dependency Exec[/tmp/checkForFormat.sh] has failures: true\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Hdp::Exec[set namenode mark]/Anchor[hdp::exec::set namenode mark::begin] (warning): Skipping because of failed dependencies\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Hdp::Exec[set namenode mark]/Exec[set namenode mark] (notice): Dependency Exec[/tmp/checkForFormat.sh] has failures: true\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Hdp::Exec[set namenode mark]/Exec[set namenode mark] (warning): Skipping because of failed dependencies\”",
    “\”Wed Aug 15 09:12:23 -0700 2012 /Stage[2]/Hdp-hadoop::Namenode::Format/Hdp::Exec[set na

    Any thoughts on what might be the cause?

    Thanks,
    M.

    Collapse
    #8192

    Sasha J
    Moderator

    OK, this looks good.
    Now, please, install hmc (“yum -y install hmc”)
    create your hosts file, which will point to FQDN (bigdata-node.localdomain) and get is_rsa key from the node to your local machine.
    then start hmc and run installation again.
    Do not forget to set HBase region server heap size to at least 1024 Mb.

    Thank you!
    Sasha

    Collapse
    #8191

    Max
    Member

    Sasha,

    I changed the ip address in /etc/hosts to 192.168.61.130 and rebooted and reran all the commends you requested:

    [root@bigdata-node ~]# ifconfig
    eth5 Link encap:Ethernet HWaddr 00:0C:29:71:96:E7
    inet addr:192.168.61.130 Bcast:192.168.61.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:29ff:fe71:96e7/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:17 errors:0 dropped:0 overruns:0 frame:0
    TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:9733 (9.5 KiB) TX bytes:6141 (5.9 KiB)

    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:58 errors:0 dropped:0 overruns:0 frame:0
    TX packets:58 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:11593 (11.3 KiB) TX bytes:11593 (11.3 KiB)

    [root@bigdata-node ~]# cat /etc/hosts
    192.168.61.130 bigdata-node.localdomain bigdata-node
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6

    [root@bigdata-node ~]# cat /etc/resolv.conf
    # Generated by NetworkManager
    domain localdomain
    search localdomain
    nameserver 192.168.61.2

    [root@bigdata-node ~]# hostname
    bigdata-node

    [root@bigdata-node ~]# hostname -f
    bigdata-node.localdomain

    [root@bigdata-node ~]# ssh localhost
    Last login: Mon Aug 13 12:13:07 2012 from localhost.localdomain

    [root@bigdata-node ~]# ssh bigdata-node.localdomain
    Last login: Mon Aug 13 13:17:32 2012 from bigdata-node.localdomain

    [root@bigdata-node ~]# nslookup http://www.google.com
    Server: 192.168.61.2
    Address: 192.168.61.2#53

    Non-authoritative answer:
    Name: http://www.google.com
    Address: 66.152.109.110
    Name: http://www.google.com
    Address: 69.16.143.110

    Thanks again,
    M.

    Collapse
    #8189

    Sasha J
    Moderator

    Max,
    your ifconfig have :
    inet addr:192.168.61.130 Bcast:192.168.61.255 Mask:255.255.255.0

    your hosts file have:
    192.168.61.128 bigdata-node.localdomain bigdata-node

    Sorry, I have typo in resolv.conf… It should be “cat /etc/resolv.conf ”

    last command (ssh ) should be

    ssh bigdata-node.localdomain

    in your case, and it will not work at all, because of above.

    Please, fix this and try ssh command again, as well as cat /etc/resolv.conf

    Also, please run “nslookup http://www.google.com” and give us the output.

    Thank you!
    Sasha

    Collapse
    #8188

    Max
    Member

    Thanks Sasha for the prompt reply.

    I have not installed HMC on this VM. Regardless – here are the results:
    [root@bigdata-node ~]# ifconfig
    eth5 Link encap:Ethernet HWaddr 00:0C:29:71:96:E7
    inet addr:192.168.61.130 Bcast:192.168.61.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:29ff:fe71:96e7/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:16 errors:0 dropped:0 overruns:0 frame:0
    TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:9681 (9.4 KiB) TX bytes:6268 (6.1 KiB)

    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:49 errors:0 dropped:0 overruns:0 frame:0
    TX packets:49 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:9473 (9.2 KiB) TX bytes:9473 (9.2 KiB)

    [root@bigdata-node ~]# cat /etc/hosts
    192.168.61.128 bigdata-node.localdomain bigdata-node
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6

    [root@bigdata-node ~]# cat /etc/resolve.conf
    cat: /etc/resolve.conf: No such file or directory

    [root@bigdata-node ~]# hostname
    bigdata-node

    [root@bigdata-node ~]# hostname -f
    bigdata-node.localdomain

    [root@bigdata-node ~]# ssh localhost
    Last login: Mon Aug 13 11:39:22 2012 from localhost.localdomain

    [root@bigdata-node ~]# ssh
    usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
    [-D [bind_address:]port] [-e escape_char] [-F configfile]
    [-i identity_file] [-L [bind_address:]port:host:hostport]
    [-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port]
    [-R [bind_address:]port:host:hostport] [-S ctl_path]
    [-w local_tun[:remote_tun]] [user@]hostname [command]

    Thanks again,
    M.

    Collapse
    #8187

    Sasha J
    Moderator

    Max,
    it is not clear what do you have now and what is the status of your VM…
    Please, run:
    yum -y erase hmc puppet
    reboot

    When it comes back up, run the following commands and send us output:

    ifconfig
    cat /etc/hosts
    cat /etc/resolve.conf
    hostname
    hostname -f
    ssh localhost
    ssh

    Thank you!
    Sasha

    Collapse
    #8186

    Max
    Member

    Sasha,

    Your help is much appreciated.

    Yes. I went through it. It is a great document. It is a bit difficult to follow, for me any way, as it covers any and all type of installations (single node, multi node, and amazon) some of which may or may not apply for a single node installation.

    I’ll tell you upfront that I am not a sysadmin on linux platform. Having said that and frankly speaking – for the individuals that simply want to try out using individual components (HBase, Hive, Pig and etc) it will be a great help if Hortonworks provided a downloadable CentOS VM with a single node already pre-installed.

    As for the network setting:
    I made some changes and my Device configuration now shows:
    Name: eth5
    Device: eth5
    Use DHCP [*]
    Static IP
    Netmask
    Default gateway IP
    Primary DNS Server 192.168.61.2
    Secondary DNS Server

    Any additional help is greatly appreciated.

    Thanks,
    M.

    Collapse
    #8184

    Sasha J
    Moderator

    Max,
    did you have a chance to look at the installation document?
    It seems like not all requirements are met, please look at it and make sure you have all the preparation steps completed.

    http://hortonworks.com/download/thankyou_hdp1a/

    Preparing Your Cluster

    In order to prepare your cluster for HDP, you will need to perform steps on each host that will be part of your cluster, as well as prepare the entire cluster to accept installation of the HDP software. This section provides information on those configurations.

    Perform the following steps on the HMC Server and each host you plan to include as part of your cluster.

    Confirm the Fully Qualified Domain Name (FQDN) for each host using the command hostname -f.
    If deploying your cluster to Amazon EC2, be sure to use the Private DNS host name.

    Confirm each host has Internet access via HTTP, HTTPS and FTP. When performing the HDP install, each host in the cluster will access the Internet to obtain software packages required for installation.
    If your hosts will use a proxy to access the Internet, configure each host machine to use an Internet proxy. Check with your IT or network team for these settings.

    If you do not have Internet access available to your cluster hosts refer to the Hortonworks Documentation on how to setup a Local Mirror Repository.

    Remove or disable any existing Puppet agent configurations. HDP performs the software installation (and ongoing cluster management) using Puppet. With HDP, the HMC Server is the Puppet master and each host in your cluster acts as a Puppet Agent.
    Disable SELinux
    Enable NTP on the cluster to synchronize the clocks across the hosts.
    Prepare Password-less SSH Login for rootuser between the HMC Server and each host in the cluster. This enables the HMC Server to reach each host in the cluster via SSH without prompting for a password.
    Password-less SSH Login is required for the HMC Server to access each host in the cluster and install the necessary software components. For more information, please refer to the Hortonworks Documentation.

    Confirm the HMC Server can SSH to itself without prompting for a password. This can be done using the ssh root@localhost command.

    Check the dependencies on each host in the cluster using the yum info [dependency]command. Confirm the following are either not installed, or if installed, they are these versions.
    Name Dependency Version-Release
    Ruby ruby 1.8.5-24.el5
    Puppet puppet 2.7.9-2
    Ruby Rack rubygem-rack 1.1.0-2.el5
    Ruby Passenger rubygem-passenger 3.0.12-1.el5.centos
    Nagios nagios 3.0.12-1.el5.centos
    Nagios Plugins nagios-plugins 1.4.15-2.el5
    Nagios Common nagios-common 2.12-10.el5
    MySQL mysql 5.*
    Pre-Flight Operations Checklist

    Complete the Preparing Your Cluster steps above and confirm you have the following handy before you install and start HMC:

    Check Operation Description
    SSH Private Key Obtain the SSH Private Key (typically id_rsa) to use during the installation. Refer to the Hortonworks Documentation for more information on how the Private Key is used during cluster provisioning.
    Host names text file Create a text file of host names that will be part of your cluster. This file should contain a list of target host names, separated by newline, for the cluster. Refer to the Preparing Your Clustersection for more information on obtaining the hostname for each host in your cluster.
    The host name should be the FQDN for the host, not the IP address. For more information, refer to the Hortonworks Documentation.

    Collapse
    #8183

    Max
    Member

    Hi Sasha,

    I changed the hostname to bigdata-node. Aside from that here is the content of:

    /etc/hosts
    192.168.61.128 bigdata-node.localdomain bigdata-node
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6

    /etc/sysconfig/network
    NETWORKING=yes
    NETWORKING_IPV6=no
    HOSTNAME=bigdata-node

    /etc/sysconfig/network-scripts/ifcfg-eth0
    don’t have this file.

    When I go to setup:
    My DNS configuration shows:
    Hostname = bigdata-node
    Primary DNS = 192.168.61.2
    Secondary DNS =
    Tertiary DNS =
    DNS search path = localdomain

    Firewall Configuration shows
    Firewall = Enabled

    Device configuration shows
    There is no device information

    What else would you like me to provide?

    Thanks,
    M.

    Collapse
    #8182

    Sasha J
    Moderator

    Please, do not post this to other thread, let us continue in this one.

    ssh: connect to host node1 port 22: No route to host
    this usually means that your network configuration set up incorrectly.
    Check all the settings (/etc/hosts, /etc/sysconfig/network, /etc/sysconfig/network-scripts/ifcfg-eth0, etc)
    you should have working networking and working ssh before you move forward with HMC.

    Thank you!
    Sasha

    Collapse
    #8178

    Max
    Member

    Sasha,

    On one of the VMs, I installed HMC without changing the hostname (just localhost). I have got much further, but running into other issues now. I’ll post my issues on another thread.

    On this VM with the hostname set to node1, even when I do ssh node1 it returns the following:
    ssh: connect to host node1 port 22: No route to host

    or ssh node1.localdomain
    ssh: connect to host node1.localdomain port 22: No route to host

    Please assist.

    Thanks,
    Max.

    Collapse
    #8177

    Sasha J
    Moderator

    passwordless ssh to node1.localdomain should work as well, not localhost only.
    Make sure you have correct name resolution and your FQDN resolved as well.
    run:

    yum -y erase hmc puppet
    yum -y install hmc
    service hmc start

    and run installation again.

    Thank you!
    Sasha

    Collapse
Viewing 15 replies - 1 through 15 (of 15 total)