HDP on Linux – Installation Forum

Node discovery and preparation fails to find all nodes

  • #7644

    I have a five node cluster and during the add nodes part of the HDP installation it fails to find 4 of the 5 nodes. It only finds the node where I have installed the HDP. I am able to ssh to all the nodes in the cluster as root. The hostnames all work via DNS from the install node. I even added the host names to the /etc/hosts file and tried IPs and I get the same error for the 4 nodes.

    Failed. Reason: ssh: hadoop02.dev.corp.oversee.net
    : Name or service not known

    Failed. Reason: ssh: hadoop03.dev.corp.oversee.net
    : Name or service not known

    Failed. Reason: ssh: hadoop04.dev.corp.oversee.net
    : Name or service not known

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #7645
    Sasha J
    Moderator

    Hi Juan,

    All the nodes must be resolvable from the host running the HMC process.

    if you are certain that the entries in your host file that you uploaded exactly match the host names in your /etc/hosts on ALL hosts, then your name resolution service may be configured to hit an external DNS source BEFORE the local /etc/hosts

    a good test is to manually ssh into the host running HMC

    then manually ssh into each target node and ensure you can do so with the same name in the /etc/hosts

    and with no password

    -Sasha

    #7646

    I am able to ssh to all of the nodes with no password. The hmc.log shows the following error.

    [2012:07:26 21:37:20][INFO][HMCTxnUtils][HMCTxnUtils.php:116][execBackgroundProcess]: Found child pid, command=/usr/bin/php ./addNodes/findSshableNodes.php, txnId=1, output=Executing /usr/bin/php ./addNodes/findSshableNodes.php metroid root 1 100 2 /var/run/hmc/clusters/metroid/hosts.txt > /var/log/hmc/hmc.txn.1.log 2>&1
    Background Child Process PID:15911
    , pid=15911
    [2012:07:26 21:37:20][INFO][findSshableNodes][commandUtils.php:76][runPdsh]: Hosts for this operation: “\/var\/run\/hmc\/clusters\/metroid\/hosts.txt”
    [2012:07:26 21:37:20][INFO][findSshableNodes][commandUtils.php:80][runPdsh]: Going to execute findSshableNodes : pdsh -R exec /var/run/hmc/clusters/metroid/findSshableNodes//ssh.sh %h
    [2012:07:26 21:37:20][INFO][findSshableNodes][commandUtils.php:7][launchCmd]: Env variable WCOLL is “\/var\/run\/hmc\/clusters\/metroid\/hosts.txt”
    [2012:07:26 21:37:20][INFO][findSshableNodes][findSshableNodes.php:121][]: Going to persist information sshAble nodes
    [2012:07:26 21:37:20][INFO][Add nodes poller][nodesActionProgress.php:33][]: Cluster Name: metroid Root Txn ID: 1
    [2012:07:26 21:37:21][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:251][]: Encountered total failure in transaction 100 while running cmd: /usr/bin/php ./addNodes/findSshableNodes.php with args: metroid root 1 100 2 /var/run/hmc/clusters/metroid/hosts.txt

    #7647

    Now I am getting the following error. A negative node?

    Finding reachable nodes: -1 / 5 in progress; 6 failed

    #7654
    Sasha J
    Moderator

    Hi Juan

    Thank you for taking the time to investigate these issues with us.

    Would you have some time to do a webex?

    if so please send your contact information to POC-SUPPORT@HORTONWORKS.COM and we will follow up with you

    Thanks in advance,

    Sasha

    #7657

    I just sent you an email with my contact information. Thanks for your help.

    #7664
    Sasha J
    Moderator

    Hi Juan,

    can you please post your OS version, whether it is a clean install, and any modifications you have made to it, since it was cleanly installed?

    -Sasha

    #7683

    [jsandoval@hadoop05 ~]$ sudo /usr/local/sbin/memconf
    memconf: V2.22 30-Jan-2012 http://www.4schmidts.com/unix.html
    hostname: hadoop05.dev.corp.oversee.net
    Dell Inc. PowerEdge R610 (2 X Six-Core Hyper-Threaded Intel(R) Xeon(R) X5690 @ 3.47GHz)
    Memory Error Correction: Multi-bit ECC
    Maximum Memory: 196608MB (192GB)
    DIMM_A1: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    DIMM_A2: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    DIMM_A3: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    DIMM_B1: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    DIMM_B2: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    DIMM_B3: 8192MB 1333MHz Synchronous DDR3 DIMM, Hynix Semiconductor (Hyundai Electronics) HMT31GR7BFR4A-H9
    empty memory sockets: DIMM_A4, DIMM_A5, DIMM_A6, DIMM_B4, DIMM_B5, DIMM_B6
    total memory = 49152MB (48GB)

    [jsandoval@hadoop05 ~]$ cat /etc/redhat-release
    CentOS release 5.5 (Final)

    [jsandoval@hadoop05 ~]$ uname -a
    Linux hadoop05.dev.corp.oversee.net 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

    #7684

    It was a clean install.

    #7711
    Sasha J
    Moderator

    Juan,

    I’m told that someone from POC support has contacted you. Your version of CentOS 5.5 requires some manual installation of some packages. The person who contacts you will assist you further

    Sasha

The topic ‘Node discovery and preparation fails to find all nodes’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.