Home Forums HDP on Linux – Installation HDP gsInstaller on EC2 hangs at Hadoop smoke test

This topic contains 10 replies, has 3 voices, and was last updated by  Ravi Veeramachaneni 1 year, 12 months ago.

  • Creator
    Topic
  • #7665

    Hi,

    I’m trying to setup a mini 3-node cluster on EC2 using gsInstaller. The installer hangs while doing the Hadoop smoke test. Appreciate any help on what’s going wrong. Environment details:
    RHEL6.3
    Master – m2.2xlarge (80GB EBS, NN, SN, JT, Oozie, Pig, Sqoop, Nagios, Ganglia, Dashboard)
    DNs – m1.large (100GB EBS, DN, TT) – 3
    Non-secured cluster

    Prior to running the gsInstaller, successfully finished gsPreRequisites and createUsers scripts. Here is the output from gsInstaller:

    [root@ip-10-30-128-116 gsInstaller]# sh gsInstaller.sh

    ===============================================================================
    Grid Stack Installer
    ===============================================================================

    ===============================================================================
    Installation Details
    ===============================================================================
    Loaded plugins: amazon-id, priorities, product-id, rhui-lb, security, subscription-manager
    Updating certificate-based repositories.
    Unable to read consumer identity
    HDP-1.0.0.12 | 1.3 kB 00:00
    HDP-1.0.0.12/primary | 33 kB 00:00
    HDP-1.0.0.12 110/110
    epel/metalink | 13 kB 00:00
    epel | 4.0 kB 00:00
    epel/primary_db | 4.6 MB 00:00
    rhui-us-east-1-client-config-server-6 | 2.6 kB 00:00
    rhui-us-east-1-client-config-server-6/primary_db | 3.1 kB 00:00
    rhui-us-east-1-rhel-server-releases | 3.7 kB 00:00
    rhui-us-east-1-rhel-server-releases/primary_db | 15 MB 00:00
    rhui-us-east-1-rhel-server-releases-optional | 3.5 kB 00:00
    rhui-us-east-1-rhel-server-releases-optional/primary_db | 2.1 MB 00:00
    28 packages excluded due to repository priority protections
    Setting up Install Process
    Package pdsh-2.27-1.el5.rf.x86_64 already installed and latest version
    Nothing to do
    hdfsuser : hdfs
    mapreduser : mapred
    installhbase : no
    installhive : no
    installtempleton : no
    installoozie : yes
    oozieuser : oozie
    installhcat : no
    installpig : yes
    installsqoop : yes
    smoke_test_user : hdptestuser
    enablesecurity : no
    enablemon : yes
    package : rpm
    ===============================================================================
    ===============================================================================
    Cluster Details
    ===============================================================================
    Namenode host : ip-10-30-128-116.ec2.internal
    Jobtracker host : ip-10-30-128-116.ec2.internal
    SNamenode host : ip-10-30-128-116.ec2.internal
    Gateway host : ip-10-30-128-116.ec2.internal
    Oozie Server : ip-10-30-128-116.ec2.internal
    Pig Client : ip-10-30-128-116.ec2.internal
    Sqoop Client : ip-10-30-128-116.ec2.internal
    ===============================================================================
    Proceed with above options? (y/N) y
    ===============================================================================
    check your Install logs at : /tmp/gsinstaller-8987.out
    ===============================================================================
    Pseudo-terminal will not be allocated because stdin is not a terminal.
    Pseudo-terminal will not be allocated because stdin is not a terminal.
    Pseudo-terminal will not be allocated because stdin is not a terminal.
    Pseudo-terminal will not be allocated because stdin is not a terminal.
    ===============================================================================
    Checking for non empty directories
    ===============================================================================
    ===============================================================================
    Download all necessary Artifacts
    ===============================================================================
    Downloading ext-2.2.zip from http://public-repo-1.hortonworks.com/HDP-1.0.0.12/repos/centos6/tars/ext-2.2.zip
    ===============================================================================
    Installing JDK
    ===============================================================================
    ===============================================================================
    Deploying Hadoop
    ===============================================================================
    ===============================================================================
    Deploying Hadoop RPMs
    ===============================================================================
    ===============================================================================
    Deploying Hadoop Configs
    ===============================================================================
    ===============================================================================
    Installing Hadoop Configs
    ===============================================================================
    ===============================================================================
    Installing Snappy
    ===============================================================================
    ===============================================================================
    Deploying Pig
    ===============================================================================
    ===============================================================================
    Installing Pig Configs
    ===============================================================================
    ===============================================================================
    Deploying Sqoop
    ===============================================================================
    ip-10-30-128-116: /tmp/HDP-artifacts-8987/mysql-connector-java-5.1.18.zip: No such file or directory
    pdsh@ip-10-30-128-116: ip-10-30-128-116: scp exited with exit code 1
    ===============================================================================
    Installing Sqoop Configs
    ===============================================================================
    ===============================================================================
    Deploying Oozie
    ===============================================================================
    ===============================================================================
    Installing Oozie Configs
    ===============================================================================
    ===============================================================================
    Setting up various directories required by hadoop
    ===============================================================================
    ===============================================================================
    Setting up various directories required by Oozie
    ===============================================================================
    ===============================================================================
    Starting All Hadoop Services
    ===============================================================================
    ===============================================================================
    Waiting 600 seconds for namenode to come out of safe mode
    ===============================================================================
    on ip-10-30-128-116.ec2.internal
    StartHadoop completed
    ===============================================================================
    Hadoop smoke test – wordcount using /etc/passwd file
    ===============================================================================
    12/07/26 20:38:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream 10.28.101.155:50010 java.net.ConnectException: Connection timed out
    12/07/26 20:38:45 INFO hdfs.DFSClient: Abandoning block blk_6496503739321890453_1002
    12/07/26 20:38:45 INFO hdfs.DFSClient: Excluding datanode 10.28.101.155:50010
    12/07/26 20:39:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream 10.29.134.104:50010 java.net.ConnectException: Connection timed out
    12/07/26 20:39:48 INFO hdfs.DFSClient: Abandoning block blk_2917393608499314041_1002
    12/07/26 20:39:48 INFO hdfs.DFSClient: Excluding datanode 10.29.134.104:50010
    12/07/26 20:40:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream 10.29.160.185:50010 java.net.ConnectException: Connection timed out
    12/07/26 20:40:51 INFO hdfs.DFSClient: Abandoning block blk_4285604635442882215_1002
    12/07/26 20:40:51 INFO hdfs.DFSClient: Excluding datanode 10.29.160.185:50010
    12/07/26 20:40:51 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hdptestuser/passwd-372612 could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

    at org.apache.hadoop.ipc.Client.call(Client.java:1092)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
    at $Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)

    12/07/26 20:40:51 WARN hdfs.DFSClient: Error Recovery for block blk_4285604635442882215_1002 bad datanode[0] nodes == null
    12/07/26 20:40:51 WARN hdfs.DFSClient: Could not get block locations. Source file “/user/hdptestuser/passwd-372612″ – Aborting…
    copyFromLocal: java.io.IOException: File /user/hdptestuser/passwd-372612 could only be replicated to 0 nodes, instead of 1
    12/07/26 20:40:51 ERROR hdfs.DFSClient: Exception closing file /user/hdptestuser/passwd-372612 : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hdptestuser/passwd-372612 could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

    org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hdptestuser/passwd-372612 could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

    at org.apache.hadoop.ipc.Client.call(Client.java:1092)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
    at $Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)
    Found 1 items
    -rw——- 3 hdptestuser hdptestuser 0 2012-07-26 20:37 /user/hdptestuser/passwd-372612

Viewing 10 replies - 1 through 10 (of 10 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #7896

    Miguel,

    Thanks for the update. I was not using the Hive anyway. I got side-tracked with few other things, I may not get to this until weekend. My goal was to use gsInstaller on RHEL or CentOS for minimum 3-node deployment. I know HMC may make my life easier but I want try the hard way and see how that goes. I’ll let you know how it went.

    Thanks,
    Ravi

    Collapse
    #7893

    Oh just i forgot, just disable hive to continue with the installation..

    Collapse
    #7892

    I successfully installed HDP with HMC on RHEL 6.3.
    The process is the same as CentOS 6.2.
    The only issue was mysql-connector-java, version 5.0.8-1 is required but i have 5.1.17-6.
    I thought yum downgrade mysql-connector-java-5.0.8-1 would do the trick but for some reason 5.1.17-6 is reinstalled during the deployment phase.. maybe there is a way to disable this in
    /etc/puppet/agent/modules/hdp-hive/manifests/mysql-connector.pp
    or
    /etc/puppet/agent/modules/hdp-sqoop/manifests/mysql-connector.pp

    Cheers,
    Miguel

    Collapse
    #7885

    Also I am curious about RHEL with HMC ill let you know what happens..

    Collapse
    #7859

    Ravi,
    The example I used was just for a single node cluster. If you have more nodes you wish to use you can specify them as well. For example here is the /etc/hosts file on a cluster i recently deployed with 5 nodes (HMC), and it is the same on all nodes. The format is ( hostname -i hostname -f name )

    127.0.0.1 localhost localhost.localdomain

    10.206.30.48 domU-12-31-39-14-1D-C2.compute-1.internal Master
    10.190.191.106 ip-10-190-191-106.ec2.internal Worker
    10.206.30.48 domU-12-31-39-14-1D-C2.compute-1.internal Worker2
    10.241.107.175 domU-12-31-39-05-68-41.compute-1.internal Worker3
    10.140.16.66 ip-10-140-16-66.ec2.internal Worker4

    Sorry I haven’t tried using ginstaller yet, but with HMC you can specify were you want your NN , SN and so on during the initial install.

    Hope this helps in some way,
    Miguel

    Collapse
    #7851

    Miguel,

    Thanks for the reply. The FQDN you were referring to in the post refers to Master IP (where NN,SN and others running) and synchronize with all Data Nodes /etc/hosts?

    As per the hang goes, I’m not using HMC, gsInstaller.

    Thanks,
    Ravi

    Collapse
    #7731

    Ravi, I had a similar issue.. here is what worked for me

    first you should associate your fully qualified domain name to your ip in your /etc/hosts file
    and make sure its the same on all your nodes
    ex..
    10.190.111.104 ip-10-190-111-104.ec2.internal Deploy

    secondly, and specifically for the hang issue try uninstalling / reinstalling hmc and pre installing all the required packages as mentioned here.

    http://hortonworks.com/community/forums/topic/puppet-failed-no-cert/

    Collapse
    #7719

    Hi Sasha,

    I thought I can try on RHEL 5.x or CentOS 5.x. The result is same, I end up getting the same error. I’m providing the details below as you requested.

    Nodes
    [root@ip-10-30-128-116 gsInstaller]# cat nodes
    ip-10-28-101-155.ec2.internal
    ip-10-29-160-185.ec2.internal
    ip-10-29-134-104.ec2.internal
    Hostname
    [root@ip-10-30-128-116 gsInstaller]# hostname -f
    ip-10-30-128-116.ec2.internal

    Yes, /etc/hosts are identical.

    [root@ip-10-30-128-116 gsInstaller]# cat /etc/hosts
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6

    Any further help would be appreciated. Not sure, what else going wrong here.

    Thanks,
    Ravi

    Collapse
    #7678

    Sasha J
    Moderator

    Hi Ravi,

    gsInstaller should work on RHEL 6

    please post the contents of your nodes files (the files you used to specify the host FQDN)

    then post the output of

    hostname -f

    for ALL nodes including the master

    finally ensure the /etc/hosts is IDENTICAL on ALL hosts and post it here

    Collapse
    #7666

    After running single instance on CentOS 6.x, I thought I can try on RHEL6.x but I realized that RHEL6.x is not yet supported. I’ll go back and try on RHEL5.x.

    Collapse
Viewing 10 replies - 1 through 10 (of 10 total)