Home Forums HDP on Linux – Installation HDP 1.2 Install Hangs on the Wizard "Install, Start and Test"

Tagged: 

This topic contains 20 replies, has 8 voices, and was last updated by  Robert 1 year, 4 months ago.

  • Creator
    Topic
  • #14910

    Roger Hill
    Participant

    I am installing HDP 1.2 via the wizard (ambari server) on a small 5 node cluster …

    – I have all of my yum repos configured properly,
    – SSH Passwordless Key access between Ambari Host server and targets is configured properly
    – NTP, DNS and all other Linux services are properly configured
    – SELinux is disabled
    – IPtables is turned off and disabled
    – JDK 1.6.33 is present and configured properly
    – ambari-server start command will start the ambari server correctly

    I get all of the way thru the WUI installer wizard, to the last step “Install, Test and Configure” and it fails every time …

    Have tried the command for ambari server reset …

    [root@hw01 ~]# ambari-server reset
    Using python /usr/bin/python2.6
    Resetting ambari-server
    **** WARNING **** You are about to reset and clear the Ambari Server database. This will remove all cluster host and configuration information from the database. You will be required to re-configure the Ambari server and re-run the cluster wizard.
    Are you SURE you want to perform the reset [yes/no]? yes
    Confirm server reset [yes/no]? yes
    Reseting the Server database…
    Ambari Server ‘reset’ complete

    Have went to the other nodes and restarted the ambari agents also, , and have even rebooted the servers (6 GB RAM), added more memory, checking the ‘free -m’ command tells me they are not using much memory after a failed install .

    Have tried checking the ambari server logs and ambari client logs, with not much help there, and the reset is not working, checked the documentation for the HDP 1.2, can’t seem to find what is my issue.

    Perhaps multiple failed installs have resulted in a “partially installed system” … ???

    Please throw any advice that you can my direction .

    Thanks,

    -Roger Hill
    618-698-6055

Viewing 20 replies - 1 through 20 (of 20 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #32870

    Robert
    Participant

    Hi Rajesh,
    In order for the community to assist you better, it would be helpful to start a new thread and provide some specifics on the issue you are having as far as what errors or messages you are seeing.

    Regards,
    Robert

    Collapse
    #32846


    Member

    Hi
    I am having same issue. What can I do to fidn the root cause ?

    thx
    Rajesh

    Collapse
    #15155

    tedr
    Member

    Hi Roger,

    It looks like you may have forgotten to put “namenode” on the end of the /usr/lib/hadoop/bin/hadoop-deamon.sh command, so the namenode daemon has not actually started. What is show as running inf your ps command below is the ganglia monitor for the namenode.

    I hope this helps,
    Ted.

    Collapse
    #15154

    Sasha J
    Moderator

    By the way, command should be:

    /usr/lib/hadoop/bin/hadoop-daemon.sh –config /etc/hadoop/conf start namenode

    There should 2 dashes before word “config”

    But you may need to format namenode first: “hadoop name node -format”,
    but even before this you should fix your java problem.

    So, I suggest the following scenario:
    1. wipe out all your nodes completely by re-imaging OS.
    2. Make sure your name resolution works correctly, firewall disabled, SELinux disabled, etc
    3. Follow installation documentation precisely
    4. have installed and running cluster in less than 1 hour (assuming you will follow installation steps precisely).

    Thank you!
    Sasha

    PS. if still problems, then we can do WebEx with you and walk you through the process.

    Collapse
    #15153

    Sasha J
    Moderator

    Roger,
    seems like we are closer to resolve the problem…
    Now is the question:
    where did you get Java and how did you install it?
    execution permission denied on Java looks weird….

    Thank you!
    Sasha

    Collapse
    #15152

    Roger Hill
    Participant

    [root@hw01 ~]# su – hdfs
    -bash-4.1$ /usr/lib/hadoop/bin/hadoop-daemon.sh –config /etc/hadoop/conf start starting , logging to /var/log/hadoop/hdfs/hadoop-hdfs–hw01.savvis.lab.out
    Usage: hadoop [--config confdir] COMMAND where COMMAND is one of:
    namenode -format format the DFS filesystem
    secondarynamenode run the DFS secondary namenode
    namenode run the DFS namenode
    datanode run a DFS datanode
    dfsadmin run a DFS admin client
    mradmin run a Map-Reduce admin client
    fsck run a DFS filesystem checking utility
    fs run a generic filesystem user client

    -bash-4.1$
    -bash-4.1$ ps -ef | grep -i namenode
    nobody 7893 1 0 Feb12 ? 00:00:03 /usr/sbin/gmond –conf=/etc/ganglia/hdp/HDPNameNode/gmond.core.conf –pid-file=/var/run/ganglia/hdp/HDPNameNode/gmond.pid
    hdfs 23040 22957 0 15:32 pts/1 00:00:00 grep -i namenode

    [root@hw01 ~]# cat /var/run/ganglia/hdp/HDPNameNode/gmond.pid
    7893

    [root@hw01 ~]# /usr/jdk/jdk1.6.0_31/bin/jps
    10501 AmbariServer
    23137 Jps

    I am going to check the ambari web console now …

    Collapse
    #15151

    Roger Hill
    Participant

    Permissioned denied ???

    [root@hw01 ~]# cat /var/log/hadoop/hdfs/hadoop-hdfs-namenode-hw01.savvis.lab.out
    /usr/lib/hadoop/libexec/../bin/hadoop: line 320: /usr/jdk/jdk1.6.0_31/bin/java: Permission denied
    /usr/lib/hadoop/libexec/../bin/hadoop: line 390: /usr/jdk/jdk1.6.0_31/bin/java: Permission denied
    /usr/lib/hadoop/libexec/../bin/hadoop: line 390: exec: /usr/jdk/jdk1.6.0_31/bin/java: cannot execute: Permission denied

    [root@hw01 ~]# ls -ld /var/log/hadoop/hdfs/hadoop-hdfs-namenode-hw01.savvis.lab.out
    -rw-r–r– 1 hdfs hadoop 316 Feb 12 23:02 /var/log/hadoop/hdfs/hadoop-hdfs-namenode-hw01.savvis.lab.out

    Why would this be happening ???

    [root@hw01 ~]# /usr/jdk/jdk1.6.0_31/bin/java -version
    java version “1.6.0_31″
    Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
    Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)

    [root@hw01 ~]# su – hdfs
    -bash-4.1$ /usr/jdk/jdk1.6.0_31/bin/java -version
    -bash: /usr/jdk/jdk1.6.0_31/bin/java: Permission denied

    Perhaps thats why ????

    Collapse
    #15144

    Yusaku Sako
    Participant

    BTW, you can find hadoop-hdfs-namenode-*.log under /var/log/hadoop/hdfs/ (again, this is on the NameNode host).

    Collapse
    #15143

    Yusaku Sako
    Participant

    Hi Roger,

    Sorry to hear that you are having trouble setting up your cluster.
    If I understand you correctly, you are on “Install, Start and Test” page of the Install Wizard, gets thru the installation part fine and overall status is beyond 34%, but NameNode fails to start.

    Could you look into the content of hadoop-hdfs-namenode-*.log from the host where NameNode is assigned?
    If there are some obvious errors there, it might help you pinpoint what the underlying issue is.
    You want to get it to the point where manually running:
    # su – hdfs
    $ /usr/lib/hadoop/bin/hadoop-daemon.sh –config /etc/hadoop/conf start namenode
    starts up the NameNode successfully. You can verify that the NameNode is up by running “ps -ef | grep namenode”. The PID should match /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid.

    Hope this helps.

    Collapse
    #15135

    Roger Hill
    Participant

    Again, all services seem to install correctly, but cannot start, getting errors like this ..

    NameNode component failure messages …

    warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/init.pp:161 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
    warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:74 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
    warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:83 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
    warning: Dynamic lookup of $ambari_db_server_host is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
    notice: /Stage[1]/Hdp::Snappy::Package/Hdp::Snappy::Package::Ln[32]/Hdp::Exec[hdp::snappy::package::ln 32]/Exec[hdp::snappy::package::ln 32]/returns: executed successfully
    notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[core-site]/File[/etc/hadoop/conf/core-site.xml]/content: content changed ‘{md5}d06b24016cbc85fa3b83d34ac916f1f3′ to ‘{md5}b17f7c9365b5cf938d7ed4e76d056d91′
    notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[hdfs-site]/File[/etc/hadoop/conf/hdfs-site.xml]/content: content changed ‘{md5}8fe38e9fc18dd6a180af84f8b992d917′ to ‘{md5}14431967bae039d2124f2dd36a3da894′
    notice: /Stage[2]/Hdp-hadoop::Namenode/Hdp-hadoop::Service[namenode]/Hdp::Exec[su - hdfs -c '/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start namenode']/Exec[su - hdfs -c '/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start namenode']/returns: executed successfully
    err: /Stage[2]/Hdp-hadoop::Namenode/Hdp-hadoop::Service[namenode]/Hdp::Exec[sleep 5; ls /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid` >/dev/null 2>&1]/Exec[sleep 5; ls /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid` >/dev/null 2>&1]/returns: change from notrun to 0 failed: sleep 5; ls /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid` >/dev/null 2>&1 returned 1 instead of one of [0] at /var/lib/ambari-agent/puppet/modules/hdp/manifests/init.pp:299
    notice: /Stage[2]/Hdp-hadoop::Namenode/Hdp-hadoop::Service[namenode]/Hdp::Exec[sleep 5; ls /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid` >/dev/null 2>&1]/Anchor[hdp::exec::sleep 5; ls /var/run/hadoop/hdfs/had

    Collapse
    #15097

    Roger Hill
    Participant

    Fixed the crashing browser install, got thru to the end of the WUI installation….guess what…crashes at the end …AGAIN starting services….lol

    Collapse
    #15094

    Larry Liu
    Moderator

    Hi, Roger

    I left a VM on your number 618-698-6055. It is weird issue that the firefox crashed due to ambari. My phone is 408-645-7043 if you want to talk about this issue.

    Please try again if you could.

    Thanks

    Larry

    Collapse
    #15092

    Roger Hill
    Participant

    I set up a proxy env varialbe…so now got the install to finally go smoothly ..

    export http_proxy=http://sl7labproxy01.savvis.lab:8080

    [root@hw01 ~]# ambari-server setup
    Using python /usr/bin/python2.6
    Run postgresql initdb
    Run postgresql start
    Starting postgresql service: [ OK ]
    Setup ambari-server
    Checking SELinux…
    SELinux status is ‘disabled’
    Checking iptables…
    iptables is disabled now
    Checking PostgreSQL…
    Configuring database…
    Configuring PostgreSQL…
    Backup for pg_hba found, reconfiguration not required
    Checking JDK…
    Downloading JDK from http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-6u31-linux-x64.bin to /var/lib/ambari-server/resources/jdk-6u31-linux-x64.bin
    JDK distribution size is 85581913 bytes
    jdk-6u31-linux-x64.bin… 100% (81.6 MB of 81.6 MB)
    Successfully downloaded JDK distribution to /var/lib/ambari-server/resources/jdk-6u31-linux-x64.bin
    To install the Oracle JDK you must accept the license terms found at http://www.oracle.com/technetwork/java/javase/downloads/jdk-6u21-license-159167.txt. Not accepting will cancel the Ambari Server setup.
    Do you accept the Oracle Binary Code License Agreement [y/n] (y)? y
    Installing JDK to /usr/jdk64
    Successfully installed JDK to /usr/jdk64/jdk1.6.0_31
    Completing setup…
    Ambari Server ‘setup’ finished successfully

    Trying to hit the UI wizard now at http://hw01.savvis.lab:8080 crashes the firefox browser ??! (yes it is up to date), won’t even start in IE at all ….is this software really this difficult to install ???

    Extremely frustrated ….

    Collapse
    #15091

    Larry Liu
    Moderator

    Hi, Roger

    In order to install HDP in your environment, please try to set up a local repo. Here is the documentation:

    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_using_Ambari_book/content/ambari-chap1-6.html

    After that, we can continue.

    Larry

    Collapse
    #15086

    Roger Hill
    Participant

    Soooo…

    [root@hw01 init.d]# tail /var/lib/pgsql/pgstartup.log
    FATAL: could not create any TCP/IP sockets
    LOG: could not translate host name “localhost”, service “5432” to address: Temporary failure in name resolution
    WARNING: could not create listen socket for “localhost”
    FATAL: could not create any TCP/IP sockets
    LOG: could not translate host name “localhost”, service “5432” to address: Temporary failure in name resolution
    WARNING: could not create listen socket for “localhost”
    FATAL: could not create any TCP/IP sockets
    LOG: could not translate host name “localhost”, service “5432” to address: Temporary failure in name resolution
    WARNING: could not create listen socket for “localhost”
    FATAL: could not create any TCP/IP sockets

    Got past the postgres not starting error…my /etc/hosts for some reason was only 640 …

    [root@hw01 ~]# chmod 644 /etc/hosts

    Fixed that…now this jdk error …won’t download, and nto sure why because now I know I have 25 GB of free space under “/” …(i.e. … /var)

    [root@hw01 ~]# ambari-server setup
    Using python /usr/bin/python2.6
    Run postgresql initdb
    Run postgresql start
    Starting postgresql service: [ OK ]
    Setup ambari-server
    Checking SELinux…
    SELinux status is ‘disabled’
    Checking iptables…
    iptables is disabled now
    Checking PostgreSQL…
    Configuring database…
    Configuring PostgreSQL…
    Restarting PostgreSQL
    Checking JDK…
    Downloading JDK from http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-6u31-linux-x64.bin to /var/lib/ambari-server/resources/jdk-6u31-linux-x64.bin
    Request headr doesn’t contain Content-Length
    ERROR: Downloading or installing JDK failed. Exiting.
    [root@hw01 ~]#

    So actually , this server I am using , does not have direct internet access, that must be the problem…any way for me to use a proxy with the installer ??? Or maybe do I need to configure squid or something ?

    Collapse
    #15084

    Roger Hill
    Participant

    Rebuilt servers with increased “/” partition …

    root@hw01 ~]# df -ha
    Filesystem Size Used Avail Use% Mounted on
    /dev/mapper/vg_root-lv_root
    27G 1.8G 24G 7% /


    .. and more memory …

    [root@hw01 ~]# free -m
    total used free shared buffers cached
    Mem: 5853 465 5388 0 23 272
    -/+ buffers/cache: 169 5683
    Swap: 4095 0 4095
    [root@hw01 ~]#

    have rebooted, now postgres wont start up ???

    [root@hw01 ~]# ambari-server setup
    Using python /usr/bin/python2.6
    Run postgresql initdb
    Run postgresql start
    Starting postgresql service: [FAILED]
    Setup ambari-server
    Checking SELinux…
    SELinux status is ‘disabled’
    Checking iptables…
    iptables is disabled now
    Checking PostgreSQL…
    About to start PostgreSQL
    ERROR: Unable to start PostgreSQL server. Exiting

    Collapse
    #15073

    Jeff Sposetti
    Moderator

    What are you seeing in /var/log/ambari-server/ambari-server.log?
    What browser are you using?

    Collapse
    #15062

    Sasha J
    Moderator

    Roger,
    it is not clear what do you mean by “at the ‘Starting Services’… and bombs out, but allows me to continue with warnings”
    What was the warnings?
    Also, “but I cannot even start the HDFS service”… What is the symptoms? any error messages? How you trying to start it?

    It will be useful if you can provide ambari-agent log for name node machine.
    And HDFS logs as well…

    Thank you!
    Sasha.

    Collapse
    #15060

    Roger Hill
    Participant

    I’d installed java on the servers myself because your installer was failing to do so … I went back, re-examined, and determined on my own, that your installer needs much more space than what we have on our default RHEL6 build for the root partition “/” . Sooo…rebuilt the servers, and added diskspace to /usr, and /var, re-installed everything …install was working better, but it gets to about 34% totally complete, at the ‘Starting Services’… and bombs out, but allows me to continue with warnings….

    So it seems like everything got installed finally on my 5 node cluster , but I cannot even start the HDFS service ??? (Checking both the ambari-server and ambari client logs yields no help)

    [root@hw01 ~]# tail /var/log/ambari-server/ambari-server.log
    17:09:49,987 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent
    17:09:49,987 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.ComponentResourceProvider for request type Component
    17:09:49,988 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent
    17:09:49,988 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.ComponentResourceProvider for request type Component
    17:09:49,988 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent
    17:09:49,989 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent
    17:09:49,989 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.ComponentResourceProvider for request type Component
    17:09:49,990 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent
    17:09:49,990 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.ComponentResourceProvider for request type Component
    17:09:49,990 INFO ClusterControllerImpl:92 – Using resource provider org.apache.ambari.server.controller.internal.HostComponentResourceProvider for request type HostComponent

    [root@hw02 ~]# tail /var/log/ambari-agent/ambari-agent.log
    {u’clusterName': u’hwlabsvvs1′,
    u’commandType': u’STATUS_COMMAND’,
    u’componentName': u’MYSQL_SERVER’,
    u’serviceName': u’HIVE’}
    INFO 2013-02-11 17:10:25,067 ActionQueue.py:80 – The STATUS_COMMAND from the server is
    {u’clusterName': u’hwlabsvvs1′,
    u’commandType': u’STATUS_COMMAND’,
    u’componentName': u’MYSQL_SERVER’,
    u’serviceName': u’HIVE’}
    INFO 2013-02-11 17:10:25,067 Controller.py:177 – No commands sent from the Server.

    Collapse
    #14913

    Sasha J
    Moderator

    Roger,
    It looks like you have all the prerequisites ok, but still can not install HDP…
    What is the error messages you got?
    Also, it seems like you going too far with your pre-installation steps…
    In general, you should not install Java, Ambari do this for you on all nodes. And it is 1.6.31.
    Yo have 1.6.33 preinstalled. Please, make sure you point to the correct JAVA_HOME on the second page (the one where you provide names for the hosts and ssh key).
    Please, do the following:

    “service ambari-agent stop” on all nodes.
    “yum erase ambari-agent” on all nodes
    “ambari-server stop” “ambari-server reset”, “ambari-server start” on ambari node

    Then connect to UI and start over. Do not install or start agents, Ambari will do this for you.
    If there a problem and error again, post the error messages here.

    Thank you!
    Sasha

    Collapse
Viewing 20 replies - 1 through 20 (of 20 total)