How to configure multi node cluster

to create new topics or reply. | New User Registration

This topic contains 10 replies, has 9 voices, and was last updated by  savita sheoran 3 months, 2 weeks ago.

  • Creator
    Topic
  • #43839

    Durga Prasad
    Participant

    Hi,

    Can anyone please provide how to set up multi node cluster?

    Thanks
    Durga

Viewing 10 replies - 1 through 10 (of 10 total)

You must be to reply to this topic. | Create Account

  • Author
    Replies
  • #70480

    savita sheoran
    Participant

    hi
    i have installed virtual box 4.3 and imported sandbox in it but after boot if try to login with commnd
    ssh root@122.0.0.1 -p 2222;
    geting response connection refused why its plz reply as soon as possible

    Collapse
    #68121

    vikash pandey
    Participant

    Hi

    Supported OS is ubuntu 12.04 for multinode cluster using Ambari1.7 and HDP2.2

    Follow the steps below
    echo “os Check”
    cat /etc/issue

    echo “program check”

    whereis rpm
    whereis scp
    whereis curl
    whereis unzip
    whereis tar
    whereis wget
    whereis openssl
    whereis python

    echo “execute hostname and then nslookup hostname to verify that the name resolves to the correct IP address”
    hostname
    nslookup hostname

    In /etc/hosts add IP FQDN
    sudo /etc/hosts

    echo “set up ulimit”
    oprn file $vi /etc/security/limits.conf

    add following lines to the end of file
    * soft nofile 10000
    * hard nofile 10000

    Then logout and re-login

    passwordless

    ssh-keygen
    copy id_rsa.pub to hosts

    cat .ssh/id_rsa.pub >> .ssh/authorized_keys
    ssh root@f.q.d.n

    NTPD
    Alternative to chkconfig
    ——————————-
    sudo apt-get install ntp
    sudo apt-get install sysv-rc-conf
    sudo sysv-rc-conf ntpd on
    sysv-rc-conf –list
    sysv-rc-conf –list ntpd

    echo “set umask”
    umask 022

    in .bashrc file set umask 022

    —————————————-
    Ambari-installation
    —————————————-
    For getting key in ubuntu12

    Ambari 1.7.0 Repository File Links:for UBUNTU 12

    wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu12/1.x/updates/1.7.0/ambari.list
    or
    wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu12/1.x/updates/1.7.0/ambari.list -O /etc/apt/sources.list.d/ambari.list

    ———————————————————————————————–
    HDP 2.2 repository file links for UBUNTU 12

    wget -nv http://public-repo-1.hortonworks.com/HDP/ubuntu12/2.x/GA/2.2.0.0/hdp.list -O /etc/apt/sources.list.d/HDP.list
    —————————————————————————————————————————-

    sudo apt-key adv –recv-keys –keyserver hkp://keyserver.ubuntu.com:80 B9733A7A07513CAD

    apt-get update

    apt-cache pkgnames

    apt-get install ambari-server

    Collapse
    #65422
    Collapse
    #62109

    Prem Kumar
    Participant

    Hi Son Hai Ha

    appologies for a late reply

    can you kindly let me how could i check ports are not blocked
    even if i disable the firewall yet to check about the ports are not blocked
    please clarify

    Thanks
    Prem

    Collapse
    #61746

    Son Hai Ha
    Participant

    Hi Kumar,
    Did you also install Nagios and Ganglia service on the cluster? Those services report the usage metrics. Just make sure Ganglia monitor at each node is running and Ganglia Server is running to receive the report.
    Please make sure these ports are not blocked: TCP 8625, 8552, 8649, 8651, 8652, 8655, 8656, 8658, 8659, 8660, 8661, 8662, 8663, 8666 and UDP 6343, 8649, 8656, 8658, 8659, 8660, 8661, 8662, 8663, 8666 for Ganglia (most of the ports are not mentioned in the manual)
    Sincerely yours,

    Collapse
    #61743

    Prem Kumar
    Participant

    Hai,

    I setup the two node clusters and all the services up and running I can see all the service has green color.But my question is If i click the metrics button i can see the following
    Disk usage : n/a
    Datanodes Live : 1/1
    Namenode & SecondaryNamenode : 1 Databode
    Memory Usage : There was no data available.Possible reason including inaccessible Ganglia Service
    Network Usage : There was no data available.Possible reason including inaccessible Ganglia Service
    CPU Usage : There was no data available.Possible reason including inaccessible Ganglia Service
    Cluster Load : There was no data available.Possible reason including inaccessible Ganglia Service
    Namenode Heap : n/a
    Namenode RPC : n/a
    Namenode CPU WIO : n/a
    Namenode Uptime : n/a
    Namenode Master Heap : n/a
    Hbase Links: No active Master, 1 regionserver, n/a
    HBase Avg Load : n/a
    HBase Master Uptime : n/a
    Resource Manager Heap : n/a
    Resource Manager uptime : n/a
    NoadManagers Live : 1/1
    Yarn Memory : n/a
    Supervisors Live : 1/1

    How to get all the values for the metrics ?

    Collapse
    #53994

    Son Hai Ha
    Participant

    Hi,
    I hope this can help. I summary the manual guide here: http://docs.hortonworks.com/HDPDocuments/Ambari-1.5.1.0/bk_using_Ambari_book/content/ambari-chap1.html

    The bellow process is described under the case of installing Ambari 1.5.1 on a cluster of VMs in Open Stack, and there are some ports and resource websites blocked by the company firewall. The VMs running Ambari are using the standard “CentOS 6.4 minimal” image. We intended to install Hadoop 1.3.3 on the cluster.

    + Edit the file /etc/hosts in all hosts to use fully qualified domain name, append the record to end of files like this:
    ###.###.###.### fully.qualified.domain.name hostname
    10.10.0.1 node1.hadoop.test node1
    10.10.0.2 node2.hadoop.test node2

    so that nodes can ping each other by hostname.

    + Edit hostname for each node:
    hostname fully.qualified.domain.name
    vi /etc/sysconfig/network

    NETWORKING=yes
    HOSTNAME=fully.qualified.domain.name

    + Disable iptables for ambari on all hosts
    chkconfig iptables off
    /etc/init.d/iptables stop

    + Disable SELinux all on all hosts
    setenforce 0

    + Set umask value on all host
    umask 022

    + Running NTP server on all hosts
    yum install ntp ntpdate ntp-doc (install)
    chkconfig ntpd on (turn on service)
    ntpdate pool.ntp.org (update time)
    /etc/init.d/ntpd start (start server)

    + Disable ipv6 (optional, in case ambari-server listen on IPv6 port)
    sysctl -w net.ipv6.conf.all.disable_ipv6=1
    sysctl -w net.ipv6.conf.default.disable_ipv6=1

    +Setting up your local repository (optional, if ambari server could not connect to Hortonwork Repositories)
    ++Install Apache Webserver:
    yum install httpd
    /etc/init.d/httpd start

    ++Download HDP packages at: http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.16/repos/centos6/HDP-UTILS-1.1.0.16-centos6.tar.gz
    yum install yum-utils createrepo
    mkdir -p /var/www/html/
    cd /var/www/html/

    untar the file here

    Note:
    – Open port 8440 and 8441 in security group, otherwise ambari agent couldn’t register to ambari server.
    – Open port 2181, 2888, 3888 for ZooKeeper
    – Open port 60000, 60010, 60020, and 60030 for HBase
    – Open port 50111 for WebHCat
    – Open port 50070, 50470, 8020, 9000, 50075, 50475, 50010, 50020, and 50090 for HDFS
    – Open port 51111, 19888, 50060, 50030, 9021 for MapReduce (13562 and 50300 not specified in the manual guide but should be opened)
    – Open port 10000 and 9083 for Hive

    Run Ambari Server Setup
    ambari-server setup

    Start Ambari Server
    ambari-server start

    Access to Ambari web: http://ambari.server.host:8080/
    Follow the wizards to create your cluster.
    They will ask for the list of nodes that you want to setup, use their FQDN to enter.

    Collapse
    #51181

    Vidy G
    Participant

    I am trying to set up a two node cluster using HDP 2.0 sandbox. I believe we need to use two different VM or physical machine to set up a 2 node cluster. Is it correct?

    I set up a sandbox VM and cloned it to create a second VM. I enabled Ambari in sandbox 1 to configure the sandbox2 as the second node in the cluster. But Ambari failed to register the second sandbox. The log file said issues with host name. I tried to modify host-name of second VM with no luck. Has anyone tried this before? If so what will be a simple way of setting up a 2 node cluster of HDP ?

    Collapse
    #49781
    Collapse
    #43856

    Robert Molina
    Moderator

    Hi Durga,
    Have you looked into using HDP’s Ambari product to setup a multi node cluster. Here is documentation that have steps of how to do so.
    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.2/bk_using_Ambari_book/content/ambari-chap1.html

    Regards,
    Robert

    Collapse
Viewing 10 replies - 1 through 10 (of 10 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.