Storm on YARN Install on HDP2 Cluster

This is the installation instructions for Storm on YARN. Our work is based on the code and documentation provided by Yahoo in the Storm-YARN repository at https://github.com/yahoo/storm-yarn

We initially installed Centos 6.4 minimal installation on a single VM. This installation can be scaled up to a multinode configuration.

You will need to make the following changes to prepare for the HDP 2.0 beta installation:

Disable selinux using the command:

setenforce 0

Edit the SELinux configuration file:

vi /etc/selinux/config

Change SELINUX=enforcing to SELINUX=disabled

Stop the iptables firewall and disable it.

stop iptables
service iptables stop
chkconfig iptables off

Now you are to start the HDP 2.0 beta install:

Install the wget package

yum -y install wget

Get the repo for Ambari and copy it to /etc/yum.repos.d – if you’re not using CentOS please visit the HDP2 Documentation for the correct repo url.

wget http://public-repo-1.hortonworks.com/ambari-beta/centos6/1.x/beta/ambari.repo
cp ambari.repo /etc/yum.repos.d

Install Java7 and all your nodes.

yum -y install jdk-7u40-linux-x64.rpm on all nodes

Verify java_home is /usr/java/jdk1.7.0_40/

[root@yarndev ~]# java -version
java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)

If java –version comes up wrong you will need to create new symbolic links and update JAVA_HOME.  As root do the following  on all nodes-

rm /usr/bin/java
rm /usr/bin/javac
rm /usr/bin/javadoc
rm /usr/bin/javaws
ln -s  /usr/java/jdk1.7.0_40/bin/java /usr/bin/java
ln -s  /usr/java/jdk1.7.0_40/bin/javac /usr/bin/javac
ln -s /usr/java/jdk1.7.0_40/bin/javadoc /usr/bin/javadoc
ln -s /usr/java/jdk1.7.0_40/bin/javaws /usr/bin/javaws
echo “export JAVA_HOME=/usr/java/jdk1.7.0_40/” >> /etc/profile

Install ntpd, start service and sync time

yum -y install ntp
service ntpd start

Verify time is the same on all nodes.

Install Ambari server

yum -y install ambari-server

Run the Ambari server setup

ambari-server setup -s -j /usr/java/jdk1.7.0_40/

Details on the Ambari install can be found in the HDP 2.0 beta docs. Make sure to point to your jdk7 install.

Start Ambari server

ambari-server start

Install and start agents

ambari-agent start
edit ambari agent config file pointing it to the ambari-agent host.

Install Maven 3.11

wget http://mirror.symnds.com/software/Apache/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz

Untar the maven file

tar –zxvf apache-maven-3.1.1-bin.tar.gz

Move the maven binary to /usr/lib/maven

mv apache-maven-3.1.1 /usr/lib/maven

Add Maven to path environment variable

export PATH=$PATH:/usr/lib/maven/bin

Get a copy of the repository for Storm on YARN from GitHub

wget https://github.com/yahoo/storm-yarn/archive/master.zip

Unzip master

unzip master
cd storm-yarn-master

Edit the pom.xml repos and Hadoop version to point at Hortonworks.

soya1

Set up Storm on your cluster:

Create a work folder to hold working files for Storm. Copy these files to your work folder and set up the environment variables.

cp lib/storm.zip  /your/work/folder

Go to your work folder and unzip storm.zip
Add storm-0.9.0-wip2 and storm-yarn-master bin folders to path
Add storm.zip to hdfs /lib/storm/0.9.0-wip2/storm.zip

hdfs dfs –put storm.zip  /lib/storm/0.9.0-wip2/

Add storm-0.9.0-wip2 and storm-yarn-master bin folders to path. Make sure to update your workfolder!

export PATH=$PATH:/usr/lib/maven/bin:/your/work/folder/storm-0.9.0-wip21/bin:/your/work/folder/storm-yarn-master/bin

Start Maven in the storm-yarn-master folder.

cd storm-yarn-master
mvn package

Start Storm

Edit the storm.yaml file  from storm-0.9.0-wip2/conf/storm.yaml to include your Zookeeper servers. Store this file for safekeeping if desired. Then run:

storm-yarn launch <path to your storm.yaml file>

Get the stormconfig with the yarn application id

yarn application -list

We store the storm.yaml file in the .storm directory so the storm command can find it when it is submitting jobs.

storm-yarn getStormConfig -appId application_1381089732797_0025  -output ~/.storm/storm.yaml

Try running two of the sample topologies. You can find the Nimbus host with cat ~/.storm/storm.yaml | grep nimbus.host:

Word Count:

[hdfs@yarndev storm-yarn-master]$ storm jar lib/storm-starter-0.0.1-SNAPSHOT.jar storm.starter.WordCountTopology WordCountTopology -c nimbus.host=<your nimbus host>

Exclamation:

[hdfs@yarndev storm-yarn-master]$ storm jar lib/storm-starter-0.0.1-SNAPSHOT.jar storm.starter.ExclamationTopology ExclamationTopology -c nimbus.host=<your nimbus host>

Monitor the results:

Monitor the results by first finding what node the AM spawned on, this is almost going to be were Nimbus spawns.

cat ~/.storm/storm.yaml | grep nimbus.host

Visit yarndev:7070

soya2

Try these Tutorials

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Explore Technology Partners
Hortonworks nurtures an extensive ecosystem of technology partners, from enterprise platform vendors to specialized solutions and systems integrators.