Storm on YARN Install on HDP2 Cluster

This is the installation instructions for Storm on YARN. Our work is based on the code and documentation provided by Yahoo in the Storm-YARN repository at

We initially installed Centos 6.4 minimal installation on a single VM. This installation can be scaled up to a multinode configuration.

You will need to make the following changes to prepare for the HDP 2.0 beta installation:

Disable selinux using the command:

setenforce 0

Edit the SELinux configuration file:

vi /etc/selinux/config

Change SELINUX=enforcing to SELINUX=disabled

Stop the iptables firewall and disable it.

stop iptables
service iptables stop
chkconfig iptables off

Now you are to start the HDP 2.0 beta install:

Install the wget package

yum -y install wget

Get the repo for Ambari and copy it to /etc/yum.repos.d – if you’re not using CentOS please visit the HDP2 Documentation for the correct repo url.

cp ambari.repo /etc/yum.repos.d

Install Java7 and all your nodes.

yum -y install jdk-7u40-linux-x64.rpm on all nodes

Verify java_home is /usr/java/jdk1.7.0_40/

[root@yarndev ~]# java -version
java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)

If java –version comes up wrong you will need to create new symbolic links and update JAVA_HOME.  As root do the following  on all nodes-

rm /usr/bin/java
rm /usr/bin/javac
rm /usr/bin/javadoc
rm /usr/bin/javaws
ln -s  /usr/java/jdk1.7.0_40/bin/java /usr/bin/java
ln -s  /usr/java/jdk1.7.0_40/bin/javac /usr/bin/javac
ln -s /usr/java/jdk1.7.0_40/bin/javadoc /usr/bin/javadoc
ln -s /usr/java/jdk1.7.0_40/bin/javaws /usr/bin/javaws
echo “export JAVA_HOME=/usr/java/jdk1.7.0_40/” >> /etc/profile

Install ntpd, start service and sync time

yum -y install ntp
service ntpd start

Verify time is the same on all nodes.

Install Ambari server

yum -y install ambari-server

Run the Ambari server setup

ambari-server setup -s -j /usr/java/jdk1.7.0_40/

Details on the Ambari install can be found in the HDP 2.0 beta docs. Make sure to point to your jdk7 install.

Start Ambari server

ambari-server start

Install and start agents

ambari-agent start
edit ambari agent config file pointing it to the ambari-agent host.

Install Maven 3.11


Untar the maven file

tar –zxvf apache-maven-3.1.1-bin.tar.gz

Move the maven binary to /usr/lib/maven

mv apache-maven-3.1.1 /usr/lib/maven

Add Maven to path environment variable

export PATH=$PATH:/usr/lib/maven/bin

Get a copy of the repository for Storm on YARN from GitHub


Unzip master

unzip master
cd storm-yarn-master

Edit the pom.xml repos and Hadoop version to point at Hortonworks.


Set up Storm on your cluster:

Create a work folder to hold working files for Storm. Copy these files to your work folder and set up the environment variables.

cp lib/  /your/work/folder

Go to your work folder and unzip
Add storm-0.9.0-wip2 and storm-yarn-master bin folders to path
Add to hdfs /lib/storm/0.9.0-wip2/

hdfs dfs –put  /lib/storm/0.9.0-wip2/

Add storm-0.9.0-wip2 and storm-yarn-master bin folders to path. Make sure to update your workfolder!

export PATH=$PATH:/usr/lib/maven/bin:/your/work/folder/storm-0.9.0-wip21/bin:/your/work/folder/storm-yarn-master/bin

Start Maven in the storm-yarn-master folder.

cd storm-yarn-master
mvn package

Start Storm

Edit the storm.yaml file  from storm-0.9.0-wip2/conf/storm.yaml to include your Zookeeper servers. Store this file for safekeeping if desired. Then run:

storm-yarn launch <path to your storm.yaml file>

Get the stormconfig with the yarn application id

yarn application -list

We store the storm.yaml file in the .storm directory so the storm command can find it when it is submitting jobs.

storm-yarn getStormConfig -appId application_1381089732797_0025  -output ~/.storm/storm.yaml

Try running two of the sample topologies. You can find the Nimbus host with cat ~/.storm/storm.yaml | grep

Word Count:

[hdfs@yarndev storm-yarn-master]$ storm jar lib/storm-starter-0.0.1-SNAPSHOT.jar storm.starter.WordCountTopology WordCountTopology -c<your nimbus host>


[hdfs@yarndev storm-yarn-master]$ storm jar lib/storm-starter-0.0.1-SNAPSHOT.jar storm.starter.ExclamationTopology ExclamationTopology -c<your nimbus host>

Monitor the results:

Monitor the results by first finding what node the AM spawned on, this is almost going to be were Nimbus spawns.

cat ~/.storm/storm.yaml | grep

Visit yarndev:7070


Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.