Get Started: Ambari for provisioning, managing and monitoring Hadoop

Ambari is 100% open source and included in HDP, greatly simplifying installation and initial configuration of Hadoop clusters. In this article we’ll be running through some installation steps to get started with Ambari. Most of the steps here are covered in the main HDP documentation here.

The first order of business is getting Ambari Server itself installed. There are different approaches to this, but for the purposes of this short tour, we’ll assume Ambari is already installed on its own dedicated node somewhere or on one of the nodes on the (future) cluster itself. Instructions can be found under the installation steps linked above. Once Ambari Server is running, the hard work is actually done. Ambari  simplifies cluster install and initial configuration with a wizard interface, taking care of it with but a few clicks and decisions from the end user. Hit http://<server_you_installed_ambari>:8080 and log in with admin/admin. Upon logging in, we are greeted with a user-friendly, wizard interface. Welcome to Apache Ambari! Name that cluster and let’s get going.

Screen Shot 2013-03-29 at 3.12.16 PM

Now we can target hosts for installation with a full listing of host names or regular expressions (in situations when there are many nodes with similar names):

Screen Shot 2013-03-29 at 3.13.20 PM

The next step is node registration, with Ambari doing all of the heavy lifting for us. An interface to track progress and drill down into log files is made available:

Screen Shot 2013-03-29 at 3.22.53 PM

Upon registration completion, a detailed view of host checks run and options to re-run are also available:

Screen Shot 2013-03-29 at 3.22.57 PM

Next, we select which high level components we want for the cluster. Dependency checks are all built in, so no worries about knowing which services are pre-requisites for others:

Screen Shot 2013-03-29 at 3.23.26 PM

After service selection, node-specific service assignments are as simple as checking boxes:

Screen Shot 2013-03-29 at 3.24.00 PM

This is where some minor typing may be required. Ambari allows simple configuration of the cluster via an easy to use interface, calling out required fields when necessary:

Screen Shot 2013-03-29 at 3.24.24 PM

Once configuration has been completed, a review pane is displayed. This is a good point to pause and check for anything that requires adjustment. The Ambari wizard makes that simple. Things look fabulous here, though, so onwards!

Screen Shot 2013-03-29 at 3.24.40 PM

Ambari will now execute the actual installation and necessary smoke tests on all nodes in the cluster. Sit back and relax, Ambari will perform the heavy lifting yet again:

Screen Shot 2013-03-29 at 3.38.15 PM

If you are itching to get involved, detailed drill-downs are available to monitor progress:

Screen Shot 2013-03-29 at 3.38.18 PM


Screen Shot 2013-03-29 at 3.38.23 PM

Ambari tracks all progress and activities for you, dynamically updating the interface:

Screen Shot 2013-03-29 at 3.44.45 PM

And just like that, we have our Hortonworks Data Platform Cluster up and running, ready for that high priority POC:


Go forth and prosper, my friends. May the (big) data be with you.

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.