Apache Ambari

A framework for provisioning, managing and monitoring Apache Hadoop clusters.

Apache Ambari is a completely open operational framework for provisioning, managing and monitoring Apache Hadoop clusters. Ambari includes an intuitive collection of operator tools and a set of APIs that hide the complexity of Hadoop, simplifying the operation of clusters.

What Ambari Does

Ambari enables system administrators to:

  • Provision a Hadoop Cluster No matter the size of your Hadoop cluster, the deployment and maintenance of hosts is simplified using Ambari. Ambari includes an intuitive Web interface that allows you to easily provision, configure and test all the Hadoop services and core components.
  • Manage a Hadoop cluster Ambari provides tools to simplify cluster management. The Web interface allows you to start/stop/test Hadoop services, change configurations and manage ongoing growth of your cluster.
  • Monitor a Hadoop cluster Gain instant insight into the health of your cluster.
    Ambari pre-configures alerts for watching Hadoop services and visualizes cluster operational data in a simple Web interface.

    Ambari also includes job diagnostic tools to visualize job interdependencies and view task timelines as a way to troubleshoot historic job performance execution.

  • Integrate Hadoop with other applications Ambari provides a RESTful API that enables integration with existing tools, such as Microsoft System Center and Teradata Viewpoint. Ambari also leverages standard technologies and protocols with Nagios and Ganglia for deeper customization.

How Ambari Works

Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of hosts involved. Ambari provides a single control point for viewing, updating and managing Hadoop service life cycles, with these important features:

  • Wizard-driven installation of Hadoop services across any number of hosts
  • Granular configuration of Hadoop services and components
  • Ganglia for metrics collection and Nagios for system alerts
  • Advanced job diagnostic and troubleshooting tools
  • Robust RESTful APIs for customization and integration with enterprise systems
  • Cluster heatmaps

Try these Tutorials

Apache Top-Level Project Since
December 2013
Hortonworks Committers & Contributors

Try it with Sandbox

Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox

Recently in the Blog

Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.

Thank you for subscribing!