Apache Ambari

A framework for provisioning, managing and monitoring Apache Hadoop clusters

Apache Ambari is a completely open operational framework for provisioning, managing and monitoring Apache Hadoop clusters. Ambari includes an intuitive collection of operator tools and a set of APIs that mask the complexity of Hadoop, simplifying the operation of clusters. With hundreds of years of combined experience, Hortonworks, along with members of the Hadoop community have answered the call to deliver the key services required for enterprise Hadoop.

What Ambari Does

Ambari enables system administrators to provision, manage and monitor a Hadoop cluster, and also to integrate Hadoop with the existing enterprise infrastructure.

Provision a Hadoop Cluster

No matter the size of your Hadoop cluster, the deployment and maintenance of hosts is simplified using Ambari. Ambari includes an intuitive Web interface that allows you to easily provision, configure and test all the Hadoop services and core components. Ambari also provides the powerful Ambari Blueprints API for automating cluster installations without user intervention.

Manage a Hadoop cluster

Ambari provides tools to simplify cluster management. The Web interface allows you to control the lifecycle of Hadoop services and components, modify configurations and manage the ongoing growth of your cluster.

Monitor a Hadoop cluster

Gain instant insight into the health of your cluster. Ambari pre-configures alerts for watching Hadoop services and visualizes cluster operational data in a simple Web interface.

Integrate Hadoop with the Enterprise

Ambari provides a RESTful API that enables integration with existing tools, such as Microsoft System Center Operations Manager, HP Operations Manager and Teradata Viewpoint, to merge Hadoop with your established operational processes.

 

Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of hosts involved. Ambari provides a single control point for viewing, updating and managing Hadoop service life cycles, with these important features:

Feature Benefit
Wizard-driven interface Facilitates installation of Hadoop across any number of hosts
API-driven installations Ambari Blueprints for automated provisioning
Granular service control Precise management of Hadoop services and component lifecycles
Configuration change history Ongoing management of Hadoop service configurations
RESTful APIs  Enables integration with enterprise systems
Extensible framework Brings custom services under management via Ambari Stacks
Customizable user interface Develop innovative user experiences via Ambari Views Framework
User Views Advanced capabilities for cluster optimization and tuning for Hadoop DevOps

 Hortonworks Collaboration for Ambari

Hortonworks is focused on going to market with a 100% open source solution. This focus allows us to collectively provide the product management guidance for Enterprise Grade Hadoop to mainstream enterprises and our partner ecosystem, and further innovate the core of Hadoop.

 

Open
Deliver a complete set of features for Hadoop operations, in public and with the community, by defining the operational framework and lifecycle. 
Integrated
Ensure that Hadoop operations can be integrated with existing IT tools, behind a single pane of glass, by providing REST APIs and multiple views of the cluster.
Initiative
Make Hadoop’s most complex operational challenges easy to manage with more insight and visibility into cluster performance.

Recent improvements to Manage and Monitor Hadoop

Our completely open approach via Apache Ambari is unique and we are excited to have HP, Pivotal and VMware jump on board to support Ambari with some of the other leaders in the data center like Microsoft and Teradata. This openness allows everyone to enjoy new features as they are delivered and the Ambari community is developing at an amazing rate. Apache Ambari 2.0 is now generally available and includes automated rolling upgrades for Hortonworks Data Platform 2.2 clusters, simplified administration of cluster security and a new Ambari Alerts framework. This new release of Ambari makes it simple to provision, manage and monitor enterprise-ready Hadoop clusters. For the complete list of new features, check out the What’s New in Apache Ambari 2.0 presentation. Some of the biggest new features include the following:

Automated Rolling Upgrades for HDP

As enterprises everywhere adopt Hadoop, they deploy more and more mission-critical analytic applications. Because of these mission critical workloads, the platform must undergo minimal to no cluster downtime during upgrades from one version to the next. That means the Hadoop platform needs to be “rolling upgradeable” and that process needs to be automated. Ambari orchestrates a series of operations on the cluster (with checks along the way) that help you move components to a newer version in a rolling fashion, minimizing downtime and impact on the Hadoop users.

More details on rolling upgrades in this blog post.

Simplified, Comprehensive Hadoop Security

Ambari helps provision, manage and monitor Hadoop security in two ways. First, Ambari simplifies the setup, configuration and maintenance of Kerberos for strong authentication in the cluster. Secondly, Ambari includes support for installing and configuring Apache Ranger for centralized security administration, authorization and audit.

Kerberos has long been the central technology for enabling strong authentication for Hadoop, but Kerberos configuration posed quite a challenge creating the principals and keytabs. Ambari makes this easier with an automated wizard-driven Kerberos configuration approach that eliminates time-consuming administration tasks. Ambari can work with your existing Kerberos infrastructure, including Active Directory, to automatically generate your cluster’s requisite principals and keytabs. Then, as you expand your cluster with more hosts or new services, Ambari can talk to your Kerberos infrastructure and automatically adjust the cluster configuration.

Apache Ranger provides centralized management of access control services for administration, authorization and audit. Ranger was added as a GA component in Hortonworks Data Platform 2.2 and now Ambari can be automatically install and configure Ranger with the rest of your cluster components.

Ambari Alerts

The enterprise Hadoop operator needs maximum visibility into the health of the cluster. As the operational framework for Hadoop, Ambari must provide that visibility out-of-the-box and also flexibly integrate with existing enterprise monitoring systems. Ambari Alerts aims to strike that balance between ease and flexibility.

Ambari Alerts provides centralized management of health alerts and checks for the services in your cluster. Ambari automatically configures the particular set of alerts based on the services installed. As a Hadoop operator, you have control over which alerts are enabled, their thresholds and their reporting output. For maximum flexibility, alert groups and multiple notification targets give you very granular control of the “who, what, why and how” around alerts. This puts both flexibility and power in the hands of the Hadoop operator. Ambari also exposes alerts REST API endpoints to enable integration with existing systems.

Ambari User Views

Ambari 2.0 includes a new User View for Tez. The Tez View helps you understand and optimize your cluster resource usage. Using the view, you can optimize and accelerate individual SQL queries or Pig jobs to get the best performance in a multitenant Hadoop environment.

But, that is just the beginning.  It’s time to put a new face on Hadoop using the Ambari Views framework.  It’s the only open source and open community effort designed to provide a compelling user experience for Hadoop while delivering consistent lifecycle management and security.  Check out additional Ambari User Views in tech preview below.

Download Ambari 2.0 and Learn More

For additional details about this release review the following resources:

Recent Ambari Releases

Ambari Version Prior Enhancements
2.0.0
  • Automated Rolling Upgrades for HDP
  • Simplified Kerberos Setup
  • Ranger and Spark support
  • Ambari Alerts & Ambari Metrics
  • New Tez View
1.7.0
  • Configuration versioning and history
  • Introduced the Ambari Stacks “Stack Advisor” for configuration validation
  • Introduced Ambari Views Framework for customizable user interfaces
1.6.0
  • Introduced Ambari Blueprints for automating cluster installs
  • Improved usability guardrails with more host and environment checks
  • Support for PostgreSQL database

Hortonworks Focus for Ambari

The Ambari community is already hard at work improving its capabilities to provision, manage and monitor Hadoop clusters.

The community will continue to innovate Ambari so that its operational capabilities keep pace with Hadoop’s ever-expanding functionality for data management, data access, governance and security.

Ambari User Views Technical Preview

Most notably, there are the Ambari User Views contributions actively being worked in the community. Ambari User Views are designed to provide capabilities that assist with the operational aspects of data application development and workload management. For more information on how to download and configure a Technical Preview of the Ambari User Views listed below use this document.

 

Tech Preview
User Views
Description
Hive

Hive View allows the user to write & execute SQL queries on the cluster. It shows the history of all Hive queries executed on the cluster whether run from Hive view or another source such as JDBC/ODBC or CLI. It also provides graphical view of the query execution plan. This helps the user debug the query for correctness and for tuning the performance. It integrates Tez View that allows the user to debug any Tez job, including monitoring the progress of a job (whether from Hive or Pig) while it is running.

You can find the source code on github along with the associated repo for the jar.   For details about how to configure and download use the Ambari User Views Tech Preview doc.

Pig

Pig View is similar to the Hive View. It allows writing and running a Pig script. It has support for saving scripts, and loading and using existing UDFs in scripts.

You can find the source code on github along with the associated repo for the jar.   For details about how to configure and download use the Ambari User Views Tech Preview doc.

Capacity Scheduler

Capacity Scheduler View helps a Hadoop operator setup YARN workload management easily to enable multi-tenant and multi-workload processing. This view provisions cluster resources by creating and managing YARN queues.

You can find the source code on github along with the associated repo for the jar.   For details about how to configure and download use the Ambari User Views Tech Preview doc.

Files

Files View allows the user to manage, browse and upload files and folders in HDFS.

You can find the source code on github along with the associated repo for the jar.   For details about how to configure and download use the Ambari User Views Tech Preview doc.

If you have questions or feedback on the User Views please post them to the Ambari User View Forum.

It is exciting to see Ambari come together and we are very interested in hearing feedback as these contributions mature.     Therefore, we have made the Ambari Operations and User Views available within the Hortonworks Sandbox to make it easier for you to try them out.  For questions and feedback on Ambari operations please post to the Ambari Forum.

Try these Tutorials

Apache Top-Level Project Since
December 2013
Hortonworks Committers
39

Ambari User Views – Tech Preview
Download, installation and setup instructions for evaluating Apache Ambari User Views Technical Preview with HDP 2.2.

Try User Ambari Views

Try Ambari with Sandbox

Hortonworks Sandbox is a self-contained virtual machine with HDP running alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox

View Past Webinars

Ambari Stacks, Views and Blueprints Workshop
Discover HDP 2.2: Using Apache Ambari to Manage Hadoop Clusters

Resources

More posts on:
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.