A completely open framework for provisioning, managing and monitoring Apache Hadoop clusters
Ambari offers an intuitive collection of tools and APIs that mask the complexity of Hadoop, simplifying the operation of clusters. Hortonworks, along with members of the Hadoop community have answered the call to deliver the key services required for enterprise Hadoop.
No matter the size of your Hadoop cluster, the deployment and maintenance of hosts is simplified using Ambari.
What Ambari Does
Ambari enables system administrators to provision, manage and monitor a Hadoop cluster, and also to integrate Hadoop with the existing enterprise infrastructure.
Provision a Hadoop Cluster
No matter the size of your Hadoop cluster, the deployment and maintenance of hosts is simplified using Ambari. Ambari includes an intuitive Web interface that allows you to easily provision, configure and test all the Hadoop services and core components. Ambari also provides the powerful Ambari Blueprints API for automating cluster installations without user intervention.
Manage a Hadoop cluster
Ambari provides tools to simplify cluster management. The Web interface allows you to control the lifecycle of Hadoop services and components, modify configurations and manage the ongoing growth of your cluster.
Monitor a Hadoop cluster
Gain instant insight into the health of your cluster. Ambari pre-configures alerts for watching Hadoop services and visualizes cluster operational data in a simple Web interface.
Integrate Hadoop with the Enterprise
Ambari provides a RESTful API that enables integration with existing tools, such as Microsoft System Center Operations Manager, HP Operations Manager and Teradata Viewpoint, to merge Hadoop with your established operational processes.
Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of hosts involved. Ambari provides a single control point for viewing, updating and managing Hadoop service life cycles, with these important features:
|Wizard-driven interface||Facilitates installation of Hadoop across any number of hosts|
|API-driven installations||Ambari Blueprints for automated provisioning|
|Granular service control||Precise management of Hadoop services and component lifecycles|
|Configuration change history||Ongoing management of Hadoop service configurations|
|RESTful APIs||Enables integration with enterprise systems|
|Extensible framework||Brings custom services under management via Ambari Stacks|
|Customizable user interface||Develop innovative user experiences via Ambari Views Framework|
|User Views||Advanced capabilities for cluster optimization and tuning for Hadoop DevOps|
Hortonworks Collaboration for Ambari
Hortonworks is focused on going to market with a 100% open source solution. This focus allows us to collectively provide the product management guidance for Enterprise Grade Hadoop to mainstream enterprises and our partner ecosystem, and further innovate the core of Hadoop.
- OpenDeliver a complete set of features for Hadoop operations, in public and with the community, by defining the operational framework and lifecycle.
- IntegratedEnsure that Hadoop operations can be integrated with existing IT tools, behind a single pane of glass, by providing REST APIs and multiple views of the cluster.
- IntuitiveMake Hadoop’s most complex operational challenges easy to manage with more insight and visibility into cluster performance.
Recent improvements to Manage and Monitor Hadoop
Our completely open approach via Apache Ambari is unique and we are excited to have HP, Pivotal and VMware jump on board to support Ambari with some of the other leaders in the data center like Microsoft and Teradata. This openness allows everyone to enjoy new features as they are delivered and the Ambari community is developing at an amazing rate. In the latest release, Apache Ambari 2.1 includes automated rolling upgrades for Hortonworks Data Platform clusters, simplified administration of cluster security and “guided configuration” experience. This release of Ambari makes it simple to provision, manage and monitor enterprise-ready Hadoop clusters. For the complete list of new features introduced with Apache Ambari 2.1, check out the What’s New in Apache Ambari 2.1 presentation. Some of the notable Ambari features include:
Automated Rolling Upgrades for HDP
As enterprises everywhere adopt Hadoop, they deploy more and more mission-critical analytic applications. Because of these mission critical workloads, the platform must undergo minimal to no cluster downtime during upgrades from one version to the next. That means the Hadoop platform needs to be “rolling upgradeable” and that process needs to be automated. Ambari orchestrates a series of operations on the cluster (with checks along the way) that help you move components to a newer version in a rolling fashion, minimizing downtime and impact on the Hadoop users.
Simplified, Comprehensive Hadoop Security
Ambari helps provision, manage and monitor Hadoop security in two ways. First, Ambari includes support for installing and configuring Apache Rangerfor centralized security administration, authorization and audit. Apache Ranger providing centralized management of access control services for administration, authorization and audit. Ranger was added as a GA component in Hortonworks Data Platform 2.2 and now Ambari can be automatically install and configure Ranger with the rest of your cluster components.
Secondly, Ambari simplifies the setup, configuration and maintenance of Kerberos for strong authentication in the cluster.Kerberos has long been the central technology for enabling strong authentication for Hadoop, but Kerberos configuration posed quite a challenge creating the principals and keytabs. Ambari makes this easier with an automated wizard-driven Kerberos configuration approach that eliminates time-consuming administration tasks. Ambari can work with your existing Kerberos infrastructure, including Active Directory, to automatically generate your cluster’s requisite principals and keytabs. Then, as you expand your cluster with more hosts or new services, Ambari can talk to your Kerberos infrastructure and automatically adjust the cluster configuration.
A Hadoop cluster certainly has it’s share of configurations. That is a double-edged sword: clusters are highly configurable and can be tuned at all levels. But knowing what to change, the recommended values, the boundaries, and the possible dependencies is a massive challenge.
The new Guided Configurations feature of Ambari helps eliminate that challenge. Guided Configurations make it clearer to the Hadoop operator what configuration changes are recommended and what the dependencies & boundaries of those changes are. This reduces the time and expertise needed to install & configure a Hadoop cluster. The following are a few of the key features found in the new Ambari Web user interface for Guided Configurations:
- The key configs are put first in the UI so that you can clearly see the “most” important configurations up-front.
- Configs are logically grouped so that related configs are shown in the UI near each other.
- The UI control is tailored for the value being modified. No more guessing values and units, the UI now shows a more intuitive set of controls that better fit the most important config controls.
- Recommended values are displayed, as well as the recommended “bounds” (max, min) to help limit the choices to only the best options.
- When you make a change, you are alerted of dependent changes that should happen so you know to take action.
A Hadoop cluster emits a tremendous amount of operational data as it stores and processes information. It is critical that a Hadoop operator knows which metrics best indicate the operational health of a cluster. Out of the box, Ambari pre-configures a set of service dashboards that highlights a set of critical metrics for the operator.
That’s a great start but often, based on workload type, operators wants to be able to customize the dashboard and show different metrics. The new Customizable Dashboards feature allows the operator to customize the service dashboards to display new metrics and charts that are tailored to their environment. This improves the operator visibility into cluster activity which makes it easier to administer Hadoop cluster and easier to get new operators enabled. With Customizable Dashboards, you can:
- Change the layout of the default dashboard widgets.
- Create new widgets built from Hadoop metrics.
- Share widgets in a Widget library for other operators to include in their dashboards.
Ambari User Views
It’s time to put a new face on Hadoop using the Ambari Views framework. A “view” is a way of extending Ambari that allows 3rd parties to plug in new resource types along with the APIs, providers and UI to support them. Ambari is the only open source and open community effort designed to provide a compelling user experience for Hadoop while delivering consistent lifecycle management and security.
Most notably, there are the Ambari User Views contributions actively being worked in the community. Ambari User Views are designed to provide capabilities that assist with the operational aspects of data application development and workload management. The following Ambari User Views are available with Apache Ambari 2.1.
|Tez||The Tez View helps you understand and optimize your cluster resource usage. Using the view, you can optimize and accelerate individual SQL queries or Pig jobs to get the best performance in a multi-tenant Hadoop environment.|
|Hive||Hive View allows the user to write & execute SQL queries on the cluster. It shows the history of all Hive queries executed on the cluster whether run from Hive view or another source such as JDBC/ODBC or CLI. It also provides graphical view of the query execution plan. This helps the user debug the query for correctness and for tuning the performance. It integrates Tez View that allows the user to debug any Tez job, including monitoring the progress of a job (whether from Hive or Pig) while it is running. This view contribution can be found here.|
|Pig||Pig View is similar to the Hive View. It allows writing and running a Pig script. It has support for saving scripts, and loading and using existing UDFs in scripts. This view contribution can be found here.|
|Capacity Scheduler||Capacity Scheduler View helps a Hadoop operator setup YARN workload management easily to enable multi-tenant and multi-workload processing. This view provisions cluster resources by creating and managing YARN queues. This view contribution can be found here.|
|Files||Files View allows the user to manage, browse and upload files and folders in HDFS. This view contribution can be found here.|
Beyond these out of the box User Views there is a growing ecosystem of Ambari User Views that are being developed by the community. You can find community User Views in the Hortonworks Gallery.
Recent Ambari Releases
Download Ambari 2.2 and Learn More
For additional details about this release review the following resources:
- Ambari 2.2 Documentation
- Ambari 2.2 Release Notes
- Ambari 2.2 Install Guide
- Learn more about Ambari Views, Ambari Blueprints and Ambari Stacks
|Ambari Version||Notable Enhancements|
Hortonworks' Focus for Ambari
The Ambari community is already hard at work improving its capabilities to provision, manage and monitor Hadoop clusters.
The community will continue to innovate Ambari so that its operational capabilities keep pace with Hadoop’s ever-expanding functionality for data management, data access, governance and security.
It is exciting to see Ambari come together and we are very interested in hearing feedback as these contributions mature. Therefore, we have made the Ambari Operations and User Views available within the Hortonworks Sandbox to make it easier for you to try them out. For questions and feedback on Ambari operations please post to the Ambari Forum. If you have questions or feedback on the User Views please post them to the Ambari User View Forum.