Advances in Hadoop security, governance and operations have accelerated adoption of the platform by enterprises everywhere. Apache Ambari is the open source operational platform for provisioning, managing and monitoring Hadoop clusters from a single pane of glass, and with the Apache Ambari 1.7.0 release last year, Ambari made it far easier for enterprises to adopt Hadoop.
Today, we are excited to announce the community release of Apache Ambari 2.0, which will further accelerate enterprise Hadoop usage by simplifying the technical challenges that slow adoption the most. Ambari 2.0 includes many features, most notable of which are:
Many thanks to all of the contributors and committers who collaborated on this release and resolved more than 1,700 JIRA issues. For the complete list of new features, check out this What’s New in Ambari 2.0 presentation.
Enough of the chit-chat. Here are some details of the exciting new features in Apache Ambari 2.0.
The Hortonworks Dev team did a great job describing rolling upgrades in this blog post. To highlight, as enterprises everywhere adopt Hadoop, they deploy more and more mission-critical analytic applications. Because of these mission critical workloads, the platform must undergo minimal to no cluster downtime during upgrades from one version to the next. That means the Hadoop platform needs to be “rolling upgradeable.”
The effort in the open source community to make the Hadoop platform rolling upgradeable goes beyond packaging (even though that is one of the key components of rolling upgrades). The developers need to consider the API compatibility between components, the components need an ability to restart jobs underway on the cluster and the system needs to maintain high availability among the Hadoop components for seamless master component switches during upgrades.
That’s a lot of work. But the Hortonworks Dev team brought it all together with Hortonworks Data Platform 2.2 and the Ambari Automated Rolling Upgrade for HDP Stack capability. This allows Hadoop operators to perform a rolling upgrade from one version of HDP to the next with minimal disruption to the cluster. Ambari orchestrates a series of operations on the cluster (with checks along the way) that help you move components to a newer version.
This only scratches the surface. Stay tuned for subsequent blogs with more details on automated rolling upgrades.
Ambari 2.0 helps provision, manage and monitor Hadoop security in two ways. First, Ambari now simplifies the setup, configuration and maintenance of Kerberos for strong authentication in the cluster. Secondly, Ambari now includes support for installing and configuring Apache Ranger for centralized security administration, authorization and audit.
Kerberos has long been the central technology for enabling strong authentication for Hadoop, but Kerberos configuration posed quite a challenge creating the principals and keytabs. Ongoing maintenance of those artifacts could be cumbersome.
Ambari 2.0 makes this easier with an automated wizard-driven Kerberos configuration approach that eliminates time-consuming administration tasks. Ambari can work with your existing Kerberos infrastructure, including Active Directory, to automatically generate your cluster’s requisite principals and keytabs. Then, as you expand your cluster with more hosts or new services, Ambari can talk to your Kerberos infrastructure and automatically adjust the cluster configuration.
Apache Ranger is the other side of the security equation, providing centralized management of access control services for administration, authorization and audit. Ranger was added as a GA component in Hortonworks Data Platform 2.2 and now with Ambari 2.0, Ranger can be automatically installed and configured with the rest of your cluster components.
Watch this blog for future posts digging deeper into Kerberos, Apache Ranger and comprehensive security support with Ambari 2.0.
The enterprise Hadoop operator needs maximum visibility into the health of the cluster. As the operational framework for Hadoop, Ambari must provide that visibility out-of-the-box and also flexibly integrate with existing enterprise monitoring systems. Ambari Alerts aims to strike that balance between ease and flexibility.
Ambari Alerts provides centralized management of health alerts and checks for the services in your cluster. Ambari automatically configures the particular set of alerts based on the services installed. As a Hadoop operator, you have control over which alerts are enabled, their thresholds and their reporting output. For maximum flexibility, alert groups and multiple notification targets give you very granular control of the “who, what, why and how” around alerts. This puts both flexibility and power in the hands of the Hadoop operator, who can now:
Ambari also exposes alerts REST API endpoints to enable integration with existing systems. There are a few integration patterns in the What’s New in Ambari 2.0 slides to give you a better sense of the possibilities. As one example of the Ambari community rallying around Alerts, our partners at SequenceIQ dove in head-first and have already integrated alerts with Periscope. Be sure to check out what they have done, since it’s a great example of community innovation in action.
The Ambari community is already hard at work improving Apache Ambari capabilities to provision, manage and monitor Hadoop clusters. Watch this blog for more news on enhancements to core features and extensibility features. But in the meantime, checkout the community release of Ambari 2.0 with the following resources: