The Future of Ambari
What a difference a year makes! Last Fall Ambari was a nascent Apache project that had recently shipped an inaugural release in the community. Fast forward a bit, at the beginning of this year Ambari shipped what has become the foundation for rapid innovation. Now Ambari has become a key member of the Apache Hadoop project ecosystem and a trusted operational platform for many companies.
Let’s take a brief look at the community’s amazing accomplishments over the past year, and then take some time to look forward.
A Year In Review
Since the Ambari 1.2.0 release in January 2013, it has progressed at an amazing pace. The community-driven approach to development has enabled the delivery of over a half-dozen releases, with more than 2,000 JIRAs resolved, and contributions from over 50 individuals. With the recent release of Ambari 1.4.1, Ambari now supports both the Hadoop 1 and Hadoop 2 stacks – the core pillars of the Hadoop ecosystem.
We are all well aware of this fact: Hadoop adoption is accelerating. As more and more enterprises are getting valuable information from the power of Hadoop, that also means the clusters are getting bigger — more data, more workloads, more processing.
Growth at this pace makes Hadoop operations more complex. The Ambari team believes this rapid growth makes it more important for us to offer ways to easily manage this complexity. In addition to Hadoop operational tasks, we also have to offer simpler ways to add new services beyond the traditional “Hadoop Stack” and integrate monitoring & management of these new services into the Ambari management fabric and Ambari Web.
To promote this simplification, we plan to drive Ambari on three fronts: Extensibility, User Experience and Operational Insight.
Extensibility for the Community and Enterprise
Ambari already includes a robust REST API. This API provides a single-point of Hadoop operational information and control that greatly simplifies integration with Hadoop. The API has been successfully used with Teradata Viewpoint and Ambari System Center Operations Manager (SCOM) Management Pack (thank you to the folks at Teradata and Microsoft for their contributions).
These “front-end” integrations via the REST API are critical, not just for software vendors to integrate their products with Hadoop but also for enterprises to integrate their tooling and infrastructure with Hadoop.
But the Ambari community understands that many software vendors and enterprises also want Ambari to be able to manage more than “the traditional Stack”. Ambari needs to provide “back-end” extensibility. How easily can I “plug-in” a new service to Ambari? What is the defined lifecycle for a service? What operational and monitoring services does Ambari make available for my component, to save me from having to do a custom build?
I’m glad to report that there is already a great deal of work occurring in the community towards defining Service “pluggability” (see AMBARI-2714 on dynamically adding a Service). This work will become the foundation for adding services to Ambari (quick shout out to the folks at Red Hat for their contributions).
We also want to make it easier to add complete Stacks to Ambari, hence the work on universal cluster layout definition (see AMBARI-1783 on cluster blueprints — shout out again to the folks at Microsoft for the contributions).
Bigger Operations with a Simplified User Experience
In years past, cluster operations could be handled with custom scripts and homegrown infrastructure. This is now proving to be a big task to manage. The size of operational teams has not grown at the same pace as the clusters, which means Ambari has to do two things: 1) continue to make the tools easier to use and 2) continue to make Hadoop easier to operate.
We expect to see the community spend time delivering growth-driven features with the goal of enabling a single Hadoop operator to perform the critical cluster administration functions without a lot of custom coding or having to call on an army of operations engineers.
More Operational Insight
This isn’t just about cluster health or access to operational metrics. It’s also about a historic picture that combines health, workloads and resources. Show me how my cluster is being used so I know I am getting the most out of my cluster resources. I need this information to predict and plan for growth in my cluster. We believe Ambari has to help operators answer these questions.
It’s been an exciting year and we are excited about the opportunities ahead of us. Thank you for the contributions and support thus far. We’ve only just begun.
The Ambari Team