Announcing Apache Hadoop 2.0.3 Release and Roadmap

 

As the Release Manager for hadoop-2.x, I’m very pleased to announce the next major milestone for the Apache Hadoop community, the release of hadoop-2.0.3-alpha!

2.0 Enhancements in this Alpha Release

This release delivers significant major enhancements and stability over previous releases in hadoop-2.x series. Notably, it includes:

  • QJM for HDFS HA for NameNode (HDFS-3077) and related stability fixes to HDFS HA
  • Multi-resource scheduling (CPU and memory) for YARN (YARN-2, YARN-3 & friends)
  • YARN ResourceManager Restart (YARN-230)
  • Significant stability at scale for YARN (over 30,000 nodes and 14 million applications so far, at time of release – see more details from folks at Yahoo! here)

Where is hadoop-2 and What is Left?

It is important to note that the this release is still considered alpha as there are a few items that still need to be addressed before we enter beta in the next couple months. Most importantly some of APIs, particularly the HDFS & YARN protobuf-based protocols aren’t fully-baked. Also note that there are some API changes from the previous hadoop-2.0.2-alpha release and that your applications will need to recompile against the new hadoop-2.0.3-alpha. Please see the Hadoop 2.0.3-alpha release notes for details.

We are converging fast on ironing out the API issues (both in HDFS & YARN/MapReduce) and, currently, plan to cut a hadoop-2.0.4-beta release in the next couple of months after this effort. It also helps to have a major presence like Yahoo! test out hadoop-2 HDFS HA over the course of the coming months as they’ve noted in their blog. To this end, the code base has also gone through significant churn and as with any alpha we expect to uncover some further issues as we endure this ongoing test.

There is still a lot of work ahead of us, but we believe that hadoop-2.0.4-beta will be a major step to then release a fully stable, supported hadoop-2 release, exciting times! Stay tuned!

Acknowledgements

As always, it’s a pleasure to work with everyone in the community – thank *you*, this goes to everyone who has contributed to this release. A special mention for Todd Lipcon for his contributions to QJM for HDFS HA and the Yahoo Hadoop team (Robert Evans, Thomas Graves, Daryn Sharp, Jason Lowe and everyone else) for their efforts in getting YARN to stability and large-scale deployments on their clusters.

Arun C. Murthy

Categorized by :
Apache Hadoop Hadoop 2.0 Other

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

YARN Ready – Integrating to YARN using Slider (part 2 of 3)
Thursday, August 7, 2014
12:00 PM Eastern / 9:00 AM Pacific

More Webinars »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :