Update on Apache Hadoop-0.23

There has been a lot of progress on hadoop-0.23. We’re continuing to crank through issues as we get ready to ship.

We are mostly past the initial challenges of moving our entire build infrastructure to Maven. Many thanks to Alejandro, Tom, Giri & Eric Yang for making it happen.

HDFS is nearly there:

  • HDFS Federation and Client-side mount tables have been tested with ~300 node clusters with security on.
  • HDFS upgrades have been tested from 0.20.2xx.
  • Functional tests for HDFS  are complete.

NextGen MapReduce (aka MRv2, aka YARN) is making great progress:

  • We are happy to report we’ve done extensive scale testing to confirm stability:
    • Sort/GridMixv3 etc. at ~350nodes
    • Scale testing with simulated clusters of ~1500 nodes
  • Functional tests for all of MapReduce functionality
  • Pig  (0.9 & 0.9.1) working with NextGen MapReduce
  • All above have been done with no regressions in security.

We are about to finish performance certification for both HDFS & MapReduce in the next couple of weeks. After that is completed, we will start integration tests with HBase, Hive, Oozie, etc.

We fixed 75 bugs in September alone and have another 50 or so bugs to go. There were at least 4 different organizations that contributed patches to MRv2 in Sept alone: Yahoo, Hortonworks, LinkedIn & Huawei.

Given our current state, I’m confident we will have a strong hadoop-0.23.0 release by late October. The current plan is to deploy to alpha clusters in November. Citius, Altius, Fortius! :)

Thanks to everyone who contributed and we look forward to continued help.

Arun C. Murthy (@acmurthy)

Categorized by :
Apache Hadoop HDFS MapReduce

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache Knox
Thursday, October 23, 2014
1:00 PM Eastern / 12:00 PM Central / 11:00 AM Mountain / 10:00 AM Pacific

More Webinars »

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.