Introducing Apache Hadoop YARN

Other posts in this series:
Introducing Apache Hadoop YARN
Apache Hadoop YARN – Background and an Overview
Apache Hadoop YARN – Concepts and Applications
Apache Hadoop YARN – ResourceManager
Apache Hadoop YARN – NodeManager

Introducing Apache Hadoop YARN

I’m thrilled to announce that the Apache Hadoop community has decided to promote the next-generation Hadoop data-processing framework, i.e. YARN, to be a sub-project of Apache Hadoop in the ASF!

Apache Hadoop YARN joins Hadoop Common (core libraries), Hadoop HDFS (storage) and Hadoop MapReduce (the MapReduce implementation) as the sub-projects of the Apache Hadoop which, itself, is a Top Level Project in the Apache Software Foundation. Until this milestone, YARN was a part of the Hadoop MapReduce project and now is poised to stand up on it’s own as a sub-project of Hadoop.

In a nutshell, Hadoop YARN is an attempt to take Apache Hadoop beyond MapReduce for data-processing.

As folks are aware, Hadoop HDFS is the data storage layer for Hadoop and MapReduce was the data-processing layer. However, the MapReduce algorithm, by itself, isn’t sufficient for the very wide variety of use-cases we see Hadoop being employed to solve. With YARN, Hadoop now has a generic resource-management and distributed application framework, where by, one can implement multiple data processing applications customized for the task at hand. Hadoop MapReduce is now one such application for YARN and I see several others given my vantage point – in future you will see MPI, graph-processing, simple services etc.; all co-existing with MapReduce applications in a Hadoop YARN cluster.

Implications for the Apache Hadoop Developer community

I’d like to take a brief moment to walk folks through the implications of making Hadoop YARN as a sub-project, particularly for members of the Hadoop developer community.

  • We will now see a top-level hadoop-yarn-project source folder in Hadoop trunk.
  • We will now use a separate jira project for issue tracking for YARN i.e. https://issues.apache.org/jira/browse/YARN
  • We will also use a new yarn-dev@hadoop.apache.org mailing list for collaboration.
  • We will continue to co-release a single Apache Hadoop release that will include the Common, HDFS, YARN and MapReduce sub-projects.

If you would like to play with YARN please download the latest hadoop-2 release from the ASF and start contributing – either to core YARN sub-project or start building your cool application on top!

Please do remember that hadoop-2 is still deemed alpha quality by the Apache Hadoop community, but YARN itself shows a lot of promise and we are excited by the future possibilities!

Conclusion

Overall, having Hadoop YARN as a sub-project of Apache Hadoop is a significant milestone for Hadoop several years in the making. Personally, it is very exciting given that this journey started more than 4 years ago with https://issues.apache.org/jira/browse/MAPREDUCE-279. It’s a great pleasure, and honor, to get to this point by collaborating with a fantastic community that is driving Apache Hadoop.

Kudos to everyone!

20 Responses to Introducing Apache Hadoop YARN

  1. Pingback: Breaking: Hadoop Community Votes To Upgrade Hadoop Core with YARN | DevOpsANGLE

  2. Pingback: Introducing Apache Hadoop YARN « Another Word For It

  3. Pingback: YARN als zelfstandig deel project binnen Hadoop

  4. Pingback: Virtual Intelligence Briefing » Breaking: Hadoop Community Votes To Upgrade Hadoop Core with YARN

  5. Pingback: Heard the latest Hadoop YARN? | Contexti Insights

  6. Pingback: Windows Azure and Cloud Computing Posts for 8/13/2012+ - Windows Azure Blog

  7. Pingback: 5 trends that are changing how we do big data — Data | GigaOM

  8. Pingback: 5 trends that are changing how we do big data ← techtings

  9. Pingback: 5 Trends That Are Changing How We Do Big Data | zowchow.com

  10. Pingback: GIASTAR – Storie di ordinaria tecnologia » Blog Archive » 5 trends that are changing how we do big data

  11. Pingback: 5 trends that are changing how we do big data - Cleantech Reporter | Cleantech Reporter

  12. Pingback: 5 trends that are changing how we do big data - Daily Small Talk : Daily Small Talk

  13. Pingback: 5 trends that are changing how we do big data | Apple Related

  14. Pingback: Global Tech Review | 5 trends that are changing how we do big data

  15. Pingback: 5 trends that are changing how we do big data | Techno BLOG

  16. Pingback: 大数据的五大进化趋势    IT经理网|CTOCIO.com

  17. Pingback: Towards High Availability in YARN: Motivation and Proposed Solution « otnira golb!

  18. Pingback: The Road Ahead for Hortonworks and Hadoop | Big Data

  19. Pingback: 5 reasons why the future of Hadoop is real-time (relatively speaking) — Tech News and Analysis

  20. Pingback: 5 reasons why the future of Hadoop is real-time (relatively speaking) ← techtings

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>