Apache Hadoop YARN wins Best Paper award at SoCC 2013!

This post from Vinod Kumar Vavilapalli of Hortonworks and  Chris Douglas and Carlo Curino of Microsoft Research.

Great news from the Apache Hadoop YARN community! A paper describing Apache Hadoop YARN was accepted at 2013 ACM Symposium on Cloud Computing (SoCC 2013), where it won the award for best paper! Here’s the title and abstract:

Title

Apache Hadoop YARN: Yet Another Resource Negotiator [Industrial Paper]

Abstract

The initial design of Apache Hadoop was tightly focused on running massive, MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has become the data and computational agorá—the de facto place where data and computational resources are shared and accessed. This broad adoption and ubiquitous usage has stretched the initial design well beyond its intended target, exposing two key shortcomings: 1) tight coupling of a specific programming model with the resource management infrastructure, forcing developers to abuse the MapReduce programming model, and 2) centralized handling of jobs’ control flow, which resulted in endless scalability concerns for the scheduler.

In this paper, we summarize the design, development, and current state of deployment of the next generation of Hadoop’s compute platform: YARN. The new architecture we introduced decouples the programming model from the resource management infrastructure, and delegates many scheduling functions (e.g., task fault-tolerance) to per-application components. We provide experimental evidence demonstrating the improvements we made, confirm improved efficiency by reporting the experience of running YARN on production environments (including 100% of Yahoo! grids), and confirm the flexibility claims by discussing the porting of several programming frameworks onto YARN viz. Dryad, Giraph, Hoya, Hadoop MapReduce, REEF, Spark, Storm, Tez.

You can access the full paper here.

Acknowledgements

We are proud of this award. With the Apache Hadoop 2 GA release right around the corner, recognition of its potential validates all the hard work that’s gone into the YARN project. We are equally humbled by the challenges still ahead of us, as we work to deliver on the promise of this platform. We hope this paper can open YARN to new audiences of developers and researchers; we welcome them to our community.

As you can see from the author list and the acknowledgements in the paper, this gigantic effort wouldn’t be possible without the extraordinary work of so many. YARN has been – and continues to be – a completely community driven project. Our congratulations and thanks to everyone who contributed to YARN.

The full list of papers accepted into SoCC 2013 is here.

Categorized by :
Apache Hadoop YARN

Comments

Subash DSouza
|
October 14, 2013 at 4:01 pm
|

Congratulations guys!!. Great win for Hortonworks, Hadoop and the community

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.