Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
June 26, 2013
prev slideNext slide

HDP 2.0 Community Preview and launch of Hortonworks Certification Program for Apache Hadoop YARN

Four years ago, Arun Murthy entered a JIRA ticket (MAPREDUCE -279) that outlined a re-architecture of the original MapReduce.  In the ticket, he outlined a set of capabilities that allowed processes to better share resources and an architecture that would allow Hadoop to extend beyond batch data processing.

It turned out that this ticket was prescient of true enterprise requirements for Hadoop. As enterprise adoption accelerated, it became even clearer that multiple processing models – moving beyond batch – was critical for Hadoop to broaden its applicability for mainstream usage in the modern enterprise architecture. The common pattern: enterprises want to store data in HDFS and then access it in a variety of ways, simultaneously, and with a consistent level of service.  It must support a range of interaction patterns, from batch to streaming to MPI and more.


Delivering multiple processing models with the YARN based architecture of Hadoop 2.0

This JIRA ticket ultimately resulted in a new branch of the open source Apache code trunk (Hadoop 2.0) and a new sub-project, Apache Hadoop YARN.

We’ve posted a series of blogs on the technical aspects of YARN, but in simplest terms YARN separates out the resource management capabilities previously in MapReduce, and thereby provides a framework to introduce a whole new range of new processing engines.  A simple graphical depiction is below, and shows that the YARN based architecture of Hadoop 2.x is fundamentally different from the architecture of Hadoop 1.x.

Announcing HDP 2.0 Community Preview

With Hadoop 2.0 working its way through the community process at the Apache Software Foundation and soon to be released as Beta, today we are excited to make two significant announcements:

The HDP 2.0 Community Preview is the first delivery to include YARN and will enable us to engage an ecosystem of partners to progress this new technology in the coming weeks and months to ensure it is ready for mainstream usage.

We have already seen several announcements of YARN (expand these) enablement from the community – including STORM / YARN from Yahoo! and Weave from Continuity – and anticipate that the Hortonworks Certification Program for Apache Hadoop YARN will further accelerate the types of applications that will be able to run natively in Hadoop.

Igniting an ecosystem of YARN application development

MOST of all, we are excited about the ecosystem of applications that will result from this program and are proud to announce over 15 partners who have already joined, including, Altiscale, Concurrent, Continuuity, DataTorrent, Elasticsearch, Karmasphere, Microsoft, MicroStrategy, Platfora, Red Hat, SAS, Splunk, Sqrrl, Tableau Software and TIBCO.


Today, HDP2.0 CP is available for download as a single-node VM and we will release a full preview distribution within the next week.  The VM package, available for download from our website in our Sandbox form factor also includes two tutorials — one on Apache Tez and one on YARN. And you will also see the essential tutorials from Hortonworks Sandbox 1.3 to enable you to see the Stinger initiative in action.

Join us!



Pavan says:

When are you likely to release a Windows version of HDP 2.0?

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums