cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

The Hortonworks Blog

More from Vinod Kumar Vavilapalli

We are right on the verge of some great celebrations of 10 years of Apache Hadoop! Hadoop Summit San Jose 2016 is almost here too marking these celebrations! Held on June 28-30, 2016, it is the event for technical and business audiences to learn how big data continues to a major force in transforming the […]

The Apache Hadoop community is happy to announce the release of Apache Hadoop 2.7.0! We want to express our gratitude to every contributor, reviewer and committer. The Hadoop community fixed 923 JIRAs in total as part of the 2.7.0 release. Of the 923 fixes: 259 were in Hadoop Common 350 were in HDFS 253 were […]

This is the third post in a series exploring recent innovations in the Hadoop ecosystem that are included in Hortonworks Data Platform (HDP) 2.2. In this post, we introduce the theme of supporting rolling upgrades and downgrades of a Hadoop YARN cluster. HDP 2.2 offers substantial innovations in Apache™ Hadoop YARN, enabling Hadoop users to […]

This is the second post in a series that explores recent innovations in the Hadoop ecosystem that are included in HDP 2.2. In this post, we introduce the theme of running service-workloads in YARN to set context for deeper discussion in subsequent blogs. HDP 2.2 brings substantial innovations in Apache Hadoop YARN, enabling users of […]

This is the first post in a series that explores recent innovations in the Hadoop ecosystem that are included in HDP 2.2. In this post, we introduce themes to set context for deeper discussion in subsequent blogs. HDP 2.2 represents another major step forward for Enterprise Hadoop. With thousands of enhancements across all elements of […]

Jian He (Apache YARN Hadoop committer) and I discuss Apache Hadoop YARN’s Resource Manager resiliency upon restart in this blog. This is the third blog post in the series on motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager (RM) resiliency. Others in the series are: Introduction: Apache YARN Resource Manager Resiliency […]

This is the second in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are: Introduction: Apache YARN Resource Manager Restart Resiliency Introduction: Phase I – Preserve Application-queues In the introductory blog, we previewed what RM Restart Phase I entails. In essence, […]

This is the first post in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are: Resilience of Apache YARN Applications across ResourceManager Restart – Phase 1 Resilience of Apache Apache Hadoop YARN across ResourceManager Restart – Phase 2 Resource Manager (RM) […]

User logs of Hadoop jobs serve multiple purposes. First and foremost, they can be used to debug issues while running a MapReduce application – correctness problems with the application itself, race conditions when running on a cluster, and debugging task/job failures due to hardware or platform bugs. Secondly, one can do historical analyses of the […]

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the ninth post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series: Introducing Apache Hadoop […]

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the 8th post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series:  Introducing Apache Hadoop […]

I’ve been sitting on this post for a while as Apache Hadoop 2 GA work was keeping me extremely busy. As they say, better late than never, so here we go 🙂 – the slides are at the end of the post. Three weeks ago, we had a Apache Hadoop YARN meetup at LinkedIn. Kind […]

This post from Vinod Kumar Vavilapalli of Hortonworks and  Chris Douglas and Carlo Curino of Microsoft Research. Great news from the Apache Hadoop YARN community! A paper describing Apache Hadoop YARN was accepted at 2013 ACM Symposium on Cloud Computing (SoCC 2013), where it won the award for best paper! Here’s the title and abstract: […]

This post is authored by Jian He with Vinod Kumar Vavilapalli and is the seventh post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series: Introducing Apache Hadoop YARN […]

This post authored by Zhijie Shen with Vinod Kumar Vavilapalli. This is the sixth blog in the multi-part series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series: Introducing Apache Hadoop YARN Apache Hadoop […]