Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
November 22, 2017 | Robert Hryniewicz

IoT and Data Science – A Trucking Demo on DSX Local with Apache NiFi

November 21, 2017 | Piet Loubser

Big Data London – UK readies for Global Data-Driven Upheaval

November 20, 2017 | Matt Spillar | Hortonworks Case Study

Addressing the Data Tipping Point

Viewing posts by: Vinod Kumar Vavilapalli« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

This is the second post in the Engineering @ Hortonworks blog series that explores how we in Hortonworks Engineering build, test and release new versions of our platforms. In this post, we deep dive into something that we are extremely excited about – Running a container cloud on YARN! We have been using this next-generation […]

This is the introductory post in a blog series that explores how we in Hortonworks Engineering build, test and release new versions of our platforms. In this post, we introduce the basic themes and set context for deeper discussions in subsequent blogs. We at Hortonworks are very proud of the work we do. Along with […]

This post introduces some of the talks and sessions from Dataworks Summit San Jose 2017 that cover the efforts of the Apache Hadoop YARN community. Come explore the latest of Apache Hadoop YARN at Dataworks Summit San Jose 2017! Dataworks Summit / Hadoop Summit San Jose 2017 is almost upon us! Held between June 13-15, […]

Thank you for reading our Data Lake 3.0 series! In part 1 of the series, we briefly introduced the power of leveraging prepackaged applications in Data Lake 3.0 and how the focus will shift from the platform management to solving the business problems. In this post, we further deliberate on this idea to help answer […]

We are right on the verge of some great celebrations of 10 years of Apache Hadoop! Hadoop Summit San Jose 2016 is almost here too marking these celebrations! Held on June 28-30, 2016, it is the event for technical and business audiences to learn how big data continues to a major force in transforming the […]

The Apache Hadoop community is happy to announce the release of Apache Hadoop 2.7.0! We want to express our gratitude to every contributor, reviewer and committer. The Hadoop community fixed 923 JIRAs in total as part of the 2.7.0 release. Of the 923 fixes: 259 were in Hadoop Common 350 were in HDFS 253 were […]

This is the third post in a series exploring recent innovations in the Hadoop ecosystem that are included in Hortonworks Data Platform (HDP) 2.2. In this post, we introduce the theme of supporting rolling upgrades and downgrades of a Hadoop YARN cluster. HDP 2.2 offers substantial innovations in Apache™ Hadoop YARN, enabling Hadoop users to […]

This is the second post in a series that explores recent innovations in the Hadoop ecosystem that are included in HDP 2.2. In this post, we introduce the theme of running service-workloads in YARN to set context for deeper discussion in subsequent blogs. HDP 2.2 brings substantial innovations in Apache Hadoop YARN, enabling users of […]

This is the first post in a series that explores recent innovations in the Hadoop ecosystem that are included in HDP 2.2. In this post, we introduce themes to set context for deeper discussion in subsequent blogs. HDP 2.2 represents another major step forward for Enterprise Hadoop. With thousands of enhancements across all elements of […]

Jian He (Apache YARN Hadoop committer) and I discuss Apache Hadoop YARN’s Resource Manager resiliency upon restart in this blog. This is the third blog post in the series on motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager (RM) resiliency. Others in the series are: Introduction: Apache YARN Resource Manager Resiliency […]

This is the second in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are: Introduction: Apache YARN Resource Manager Restart Resiliency Introduction: Phase I – Preserve Application-queues In the introductory blog, we previewed what RM Restart Phase I entails. In essence, […]

This is the first post in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are: Resilience of Apache YARN Applications across ResourceManager Restart – Phase 1 Resilience of Apache Apache Hadoop YARN across ResourceManager Restart – Phase 2 Resource Manager (RM) […]

User logs of Hadoop jobs serve multiple purposes. First and foremost, they can be used to debug issues while running a MapReduce application – correctness problems with the application itself, race conditions when running on a cluster, and debugging task/job failures due to hardware or platform bugs. Secondly, one can do historical analyses of the […]

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the ninth post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series: Introducing Apache Hadoop […]

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the 8th post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series:  Introducing Apache Hadoop […]