Building YARN Apps for Hadoop

Develop with YARN as the Data Operating System for Enterprise Hadoop

YARN is the data operating system of Hadoop that enables you to process data simultaneously in multiple ways. YARN provides provides the resource management and pluggable architecture to enable a wide variety of data access methods to operate on data stored in Hadoop with predictable performance and service levels.

Develop with YARN Engines

Engines such as Apache Tez and Apache Slider provide powerful frameworks to rapidly integrate 3rd party processing and services. YARN APIs can be used natively for complete control where needed. As a developer you can choose the option that suits your need.

YARN-Services-APIs

Apache Tez

Apache™ Tez generalizes the MapReduce paradigm to a more powerful framework for executing a complex DAG (directed acyclic graph) of tasks. By eliminating unnecessary tasks, synchronization barriers, and reads from and write to HDFS, Tez speeds up data processing across both small-scale, low-latency and large-scale, high-throughput workloads. More about Tez »

Apache Slider

Apache™ Slider is an engine that runs other applications in a YARN environment. With Slider, distributed applications that aren’t YARN-aware can now participate in the YARN ecosystem – usually with no code modification. Slider allows applications to use Hadoop’s data and processing resources, as well as the security, governance, and operations capabilities of enterprise Hadoop.

Data processing engines such as Apache Hive, HBase and Storm already take advantage of the available YARN APIs and Engines making those engines more powerful and versatile than ever before.

Applications integrating with Slider and Tez are eligible for certification in the YARN Ready program.

Develop with YARN APIs

YARN has become the data operating system for Hadoop and is the architectural center for development of Hadoop-based applications. The resources below can help you understand the YARN-based architecture of Hadoop 2 and how to build apps that can take full advantage of the possibilities.


Get an overview of Apache Hadoop YARN concepts in this slide deck.

STEP 1. Understand the motivations and architecture for YARN.

Concepts

Building Apps

STEP 2. Explore example applications on YARN.

The simple applications in this section show how to build and deploy apps against the YARN APIs and are a simple way to get started. These apps can be easily replicated in the Hortonworks Sandbox VM environment.

  • Simple YARN App. This ‘Hello World’ app for YARN runs n copies of a unix command.
  • Distributed Shell. This fuller example implements a distributed shell on YARN.
  • MemcacheD on YARN. A tutorial showing how to deploy the very popular MemcacheD framework on YARN.

STEP 3. Examine real world applications YARN.

These applications are richer applications built on YARN and demonstrate real-world use and deployment.

Further Resources

The following resources can also assist with developing Hadoop-based Apps on YARN.

Companies using YARN

Join the Webinar!

Big Data Virtual Meetup Chennai
Wednesday, October 29, 2014
9:00 pm India Time / 8:30 am Pacific Time / 4:30 pm Europe Time (Paris)

More Webinars »

Try these Tutorials

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.