cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
cta

Tools and applications that are YARN Ready have been verified to work within YARN, which means they are able to use the resources of the customer’s Hadoop system to process Hadoop data in-place, without interfering with other YARN Ready tools and applications.

Apache Hadoop YARN is the data operating system for Hadoop 2, responsible for managing access to Hadoop’s critical resources. YARN enables a user to interact with all data in multiple ways simultaneously, making Hadoop a true multi-use data platform and allowing it to take its place in a modern data architecture. Customers building a data lake expect to operate on the data without moving it to other systems, leveraging the processing resources of the data lake. Applications that use YARN fulfill that promise, lowering operational costs while improving quality and time-to-insight.

fpo

HOW DO YOU BECOME YARN READY?

The first step is to determine the appropriate integration framework for your application. If it uses YARN natively, or uses a YARN framework (e.g. Apache Tez, Apache Slider, etc.), you’re well on your way. If it does not use YARN — e.g. it reads directly from HDFS — you’ll want to look into the expanding number of approaches that allow you to move to a YARN-based application.

There are four options available for integration into YARN:

Style Engine Example
Full control YARN Native Custom or packaged app where fine grained control of cluster resources is required
Batch (Legacy) MapReduce Existing MapReduce application written for the 1.x code line
Batch or Interactive Query Tez Business Intelligence or analytic applications that optimize throughput while reducing latency
Online or Realtime Service Slider “Always-on” services, e.g. online or streaming applications