Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
July 23, 2014
prev slideNext slide

Apache Hadoop YARN: Present and Future

Although the Hadoop Summit San Jose 2014 has come and gone, the invaluable content—keynotes, sessions, and tracks—is available here. We ’ve selected a few sessions for Hadoop developers, practitioners, and architects, curating them under Apache Hadoop YARN, the architectural center and the data operating system.

In most of the keynotes and tracks three themes resonated:

  1. Enterprises are transitioning from traditional Hadoop to modern Hadoop 2.
  2. YARN is an enabler, the central orchestrator that facilitates multiple workloads, runs multiple data engines, and supports multiple access patterns—batch, interactive, streaming, and real-time—in Apache Hadoop 2.
  3. Apache Hadoop 2, as part of Modern Data Architecture (MDA), is enterprise ready.
  4. Multiple Work Loads, Multiple Access Patterns

    Whether your data are at rest or in motion, the demands from enterprises for data access patterns, multiple workloads, and efficient resource management have driven Hadoop’s evolution in the last two years. At the core of its evolution is YARN, a data operating system with a pluggable framework that manages resources for all the data access engines running natively on it.
    data

    YARN has fundamentally changed Hadoop. In a way, YARN’s evolution and potential reminds me of how UNIX liberated us from a single-process, single-user model to a multi-process, multi-user pluggable platform—think of shared libraries (.so) in UNIX, think of pluggable Linux kernel modules (.ko), think of myriad data applications running on top of the same platform. In this case, the platform is a Hadoop 2 cluster.

    In short, YARN supports multiple programming paradigms. You are not confined only to MapReduce. Instead, you can implement purpose-built programming paradigm, implemented as YARN native apps, atop YARN for your distributed tasks.

    These couple of tweets sum up the YARN sentiments:

    tweet_1

    tweet_2

    Apache Hadoop 2 and Apache Hadoop YARN

    Here are a few keynotes and sessions that speak to these themes:

    Session Title Watch View
    Unlocking Hadoop’s Potential Video
    Enterprise Hadoop for Pools, Ponds, Clouds and Beyond Video
    Apache Hadoop YARN: Present and Future Video Slides
    YARN: The Key to Overcoming the Challenges of Broad-based Hadoop Adoption Video Slides
    One Grid to Rule them All: Building a Multi-tenant Data Cloud with YARN Video Slides
    Lessons Learned from Migration of a Large-analytics Platform from MPP Databases to Hadoop YARN Video Slides

    We cherry picked these few tracks, because they explore the potential of YARN. However, you can always peruse through all the tracks on the schedule’s session description along any time slot, on any day, that piques your curiosity. For example, when you hover and click on a session description, a popup will display in which you can either elect to watch the video or view the slides.

    spark_on_yarn

    What’s Next?

    You can learn more about YARN by

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>