Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
June 26, 2014
prev slideNext slide

Making Apache Spark YARN Ready

Spark on YARN

Hadoop 2 and its YARN-based architecture has ushered in a new wave of innovation in and around Hadoop. One technology benefitting from this maturation is Apache Spark. Spark is a unique and powerful engine for building and executing iterative algorithms for advanced analytics such as clustering and classification of datasets.

HDPSparkIn early May, we made Spark available as a Technology Preview download for use with Hortonworks Data Platform 2.1, and in June we announced our broader “YARN Ready” program aimed at accelerating the number of data processing solutions that take advantage of YARN as the architectural center of Hadoop.

Today, we announce certification of Apache Spark as YARN Ready. This certification ensures memory and CPU intensive Spark-based applications can co-exist within a single Hadoop cluster with all the other workloads you have deployed. Together, they allow you to use a single cluster with a single set of data for multiple purposes rather than silo your Spark workloads into a separate cluster.

This means you can deploy interactive SQL query applications with Hive and low latency application using HBase alongside your iterative, machine learning workloads deployed using Spark. As such, you eliminate the need to have a separate system or separate set of resources for your data science work.

Certifying Spark as “YARN Ready” provides assurance to end users interested in deploying their data lakes so that their YARN-based applications, including Spark applications, work cooperatively with predictable performance.

The Hortonworks Commitment

Hortonworks’ tech preview of Apache Spark is part of a larger initiative that will bring the best of heterogeneous, tiered storage, and resource-based models of computing together with the broader Hadoop community starting at the core of HDFS and working outward and upward. Certifying Spark as YARN Ready, integrating Spark with Ambari so it’s easily provisioned, managed, and monitored, and integrating Spark with XA Secure (our recent security-related acquisition) for centralized authentication and audit are just some of the efforts that prepare Spark for use within a broader Enterprise Hadoop platform.

Our focus remains on delivering a fast, safe, scalable, and manageable data platform on a consistent footprint that includes HDFS, YARN, Tez, Ambari, Knox, Falcon and Spark to name just a few of the critical components of enterprise Hadoop. Working within this comprehensive set of components, we make Apache Spark “enterprise ready” so that our customers can confidently adopt it.

Learn More



Dave Moore says:

Spark 1.0 or 0.9? Great news either way.

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums