Announcing HDP 2.1 Tech Preview Component: Apache Spark

Lighting Up Machine Learning & Data Science

Spark-logo-192x100pxHadoop 2 and its YARN-based architecture has increased the interest in new engines to be run on Hadoop and one such workload is in-memory computing for machine learning and data science use cases. Apache Spark has emerged as an attractive option for this type of processing and today, we announce availability of our HDP 2.1 Tech Preview Component of Apache Spark.  This is a key addition to the platform and brings another workload supported by YARN on HDP.

There has been a marked increase in interest among data scientists and enthusiasts for Apache Spark as they explore new ways to perform their very unique yet complex task in Hadoop. Our customers are investigating this technology as Spark allows key resources to effectively and simply implement iterative algorithms for advanced analytics such as clustering and classification of datasets.  It provides three key value points to developers,

  • in-memory compute for iterative workloads,
  • a simplified programming model in Scala,
  • and machine learning libraries to simply programming.

The Hortonworks Commitment

Hortonworks’ tech preview of Apache Spark is part of a larger initiative that will bring the best of heterogeneous, tiered storage, and resource-based models of computing together with the broader Hadoop community starting at the core of HDFS and working outward from there.

Our focus remains on delivering a fast, safe, scalable, and manageable platform on a consistent footprint that includes HDFS, YARN, Tez, Ambari, Knox, and Falcon to name just a few of the critical components of enterprise Hadoop.  We are working within this comprehensive set of components and hope to make Apache Spark “enterprise ready” so that our customers can confidently adopt it.


We have already completed baseline work to integrate Spark on YARN with HDP and want to make this available so that we work with customers to identify use cases which are best suited for this technology. Spark has the potential to meet the complex requirements of data science and machine learning use cases.  We encourage you to download our Tech Preview today and engage us so we can build out this key technology together.

Evaluate Spark on HDP Here

Categorized by :


Abel Coronado
July 9, 2014 at 8:51 pm

Hi !!!

We want to deploy spark 1.0.0 in a HDP cluster, is it possible to have a step by step guide???

July 16, 2014 at 4:08 pm

Right now the Spark 0.9.1 based tech preview is out which provides step-by-step instructions to evaluate Spark

Very shortly we will provide an update of this tech preview with latest version of Spark.

August 2, 2015 at 11:42 am

this is the website for those who need codes online.

October 30, 2015 at 12:46 am

this is he homepage to moviestarplanet game for vip membership online here.

December 9, 2015 at 3:02 am


February 10, 2016 at 6:05 am

good post and informative one.keep posting this kind of stuff.
thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.