Announcing HDP 2.1 Tech Preview Component: Apache Spark

Lighting Up Machine Learning & Data Science

Spark-logo-192x100pxHadoop 2 and its YARN-based architecture has increased the interest in new engines to be run on Hadoop and one such workload is in-memory computing for machine learning and data science use cases. Apache Spark has emerged as an attractive option for this type of processing and today, we announce availability of our HDP 2.1 Tech Preview Component of Apache Spark.  This is a key addition to the platform and brings another workload supported by YARN on HDP.

There has been a marked increase in interest among data scientists and enthusiasts for Apache Spark as they explore new ways to perform their very unique yet complex task in Hadoop. Our customers are investigating this technology as Spark allows key resources to effectively and simply implement iterative algorithms for advanced analytics such as clustering and classification of datasets.  It provides three key value points to developers,

  • in-memory compute for iterative workloads,
  • a simplified programming model in Scala,
  • and machine learning libraries to simply programming.

The Hortonworks Commitment

Hortonworks’ tech preview of Apache Spark is part of a larger initiative that will bring the best of heterogeneous, tiered storage, and resource-based models of computing together with the broader Hadoop community starting at the core of HDFS and working outward from there.

Our focus remains on delivering a fast, safe, scalable, and manageable platform on a consistent footprint that includes HDFS, YARN, Tez, Ambari, Knox, and Falcon to name just a few of the critical components of enterprise Hadoop.  We are working within this comprehensive set of components and hope to make Apache Spark “enterprise ready” so that our customers can confidently adopt it.

Availability

We have already completed baseline work to integrate Spark on YARN with HDP and want to make this available so that we work with customers to identify use cases which are best suited for this technology. Spark has the potential to meet the complex requirements of data science and machine learning use cases.  We encourage you to download our Tech Preview today and engage us so we can build out this key technology together.

Evaluate Spark on HDP Here

Categorized by :
Administrator CIO & ITDM Data Analyst & Scientist Developer New Features Spark

Comments

|
July 16, 2014 at 4:08 pm
|

Right now the Spark 0.9.1 based tech preview is out which provides step-by-step instructions to evaluate Spark http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf

Very shortly we will provide an update of this tech preview with latest version of Spark.

Abel Coronado
|
July 9, 2014 at 8:51 pm
|

Hi !!!

We want to deploy spark 1.0.0 in a HDP cluster, is it possible to have a step by step guide???

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Explore Technology Partners
Hortonworks nurtures an extensive ecosystem of technology partners, from enterprise platform vendors to specialized solutions and systems integrators.