Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 31, 2014
prev slideNext slide

Databricks: An Expanded Partnership with Hortonworks

Arsalan Tavakoli-Shiraji, customer engagement lead overseeing business development activities at Databricks, is our guest blogger today. In this blog, he discusses our expanded partnership built around Apache Spark on Apache Hadoop in three areas: customers, engineering, and open source.

Today Databricks and Hortonworks are announcing an expanded partnership built around Apache Spark; allow me to explain why we’re thrilled to be embarking on this journey with them.

When we started Databricks last summer, Apache Spark was in the early stages of enterprise adoption. There were less than 100 contributors, no commercial distributions, and limited production deployments shared publicly. As we set down this path, some of our primary goals were to rapidly expand and mature Spark’s capabilities and simplify the adoption process by providing a broad array of deployment options. We knew early on that the Spark community and partner ecosystem would be critical to achieving these goals.

As we fast forward to the present, we are thrilled with the pace at which Spark has advanced. It is now the most active project in the open source Big Data ecosystem with more than 330 contributors in the past year alone from over 50 organizations. It is distributed by over 10 platform vendors, including all the major Hadoop distributors. Spark has been deployed in production at scale across a broad array of verticals, including financial services, telecom, retail, public sector, and technology with data sizes ranging from 10’s of GB’s to hundreds of Petabytes.

Yet, there is still much left to achieve, and that’s why we’re excited to be partnering with Hortonworks and further embracing them in the Apache Spark community. As part of this partnership, we are aligned around the following three areas:

  • Customers: As a leading Hadoop vendor, Hortonworks has a broad array of customers across a number of verticals, many of which have already started down the Spark path or are eager to get going. We’re excited to lend our expertise to help ensure these enterprises are able to gain deeper insights, faster with Spark.
  • Engineering: Hortonworks has a strong record of engineering contributions in the open source community, and we look forward to that trend continuing with Spark. Improving Spark’s performance on Apache Hadoop YARN, integration with open source security efforts such as Apache Ranger (incubator project), and better support for the ORC file format are core areas where Hortonworks brings expertise and can quickly provide meaningful contributions.
  • Open Source DNA: Hortonworks has always been unequivocal in its stance: everything it does is 100% open source – which manifests itself not only in its development model but also it’s commitment to the community. Even back in June, Hortonworks was part of our inaugural “Certified Spark Distribution” class because they realized the importance of supporting the growing 3rd party application ecosystem.

There is no doubt that this is an exciting time for Big Data in general, and Spark in particular. Enterprises are increasingly adopting Spark to deliver increased ROI on their Big Data initiatives and a rapidly growing set of 3rd party applications are being built on top of Spark to harness its capabilities as a blazing fast and versatile data processing engine; we look forward to this important partnership helping push Spark even farther.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *