Arsalan Tavakoli-Shiraji, customer engagement lead overseeing business development activities at Databricks, is our guest blogger today. In this blog, he discusses our expanded partnership built around Apache Spark on Apache Hadoop in three areas: customers, engineering, and open source.
Today Databricks and Hortonworks are announcing an expanded partnership built around Apache Spark; allow me to explain why we’re thrilled to be embarking on this journey with them.
When we started Databricks last summer, Apache Spark was in the early stages of enterprise adoption. There were less than 100 contributors, no commercial distributions, and limited production deployments shared publicly. As we set down this path, some of our primary goals were to rapidly expand and mature Spark’s capabilities and simplify the adoption process by providing a broad array of deployment options. We knew early on that the Spark community and partner ecosystem would be critical to achieving these goals.
As we fast forward to the present, we are thrilled with the pace at which Spark has advanced. It is now the most active project in the open source Big Data ecosystem with more than 330 contributors in the past year alone from over 50 organizations. It is distributed by over 10 platform vendors, including all the major Hadoop distributors. Spark has been deployed in production at scale across a broad array of verticals, including financial services, telecom, retail, public sector, and technology with data sizes ranging from 10’s of GB’s to hundreds of Petabytes.
Yet, there is still much left to achieve, and that’s why we’re excited to be partnering with Hortonworks and further embracing them in the Apache Spark community. As part of this partnership, we are aligned around the following three areas:
There is no doubt that this is an exciting time for Big Data in general, and Spark in particular. Enterprises are increasingly adopting Spark to deliver increased ROI on their Big Data initiatives and a rapidly growing set of 3rd party applications are being built on top of Spark to harness its capabilities as a blazing fast and versatile data processing engine; we look forward to this important partnership helping push Spark even farther.