2013 was certainly a revealing year for the Enterprise Hadoop market. We witnessed the emergence of the YARN-based architecture of Hadoop 2 and a strong ecosystem embracement that will fuel its next big wave of innovation. The analyst community accurately predicted Hadoop’s market momentum would greatly accelerate, but none predicted a pure play vendor would publicly declare its intent to pivot away from the Enterprise Hadoop market. Interesting times indeed!
Join us on Tuesday January 21st where we’ll be covering the Enterprise Hadoop State of the Union in more detail.
Mike Gualtieri at Forrester sums up the state of Hadoop quite nicely: “Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data”.
Further, Tony Baer from Ovum talks about how “Hadoop is becoming a more ‘normal’ software market” and the “Hadoop vendor ecosystem [is] gaining critical mass”.
While Hadoop’s momentum has been impressive, its 2nd act has only just begun. In YARN: Weaving the Future of Hadoop, Robin Bloor talks about how “YARN is the major innovation in Hadoop 2.0” and “With Hadoop 2.0 we expect this ecosystem to grow like bamboo in spring time”. He states that thanks to YARN “Hadoop is no longer just a MapReduce-based batch environment. You will be able to run many applications on it concurrently.” – spanning batch, interactive, online, and streaming use cases all running IN Hadoop for example. Bloor discussed the new data processing engine called Apache Tez that runs on YARN and “provides a customizable framework for low latency and high throughput workloads” aimed at quenching the need for speed for technologies like Apache Hive (via the Stinger initiative), Apache Pig, Cascading and many commercial systems previously dependent on batch-only MapReduce.
And speaking of speed, GigaOm wrote about “non-batch functionality for Hadoop thanks to YARN, which lets Hadoop run all sorts of processing frameworks” – highlighting the integration of real-time stream processing via Apache Storm, with Enterprise Hadoop.
Bottom-line: We’re not in Kansas (aka MapReduce land) anymore. With the introduction of YARN and Tez, analysts agree that the Enterprise Hadoop market is entering an even more compelling wave of innovation that is certain to see further adoption from major technology vendors.
One company’s rationale for trying to pivot away from the Enterprise Hadoop market to focus on its own proprietary big data product appears driven by the desire for more control and immediate profits. The pivot clearly discounts the power of community-driven open source innovation and its ability to outpace any single vendor’s agenda.
Cases in point: the progress of Apache Ambari (open source Hadoop management) and the Stinger Initiative (enhancing the Speed, Scale and SQL support of Apache Hive) have demonstrated that proprietary technology add-ons ultimately result in limited differentiation in the face of community collaboration. Stinger Phase 2, for example, included contributions from no less than 10 commercial entities including Microsoft, Facebook, SAP and many more.
Hortonworks has maintained a consistent focus enabling Hadoop to be an enterprise-viable data platform that uniquely powers a new generation of data-driven applications and analytics. Our vision where “half the world’s data will be processed by Apache Hadoop” keeps us laser-focused on innovating Enterprise Hadoop in the open and unlocking the broader opportunities beyond just Hortonworks.
In his Notes from Hadoop World paper, Cowen & Co. analyst Peter Goldmacher sums up the big, bigger and biggest opportunities quite nicely:
“We believe Hadoop is a big opportunity and we can envision a small number of billion dollar companies based on Hadoop. We think the bigger opportunity is Apps and Analytics companies selling products that abstract the complexity of working with Hadoop from end users and sell solutions into a much larger end market of business users. The biggest opportunity in our mind, by far, is the Big Data Practitioners that create entirely new business opportunities based on data where $1M spent on Hadoop is the backbone of a $1B business.”.
While Enterprise Hadoop plays a Big role, unlocking its long-term potential hinges on helping companies like HP, Microsoft, Rackspace, SAP and Teradata bring more data under management in order to unlock the Bigger opportunity that Goldmacher describes. Making Hadoop easier to use and consume by mainstream enterprises accelerates its time to value – fueling the Biggest opportunities for the end users themselves.
Over the past few months, there’s been an interesting debate over which business model for Enterprise Hadoop is best. Jeff Kelly summarized things in his SiliconAngle article “The Better Model for Hadoop: Open Source or Proprietary Approach?”. Needless to say, Hortonworks believes a 100% open source approach is best, and I’d like to use Red Hat’s growth over the past 15 years to draw out some key points for the discussion.
While there are many drivers to Red Hat’s hockey stick growth, the key items to me are:
Red Hat placed bets and investments early on designed to grow the market with strategic partners like IBM, HP, and Oracle, as well as to enable a broad ecosystem of solutions built on its platform. The curve above illustrates this long game perfectly, including the compounding power of an open source software subscription model.
Completing the comparison, Red Hat established a vision for 100% open source Enterprise Linux that extended far beyond the work happening on the Linux Kernel.
At Hortonworks, our vision for Enterprise Hadoop is logically similar since it spans innovations within Apache Hadoop (i.e. the “kernel”) as well as a range of 100% open source projects focused on addressing the core platform, data, and operational services required in an enterprise data platform.
We believe that by focusing on innovating within the open source community (specifically Apache Software Foundation projects) with a focus on making Hadoop easy to use and consume by the enterprise and deeply integrated with key datacenter technologies, the platform will get pulled into enterprises far more quickly than a model that’s closed and/or driven by a single vendor.
Bottom-line: We’re committed to innovating in the open since it provides the fastest and most transparent path to value for everyone using Enterprise Hadoop. And unlike others, the long-term viability of our business is predicated on us always honoring this commitment to our customers and partners. That’s how we ALL win.
While I touched on the key innovations happening with YARN, Tez, and Stinger above, there’s even more happening beyond that spanning Security, Dataset Management, OpenStack-powered Clouds, and beyond.
Rather than cover these details now, let’s get together on Tuesday January 21st where we’ll be covering the Enterprise Hadoop State of the Union in more detail.
Sources used for Red Hat chart: