As we speed towards wide spread enterprise adoption of Apache Hadoop, it has become readily apparent that this new data platform must not only capture, process and distribute data, but it also must be able to be deployed in a variety of ways, be it on premise, in a VM, as an appliance or better yet in the cloud…
Today we announced a new relationship with Rackspace in which we will develop an OpenStack based Hadoop solution for the public and private cloud. This is not just a paper relationship. It is a joint effort to produce and make available Hortonworks Data Platform for OpenStack in early 2013.
There are customers today that deploy Hadoop clusters using HDP on dedicated hardware at Rackspace and this is now available as a turn-key, on-demand service running on the Rackspace open cloud and in clusters on private cloud infrastructure in data centers or a customer’s data center.
Why does this make sense?
Well, when you speak of the OpenStack we think of compute, networking and storage as the three main components. OpenStack was created by Rackspace as a collaborative software project designed to create freely available code, badly needed standards, and common ground for the benefit of both cloud providers and cloud customers. In this environment, Hortonworks just makes sense. Our 100% open source approach is freely available; standards based and better yet open to integrate with the ecosystem and other stack components. More importantly, core Hadoop is compute and storage and Hortonworks provides the most stable and reliable distribution for this. For wide scale adoption, Hadoop must be enterprise ready and HDP represents this.
Avoid Vendor Lock
The point of an OpenStack is to provide an open and scalable operating system for building public and private clouds. It provides both large and small organizations an alternative to closed cloud environments, reducing the risks of lock-in associated with proprietary platforms. With Rackspace you simply provision the service and you are “good to go”. With Hortonworks, we add a new service to the stack that is also provisioned via Rackspace so you can be up and running in minutes and without license and without the vendor lock.
The main reason we can do this is we package a fully open Apache Ambari for monitoring and managing a cluster. With other distributions you need to purchase these same capabilities, which not only locks you in to the vendor for license but also closes the ecosystem, as the open source community can no longer be a source for patches or upgrades. You need to wait for your vendor to release their proprietary fix, even for the open source bits they built on top of. Not with Hortonworks.
This approach allows customers to invest further into the open cloud future to confidently invest in a technology for the long term.
Where exactly IS your data?
Many have turned to the cloud to store or process data. Doesn’t it make sense to extend this processing for big data in the cloud where much data already resides? Well with this new offer you can do just that and in only a matter of minutes. You can easily extend your current Rackspace environment by firing up a Hadoop cluster and there is no need to move data from internal resources to the cloud the data is already there. While this may not be the case for every Hadoop project, it makes sense for many and it may make sense for many Rackspace customers.
Rackspace & Hortonworks… seems like a match made in heaven, well, maybe in the clouds