Big data and cloud computing are top priorities in enterprise IT today. Organizations are adopting these two disruptive technologies because of the promise of lower cost, flexibility, portability and ease of management.
Today’s blog is another in a series discussing Apache Hadoop in the cloud as a key deployment option. Our guest blogger today is Sean Anderson, Manager of Data Service at Rackspace, the managed cloud company.
In 2012, Rackspace and Hortonworks partnered to expand the capabilities of Enterprise HadoopTM to both public cloud utility services and private clouds utilizing the popular open-source cloud platform Openstack. Rackspace was an initial founder of the Openstack technology along with NASA, and while Rackspace stays active in developing all their public cloud platforms off Openstack projects, the community that has been created encompasses more than a single technology vendor or school of thought.
Rackspace and Hortonworks had a commonality, a sincere focus on open-source technologies and facilitating the ecosystems they create. It was this common dedication that lead to the creation of a Hadoop platform-as-a-service that spans deployment capabilities and allows for user to select the exact technology model to fit their use-case and SLA. In addition, Rackspace’s years of hosting architecture expertise brings new, forward-thinking optimizations into the fold that are coupled with strong Hadoop expertise at Hortonworks.
Often users need to scale from a small environment into a production-grade solution quickly, which is why we have solutions that all utilize the same HDP versioning as well as many of the same tooling. Our combined team can help practitioners scale between these solutions with minimal disruption to the business.
Cloud Big Data Platform – Cloud Big Data Platform was engineered for simplicity enabling users to spin up their clusters in much the same way they do their applications in the cloud. A few details about the size of your file system and the geo-location of your cluster and you have an environment in minutes. This can be done through an API, control panel, or command-line utility. We have deployed specialized storage dense and network optimized machines to ensure optimal query performance. Users can start with an environment as small as 1.25TB and grow into the multi petabytes. One popular trend we see is that these resources do not have to be available all the time. We have built a connector into our Cloud Files (Object Storage) and ObjectRocket MongoDB services that allows data to be stored and compute to only be deployed when a query needs to be run. Cloud Big Data Platform is also a perfect environment for an expanded Sandbox activity.
Cloud Big Data OnMetal – Most multi-tenant services come at the expense of shared resources. Users often architect to achieve the performance they desire, but in the multi-tenant public cloud, things like IO and throughput become valuable resources. Rackspace has launched a new technology that will deploy a dedicated server with an API in minutes but does not impose the penalties of virtualization at the application layer. Our first flavor included a highly performant and RAM optimized server for increased performance for Apache Tez and Apache Spark on HDP functions. Many users are also focused on OnMetal to achieve the physical isolation that they need to ensure data integrity and security is met.
Managed Big Data Platform – Not all scenarios are ideal for the cloud, especially when data is sensitive and regulations need to be met. Hadoop has had a deep baremetal ethos since the beginning, and for many users it is still the model that allows them the control and predictability that they need. The combined expertise of Hortonworks in Hadoop application support and Rackspace’s deep knowledge and fanatical approach to problem solving ensure that no component of a successful technology stack is ignored. We also offer private cloud Hadoop deployments utilizing Openstack, Microsoft, and VMware clouds, which is important if you have a multi-cloud strategy.
With deployment options plentiful, users can focus on building their applications harnessing the powerful capabilities of data processing platforms like Hadoop in the manner that best serves them today.