cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
December 04, 2014
prev slideNext slide

Deploying Hadoop On a Hybrid Cloud with Microsoft

As more organizations consider the cloud as a component of their Apache Hadoop deployments, we can look to our partners for a range of solutions designed to meet these needs. This is the first post in a series on partner solutions available for deploying Hadoop in the cloud. We will build on the Hybrid deployment post with general use cases for Hadoop in a Hybrid cloud. Through our partners we have broad set of options for the cloud available today spanning on-premises, virtual and cloud-based deployments.

Hadoop in the Hybrid cloud with Microsoft

This week we will focus on the cloud offerings with Microsoft. Their on-premises offerings include Hadoop on Windows and in their Microsoft APS appliance. In the cloud, there are several deployment options offering flexibility and choice for the enterprise. The complete breadth of Hadoop offerings with Microsoft is shown below.

hdp_1_msft

Pushing data: Automated cloud backup for Microsoft Azure

Data architects require Hadoop to act like other systems in the data center, and business continuity through replication across on-premises and cloud-based storage targets is a critical requirement. In HDP 2.2, we jointly extended the capabilities of Apache Falcon to establish an automated policy for cloud backup to Microsoft Azure.
After deploying cluster back up, users now have the option to dynamically spin up a Hadoop cluster, using HDInsight for data processing and analytics in the cloud.

This is an important first step in a broader vision to enable seamlessly integrated hybrid deployment models for Hadoop as illustrated in the diagram below.

hdp_2_msft

Hadoop on Azure Infrastructure as a Service (IaaS)

Earlier this year we announced that the Hortonworks Data Platform (HDP) was the first platform to be certified to run on Microsoft Azure Infrastructure as a Service. This gave customers new deployment choices for small and large deployments in the cloud. With this new certification, Hortonworks and Microsoft made Apache Hadoop more widely available and easy to deploy for data processing and analytic workloads enabling the enterprise to expand their modern data architecture. Last week we provided a getting started guide to help jumpstart Hadoop initiatives that need to use Hadoop on Azure IaaS.

Hadoop on Azure Platform as a Service (PaaS)

We’ve been working with Microsoft on joint engineering initiatives for more than 3 years and one of the first products of that engineering work was making Apache Hadoop available natively on Windows. Subsequently we announced the Hortonworks Data Platform for Windows and not long after that Microsoft announce Microsoft HDInsight. HDInsight is Microsoft’s PaaS Hadoop offering and is built on HDP for Windows. This both provides an enterprise grade version of Apache Hadoop in the cloud as well as ties neatly into additional Microsoft Azure based offerings such as Azure Machine Learning.

Interoperability: Maximizing Hadoop Deployment choice for Microsoft Customers

These latest efforts further expand the deployment options for Microsoft customers while providing them with complete interoperability between workloads on-premises and in the cloud. This means that applications built on-premises can be moved to the cloud seamlessly. It also means that data analytics first created and validated in the cloud can be moved on-premises. Complete compatibility between these infrastructures gives customers the freedom to use the infrastructure that best meets their needs. You can backup data where the data resides (geographically) and provide the flexibility and opportunity for others to do Hadoop analytics in the cloud (globally).

We are excited to be working with Microsoft to extend the Modern Data Architecture to the Hybrid cloud and we look forward to continuing our long history of working with Microsoft to engineer and offer solutions that meet the most flexible and easy to use deployment options for big data available.

Additional resources:

 

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *