On October 15 we announced that we would support Apache Hadoop as an Infrastructure as a Service (IaaS) on Microsoft Azure. This made us the first Hadoop vendor to give customers and prospects access to that flexible and scalable cloud infrastructure for their big data deployments.
This guide walks you through using the Azure Gallery to quickly deploy Hortonworks Data Platform (HDP) clusters on Microsoft Azure IaaS.
What you need is:
Start by logging into the Azure Portal with your Azure account: https://portal.azure.com/
Navigate to the ‘MarketPlace’ -> ‘Virtual Machines.’ Expand on the ‘Recommended’ row, and this will open up a side pane with the ‘Hortonworks Data Platform’ icon available. Click on the Hortonworks Data Platform icon. This will launch the wizard to configure HDP for deployment.
Hortonworks provides you with two configuration choices for your cluster, based on the type of workloads you want to run.
Evaluation clusters have a predefined HDP Master and HDP Worker node virtual machine size and node count. You just need to set the appropriate Azure Subscription to use and Location to generate the cluster, and then click Create to kick off the deployment. HDP is deployable across all Locations that Azure is available in.
This configuration is ideal for running Proof of concept and production workloads: It consists of flexible size cluster deployments that provides the option to utilize larger Azure Virtual Machines.
With Standard clusters, you can set the Virtual Machine sizes and count to use in the cluster. The following options are available:
|Master Node: Deploys three master nodes||Select from the following high capacity Azure Virtual Machine sizes: A7, A8 or A9||Default is A7|
|Worker Node: Deploy between 6-45 worker nodes||Select from the following high capacity Azure Virtual Machine sizes: A7, A8 or A9||Default is A7|
Set the appropriate Azure Subscription to use and Location to generate the cluster, and then click Create to kick off the deployment. HDP is deployable across all Locations that Azure is available in.
Log into Ambari. Find the public hostname of the first master node. This will be of the format “-master-01.cloudapp.net” – for example, if your Resource group was set to “my-hdp,” then the first master node’s public hostname is “my-hdp-master-01.cloudapp.net”.
In a browser, go to the HTTPs Ambari Server URL on that first master node: https://my-hdp-master-01.cloudapp.net:8443
Click through any warnings about access. You will get to the Ambari login page. Use the password that you configured before to log in.
To secure the network ports, only the Ambari web ui port on the first master node is enabled. To enable the other Management UIs, in the Microsoft Azure portal, go into the virtual machine settings and configure Endpoints.
For Evaluation clusters, the following hosts and ports are used:
For Standard clusters, the following hosts and ports are used: