Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 31, 2014
prev slideNext slide

Microsoft Azure IaaS Getting Started Guide

On October 15 we announced that we would support Apache Hadoop as an Infrastructure as a Service (IaaS) on Microsoft Azure. This made us the first Hadoop vendor to give customers and prospects access to that flexible and scalable cloud infrastructure for their big data deployments.

This guide walks you through using the Azure Gallery to quickly deploy Hortonworks Data Platform (HDP) clusters on Microsoft Azure IaaS.

What you need is:

  • A Microsoft Azure account
  • That’s it!

Start by logging into the Azure Portal with your Azure account: https://portal.azure.com/

Navigate to the ‘MarketPlace’ -> ‘Virtual Machines.’ Expand on the ‘Recommended’ row, and this will open up a side pane with the ‘Hortonworks Data Platform’ icon available. Click on the Hortonworks Data Platform icon. This will launch the wizard to configure HDP for deployment.

azure_guide_1

azure_guide_2You first start with setting up cluster name and credentials. Set the following:

  • Resource group: This represents a grouping for all resources that are tied to this HDP cluster instance. Choose a unique name (that you have not used before).
  • User name: This will represent the name of the Administrator account for the Linux virtual machines that are part of the cluster.
  • SSH Key: The content of the public SSH key that you will use to remotely log into the Linux virtual machines. If you don’t have a public/private key, you can create it on your local machine and then upload the contents of the public key here.
  • Ambari password: the password that will be used for the Ambari Server Web UI login.

Hortonworks provides you with two configuration choices for your cluster, based on the type of workloads you want to run.

Evaluation Workload Clusters

azure_guide_3This configuration is ideal for rapid evaluation of HDP: It consists of a 5 node cluster running on cost-efficient Azure Virtual Machines.

Evaluation clusters have a predefined HDP Master and HDP Worker node virtual machine size and node count. You just need to set the appropriate Azure Subscription to use and Location to generate the cluster, and then click Create to kick off the deployment. HDP is deployable across all Locations that Azure is available in.

Standard Workload Clusters

This configuration is ideal for running Proof of concept and production workloads: It consists of flexible size cluster deployments that provides the option to utilize larger Azure Virtual Machines.

With Standard clusters, you can set the Virtual Machine sizes and count to use in the cluster. The following options are available:

Master Node: Deploys three master nodes Select from the following high capacity Azure Virtual Machine sizes: A7, A8 or A9 Default is A7
Worker Node: Deploy between 6-45 worker nodes Select from the following high capacity Azure Virtual Machine sizes: A7, A8 or A9 Default is A7

azure_guide_4

Set the appropriate Azure Subscription to use and Location to generate the cluster, and then click Create to kick off the deployment. HDP is deployable across all Locations that Azure is available in.

Using Ambari and HDP

azure_guide_5You will get a notification in the Azure portal when the HDP cluster is successfully deployed. Wait for the notification, and then proceed forward.

Log into Ambari. Find the public hostname of the first master node. This will be of the format “-master-01.cloudapp.net” – for example, if your Resource group was set to “my-hdp,” then the first master node’s public hostname is “my-hdp-master-01.cloudapp.net”.

In a browser, go to the HTTPs Ambari Server URL on that first master node: https://my-hdp-master-01.cloudapp.net:8443

Click through any warnings about access. You will get to the Ambari login page. Use the password that you configured before to log in.

User: admin
Password:

Enabling Management UI Ports

To secure the network ports, only the Ambari web ui port on the first master node is enabled. To enable the other Management UIs, in the Microsoft Azure portal, go into the virtual machine settings and configure Endpoints.

For Evaluation clusters, the following hosts and ports are used:

  • HDFS NameNode UI: :50070
  • YARN ResourceManager UI: :8088
  • HBase Master: :60010
  • Storm UI: :8744
  • Falcon UI: :15000

For Standard clusters, the following hosts and ports are used:

  • HDFS NameNode UI: :50070
  • YARN ResourceManager UI: :8088
  • HBase Master: :60010
  • Storm UI: :8744
  • Falcon UI: :15000

Useful Product Links

Tags:

Comments

  • Hi Rohit,

    I followed the instructions given on this page by following the instructions. I could see the Azure has created 5 nodes cluster for Hortonworks. ( I saw the notification that it has been created successfully, I also see there 13 resources created : 5 Virtual Machines, 5 Cloud Services, 2 Storage, 1 Virtual Network).

    Resource name used is : “my-hdpcluster1” and so the
    Master name happens to be : “my-hdpcluster1-master-01”.

    As per this guide, the address of Ambari should be “http://my-hdpcluster1-master-01.cloudapp.net:8443/” when I enter and hit this URL in the browser I see
    No data received
    ERR_EMPTY_RESPONSE

    I am not sure if I did something wrong. Could you please suggest what needs to be modified to make it work?

    Thank you.
    Arpan

  • Leave a Reply

    Your email address will not be published. Required fields are marked *