In his blog, Tim Hall wrote, “Enterprises are embracing Apache Hadoop to enable their modern data architectures and power new analytic applications. The freedom to choose the on-premises or cloud environments for Hadoop that best meets the business needs is a critical requirement.”
One of the choices in deploying Hadoop in the cloud environment is with Microsoft Azure using Cloudbreak. Other choices include Google Cloud Platform, Openstack, and AWS.
But in this blog, I’ll show how you can deploy Hadoop in Azure with few clicks by running HDP multimode cluster in Azure’s Linux VM using Cloudbreak.
Azure is a cloud computing platform and infrastructure, created by Microsoft, for building, deploying and managing applications and services through a global network of Microsoft-managed datacenters.
Cloudbreak is a RESTful Hadoop as a Service API. Once Cloudbreak is deployed in your favorite servlet container, it exposes a REST API, allowing provisioning of Hadoop clusters of arbitrary sizes on your selected cloud provider.
Provisioning Hadoop has never been easier. Cloudbreak is built on the foundation of cloud providers API (Microsoft Azure, Amazon AWS, Google Cloud Platform, OpenStack), Apache Ambari, Docker containers, Swarm and Consul.
Before you get started, you must setup two accounts and understand Ambari Blueprints:
First, login into your Azure portal and create a network manually.
Then create X509 certificate on your local host with a 2048-bit RSA key pair. You need to run the command shown below on your local machine. You can choose the names of these files, as you like.
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azuretest.key -out azuretest.pem
As an example shown below, accept the default values at the prompt.
In the directory where you executed the openssl command, you will see two files created as listed below.
-rw-r–r– 1 nsabharwal staff 1346 May 7 17:00 azuretest.pem –> We need this file to create credentials in cloudbreak.
-rw-r–r– 1 nsabharwal staff 1679 May 7 17:00 azuretest.key –> We need this to login into the host after cluster deployment.
To avoid bad permission and security compliance errors, chmod the files as show below:
chmod 400 azuretest.key
For example: ssh -i azuretest.key cloudbreak@fqdn
You may face an issue where use of .key file may ask for passphrase. In this case, you need to check openssl version.
Check your openssl version and if it’s latest version then run the following and use azuretest_login.key to login
openssl rsa -in azuretest.key-out azuretest_login.key
OpenSSL 0.9.8zc 15 Oct 2014
Latest version of openssl creates .key with
—–BEGIN PRIVATE KEY—–
Old openssl creates keys with RSA (we need this)
—–BEGIN RSA PRIVATE KEY—–
Login to Cloudbreak portal and create Azure credential. Once you fill in the information and hit create credential, you will get a file from Cloudbreak that needs to be uploaded into the Azure portal.
Save the file as azuretest.cert on your local machine.
Creating Blueprints on Azure
Login to Azure portal (switch to classic mode in case you are using new portal)
Click Settings –> Manage Certificates then upload the bottom of the screen.
There are 2 more actions that you must perform before creating your cluster on Azure.
1) Create a template
You can change the instance type and volume type as per your setup.
2) Create an Ambari Blueprint – You can grab sample Blueprints here (You may have to format the Blueprint in case there is any issue)
After successfully creating a template and Blueprint on, you are ready to deploy your cluster that reflects your Blueprint.
From the Azure GUI, select the credential and hit create cluster
Login into your host where you can try some of these commands.
To get FQDN from the Azure portal
ssh -i azuretest.key ubuntu@fqdn or ssh -i azuretest.key cloudbreak@fqdn
To change to sudo user
sudo su -
To list processes
To get shell
docker exec -it <container id> bash
[root@azuretest ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f493922cd629 sequenceiq/docker-consul-watch-plugn:1.7.0-consul “/start.sh” 2 hours ago Up 2 hours consul-watch
100e7c0b6d3d sequenceiq/ambari:2.0.0-consul “/start-agent” 2 hours ago Up 2 hours ambari-agent
d05b85859031 sequenceiq/consul:v0.4.1.ptr “/bin/start -adverti 2 hours ago Up 2 hours consul
[root@test~]# docker exec -it 100e7c0b6d3d bash