Analytics on HDInsight Service – Harnessing the power of Hadoop on the Cloud
This week, we announced the launch of Hortonworks Data Platform (HDP) 1.3 for Windows which brings our native Windows Hadoop distribution to parity with our Linux distribution. HDP for Windows is also the Hadoop foundation for Microsoft’s HDInsight Service which delivers Hadoop and BI capabilities in in the Azure cloud.
Impetus, a Hortonworks System Integrator partner, is an early adopter of the Hortonworks Data Platform (HDP) and has leveraged the combined power of Hadoop & Microsoft Azure platform for a number of successful big data implementations using Microsoft’s HDInsight Service.
Vineet Tyagi, Associate VP & Head of Innovation Labs at Impetus is our guest blogger.
Our customers indicate that they have derived significant business value leveraging the power of Hadoop in the Cloud. Impetus-led implementations have used Hadoop MapReduce jobs as well as Mahout on top of HDInsight for Data-Analytics such as Recommendations, Clustering and Classification.
A few example use cases from our customer implementations successfully using HDInsight include:
- A recommender built for analyzing weblogs as input that generates recommendations for various users, based on Item/User similarity. The visualization was implemented using PowerPivot over Excel. Using this solution the customer was able to realize an increase of 15%+ in the conversion rates and order value.
- A categorization/classification engine with Apache Mahout (using NaiveBayes, RandomForest, and Complimentary Naïve Bayes algorithms), extensive usage of Apache Hive was made for data preprocessing. The solutions were deployed successfully on the Azure cloud.
- An analytics solution on media related data built using HDInsight Service platforms. The use case involved ingesting high volumes of data with 20+ dimensions and performing both low-latency queries, as well as richer analytics on the dataset in various dimensions. The source data was subjected to initial process of massaging, merging, segregating and populating it into HDFS and Hive. It was then preprocessed to store aggregated data into SSAS cube.
The environments used included:
- High-end Windows Server on Azure running SSAS cube
- Hadoop cluster with Hive for querying (HDInsight Service)
- Azure BLOB storage for data ingestion
The administrative and deployment tools available with the HDInsight platform have helped Impetus to quickly conceptualize and implement solutions that deliver business value to enterprises. We are very enthusiastic about the new HDP v 1.3 for Windows release and have already started tapping the power of this upgrade in implementing solutions for customers.
For further information on Impetus, visit www.Impetus.com.