Analytics on HDInsight Service – Harnessing the power of Hadoop on the Cloud

impetusThis week, we announced the launch of Hortonworks Data Platform (HDP) 1.3 for Windows which brings our native Windows Hadoop distribution to parity with our Linux distribution. HDP for Windows is also the Hadoop foundation for Microsoft’s HDInsight Service which delivers Hadoop and BI capabilities in in the Azure cloud.

Impetus, a Hortonworks System Integrator partner, is an early adopter of the Hortonworks Data Platform (HDP) and has leveraged the combined power of Hadoop & Microsoft Azure platform for a number of successful big data implementations using Microsoft’s HDInsight Service.

Vineet Tyagi, Associate VP & Head of Innovation Labs at Impetus is our guest blogger.

Our customers indicate that they have derived significant business value leveraging the power of Hadoop in the Cloud. Impetus-led implementations have used Hadoop MapReduce jobs as well as Mahout on top of HDInsight for Data-Analytics such as Recommendations, Clustering and Classification.

A few example use cases from our customer implementations successfully using HDInsight include:

  1. A recommender built for analyzing weblogs as input that generates recommendations for various users, based on Item/User similarity. The visualization was implemented using PowerPivot over Excel. Using this solution the customer was able to realize an increase of 15%+ in the conversion rates and order value.
  2. A categorization/classification engine with Apache Mahout (using NaiveBayes, RandomForest, and Complimentary Naïve Bayes algorithms), extensive usage of Apache Hive was made for data preprocessing.  The solutions were deployed successfully on the Azure cloud.
  3. An analytics solution on media related data built using HDInsight Service platforms. The use case involved ingesting high volumes of data with 20+ dimensions and performing both low-latency queries, as well as richer analytics on the dataset in various dimensions. The source data was subjected to initial process of massaging, merging, segregating and populating it into HDFS and Hive. It was then preprocessed to store aggregated data into SSAS cube.

The environments used included:

  • High-end Windows Server on Azure running SSAS cube
  • Hadoop cluster with Hive for querying (HDInsight Service)
  • Azure BLOB storage for data ingestion

The administrative and deployment tools available with the HDInsight platform have helped Impetus to quickly conceptualize and implement solutions that deliver business value to enterprises. We are very enthusiastic about the new HDP v 1.3 for Windows release and have already started tapping the power of this upgrade in implementing solutions for customers.

For further information on Impetus, visit

Find our more about HDInsight Service, Hortonworks Data Platform for Windows and our partnership with Microsoft 

Categorized by :
Business Values of Hadoop Windows

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.