How To Use Microsoft Excel to Visualize Hadoop Data

Access to Big Data for Half a Billion Excel Users

Microsoft and Hortonworks have been working together for over two years now with the goal of bringing the power of Big Data to a billion people. As a result of that work, today we announced the General Availability of HDP 2.0 for Windows with the full power of YARN.

There are already over half a billion Excel users on this planet.

So, we have put together a short tutorial on the Hortonworks Sandbox where we walk through the end-to-end data pipeline using HDP and Microsoft Excel in the shoes of a data analyst at a financial services firm where she:

  • Cleans and aggregates 10 years of raw stock tick data from NYSE
  • Enriches the data model by looking up additional attributes from Wikipedia
  • Creates an interactive visualization on the model

You can find the tutorial here.

As part of this process you will experience how simple it is to integrate HDP with the Microsoft Power BI platform.

This integration is made possible by the community work to design and implement WebHDFS, an open REST API in Apache Hadoop. Microsoft used the API from Power Query for Excel to make the integration to Microsoft Business Intelligence platform seamless.

Happy Hadooping!!!

Categorized by :
Big Data Business Analytics Data Analyst & Scientist Developer Hadoop Ecosystem HDP 2 HDP for Windows Modern Data Architecture Other Sandbox

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.