How to Refine and Visualize Twitter Data

He loves me, he loves me not… using daisies to figure out someone’s feelings is so last century. A much better way to determine whether someone likes you, your product or your company is to do some analysis on Twitter feeds to get better data on what the public is saying. But how do you take thousands of tweets and process them?  We show you how in our video – Understand your customers’ sentiments with Social Media Data – that you can capture a Twitter stream to do Sentiment Analysis.

Twitter Sentiment VisualizationNow, when you boot up your Hortonworks Sandbox today, you’ll find Tutorial 13: Refining and Visualizing Sentiment Data as the companion step-by-step guide to the video. In this Hadoop tutorial, we will show you how you can take a Twitter stream and visualize it in Excel 2013 or you could use your own favorite visualization tool. Note you can use any version of Excel, but Excel 2013 allows you do plot the data on a map where other versions will limit you to the built-in charting function.

In this tutorial, you’ll have work through the following:

  • Download and extract the sentiment tutorial files that we’ve included in the tutorial.
  • Load Twitter data into the Hortonworks Sandbox.
  • Copy a Hive script to the Sandbox.
  • Run the Hive script to refine the raw data.
  • Access the refined sentiment data with Excel.
  • Visualize the sentiment data using Excel Power View.

If you don’t have Excel 2013, but you’d like to do some cool visualizations, you can hook up any visualization tool that can be connected with an ODBC driver. For assistance refer to:

We’ve also included some tutorials from our partners, like Tableau to help you with this. The Sandbox Partner Gallery can be found here.

Don’t have the Sandbox? You can download it here. It’s our free, single node HDP environment that can run on your laptop.

Already have the Sandbox and want to play with this new tutorial? On start up, the Sandbox will pull the new tutorial into your version or you can tell the Sandbox to “Update” the tutorials from the “About Hortonworks Hue” button.

Now, put down that daisy and enjoy the new tutorial!

Categorized by :
Architect & CIO Data Analyst & Scientist Developer HDP Other Sandbox Visitor Type

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

YARN Ready – Office Hours
Thursday, September 11, 2014
1:00 PM Eastern / 10:00 AM Pacific

More Webinars »

Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.