Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
September 05, 2014
prev slideNext slide

Tableau on Hadoop Gets Better and Better with

Continuing our ecosystem momentum for the next generation of SQL in Hadoop, here to share his insights with us on the potential that holds for both the individual data worker as well as the data driven company alike is Dustin Smith, Product Marketing Manager at Tableau Software.

The work delivered over the last year as part of Stinger has made a tremendous impact for our customers who are using Tableau to analyze Hadoop data, and we are excited to see this momentum continue under the leadership of Hortonworks within the Apache Hive community. And now, with, Tableau on Hadoop gets better and better. This initiative will increase the power and flexibility of Apache Hive to build on its petabyte scale capabilities with sub-second queries against a Hadoop ecosystem (among other aims).  This has massive implications for organizations looking to put the power of their massive data assets into the hands of the everyday business user needing to do ad hoc data discovery and reporting to drive insight and innovation.

There are few things in this life that data nerds like me love more than a giant new data set to play with. The anticipation of digging into trends and discovering outliers is a palpable feeling much like the excitement you felt as a kid bursting through the school doors onto the playground for recess. The freedom to explore and the possibility of discovery are central tenets of being naturally curious beings. Conversely, there are few things in this life data nerds like me hate more than waiting for a query to run.  It’s like watching those precious moments of playground freedom trickle through your fingers. Logically, as an adult I know that when it comes to massive data volumes, the bigger my data playground is, the longer my queries will take to run as the questions I ask may be pointed at billions of rows.

It doesn’t stop me from wanting to pout, though.

This is why I’m so excited for the initiative as a Tableau user.  It means continuing to breakaway from the traditional way Hadoop is sometimes thought of — just for cold data storage or for pre-planned data projects with chunks of time blocked out – and constantly looking to Hadoop in the “I have a data question right now” situations.  In the world of self-service analytics and the legions of data hungry people who live there, the prospect of sub-second queries against structured and un-structured data sets scaling into the petabyte range means the true power of Hadoop becomes more than just accessible; it becomes something that can be leveraged in the day-to-day flow of answering data questions (while in a meeting, interacting with a client or troubleshooting a report).


The additional aims of the initiative, including improved complex query handling as well as overall better workload management, mean that organizations as a whole can keep widening the pool of users leveraging Apache Hive to empower data discovery and insight.  To those companies embracing a data driven and self-service analytics culture, this will be of critical importance as more and more types of data become housed inside of Hadoop with the potential to help every aspect of a business. Whether you are a data nerd like me that actively plays in your company’s Hadoop data playground or if you are more focused on the overall well-being of a Hadoop deployment, Hortonwork’s commitment to the initiative means data workers and IT groups alike are in store for amazing things.

Discover More:


Leave a Reply

Your email address will not be published. Required fields are marked *