In this blog, Kevin Petrie (Attunity Senior Director of Marketing) joins me to share thoughts on Hadoop and the Enterprise Data Warehouse.
Some believe that Hadoop and the Enterprise Data Warehouse (EDW) will continue to coexist, side-by-side, solving different use cases. The peanut butter is over here, and the chocolate is over there.
At Hortonworks and Attunity, we see something else. We see how Hortonworks subscribers use Hortonworks Data Platform (HDP) for EDW optimization. Together, we use Attunity’s Appfluent Visibility software to find just the right mix of ingredients for this Big Data confection.
Join Hortonworks, Attunity and RCG for an April 23 webinar on Data Warehouse Optimization
Like peanut butter and chocolate, Hadoop and the EDW are better in combination.
The simple dollar math of spotting cold EDW data and moving it to Hadoop is easily understood and compelling: if you don’t need to analyze data in the EDW, then storage in Hadoop costs you less.
But enterprises manage their Big Data infrastructures on several dimensions. Let’s walk through a few Hadoop value drivers that recently caught the attention of a joint Hortonworks and Appfluent customer, a global retail giant.
- Optimize EDW storage by archiving cold data. If your EDW is bursting at the seams, step one is to offload cold data onto HDP. This particular retail enterprise estimates that by adding HDP it can accommodate 20-30% data growth this year without expanding its existing warehouse. Their EDW can support ad-hoc or planned analysis, while Hadoop holds the archival data for reporting and staging.
- Improve EDW performance by offloading ETL processing. A big chunk of EDW CPU cycles are spent on data transformation, which can slow ingestion, analytic queries and overall EDW performance. Instead, use Hadoop as the landing zone for incoming data. Transform it there more efficiently and economically, then load it into your EDW for analytics.
- Enhance analyst productivity. Freeing up valuable EDW capacity enabled analytic queries to run more quickly, more reliably and with less tuning. Data analysts put time they previously spent waiting for query results or troubleshooting their jobs to more productive uses.
- Increase confidence in business decisions. That retail EDW was stretched close to 100% utilization, so its CPUs could not run certain queries over more than 1/3 of relevant data. Moving some of the data to Hadoop enabled this customer to run the same query on 100% of relevant data. Hadoop also enables the retailer to cost-effectively process this data alongside years of historical records, as well as structured and unstructured data from new sources.
More data, different types of data and data stretching back further in time—all of these factors combine to improve confidence in the results. Better business decisions flow from that confidence.
So as you make new investments in Hadoop, look for ways to optimize your familiar EDW investment. For the analyst or data scientist, these are two great tastes that taste great together.
For more on how this works, read the Hortonworks white paper: Data Architecture Optimization with Hortonworks Data Platform.
Register for the Webinar on Thursday April 23 at 11am Pacific Time: Optimizing the Modern Data Warehouse
Vishal Dhanuka is Managing Director and Head of Customer Innovation & Strategy at Hortonworks. He is responsible for developing Hortonworks’ value based go-to-market strategy across industries.