Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
April 10, 2015
prev slideNext slide

Optimize Your Data Architecture with Hadoop

Opportunity abounds

According to the enterprise data usage experts at Appfluent, the typical Enterprise Data Warehouse (EDW) dedicates 70% of its storage volume to unused data and 55% of its processing capacity to low value ETL workloads. This represents a waste of what could otherwise be a high performance, finely tuned analytics and reporting environment that supports enterprise priorities. Even worse, EDW environments often cannot deal with the varied structures of new data sources that offer so much untapped value.


The business dashboards an EDW makes possible fuel today’s enterprises with insights and opportunity. But every organization is strained by escalating costs of expanding the EDW, unnecessary complexity in the EDW environment, and an inability to expand the EDW to manage data growth and variety.

But what if you could free up 70% of storage capacity, put 55% of your processing to work on other tasks, and enrich your EDW with new data sources you’ve never brought together before? Would you change the cost structure of IT in your organization? Could you deliver new insights that provide a competitive advantage to your business? Might you uncover ways to turn your data into entirely new revenue streams?

Capturing opportunity, pocketing cash

Apache Hadoop adoption continues its high-velocity growth trajectory due in large part to favorable economics; it’s simply much less costly than alternatives to manage high volumes of any type of data with Hadoop. What was previously impossible in data management is suddenly happening, and the business impact is significant.

Hortonworks customer TrueCar is revolutionizing car buying with a business built on a thorough understanding of every car purchase made every day. It’s a business built on data, and they compete by managing that data better and at a lower cost than their competitors. With Hortonworks Data Platform (HDP), TrueCar now estimates they can manage data at 12 cents per GB. Traditional alternatives would have cost them $19 per GB. They’ve captured a cost benefit of two orders of magnitude in the resource that drives their business.

Long established companies are taking advantage of Hadoop’s economics as well. A large bank expects to save over $40 million in data warehousing costs this year alone, simply by moving ETL processing to HDP. A major retailer deployed HDP to manage data at 1/50th the cost per TB of their EDW, freeing up millions of dollars to invest elsewhere.

Archive, Offload, Enrich

Because of these economics, every modern data architecture will include Hadoop as a core element. In our work with hundreds of customers, we see three approaches organizations take to optimize their data architecture:

  • Archive – moving high volumes of data to HDP as an active archive, starting with unused data, storing data longer, and building a data lake
  • Offload – moving ETL workloads to HDP, taking advantage of distributed processing in Hadoop and freeing existing systems for analytic workloads
  • Enrich – refining new types of data in HDP and bringing those into the EDW where they can add value to analytics and reporting


While archive and offload free up significant cost savings, things get interesting when organizations enrich data with Hadoop. ZirMed provides a data service to help healthcare providers collect accurate payments. They use HDP to enrich their EDW with new data types including pharmacy receipts, text messages, and patient web activity. By bringing together data that could not easily be combined in the past, ZirMed offers its customers new insights to help them bring in more revenue.

Getting started

The scale of Hadoop’s economic advantage over alternatives means nearly any organization can capture quick wins that drive measurable return on investment. If you are wrestling with how to expand your EDW or data architecture while controlling costs, Hadoop offers a solution.

Most organizations start small but move quickly. That’s why we designed HDP Jumpstart, combining the support, services, and training to help you be successful.

Download Enterprise Data Warehouse Optimization White Paper

Contact Hortonworks today to learn more.


Leave a Reply

Your email address will not be published. Required fields are marked *