Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 31, 2015
prev slideNext slide

Creating a Strategic and Dynamic Data Supply Chain

Our guest blogger today comes from our partner Talend, who has been working with us for many years to help organizations transition from data chaos to a modern data architecture. In this blog, Talend’s Ashley Stirrup, CMO, talks about a helping organizations to support a dynamic data supply chain.

In order to remain viable in increasingly competitive markets, companies must create ever-more detailed models of the business that incorporate all data – regardless of source or volume. In essence, companies need to expand from a one-megapixel view of business activity to a fine-grained gigapixel view. This visibility, paired with the creation of predictive models, enables companies to understand what is likely to happen and what activities lead to desired outcomes. Companies can then focus on encouraging those outcomes with data-driven management practices.

Data Chaos

What holds most companies back from becoming truly data-driven? Not surprisingly it’s the data: accessing it from many sources to create a detailed view, preparing it, transforming it, loading it, reporting on it, storing it, and analyzing it. In order to become a data-driven organization, companies must change the way they think about data. It must be considered and treated like a highly strategic asset. That means companies must become masters of their data – and therein lies the challenge.

Many companies today operate in data chaos. Multiple data silos, poor data quality, the growth of big data, new data sources and inconsistent data across systems – all of these are contributing factors. Data chaos has also resulted from legacy and accidental architectures that built up over time.

Meanwhile, the process of extracting, transforming, and loading data (ETL) is changing. The standard model of application data moving through an ETL process and ending up in a data warehouse has been under stress for a long time. For many organizations, it has been the primary use case for Hadoop, where it has been modified to ELT, Extract, Load, and Transform (with most of the T taking place inside Apache Hadoop).

Transforming ETL

But the transformation that is really required is much bigger than this. With new sources of data everywhere, ETL is a vital process that must allow the E, the T, and the L to be combined, ordered, and located wherever needed. At times, you will land data in Hadoop, transform it there, and load it somewhere else. Or extract data from mobile gateways and distribute it to both big data repositories and to the data warehouse. In other words ETL has grown into a form of data logistics that supports a dynamic data supply chain. And like supply chains in the real world, the infrastructure must allow for constant adjustment, reconfiguration, and optimization as conditions change, new data sources arrive, new technology is installed, and disruptions occur. Instead of taking two months to add a field with traditional ETL, modern data logistics must allow changes to be made in near real time.

A Dynamic Data Supply Chain

Talend and Hortonworks have been working together for many years to help organizations transition quickly and efficiently from data chaos to modern data architectures capable of supporting a dynamic data supply chain. As the leading contributor to the Apache Hadoop, Hortonworks Data Platfom is the foundation of this modern data architecture. Talend integrates data from all types of data sources: real-time data, application data, data warehouses, and big data sources such as Hadoop. It is designed to solve the entire problem of creating a dynamic data supply chain: integrating all types of data, normalizing them, and providing governed access at scale.

Many organizations currently still live, to one degree or another, with data chaos. They know that part of the truth is in one system or data source, and some parts of the truth are over there in big data, but they have no effective way to integrate it all in time to matter. Ultimately, what Talend and Hortonworks provide is a way to change this situation and create a dynamic data supply chain that is foundational for data-driven organizations.

Learn More


Leave a Reply

Your email address will not be published. Required fields are marked *