Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 12, 2013
prev slideNext slide

Modern Manufacturing Architectures Built with Hadoop

In God we trust, all others must bring data.
Dr. W. Edwards Deming

Dr. W. Edwards Deming was a statistician and manufacturing consultant who worked on Japanese reconstruction after WWII. His quality control methods influenced innovative Japanese manufacturing processes that simultaneously increased volume, reduced cost, and improved quality. Near the end of his career, Deming taught the same lessons to U.S. automakers.

To this day, the “Deming Prize” is one of the highest rewards for Total Quality Management in the world.

Deming understood how to establish a manufacturing process and then use data to continuously improve that process. But during Deming’s life, the cost of capturing and storing large amounts of data was much higher than it is today.

Now relatively inexpensive sensors can gather and frequently transmit discreet bits of information along many points in the production line. This massive flow of real-time sensor data allows managers to quickly identify problems as they occur. If sensor data indicates problems, quality managers can “stop the line” and make corrections with minimal losses of time and material.

An Apache Hadoop modern data architecture also has strategic power. Our manufacturing customers use Hortonworks Data Platform (HDP) to move beyond reactive error avoidance to proactive process improvement.

WD, a Western Digital company, is a good example of this among our manufacturing customers. WD used to retain a relatively small portion of its manufacturing test data and then retain that for between 3-12 months. Now the company retains several times more data and they save it for at least 24 months.

The WD manufacturing team provides a Critical Parameter Dashboard that allows other employees to drill down into the data. Now with HDP, they can create that dashboard ten times more quickly than before.

By satisfying the growing internal demand for data more quickly, the WD Hadoop team helps the entire organization make better decisions and respond more nimbly to changes in technology and the marketplace.

This helps WD further innovate its processes and products, which enhances manufacturing efficiency.

Here is a general manufacturing reference architecture that reflects common system components and processes used by some of our manufacturing customers:

Manufacturing Ref Arch

Loading Manufacturing Data to Hadoop

Apache Sqoop is included in Hortonworks Data Platform, as a tool to transfer data between external structured data stores (such as SQL Server, Teradata or Oracle) into HDFS or related systems like Hive and HBase. We also see our customers using Web HDFS and Linux file commands like hadoop fs –put to ingest data into Hadoop.

Processing Manufacturing Data

Depending on the use case, manufacturing teams process data in batch (using Apache Hadoop MapReduce and Apache Pig); interactively (with Apache Hive); online (with Apache HBase) or streaming (with Apache Storm).

Analyzing and Visualizing Manufacturing Data

Once data is stored and processed in Hadoop it can either be analyzed in the cluster or exported to relational data stores for analysis there. These relational data stores might include:

  • MySQL
  • SQL Server
  • Teradata
  • Oracle

HDP 2.0, with its YARN-based architecture, allows multiple applications to access the same data set and perform different types of analysis in parallel. This means that the following analytic and visualization applications can all work with the same data, on the same Apache Hadoop cluster, at the same time.

Manufacturers might use the following tools (and others) to easily access, analyze and visualize data IN Hadoop:

Keep What You Have, Then Add Apache Hadoop

Our manufacturing customers have added Apache Hadoop to their existing architectures, without removing their familiar components or processes. HDP is a “plus one” addition to their existing data architecture, to create a modern data architecture that is interoperable and familiar.

This means that the same team of analysts and practitioners can use their existing skills to pursue the same manufacturing goals as those Deming championed: increased volume, reduced cost and better quality.


As more and more companies integrate Hadoop into their manufacturing of electronics, aircraft, automobiles, chemicals, household appliances and industrial equipment they can feed their data into manufacturing dashboards that can be shared with other work groups within their companies (but outside of the manufacturing operations group).

Improved visibility into the manufacturing process can help designers, product managers, and procurement teams make decisions that facilitate manufacturing and improve the final product.

This is the second in our series on modern data architectures across industry verticals. Recently we discussed Modern Healthcare Architectures Built with Hadoop.

Watch our blog in the coming weeks for reference architectures in other industry verticals.


Leave a Reply

Your email address will not be published. Required fields are marked *