The factory of the future will merge the virtual world with the real world.
Domhall Carroll, industry sector head of Siemens Ltd, made this point as he addressed the Engineers Ireland annual conference in May 2014. Carroll described prior industrial revolutions, leading up to today’s transformations that are creating a fourth industrial revolution, based on the use of cyber-physical systems. His remarks aligned with three ways that we see our manufacturing customers solve data challenges in Industry 4.0. With HDP, they build:
In his remarks, Carroll concluded that the factory of the future will treat everything as a service (including tools and skills) across an integrated supply-chain, nourished with highly available data.
Some of the world’s largest manufacturers subscribe to Hortonworks Data Platform (HDP) for that “highly available data”. With HDP, they integrate Apache Hadoop into their modern data architectures built for the “Industry 4.0” that Carroll described. Hortonworks has partnered with these industrial innovators, and together we’ve solved common data challenges that face manufacturing as well as other industries such as insurance, telecommunications, retail, and healthcare.
Manufacturing is closely tied to the retail enterprises that distribute its products to customers. Today, customers expect the same level of personalization from their manufacturers as they do from their retailers. That customer intimacy (approaching a batch size of one) starts with integrating a single view of their customers into the entire engineering process.
Manufacturers need more data to know more about who will use their product, how they will use it, and how they might push that product to (or beyond) its factory specifications.
Previously, manufacturers have gathered that customer insight with “small data” methods such as product surveys, focus groups or by analyzing products that are returned for defect. Those approaches have these obvious limitations:
But now the amount of data on product use is growing at the same exponential rate that we see across all industries. Some manufactured goods constantly transmit their own usage data. Mobile phones, cars, aircraft or even household appliances can send sensor or geo-location data on how and where they are actually used (across the actual range of real-life users).
Other manufactured goods such as pharmaceuticals, clothing or packaged foods may not transmit their own data but consumers of those goods leave a much larger trail of opinions in their web clickstreams and social media interactions. Although small amounts of this data may be biased, big data on sentiment can reveal patterns with a high level of confidence. Confidence grows even greater when these “human-generated” streams can be combined and correlated with the raw telemetry data streaming directly from the machine.
The challenge for the manufacturer is how to capture, combine and correlate all those multi-faceted data streams in one place. Traditional storage platforms were built for one particular function or line of business, and so many manufacturing enterprises find themselves with multiple views. Enterprise Hadoop with YARN at its architectural center, can capture and store all types of data, regardless of source or structure—and then process that data in many different ways at once.
One major aircraft manufacturer uses HDP to capture multiple data streams on how pilots operate each one of its aircraft. Now they gather streaming data such as cruising velocity, pitch, take off and landing characteristics, altitude, outside temperature and air pressure. The company can analyze this data as it streams in to diagnose issues and prioritize repairs. That same data can be stored in HDP for years, providing long-range diagnostics that span a wide range of climates, use cases and pilots—providing deep insight to engineers working on the next model.
Today’s manufacturing processes are complex. As Carroll pointed out, Industry 4.0 involves a close interaction between cyber and physical systems. Data is the thread that ties those systems together, and big data in HDP allows manufacturers to build robust predictive analytics that will show them what is likely to happen as their cyber and physical systems interact.
Legacy data platforms limit predictive capabilities because they cannot scale economically and their architectures limit the types of data they can ingest. Those legacy platforms may still work well for the operational purposes they were designed for, but HDP with YARN at its center, allows manufacturers to analyze any type of data for batch, interactive or real-time applications. These advanced analytic applications make operations smarter with self-healing processes. They can inform predictive equipment maintenance and proactive quality management.
Merck provides a good example of how to harness big data for predictive analytics. The company is one of the world’s largest manufacturers of vaccines, and a 2014 article in InformationWeek described how the company uses big data in Hadoop to improve its manufacturing yields.
The article points out the importance of controlling quality in vaccine manufacturing:
Vaccines often contain attenuated viruses, meaning they’re altered so they give you immunity but not the actual disease, and thus they have to be handled under precise conditions during every step in the manufacturing process. Components might have to be stored at exactly -8 degrees for a year or more, and with even a slight variance from regulator-approved manufacturing processes, the materials have to be discarded.
That discard was costing the company hundreds of millions of dollars in lost revenue. Now the team monitors many more variables and stores those measurements from each batch. Predictive analytics over that superior data set help Merck improve yields and revenue.
WD, a Western Digital company, also uses HDP to optimize its manufacturing with predictive analytics. The WD manufacturing process is state of the art and driven by data. The first phase of the process occurs in the clean room, where each drive’s components are assembled. Sensors capture data at each step, for each drive. Across about 200 million devices per year, this adds up to petabytes of information.
Before Hadoop, WD retained data for between 3 months to one year. Hortonworks Data Platform extends the data retention time horizon to two years. With this additional data, WD engineers can identify subtler patterns in the data that may not surface over a shorter time horizon and build predictive analytics that further optimize its advanced manufacturing operations.
Manufacturers face a paradox. They may design and manufacture an excellent product that meets rigorous design requirements, confirmed through exhaustive testing. But their ability to deliver that product to market relies on the upstream delivery of raw materials and the downstream shipping and distribution of what they produce.
Although manufacturers do not usually control the upstream and downstream components of their supply chains, they want to predict likely outcomes across the value chain that affects their products.
Hortonworks subscribers in manufacturing use HDP to manage inventory and avoid shortages of key raw materials for their production runs. With new data discovery capabilities in Hadoop, manufacturers increase their own operational efficiency with n-tier visibility into their supply chain. When products are returned for defect, they can analyze data from the unit’s production run—even if it occurred years prior. These forensics help with root cause analysis to identify whether problems arose from raw materials, the manufacturing process, or post-sale usage.
With greater supply chain visibility they can also connect logistics steps that they currently see as isolated data islands. They can also increase working capital and reduce operating expenses.
At Hortonworks, we’ve seen manufacturing subscribers improve customer service, optimize their manufacturing systems, and better manage their supply chains through a number of data-driven initiatives. They leverage advanced analytic applications to improve visibility of data already under management and also to optimize their data architectures to reduce costs.
In my next post in the series, I’ll discuss some specific manufacturing solutions with HDP and describe how those solutions impacted the bottom-line results of our subscribers.