Biopharmaceutical companies embrace advanced analytics and a single view of Big Data. Disparate data systems, inefficient clinical trials, slow time-to-market, and the lack of operational safety can cost these companies competitive advantage. Thus continuous innovation around data is mission critical.
Join Mark Baker, Senior Business Systems Architect at CSL Behring on June 13th at 5:50pm, as he presents:
Integrating and Analyzing Data from Multiple Manufacturing Sites Using Apache NIFI and Apache Zeppelin to a Central Hadoop Data Lake at CSL Behring
Abstract: In this talk, Mark Baker (CSL) will show how CSL Behring is Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI to a central Hadoop data lake at CSL Behring
The challenge of merging data from disparate systems has been a leading driver behind investments in data warehousing systems, as well as, in Hadoop. While data warehousing solutions are ready-built for RDBMS integration, Hadoop adds the benefits of infinite and economical scale – not to mention the variety of structured and non-structured formats that it can handle. Whether using a data warehouse or Hadoop or both, physical data movement and consolidation is the primary method of integration.
There may also be challenges with synchronizing rapidly changing data from a system of record to a consolidated Hadoop platform. This introduces the need for “data federation”, where data is integrated without copying data between systems.
For historical/batch data use cases there is a replication of data across remote data hubs into a central data lake using Apache NIFI. We will demo using Apache Zeppelin for analyzing data using Apache Spark and Apache HIVE.
To see how CSL Behring and other companies leverage data to maximize their business potential, register for the DataWorks Summit today!