Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
May 26, 2017
prev slideNext slide

Precision Medicine: a 5 Million Person Case Study

The San Jose DataWorks Summit (June 13-15) is nearly upon us! Our array of speakers is larger and even more impressive than last year. This year one of our Keynote and Enterprise Adoption Tracks will include Dr. Wade Schulz, Resident Physician, Clinical Pathology, at the Yale School of Medicine. Co-presenting will be Hao Dai, Deputy Director, Biobank Center, at the National Center for Cardiovascular Diseases in China.

Join Dr. Schulz and Hao, on Wednesday, June 14th, at 3:00pm, as they present:

International Precision Medicine – A Five Million Person Case Study in Hadoop

Precision medicine and digital health research have the potential to improve the efficiency and effectiveness of healthcare delivery. With the continued interest in population genomics and its effect on disease, it is now essential to provide robust technical platforms for data management and analysis in order to realize the full potential of these large studies. In this session, we will present the use case for an ongoing international research collaboration between Yale University, the National Center for Cardiovascular Disease, and colleagues with a shared goal of enrolling five million individuals. We will discuss the architecture and design of the Hadoop-based infrastructure used for data integration that enables researchers around the world to securely analyze the data acquired for this study.

To illustrate, we will present a specific example that uses Kafka, Storm, Spark, Python, and GPU-enabled Hadoop nodes that will enable our researchers to complete machine learning projects that link biomedical imaging data, genomics, electrocardiograms, and patient outcomes. Similar architectural approaches can be used within nearly any industry, and we will demonstrate how others can use similar approaches to gain valuable insights into their data. The topic will be co-presented by collaborators from Yale University and the National Center for Cardiovascular Disease, China.

Be sure to register for the DataWorks Summit to catch this presentation and many others!

To learn more about predictive analytics and Big Data solutions for healthcare, visit


Leave a Reply

Your email address will not be published. Required fields are marked *