Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

Connected Data Platforms for Healthcare

cloud Ready to Get Started?

Download Sandbox

Arizona State University uses HDP® to Unlock Insights

Arizona State University (ASU) is the largest university in the U.S. and was named by U.S. News & World Report as the “Most Innovative School in America” in 2016. ASU’s Complex Adaptive Systems Initiative (or CASI) built a genomic data lake with petabytes of genetic data on hundreds of individuals, powering research on how each individual variant in the genome can influence the expression of a cancer gene, thereby unlocking insight into potentially life-saving treatments.

Saving lives while delivering more efficient care

Difficult challenges and choices face today’s healthcare industry. Researchers, clinicians and administrators have to make important decisions—often without sufficient data. Hortonworks offers open source Connected Data Platforms (powered by Apache™ Hadoop® and Apache NiFi) to make healthcare data available and actionable. Researchers explore the genetic architecture of cancer cells. Nurses and physicians monitor intensive care patients. Administrators submit reimbursement claims before patients leave the hospital. Hortonworks is transforming healthcare.

Access genomic data for new cancer treatments

If we read that a given drug is “40% effective in treating cancer,” another interpretation could be that the drug is 100% effective for patients with a certain genetic profile. However, genomic data is Big Data. The data in a single human genome includes approximately 20,000 genes. Stored in traditional data platforms, this is the equivalent of several hundred gigabytes. Combining each genome with one million variable DNA locations produces the equivalent of about 20 billion rows of data per person.

Researchers at major universities and teaching hospitals are tackling that challenge with Hortonworks Data Platform as the cost-effective, reliable platform for storing genomic data and combining that with other data on demographics, trial outcomes, and real-time patient responses. They are adopting Hortonworks DataFlow to stream that data into HDP for real-time decisions and long-term cohort analyses. Connected Data Platforms help those doctors learn which drugs and treatments work best for groups of patients across the genetic spectrum.

Monitor patient vitals in real time

In a typical hospital setting, nurses do rounds and manually monitor patient vital signs. They may visit each bed every few hours to measure and record vital signs but the patient’s condition may decline between the time of scheduled visits. This means that caregivers often respond to problems reactively, in situations where arriving earlier may have made a huge difference in the patient’s wellbeing.

New wireless sensors can capture and transmit patient vitals far more frequently than human beings can visit the bedside, and these measurements can stream into a Hadoop cluster. Caregivers can use these signals for real-time alerts to respond more promptly to unexpected changes. Over time, this data accumulates in HDP, feeding algorithms that proactively help predict the likelihood of an emergency even before it could be detected with a bedside visit.

Reduce cardiac re-admittance rates

Patients with heart disease can be closely monitored while they are in a hospital, but when those patients go home, they may skip their medications or ignore dietary and self-care instructions given by their doctor when they left the hospital.

Congestive heart failure causes fluid retention, which leads to weight gain. In one innovative program at UC Irvine Health, patients could return home with a wireless scale and weigh themselves at regular intervals. Algorithms running in Hortonworks determined unsafe weight gain thresholds and alerted a physician to see the patient proactively, before an emergency re-admittance was necessary.

Machine learning to screen for autism with in-home testing

Autism spectrum disorders affect 1 in 100 children at an annual cost estimated at more than $100 billion. The condition can be detected through behavior at eighteen months, but more than 1 in 4 cases are still undiagnosed at 8 years of age. A small number of clinical testing facilities are oversubscribed, with long wait lists. The most common diagnostic test typically takes 2.5 hours to administer and score.

Dr. Dennis Wall is Director of the Computational Biology Initiative at the Harvard Medical School. In this presentation, he describes a process his team developed for low-cost, mobile screening for autism. It takes less than five minutes and relies on the ability to store large volumes of semi-structured data from brief in-home tests administered and submitted by parents. Wall’s lab also used Facebook to capture user-reported information on autism.

Artificial intelligence running on those huge data sets helps maximize efficiency of diagnosis without loss of accuracy. This approach, in combination with data storage on a Hadoop cluster, can be used for other innovative machine learning diagnostic processes.

Store medical research data forever

Medical and scientific researchers at universities live by the “publish or perish” code. Data supporting a given paper used to be appended in an Excel spreadsheet, but many of today’s data sets are just too large. Nevertheless, supporting data sets must be perpetually available is association with its paper. If the data disappears, the paper becomes unsubstantiated.

Universities can use a cluster running Hortonworks Data Platform as a cost-effective, perpetual storage platform for its scientists’ data. Easy and open querying capabilities allow scientific colleagues to share data, validate it and reuse it for more downstream research.

Track equipment, medicines and caregivers with RFID data

Hospitals have begun to use radio-frequency identification (RFID) to track equipment and medicines that move throughout their facilities. RFID scans of an item or device can capture their contents, location, manufacture date, order numbers, and shipping data. One innovative hospital group was able to determine how long its doctors stood in front of sinks to wash their hands (reducing the likelihood of disease transmission).

In the short run, this data can help utilize medicines before their dates of expiration or quickly locate an important piece of equipment. Over time, historical data on how medicines, equipment and doctors interact provides valuable information for planning purchases, training staff and improving operational efficiency.


Mercy has partnered with Hortonworks to create the Mercy Data Library, a Hadoop-based data lake running on Hortonworks Data Platform (HDP). The Data Library will contain volumes of batch data extracts from relational systems like Clarity as well as real-time data sources including Epic access logs. They plan to ingest other data sources, including social…

Cardinal Health
Cardinal Health

Fuse by Cardinal Health is an innovation lab focused on improving the future of health and wellness by making healthcare safer and more cost effective. The Fuse team focuses on connected care, building a smarter supply chain, and discovering new insights through analytics. Fuse chose Hortonworks Data Platform to optimize its data architecture and enrich…


ZirMed, a leading provider of healthcare information management solutions, built a Hadoop cluster running HDP for Windows 2.0. The results were five times the amount of usable storage and greater processing power, all for 30% of the cost of traditional enterprise technologies. Louisville, KY based ZirMed, was founded in 1999 and is a leading provider…