Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
Hortonworks Customer
UC Irvine Health

UC Irvine Health turned to Hadoop and Hortonworks Data Platform to improve clinical operations in the hospital and its scientific research at the medical school. Their team is building a quantified medical practice that reduces readmissions, speeds new research projects, and tracks patient vital stats on a minute-by-minute basis.

One Hadoop Platform for Two Different Missions

The Clinical Informatics Group (CIG) at UC Irvine Health (UCIH) was founded in 2009 to provide high quality information in support of the pioneering work done by researchers and clinicians at UC Irvine. The CIG began its journey towards Hadoop with an assessment of their data storage.

Some data was scattered across multiple Excel spreadsheets. UCIH also had 9 million semi-structured records for 1.2 million patients over 22 years, none of which was searchable or retrievable. These semi-structured records included dictated radiology reports, pathology reports, and rounding notes—very valuable, in aggregate. But it was not accessible in the aggregate.

The CIG first migrated data from that “low tech” platform to an enterprise data warehouse with integrated clinical business intelligence tools. Then they migrated again to their current modern data architecture on Hadoop, on Hortonworks Data Platform (HDP).

The single Hadoop data lake at UC Irvine Health allows the CIG to make good on their “no data left behind” doctrine and serves two different constituents: The UC Irvine School of Medicine for medical research and the UC Irvine Medical Center for the quality of its clinical practice. The medical school and the hospital have distinct big data use cases, but they are both served by a unified data platform with HDP at its core.

Charles Boicey describes the efficiency of serving different stakeholders with the same comprehensive data platform:

"Hadoop is the only technology that allows healthcare to store data in its native form. If Hadoop didn’t exist we would still have to make decisions about what can come into our data warehouse or the electronic medical record (and what cannot). Now we can bring everything into Hadoop, regardless of data format or speed of ingest. If I find a new data source, I can start storing it the day that I learn about it. We leave no data behind."

Now back to those 9 million semi-structured legacy records. They are now searchable and retrievable in the Hadoop Distributed File System (HDFS). This allowed the UCIH team to turn off their legacy system that was used for view only, saving them more than $500,000.

Solutions for Clinicians

The CIG has already launched two new data-driven programs, one to reduce patient re-admittance and another to monitor patient vitals in real time.

Predictive analytics to reduce re-admittance

One of UCIH's goals is to predict the likelihood of hospital re-admittance within 30 days after discharge. Patients with congestive heart failure have a tendency to build up fluid, which causes them to gain weight. Rapid weight gain over a 1-2 day period is a sign that something is wrong and that the patient should see a doctor.

UCIH collaborated with medical device integration partner, iSirona, to develop a program that sends those heart patients home with a scale and instructions to weigh themselves once daily. The weight data is wirelessly transmitted to Hadoop where an algorithm determines which weight changes indicate risk of re-admittance. The system notifies clinicians about only those cases. All home monitoring data will be viewable in the EMR via an API to Hadoop.

UCIH chose Hortonworks over competitors like Cloudera and MapR because of its commitment to 100% open source Hadoop. This openness makes collaboration with systems partners like iSirona easier. The group also appreciated HDP’s unique Windows compatibility. “Healthcare is cost conscious so a proprietary Hadoop infrastructure with unclear future costs will not work for us. Hortonworks was healthcare friendly from the first phone call.”

Real time surveillance for rapid response

In a typical hospital, nurses manually measure patient vital signs every few hours. This means that the health of their patients may change in the hours between two vital sign measurements.

Even if you brought up Hadoop for the single purpose of eliminating view-only legacy systems, it would be well worth your while.

In January, the medical center will start piloting a new technology called SensiumVitals® to monitor and transmit patient vital signs every minute. Patients in the pilot will wear a SensiumVitals patch that will monitor and wirelessly transmit heart rate, respiratory rate, and temperature. Nurses will be alerted if any of a patient’s vital signs cross certain risk thresholds, so the staff can attend to the patient immediately.

But from a long-term perspective, this sensor data enables something much more profound: predictive analytics that can allow caregivers to respond before a patient’s vital signs ever cross a dangerous threshold.

Most of those minute-by-minute snapshots of vital signs will be unremarkable, but the data points they generate (4,320 per patient, per day) are the building blocks for algorithms that can predict near-term outcomes with an ever-increasing degree of certainty. Like the previous example with heart patients, this data will reduce average time to insight for important medical decisions the staff needs to make.

This is because an increased temperature, heart rate or respiratory rate in isolation of other data may not be cause for concern. But those same vitals, combined with all of the prior data on that patient, combined with years of data on other patients with similar risk factors, combined with unique characteristics of that patient’s medical history, physical characteristics, gender and age—all of that will eventually paint a far more detailed picture, with more predictive power.

Again, Boicey:

“For healthcare, we have never had the ability to do this. We have always taken the approach that we think we know what data elements are important. Now with all the data, we let the data determine what is important for predictive analysis. Yogi Berra might have said it like this: we are now able to capture the data that we know that we need as well as the data that someday we will know that we needed.”

Solutions for Researchers

Researchers at the medical school will be using HDP for cohort discovery. For example, a biomedical researcher on prostate cancer may want to identify males between the ages of 45-55 that:

  • had prostate cancer at a certain stage,

  • that underwent a prostatectomy and

  • that are taking a certain class of drugs.

Then researchers can easily present the anonymous sample cohort to their Internal Review Board for approval, without ever having seen uniquely identifiable information. This speeds the process of preparing and approving a study, while assuring patient confidentiality.


Hadoop is the only technology that allows healthcare to store data in its native form...We leave no data behind.

Charles Boicey, Informatics Solution Architect

Future Plans

UC Irvine Health has already benefitted from integrating Hortonworks Data Platform into its modern data architecture. Now UCIH is ready to tackle additional use cases.

In the coming year, UCIH plans to extend its research capabilities with data mining and data exploration. Now that the group has all of the data in one data lake, it can find previously undiscovered factors that are indicative of a certain outcome.

They also plan to include genomic data in the future.

New research made possible by this combined data will be shared with other practitioners and policy-makers as they review publications that come out of the medical school.

For its biomedical device maintenance, the team wants to use geolocation and sensor data to better manage its medical equipment. The biomedical team needs to know where all the equipment is, so they don’t waste time searching for an item.

Over time, they can determine the usage of different devices. For example, the biomedical engineers will know how often a heart monitor is being used. They can use this information to make rational decisions about when to repair or replace the monitor.

About UCIH

UC Irvine Health (UCIH) is a team of nationally regarded physicians and nurses, researchers and clinicians, educators and students united by a single calling—to improve the lives of the people in Orange County, California and beyond.

As the only university-based care provider in Orange County, UC Irvine Health’s multifaceted organization is dedicated to the discovery of new medical frontiers, to the teaching of future healers and to the delivery of the finest evidence-based care. The union of discovery, teaching and healing gives them the expertise to diagnose and treat exceedingly rare conditions and diseases.