Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
Hortonworks Customer

UC Irvine Health

Healthcare is cost conscious so a proprietary Hadoop infrastructure with unclear future costs will not work for us. Hortonworks was healthcare friendly from the first phone call.

- Charles Boicey, Informatics Solution Architect

UC Irvine Health (UCIH) is a team of nationally regarded physicians and nurses, researchers and clinicians, educators and students united by a single calling—to improve the lives of the people in Orange County, California and beyond.

The Clinical Informatics Group (CIG) at UC Irvine Health (UCIH) was founded in 2009 to provide high quality information in support of the pioneering work done by researchers and clinicians at UC Irvine.


Some data was scattered across multiple Excel spreadsheets. UCIH also had 9 million semi-structured records for 1.2 million patients over 22 years, none of which was searchable or retrievable. These semi-structured records included dictated radiology reports, pathology reports, and rounding notes—very valuable, in aggregate. But it was not accessible in the aggregate.


The CIG first migrated data from that “low tech” platform to an enterprise data warehouse with integrated clinical business intelligence tools. Then they migrated again to their current modern data architecture on Hadoop, on Hortonworks Data Platform.

UCIH chose Hortonworks over competitors because of its commitment to 100% open source Hadoop. This openness makes collaboration with systems partners easier.

The single Hadoop data lake at UC Irvine Health allows the CIG to make good on their “no data left behind” doctrine and serves two different constituents: The UC Irvine School of Medicine for medical research and the UC Irvine Medical Center for the quality of its clinical practice. The medical school and the hospital have distinct big data use cases, but they are both served by a unified data platform with HDP at its core.


The 9 million semi-structured legacy records are now searchable and retrievable in the Hadoop Distributed File System (HDFS). This allowed the UCIH team to turn off their legacy system that was used for view only, saving them more than $500,000.

The CIG has already launched two new data-driven programs, one to reduce patient re-admittance and another to monitor patient vitals in real time.

Researchers at the medical school will be using HDP for cohort discovery. For example, a biomedical researcher on prostate cancer may want to identify males between the ages of 45-55 that:
• had prostate cancer at a certain stage,
• that underwent a prostatectomy and
• that are taking a certain class of drugs.

Then researchers can easily present the anonymous sample cohort to their Internal Review Board for approval, without ever having seen uniquely identifiable information. This speeds the process of preparing and approving a study, while assuring patient confidentiality.