Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
April 06, 2017
prev slideNext slide

Applied Healthcare Informatics: A Healthcare Data Ecosystem Constructed on HDP and Utilizing HDF

This is a guest blog post by Charles Boicey, Chief Innovation Officer at Clearsense. Clearsense was born out of a passion for helping healthcare organizations realize the promise of their data and its ability to help them make better, faster clinical decisions—to meet the challenges of value-based care, drive research, improve patient care, and ultimately help reduce preventable deaths. Charles is, at heart, a clinician technologist. With 25 years of combined nursing and healthcare informatics experience, he began his career as a trauma critical care nurse at Los Angeles County + USC Medical Center, a county hospital in Los Angles. Since then, Charles has introduced several technology innovations to healthcare. He is passionate about providing clinicians with data-driven solutions that serve as cognitive accelerators in the clinical decision-making process.

I am now starting on my eighth year applying “Big Data” technologies to solve some of healthcare’s biggest challenges. Five of those years were in academic, medical center environments at the University of California, Irvine and Stony Brook Medicine. For the past two years, I’ve been in the commercial sector at Clearsense. Hortonworks technologies and team-members have been partners on my Big Data journey.

I thought I would take this opportunity to review our Clearsense experiences at over the past two years and share with you our discoveries. Starting first with Hortonworks DataFlow (HDF), we now have a capable pipeline for us to ingest and route data of several different types and from several different sources. Our data architecture is multi-tenant and requires a secure routing system capable of channeling both streaming and batched data from several tenants and routing them appropriately. HDF handles this challenge elegantly and efficiently, and we have complete end-to-end insight into the process.

HDF brings the data to Clearsense, but Hortonworks Data Platform (HDP) is the 100% open-source platform for storing and processing that data for whichever end-user application our clients need. HDP has allowed us to build out a non-proprietary, extensible landing zone for all types of data. It enables us to keep our promise to clients that we will ingest all of their data, in its entirety, regardless of type.

Our Clearsense HDP data lake then serves as the source that feeds multiple data ponds that serve the varied clinical, quality, operational, financial and research needs of a wide range of healthcare organizations. Advances in Apache data access projects such as Hive, Pig, and HBase allow our clients to access their data using familiar tools and their existing skill sets.

Another area of innovation that we’ve seen is the integration of Spark into the HDF ecosystem. This has eliminated the kludge that we used to face stitching together Mahout over MongoDB or using Storm in ways it wasn’t intended. In fact, we now have a tightly knit lambda architecture that utilizes HDF to route the streaming data through the Spark pipeline. With tools such as Spark R, we now produce offline models and drop them directly into Spark. For example, our models for the detection of patient condition changes and sepsis now run in a Spark environment.

Another advantage of open-source software is that we can make use of newer hardware architectures, such as DriveScale’s separation of compute from storage.

The introduction of robust security components in Apache Ranger and Apache Knox has given us the ability to strengthen security—ensuring an auditable, traceable HIPAA-compliant environment.

Hadoop in healthcare is here to stay, and every week we sense the momentum that the open-source community is sustaining. It has been fun seeing a community of two or three evangelists expand out to several hundred technologists that have built out healthcare production systems on HDP.  It is a great community and I am happy to be part of it.

Please enjoy some of my previous Hortonworks guest blog posts that describe my work in applied healthcare informatics:


stellasweety says:

Nice blog about applied healthcare informatics.

robert downey says:
Your comment is awaiting moderation.

I have a basic knowledge about Healthcare Informatics. After i reading your blog i have a clear view about Healthcare Informatics.

robertdowney says:
Your comment is awaiting moderation.

very useful information for us like who want to know details abut Healthcare Informatics. Thanks for sharing.

robert downey says:
Your comment is awaiting moderation.

Spigot Software Staff Augmentation services help the clients in executing large scale organization efficiently. Contract Staffing helps in managing the dynamic resource requirement of an organization in an efficient and cost effective manner.

Leave a Reply

Your email address will not be published. Required fields are marked *