This is a guest blog post by Charles Boicey, Chief Innovation Officer at Clearsense. Clearsense was born out of a passion for helping healthcare organizations realize the promise of their data and its ability to help them make better, faster clinical decisions—to meet the challenges of value-based care, drive research, improve patient care, and ultimately help reduce preventable deaths. Charles is, at heart, a clinician technologist. With 25 years of combined nursing and healthcare informatics experience, he began his career as a trauma critical care nurse at Los Angeles County + USC Medical Center, a county hospital in Los Angles. Since then, Charles has introduced several technology innovations to healthcare. He is passionate about providing clinicians with data-driven solutions that serve as cognitive accelerators in the clinical decision-making process.
I am now starting on my eighth year applying “Big Data” technologies to solve some of healthcare’s biggest challenges. Five of those years were in academic, medical center environments at the University of California, Irvine and Stony Brook Medicine. For the past two years, I’ve been in the commercial sector at Clearsense. Hortonworks technologies and team-members have been partners on my Big Data journey.
I thought I would take this opportunity to review our Clearsense experiences at over the past two years and share with you our discoveries. Starting first with Hortonworks DataFlow (HDF), we now have a capable pipeline for us to ingest and route data of several different types and from several different sources. Our data architecture is multi-tenant and requires a secure routing system capable of channeling both streaming and batched data from several tenants and routing them appropriately. HDF handles this challenge elegantly and efficiently, and we have complete end-to-end insight into the process.
HDF brings the data to Clearsense, but Hortonworks Data Platform (HDP) is the 100% open-source platform for storing and processing that data for whichever end-user application our clients need. HDP has allowed us to build out a non-proprietary, extensible landing zone for all types of data. It enables us to keep our promise to clients that we will ingest all of their data, in its entirety, regardless of type.
Our Clearsense HDP data lake then serves as the source that feeds multiple data ponds that serve the varied clinical, quality, operational, financial and research needs of a wide range of healthcare organizations. Advances in Apache data access projects such as Hive, Pig, and HBase allow our clients to access their data using familiar tools and their existing skill sets.
Another area of innovation that we’ve seen is the integration of Spark into the HDF ecosystem. This has eliminated the kludge that we used to face stitching together Mahout over MongoDB or using Storm in ways it wasn’t intended. In fact, we now have a tightly knit lambda architecture that utilizes HDF to route the streaming data through the Spark pipeline. With tools such as Spark R, we now produce offline models and drop them directly into Spark. For example, our models for the detection of patient condition changes and sepsis now run in a Spark environment.
Another advantage of open-source software is that we can make use of newer hardware architectures, such as DriveScale’s separation of compute from storage.
Hadoop in healthcare is here to stay, and every week we sense the momentum that the open-source community is sustaining. It has been fun seeing a community of two or three evangelists expand out to several hundred technologists that have built out healthcare production systems on HDP. It is a great community and I am happy to be part of it.
Please enjoy some of my previous Hortonworks guest blog posts that describe my work in applied healthcare informatics: