Modern Healthcare Architectures Built with Hadoop

We have heard plenty in the news lately about healthcare challenges and the difficult choices faced by hospital administrators, technology and pharmaceutical providers, researchers, and clinicians. At the same time, consumers are experiencing increased costs without a corresponding increase in health security or in the reliability of clinical outcomes.

One key obstacle in the healthcare market is data liquidity (for patients, practitioners and payers) and some are using Apache Hadoop to overcome this challenge, as part of a modern data architecture. This post describes some healthcare use cases, a healthcare reference architecture and how Hadoop can ease the pain caused by poor data liquidity.

New Value Pathways for Healthcare

In January 2013, McKinsey & Company published a report named “The ‘Big Data’ Revolution in Healthcare”. The report points out how big data is creating value in five “new value pathways” allowing data to flow more freely. Below we present a summary of these five new value pathways and an an example how Hadoop can be used to address each. Thanks to the Clinical Informatics Group at UC Irvine Health for many of the use cases, described in their UCIH case study.

Pathway Benefit Hadoop Use Case
Right Living Patients can build value by taking an active role in their own treatment, including disease prevention. Predictive Analytics: Heart patients weigh themselves at home with scales that transmit data wirelessly to their health center. Algorithms analyze the data and flag patterns that indicate a high risk of readmission, alerting a physician.
Right Care Patients get the most timely, appropriate treatment available. Real-time Monitoring: Patient vital statistics are transmitted from wireless sensors every minute. If vital signs cross certain risk thresholds, staff can attend to the patient immediately.
Right Provider Provider skill sets matched to the complexity of the assignment— for instance, nurses or physicians’ assistants performing tasks that do not require a doctor. Also the specific selection of the provider with the best outcomes. Historical EMR Analysis: Hadoop reduces the cost to store data on clinical operations, allowing longer retention of data on staffing decisions and clinical outcomes. Analysis of this data allows administrators to promote individuals and practices that achieve the best results.
Right Value Ensure cost-effectiveness of care, such as tying provider reimbursement to patient outcomes, or eliminating fraud, waste, or abuse in the system. Medical Device Management: For biomedical device maintenance, use geolocation and sensor data to manage its medical equipment. The biomedical team can know where all the equipment is, so they don’t waste time searching for an item.Over time, determine the usage of different devices, and use this information to make rational decisions about when to repair or replace equipment.
Right Innovation The identification of new therapies and approaches to delivering care, across all aspects of the system. Also improving the innovation engines themselves. Research Cohort Selection: Researchers at teaching hospitals can access patient data in Hadoop for cohort discovery, then present the anonymous sample cohort to their Internal Review Board for approval, without ever having seen uniquely identifiable information.

Source: The ‘Big Data’ Revolution in Healthcare. McKinsey & Company, January 2013.

At Hortonworks, we see our healthcare customers ingest and analyze data from many sources. The following reference architecture is an amalgam of Hadoop data patterns that we’ve seen with our customers’ use of Hortonworks Data Platform (HDP). Components shaded green are part of HDP.

healthcare-mda

Sources of Healthcare Data

Source data comes from:

  • Legacy Electronic Medical Records (EMRs)
  • Transcriptions
  • PACS
  • Medication Administration
  • Financial
  • Laboratory (e.g. SunQuest, Cerner)
  • RTLS (for locating medical equipment & patient throughput)
  • Bio Repository
  • Device Integration (e.g. iSirona)
  • Home Devices (e.g. scales and heart monitors)
  • Clinical Trials
  • Genomics (e.g. 23andMe, Cancer Genomics Hub)
  • Radiology (e.g. RadNet)
  • Quantified Self Sensors (e.g. Fitbit, SmartSleep)
  • Social Media Streams (e.g. FourSquare, Twitter)

Loading Healthcare Data

Apache Sqoop is included in Hortonworks Data Platform, as a tool to transfer data between external structured data stores (such as Teradata, Netezza, MySQL, or Oracle) into HDFS or related systems like Hive and HBase. We also see our customers using other tools or standards for loading healthcare data into Hadoop. Some of these are:

Processing Healthcare Data

Depending on the use case, healthcare organizations process data in batch (using Apache Hadoop MapReduce and Apache Pig); interactively (with Apache Hive); online (with Apache HBase) or streaming (with Apache Storm).

Analyzing Healthcare Data

Once data is stored and processed in Hadoop it can either be analyzed in the cluster or exported to relational data stores for analysis there. These data stores might include:

  • Enterprise data warehouse
  • Quality data mart
  • Surgical data mart
  • Clinical info data mart
  • Diagnosis data mart
  • Neo4j graph database

Many data analysis and visualization applications can also work with the data directly in Hadoop. Hortonworks healthcare customers typically use the following business intelligence and visualization tools to inform their decisions:

  • Microsoft Excel
  • Tableau
  • RESTful Web Services
  • EMR Real-time analytics
  • Metric Insights
  • Patient Scorecards
  • Research Portals
  • Operational Dashboard
  • Quality Dashboards

The following diagram shows how healthcare organizations can integrate Hadoop into their existing data architecture to create a modern data architecture that is interoperable and familiar, so that the same team of analysts and practitioners can use their existing skills in new ways:

Healthcare Ecosystem

As more and more healthcare organizations adopt Hadoop to disseminate data to their teams and partners, they empower caregivers to combine their training, intuition, and professional experience with big data to make data-driven decisions that cure patients and reduce costs.

Watch our blog in the coming weeks as we share reference architectures for other industry verticals.

Download the Datasheet

Categorized by :
Apache Hadoop Architect & CIO Architecture Big Data HDP HDP 2 Healthcare Industry Happenings

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.