Project looks to improve cancer treatment with big data

The American Society of Clinical Oncologists (ASCO) announced a plan on March 27 to collect and analyze data from cancer patients around the country to improve care. Current clinical trial data is limited to a small share of adult patients, but new big data tools such as Hadoop are increasingly making it possible to push healthcare research beyond these contained study groups to larger databases such as the one ASCO is introducing.

Around 1.6 million Americans are diagnosed with cancer each year, but medical researchers only have access to a small portion of patient data, the Wall Street Journal reported. Also, data is mostly stored in unconnected servers or paper files, making it difficult to derive insights from it or spot broader trends.

"There is a treasure trove of information inside those cases if we simply bring them together," ASCO CEO Allen Lichter told the Wall Street Journal.

Additionally, a database of actual patient data will contain more typical cases than clinical trials, which are often populated with a carefully defined group of subjects, The Wall Street Journal noted. A full inventory of patients can include those with conditions such as heart failure or diabetes – much more reflective of what an oncologist is likely to actually see – cancer doctor W. Charles Penley explained.

The initial ASCO initiative includes anonymous data from around 100,000 breast cancer patients. As the system grows, doctors will be able to tap into the database to help develop treatment recommendations. Drawing on open source software tools, the ASCO CancerLinQ system includes real-time data collection and clinical decision support plus data mining and visualization features.

Open source tools such as Hadoop and HBase make it easy for organizations to quickly build and query comparable database architectures. With the expansion of Hadoop and big data technology, new advances in medicine and other sectors will increasingly be possible.

Categorized by :

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.