According to a 2013 Global Data Breach study by the Ponemon Institute, the average cost of data loss exceeds $5.4 million per breach, and the average per person cost of lost data approaching $200 per record in the United States. That said, no industry is spared from this threat and all of our data systems, including Hadoop, need to address the security concern. Protecting sensitive data in Hadoop is now the imperative for IT and the business.
Enterprises are adopting the Modern Data Architecture with Hadoop and Hortonworks Data Platform (HDP) to cost effectively capture, store and process all data, structured and unstructured. And with the introduction of Apache Hadoop YARN, HDP is used to host different data applications and users with access to the same data simultaneously. This underscores the value of the joint HDP and DGSecure solution to provide comprehensive and coordinated security for enterprise Hadoop.
Understanding where the sensitive data is located is crucial to assessing and managing this risk. DGSecure for Hadoop scans your data in structured, semi-structured or unstructured formats and then masks or encrypts your data, providing you with a complete dashboard to track and report all sensitive data protections in your environment. Below are two examples of how customers are realizing the value Hadoop can bring while ensuring compliance and data protection.
Inaccuracies within the healthcare billing system are a major burden and cost. The American Medical Association notes nearly 1 in 10 healthcare bills contain errors and that $43 billion could have been saved if commercial insurers consistently paid claims since 2010. One company helping to improve this statistic is an innovative healthcare analytics organization that combines clinical expertise and analytical technology inside Hadoop to identify and reclaim excessive and inaccurate healthcare charges. They are utilizing Dataguise to discover specific sensitive data elements (in flight and at rest) and to mask and encrypt these elements. These include Protected Health Information (PHI) data such as names, health records, addresses, and billing amounts. Being able to discover and protect specific sensitive elements in Hadoop is one of the unique differentiators with the Dataguise solution. Consistent and flexible masking provides coding accuracy, enabling consistent data bindings between diagnosis and procedure costs. They are then able to leverage Hortonworks Hadoop data platform and a leading analytics solution to validate billing consistency and report on billing inaccuracies, giving it a 99% success rate. The flexible and intelligent masking and encryption provided by Dataguise allows this organization to achieve compliance for both Federal (HIPAA) and 38 State Privacy laws, as well as identify new revenue streams by sharing de-identified medical records with clients, partners and government health agencies in a secure, private HIPAA compliant format.
A global smartphone manufacturer leverages the power of Hadoop to capture and aggregate phone logging data (product, usage and user configuration information). This data is then de-identified using Dataguise DgSecure via the Flume agent. Dataguise encrypts and masks specific sensitive data elements within the larger volume of data, leaving the key product and usage data open for the business users analytical and reporting needs. This ensures compliance with U.S. and European compliance directives and results in a highly scalable, high performance, on demand (and secure) analytics platform that product teams can use to continuously improve the products, add new features and enhance overall user experience.
As we see more and more companies turning to Hadoop, we also see security considerations playing a bigger role. The recently announced Apache Argus incubator project provides provides central administration and coordinated enforcement of enterprise security policy for a Hadoop cluster. Dataguise
is planning on working with the Apache Argus community to provide an integrated approach for authorized decryption, wherein Dataguise decryptions can be authorized and controlled centrally from the Argus authorization framework, allowing clients to achieve maximum value and security within the Hadoop deployments.
Visit the Dataguise tutorial to try it in the Hortonwoks Sandbox.