IBM InfoSphere Guardium Secures the Hortonworks Data Platform and Ecosystem
IBM InfoSphere Guardium has certified with HDP 2.1. The Hortonworks Certified Technology Program simplifies big data planning by providing pre-built and validated integrations between leading enterprise technologies and HDP.
Kathryn Zeidenstein, InfoSphere Guardium Evangelist, is our guest blogger and describes security, Hadoop, and the Guardium solution.
Those of us in the data security and privacy space tend to worry a lot. With each new breaking story on the latest data breach, and with the subsequent fallout, people higher and higher up the food chain are also worrying a lot. Some of this worry may be translating to some pushback for some organizations in rolling out Hadoop capabilities across the enterprise and opening up its insights to more applications and users. Just because you use Hadoop doesn’t mean you get a pass from the requirements to protect private data and the need to provide proof of compliance to initiatives such as HIPAA, SOX, and PCI.
I’m seeing more and more clients who absolutely must come up with a security and compliance strategy before they can move ahead with their implementations. They need confidence that they have a reasonable level of security and data privacy controls in place and they need a way to prove it. By combining native security and authentication controls that exist in Hortonworks today with an audit and compliance solution that scales across the enterprise, you have many of the tools you need to both secure data and prove compliance.
It’s been heartening to see more Apache projects and Hadoop vendors focusing on the areas granular privileges, authentication, and encryption. And Hortonworks has been very upfront with their rollout of security features in their distribution, which is only goodness as far as I’m concerned.
InfoSphere Guardium Activity Monitoring for Hadoop provides the real-time monitoring, alerting, and reporting capabilities that are so critical to preventing or mitigating data breaches as well as for reporting requirements required by audit and compliance teams. So I’m very pleased that we’ve had the opportunity to work with the Hortonworks team to certify InfoSphere Guardium on HDP 2.1.
Let me briefly review the architecture and benefits. At a very high level, as shown below, InfoSphere Guardium fits into the Modern Data Architecture by providing real-time data protection capabilities across both data sources and Hadoop environments, so you are using the same exact infrastructure across the enterprise, which makes it much easier to aggregate events across data systems and thus provide an enterprise-level view of events.
(Alerts and data activity can also be forwarded directly to a security information event management system, such as IBM QRadar or HP Arcsight, and also to Splunk.)
InfoSphere Guardium supports a wide range of Hadoop, NoSQL, and relational systems with the exact same infrastructure.
Figure 1. InfoSphere Guardium in the Modern Data Architecture
Figure 2 shows the architecture in more detail. There are two main architectural components – the software TAP (S-TAP), which is a lightweight kernel agent that resides on the appropriate nodes (HDFS name node, Hive server, Resource Manager, etc) and intercepts TCP traffic and forwards to the other major component, the Guardium collector appliance, which can be a hardware, software, or virtual appliance. This collector appliance is hardened to ensure that not even privileged user can change audit data once it’s been collected.
The appliance executes appropriate actions based on the security policy that is in effect, such as issuing a real-time alert if the message traffic indicates access to a sensitive file directory, and logging the actual action in a secure and hardened repository for reporting, aggregating with other collected traffic and more.
Figure 2. InfoSphere Guardium architecture decreases overhead and enables segregation of duties
With InfoSphere Guardium, you don’t have to worry about post processing audit data to produce reports or do any complicated configurations. Out-of-the-box reports and policies, including those tailored for Hadoop, get you up and running quickly, and those reports and policies are easily customized to align with your audit requirements.
InfoSphere Guardium Activity Monitoring can be obtained as a separate product or as part of a more comprehensive solution for security and privacy, InfoSphere Data Privacy for Hadoop, which includes business glossary and data masking capabilities as well as the activity monitoring provided by Guardium.
To summarize, by augmenting a secure Hortonworks deployment with InfoSphere Guardium, you can:
- Address regulatory challenges
- Protect against data breaches
- Reduce the total cost of ownership over a roll your own solution
- Leverage the same solution and aggregated reporting across the entire data ecosystem. For a complete list of supported data sources, see the system requirements web site.
- Register for the Next InfoSphere Guardium Tech Talk on July 17th, 2014: What is this thing called Hadoop and how do I secure it?
- Planning a security and auditing deployment for Hadoop e-book
- Solution Brief: IBM InfoSphere Data Privacy for Hadoop