With the growing volumes of diverse data being stored in the Data Lake, any breach of this enterprise-wide data can be catastrophic, from privacy violations and regulatory infractions to corporate image and long-term shareholder value. Seshu Adunuthula – Head of Analytics Infrastructure, eBay acting as Track Chair for Governance and Security for Hadoop Summit San Jose has come up with 15 sessions that will focus on:
The committee recommends these 3, if you can’t attend all 15:
Speakers: Ancil McBarnett and Pardeep Kumar from Hortonworks
You got your cluster installed and configured. You celebrate, until the party is ruined by your company’s Security officer stamping a big “Deny” on your Hadoop cluster. And oops!! You cannot place any data onto the cluster until you can demonstrate it is secure In this session you would learn the tips and tricks to fully secure your cluster for data at rest, data in motion and all the apps including Spark. Your Security officer can then join your Hadoop revelry (unless you don’t authorize him to, with your newly acquired admin rights)
Speaker: Justin Leet from Hortonworks
Security in data, and particularly in big data, is increasingly a major concern, for both companies and for consumers. We want to find potential intrusions as soon as possible. At scale, this becomes a big data problem as enormous numbers of packets flow into a company’s network every second and applications generate copious amounts of logging. Apache Metron is a incubator project that offers advanced security analytics capability over big data. It allows for the capture and analysis of large scale network and log data in order to allow for rules based anomaly detection and analysis. Designed to run at scale, Metron looks to provide a comprehensive tool for organizations to be able to understand the flood of data needed to maintain strong security. We’ll dig into the details of what Metron currently offers, how it’s been put together, and what it means for big data security.
Speakers: Ray Harrison and Michael Fagan from Comcast
Present the challenges and our approach for managing a large multi-tenant, full stack Enterprise Hadoop Data Lake at Comcast supporting near real-time, batch and heavy analytics use cases across a diverse range of data sets and business customers. We will present our approach to governance, monitoring, architecture and automation as well as discuss challenges such as high availability and multi-data center network scenarios.
Hope to see you at the sessions, but you need to register to attend Hadoop Summit San Jose.