Securing Hadoop with Knox Gateway

 

Back in the day, in order to secure a Hadoop cluster all you needed was a firewall that restricted network access to only authorized users. This eventually evolved into a more robust security layer in Hadoop… a layer that could augment firewall access with strong authentication. Enter Kerberos.  Around 2008, Owen O’Malley and a team of committers led this first foray into security and today, Kerberos is still the primary way to secure a Hadoop cluster.

Fast-forward to today… Widespread adoption of Hadoop is upon us.  The enterprise has placed requirements on the platform to not only provide perimeter security, but to also integrate with all types of authentication mechanisms. Oh yeah, and all the while, be easy to manage and to integrate with the rest of the secured corporate infrastructure. Kerberos can still be a great provider of the core security technology but with all the touch-points that a user will have with Hadoop, something more is needed.

The time has come for Knox.

The only path to security in Hadoop is the community

Screen Shot 2013-02-19 at 6.16.28 AM

The Knox Gateway aims to provide perimeter security that will integrate easily into existing security infrastructure.  Delivering this key component of the Apache Hadoop ecosystem is a critical community project.  Security is not an afterthought.  It needs to be woven into the very fabric of Hadoop in order to be effective. Being a part of the community will allow Knox to accomplish just that.

Already the community has rallied around the project and the vote has been positive thus far.  Tomorrow we should see community approval of a new incubation project in the Apache Software Foundation for Knox, a security layer for the Hadoop ecosystem.  The initial mentor list contains resources from Hortonworks, Microsoft and NASA among others.

What comprises the Knox Gateway?

The Knox Gateway (“Gateway” or “Knox”) is a system that provides a single point of authentication and access for Apache Hadoop services in a cluster. The goal is to simplify Hadoop security for both users (i.e. who access the cluster data and execute jobs) and operators (i.e. who control access and manage the cluster). The Gateway runs as a server (or cluster of servers) that serve one or more Hadoop clusters.  It has few key functions:

  • Provide perimeter security to make Hadoop security setup easier
  • Support authentication and token verification security scenarios
  • Deliver users a single cluster end-point that aggregates capabilities for data and jobs
  • Enable integration with enterprise and cloud identity management environments
  • Manage security across multiple clusters and multiple versions of Hadoop

Knox will be able to provide a security layer for multiple clusters and multiple versions of Hadoop simultaneously and will deliver a simple intuitive management interface.  Playing nice with others is always a security imperative, so Knox will integrate with the existing frameworks for Active Directory /LDAP and it will allow for extensions for custom authentication mechanisms.

Availability

The short term plan for the Knox team is to deliver a solid, working release in late March so that early adopters can begin to evaluate and provide valuable feedback.  This critical step will ensure that the gateway fits nicely into customers’ infrastructure and makes Hadoop easier to use… and more secure.

Categorized by :
Apache Hadoop Knox Gateway Other Security

Comments

|
July 31, 2014 at 7:03 am
|

Knox has graduated from the Apache incubator.

kminder
|
August 21, 2013 at 2:41 pm
|

This project has move to Apache.
http://knox.incubator.apache.org/
There is some documentation there and we are working on more.

abvs
|
July 2, 2013 at 3:53 am
|

Is this project still maintained ?
Any chance to see some documentation ?
The github repo looks deserted

    |
    July 31, 2014 at 7:04 am
    |

    Knox is an Apache project and is no longer maintained in github.

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache Knox
Thursday, October 23, 2014
1:00 PM Eastern / 12:00 PM Central / 11:00 AM Mountain / 10:00 AM Pacific

More Webinars »

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.