Apache Knox Gateway

A single point of secure access for Hadoop clusters
The Knox Gateway (“Knox”) is a system that provides a single point of authentication and access for Apache™ Hadoop® services in a cluster. The goal of the project is to simplify Hadoop security for users who access the cluster data and execute jobs, and for operators who control access and manage the cluster. Knox runs as a server (or cluster of servers) that serve one or more Hadoop clusters.

What Knox Gateway Does

Knox Gateways provides security for multiple Hadoop clusters, with these advantages:

  • Provide perimeter security to make Hadoop security setup easier
  • Support authentication and token verification security scenarios
  • Deliver users a single cluster end-point that aggregates capabilities for data and jobs
  • Enable integration with enterprise and cloud identity management environments
  • Manage security across multiple clusters and multiple versions of Hadoop

How Knox Gateway Works

Knox aims to provide perimeter security that will integrate easily into existing security infrastructure.  Delivering security to the Hadoop ecosystem is a critical community project.  Knox needs to be woven into the very fabric of Hadoop in order to be effective, and being a part of the community will allow Knox to accomplish just that.

Currently, a Hadoop cluster is presented to consumers as a loose collection of independent services. This makes it difficult for users to interact with Hadoop since each service maintains it’s own method of access and security. Configuration and administration of a secure Hadoop cluster is complex and so many Hadoop administrators are forced with the choice of slowing their Hadoop rollout or running Hadoop without security.

The goal of the project is to provide coverage for all existing Hadoop ecosystem projects. In addition, the project will be extensible to allow for future proprietary Hadoop components without requiring changes to the Knox source code. Knox is expected to run in a DMZ environment where it will provide controlled access to multiple Hadoop services. In this way Hadoop clusters can be protected by a firewall with controlled access. The authentication components of the gateway will be modular and extensible, to be easily integrated with existing security infrastructure.

Try these Tutorials

Apache Top-Level Project Since
February 2013
Hortonworks Committers

Try Knox Gateway with Sandbox

Hortonworks Sandbox is a self-contained virtual machine with HDP running alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox


More posts on:
HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.