Apache Knox Gateway
The Knox Gateway (“Knox”) is a system that provides a single point of authentication and access for Apache™ Hadoop® services in a cluster. The goal of the project is to simplify Hadoop security for users who access the cluster data and execute jobs, and for operators who control access and manage the cluster. Knox runs as a server (or cluster of servers) that serve one or more Hadoop clusters.
What Knox Gateway Does
Knox Gateways provides security for multiple Hadoop clusters, with these advantages:
- Provide perimeter security to make Hadoop security setup easier
- Support authentication and token verification security scenarios
- Deliver users a single cluster end-point that aggregates capabilities for data and jobs
- Enable integration with enterprise and cloud identity management environments
- Manage security across multiple clusters and multiple versions of Hadoop
How Knox Gateway Works
Knox aims to provide perimeter security that will integrate easily into existing security infrastructure. Delivering security to the Hadoop ecosystem is a critical community project. Knox needs to be woven into the very fabric of Hadoop in order to be effective, and being a part of the community will allow Knox to accomplish just that.
Currently, a Hadoop cluster is presented to consumers as a loose collection of independent services. This makes it difficult for users to interact with Hadoop since each service maintains it’s own method of access and security. Configuration and administration of a secure Hadoop cluster is complex and so many Hadoop administrators are forced with the choice of slowing their Hadoop rollout or running Hadoop without security.
The goal of the project is to provide coverage for all existing Hadoop ecosystem projects. In addition, the project will be extensible to allow for future proprietary Hadoop components without requiring changes to the Knox source code. Knox is expected to run in a DMZ environment where it will provide controlled access to multiple Hadoop services. In this way Hadoop clusters can be protected by a firewall with controlled access. The authentication components of the gateway will be modular and extensible, to be easily integrated with existing security infrastructure.
Hortonworks provides the fastest path to innovation by working with the open source community by identifying and developing enterprise requirements for Hadoop.
Business Value of Hadoop
Sources of Big Data are turning the conversation from “data analytics” to “big data analytics” because they hold significant business value.