cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
December 09, 2014
prev slideNext slide

Adding a Federation Provider to Apache Knox

The architecture of Hortonworks Data Platform (HDP) matches the blueprint for Enterprise Apache Hadoop, with data management, data access, governance, operations and security. This post focuses on one of those core components: security. Specifically, we will focus on Apache Knox Gateway for securing access to the Hadoop REST APIs.

Pseudo Federation Provider

This blog will walk through the process of adding a new provider for establishing the identity of a user. We will use the simple example of the Pseudo authentication mechanism in Hadoop to illustrate ideas for extending the pre-authenticated federation provider that is available out of the box in Apache Knox. This provider is not yet ready for use in a production environment, but the example will highlight the general programming model for adding pre-authenticated federation providers. There is also a companion github project for this article.

Provider Types

Apache Knox has two types of providers for establishing the identity of the source of an incoming REST request. One is an Authentication Provider and the other is a Federation Provider.

Authentication Providers

Authentication providers are responsible for actually collecting credentials of some sort from the end user. Some examples would be things like HTTP BASIC authentication with username and password that gets authenticated against LDAP or RDBMS. Apache Knox ships with HTTP BASIC authentication against LDAP using Apache Shiro. The Shiro provider can actually be configured in multiple ways.

Authentication providers are sometimes less than ideal since many organizations only want their users to provide credentials to the specific trusted solutions and to use some sort of SSO or federation of that authentication event across all other applications.

Federation Providers

Federation providers, on the other hand, never see the user’s actual credentials. Instead, they validate a token that represents a prior authentication event. This allows for greater isolation and protection of user credentials while still providing some means to verify the trustworthiness of the incoming identity assertions. OAuth 2, SAML assertions, JWT/SWT tokens and header-based identity propagation are all examples of federation providers.

Out of the box, Apache Knox enables the use of custom headers for propagating things like the user principal and group membership through the HeaderPreAuth federation provider. This is generally useful for solutions such as CA SiteMinder and IBM Tivoli Access Manager. In these sorts of deployments, all traffic to Hadoop would go through the solution gateway, which would then authenticate the user and can inject the request with identity propagation headers.

The use of network security and the identity management solution does not allow requests to bypass the authenticating solution gateway. This provides a level of trust for accepting the header-based identity assertions. Knox can be configured to provide additional validation through a pluggable mechanism and IP address validation. This ensures that the requests are coming from a configured set of trusted IP addresses, presumably those of the solution gateway.

Let’s Add a Federation Provider

This blog will discuss how to add a new federation provider that will extend the abstract bases that were introduced in the PreAuth provider module. It will be a minimal provider that accepts a request parameter from the incoming request.

The module and dependencies


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>net.minder</groupId>
<artifactId>gateway-provider-security-pseudo</artifactId>
<version>0.0.1</version>

<repositories>
<repository>
<id>apache.releases</id>
<url>https://repository.apache.org/content/repositories/releases/</url>
</repository>
</repositories>

<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.knox</groupId>
<artifactId>gateway-spi</artifactId>
<version>0.5.0</version>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.knox</groupId>
<artifactId>gateway-spi</artifactId>
<version>0.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.knox</groupId>
<artifactId>gateway-util-common</artifactId>
<version>0.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.knox</groupId>
<artifactId>gateway-provider-security-preauth</artifactId>
<version>0.5.0</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.servlet</artifactId>
<version>3.0.0.v201112011016</version>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.easymock</groupId>
<artifactId>easymock</artifactId>
<version>3.0</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.knox</groupId>
<artifactId>gateway-test-utils</artifactId>
<scope>test</scope>
<version>0.5.0</version>
</dependency>
</dependencies>

</project>

Dependencies

NOTE: the “version” element must match the version indicated in the pom.xml of the Knox project. Otherwise, building will fail.

gateway-provider-security-preauth

This particular federation provider is going to extend the existing PreAuth module with the capability to accept the user.name request parameter as an assertion by a trusted party of the user’s identity. Knox will use the preauth module to leverage the base classes for things like IP address validation.

gateway-spi

The gateway-spi dependency pulls in the general interfaces, base classes and utilities that are expected by the Apache Knox gateway. The core GatewayServices are available through the gateway-spi module and other elements of gateway development.

gateway-util-commom

This gateway-util-common module provides common utility facilities for developing the gateway product. This is where you find the auditing, JSON and url utilities classes for gateway development.

javax.servlet from org.eclipse.jetty.orbit

This module provides the specific classes needed to implement the provider filter.

junit, easymock and gateway-test-utils

JUnit, easymock and gateway-test-utils provide the basis for writing REST-based unit tests for the Apache Knox Gateway project. They can be found in all of the existing unit tests for the various modules that make up the gateway offering.

Apache Knox Topologies

In Apache Knox, individual Apache Hadoop clusters are represented by descriptors called topologies. These topologies deploy specific endpoints that expose and protect access to the services of the associated Hadoop cluster. The topology descriptor describes the available services and their respective URLs within the actual cluster. It also describes the policy for protecting access to those services.

The policy is defined through the description of various Providers. Each provider and service within a Knox topology has a role, and provider roles consist of:

  • authentication,
  • federation
  • authorization, and
  • identity assertion

In this blog we are concerned with a Provider of type “federation.”

The Pseudo provider makes two assumptions. First, that authentication has happened at the OS level or from within another piece of middleware. The second assumption is that credentials were exchanged with some party other than Knox. This other party will be trusted by the Knox federation provider. The typical provider configuration will look something like this:


<provider>
<role>federation</role>
<name>Pseudo</name>
<enabled>true</enabled>
</provider>

Ultimately, an Apache Knox topology manifests as a web application deployed within the gateway process. It exposes and protects the URLs associated with the services of the underlying Hadoop components in each cluster.

Providers generally interject a ServletFilter into the processing path of the REST API requests that enter the gateway and are dispatched to the Hadoop cluster. The mechanism used to interject the filters, their related configuration and integration into the gateway is the ProviderDeploymentContributor.

ProviderDeploymentContributor


/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.gateway.preauth.deploy;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

import org.apache.hadoop.gateway.deploy.DeploymentContext;
import org.apache.hadoop.gateway.deploy.ProviderDeploymentContributorBase;
import org.apache.hadoop.gateway.descriptor.FilterParamDescriptor;
import org.apache.hadoop.gateway.descriptor.ResourceDescriptor;
import org.apache.hadoop.gateway.topology.Provider;
import org.apache.hadoop.gateway.topology.Service;

public class PseudoAuthContributor extends
ProviderDeploymentContributorBase {
private static final String ROLE = "federation";
private static final String NAME = "Pseudo";
private static final String PREAUTH_FILTER_CLASSNAME = "org.apache.hadoop.gateway.preauth.filter.PseudoAuthFederationFilter";

@Override
public String getRole() {
return ROLE;
}

@Override
public String getName() {
return NAME;
}

@Override
public void contributeFilter(DeploymentContext context, Provider provider, Service service,
ResourceDescriptor resource, List<FilterParamDescriptor> params) {
// blindly add all the provider params as filter init params
if (params == null) {
params = new ArrayList<FilterParamDescriptor>();
}
Map<String, String> providerParams = provider.getParams();
for(Entry<String, String> entry : providerParams.entrySet()) {
params.add( resource.createFilterParam().name( entry.getKey().toLowerCase() ).value( entry.getValue() ) );
}
resource.addFilter().name( getName() ).role( getRole() ).impl(PREAUTH_FILTER_CLASSNAME ).params( params );
}
}

The topology descriptor indicates which DeploymentContributors are required for a given cluster through the role and the name of the providers.. The topology deployment machinery within Knox first looks up the required DeploymentContributor by role. In this example, it identifies the provider as being a type of federation. It then looks for the federation provider with the name of Pseudo.

Once the providers have been resolved into the required set of DeploymentContributors, each contributor is given the opportunity to contribute to the construction of the topology web application that exposes and protects the service APIs within the Hadoop cluster.

This particular DeploymentContributor needs to add the PseudoAuthFederationFilter servlet filter implementation to the topology specific filter chain. It will also add each of the provider parameters from the topology descriptor as filterConfig parameters. This enables the configuration of the resulting servlet filters from within the topology descriptor while encapsulating the specific implementation details of the provider from the end user.

PseudoAuthFederationFilter


/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.gateway.preauth.filter;

import java.security.Principal;
import java.util.Set;

import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;

public class PseudoAuthFederationFilter
extends AbstractPreAuthFederationFilter {

@Override
public void init(FilterConfig filterConfig) throws ServletException {
super.init(filterConfig);
}

/**
* @param httpRequest
*/
@Override
protected String getPrimaryPrincipal(HttpServletRequest httpRequest) {
return httpRequest.getParameter("user.name");
}

/**
* @param principals
*/
@Override
protected void addGroupPrincipals(HttpServletRequest request,
Set<Principal> principals) {
// pseudo auth currently has no assertion of group membership
}
}

The PseudoAuthFederationFilter above extends AbstractPreAuthFederationFilter. This particular base class takes care of a number of boilerplate type aspects of pre-authenticated providers that would otherwise have to be done redundantly across providers. The two abstract methods that are specific to each provider are getPrimaryPrincipal and addGroupPrincipals. These methods are called by the base class in order to determine what principals should be created and added to the java Subject that will become the effective user identity for the request processing of the incoming request.

getPrimaryPrincipal

Implementing the abstract method getPrimaryPrincipal allows the new provider to extract the established identity from the incoming request for the given provider and communicate it back the AbstractPreAuthFederationFilter. This will then add it to the java Subject being created to represent the user’s identity. For this particular provider, all we have to do is return the request parameter by the name of “user.name.”

addGroupPrincipals

Given a set of Principals, the addGroupPrincipals is an opportunity to add additional group principals to the resulting java Subject that will be used to represent the user’s identity. This is done by adding new org.apache.hadoop.gateway.security.GroupPrincipals to the set. For the Pseudo authentication mechanism in Hadoop, there really is no way to communicate the group membership through the request parameters. One could easily envision adding an additional request parameter for this though, “user.groups” for example.

Configure as an Available Provider


resources/META- INF/services/org.apache.hadoop.gateway.deploy.ProviderDeploymentContributor
##########################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##########################################################################

org.apache.hadoop.gateway.preauth.deploy.PseudoAuthContributor

Add to Knox as a Gateway Module

At this point, the module should be able to be built as a standalone module with:
mvn clean install

However, we want to extend the Apache Knox Gateway build to include the new module in its build and release processes. In order to do this we will need to add it to a common pom.xml file.

At the root of the project source tree there is a pom.xml file that defines all of the modules that are official components of the gateway server release. You can find each of these modules in the “modules” element. We need to add our new module declaration there:


<modules>
...
<module>gateway-provider-security-pseudo</module>
...
</modules>

Then later in the same file we have to add a fuller definition of our module to the dependencyManagement/dependencies element:


<dependencyManagement>
<dependencies>
...
<dependency>
<groupId>${gateway-group}</groupId>
<artifactId>gateway-provider-security-pseudo</artifactId>
<version>${gateway-version}</version>
</dependency>
...
</dependencies>
</dependencyManagement>

Gateway Release Module Pom.xml

Now, our Pseudo federation provider is building with the gateway project but it isn’t quite included in the gateway server release artifacts. In order to include it in the release archives and make available to the runtime, we need to add it as a dependency to the appropriate release module. In this case, we are adding it to the pom.xml file within the gateway-release module:


<dependencies>
...
<dependency>
<groupId>${gateway-group}</groupId>
<artifactId>gateway-provider-security-pseudo</artifactId>
</dependency>
...
</dependencies>

Note that this is basically the same definition that was added to the root level pom.xml but minus the “version” element.

Build, Test and Deploy

At this point, we should have an integrated custom component that can be described for use within the Apache Knox topology descriptor file and engaged in the authentication of incoming requests for resources of the protected Hadoop cluster.

building

You may use the same maven commands to:

mvn clean install

This will build and run the gateway unit tests.

You may use the following to not only build and run the tests but to also package up the release artifacts. This is a great way to quickly setup a test instance to manually test your new Knox functionality.

ant package

testing

To install the newly packaged release archive in a GATEWAY_HOME environment:

ant install-test-home

This will unzip the release bits into a local ./install directory and do some initial setup tasks to ensure that it is actually runnable.

We can now start a test ldap server that is seeded with a couple test users:

ant start-test-ldap

The sample topology files are setup to authenticate against this LDAP server for convenience and can be used as is for a quick sanity test of the install.

At this point, we can choose to run a test Knox instance or a debug Knox instance. If you want to run a test instance without the ability to connect a debugger then:

ant start-test-gateway

You may now test the out-of-box authentication against LDAP using HTTP BASIC by using curl and one of the simpler APIs exposed by Apache Knox:

curl -ivk --user guest:guest-password https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS

Change Topology Descriptor

Once the server is up and running and you are able to authenticate with HTTP BASIC against the test LDAP server, you can now change the topology descriptor to leverage your new federation provider.

Find the sandbox.xml file in the install/conf/topologies file and edit it to reflect your provider type, name and any provider specific parameters.


<provider>
<role>federation</role>
<name>PseudoProvider</name>
<enabled>true</enabled>
<param>
<name>filter-init-param-name</name>
<value>value</value>
</param>
</provider>

Once your federation provider is configured, just save the topology descriptor. Apache Knox will notice that the file has changed and it will automatically redeploy that particular topology. Any provider params described in the provider element will be added to the PseudoAuthFederationFilter as servlet filter init params. These can be used to configure aspects of the filter’s behavior.

curl again

We are now ready to use curl again to test the new federation provider and ensure that it is working as expected:

curl -ivk https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS&user.name=guest

Conclusion

This blog illustrated a simplified example of implementing a federation provider for establishing the identity of a previous authentication event and propagating that into the request processing for Hadoop REST APIs inside of Apache Knox.

The process to extend the pre-authenticated federation provider is a quick and simple way to extend certain SSO capabilities to provide authenticated access to Hadoop resources through Apache Knox.

The Knox community is a growing community that welcomes contributions from users interested in extending Knox’s capabilities with useful features.

NOTE: The provider illustrated in this example has limitations that preclude it from being used in production. Most notably, it does not have any means to follow redirects because it lacks the user.name parameter in the Location header. We would need to add something like a cookie to be able to determine the user identity on the redirected request.

More Resources

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>