Apache Ranger

Comprehensive security for Enterprise Hadoop

Apache Ranger delivers a comprehensive approach to security for a Hadoop cluster. It provides central security policy administration across the core enterprise security requirements of authorization, audit and data protection.

Apache Ranger already extends baseline features for coordinated enforcement across Hadoop workloads from batch, interactive SQL and real–time IN Hadoop. It truly represents a major step forward for the Hadoop ecosystem by providing a comprehensive approach to security – all completely as open source.

What Ranger Does

Apache Ranger offers a centralized security framework to manage fine-grained access control over Hadoop data access components like Apache Hive and Apache HBase. Using the Apache Ranger console, security administrators can easily manage policies for access to files, folders, databases, tables, or column. These policies can be set for individual users or groups and then enforced within Hadoop.

Security administrators can also use Apache Ranger to manage audit tracking and policy analytics for deeper control of the environment. The solution also provides an option to delegate administration of certain data to other group owners, with the aim of securely decentralizing data ownership.

Apache Ranger currently supports authorization, auditing and security administration for the following HDP components:

How Ranger Works

Apache Ranger has a decentralized architecture with the following internal components:

Component Description
Ranger portal The portal is the central interface for security administration. Users can create and update policies, which are then stored in a policy database. Plugins within each component poll these policies at regular intervals.

The portal also consists of an audit server that sends audit data collected from the plugins for storage in HDFS or in a relational database.

Ranger plugins Plugins are lightweight Java programs which embed within processes of each cluster component. For example, the Apache Ranger plugin for Apache Hive is embedded within Hiveserver2.

These plugins pull in policies from a central server and store them locally in a file. When a user request comes through the component, these plugins intercept the request and evaluate it against the security policy. Plugins also collect data from the user request and follow a separate thread to send this data back to the audit server.

User group sync Apache Ranger provides a user synchronization utility to pull users and groups from Unix or from LDAP or Active Directory. The user or group information is stored within Ranger portal and used for policy definition.

Deployment

Ranger can be deployed manually or can be deployed using Ambari, starting with Ambari 2.0.

Hortonworks Focus for Ranger

Focus Planned Enhancements
Extension of support Additional investments extend administration of authorization and auditing to more Hadoop components:

  • resource management (YARN)
  • search (Apache Solr)
  • messaging (Apache Kafka)
Deeper integration
  • API integration with HDFS
  • Support for new permissions within cluster components
Enterprise readiness
  • Centralizing audit for the entire platform
  • Enabling interactive audit queries through Solr
  • Global tag-based policies
Encryption Production-ready KMS to support HDFS Transparent Data Encryption

Recent Progress in Ranger

Version Prior Enhancements
Apache Ranger 0.4
  • Support for authorization and audit in Apache Storm and Apache Knox
  • Integration with Apache Hive API, support of local grant/revoke permissions
  • Support grant/revoke in Apache HBase
  • Audit storage in HDFS
  • Windows support
  • REST APIs for policy manager
  • Support for Oracle database as a policy and audit store
HDP Advanced Security 3.5
  • Centralized security administration
  • Fine-grain access control for Apache Hadoop, Hive and HBase
  • Detailed resource auditing
  • Delegated administration
  • Audit of policy updates

Simplified, Comprehensive Hadoop Security with Ambari

Ambari 2.0 helps provision, manage and monitor Hadoop security in two ways. First, Ambari now simplifies the setup, configuration and maintenance of Kerberos for strong authentication in the cluster. Secondly, Ambari now includes support for installing and configuring Apache Ranger for centralized security administration, authorization and audit. For additional details view the Apache Ambari page.

Try these Tutorials

Try the New Hadoop Security Tutorial

Manage Security Policy for Hive and HBase with Knox and Ranger

 

Try Ranger with Sandbox

Hortonworks Sandbox is a self-contained virtual machine with HDP running alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox

View Past Webinars

Discovering Patterns for Cyber Defense Using Linked Data Analysis
More posts on:
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.