Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
December 23, 2014
prev slideNext slide

Apache Ranger Audit Framework

Introduction

Apache Ranger provides centralized security for the Enterprise Hadoop ecosystem, including fine-grained access control and centralized audit mechanism, all essential for Enterprise Hadoop. This blog covers various details of Apache Ranger’s audit framework options available with Apache Ranger Release 0.4.0 in HDP 2.2 and how they can be configured.

The audit framework can be configured to send access audit logs generated by Apache Ranger plug-ins to one or more of the following destinations:

  • RDBMS: MySQL or Oracle
  • HDFS
  • Log4j appender

Ranger Audit framework supports saving audit logs to RDBMS. Currently, MySQL and Oracle are the supported RDBMS, with other DBs such as Postgres in the roadmap. Interactive audit reporting in Ranger Administration portal reads the audit logs from RDBMS.

Database schema for audit logging is generated during the installation of Ranger Admin. Before running setup.sh to setup Ranger Admin, please specify the audit database details in install.properties file, as shown in the example below. (Please refer to Ranger Admin installation documentation for details of install.properties and setup.sh.)


DB_FLAVOR=MYSQL
SQL_COMMAND_INVOKER=mysql
db_root_password=secretPa5$
db_host=mysqdb.example.com
audit_db_name=ranger_audit
audit_db_user=ranger_audit
audit_db_password=secretPa5$

During setup of each Ranger plug-in (HDFS/Hive/HBase/Knox/Storm), i.e. before running enable-*-plugin.sh, please specify the audit database details in install.properties , as shown in the example below. (Please refer to Ranger plug-in installation documentation for details of install.properties and enable-*-plugin.sh.) Please ensure to provide the same database details used during Ranger Admin setup.


XAAUDIT.DB.IS_ENABLED=true
XAAUDIT.DB.FLAVOUR=MYSQL
XAAUDIT.DB.HOSTNAME=mysqldb.example.com
XAAUDIT.DB.DATABASE_NAME=ranger_audit
XAAUDIT.DB.USER_NAME=ranger_audit
XAAUDIT.DB.PASSWORD=secretPa5$

Audit logging to RDBMS can be configured to be synchronous or asynchronous. In synchronous mode, the calls to audit will block the thread until it is committed to the database. In asynchronous mode, the calls to audit will return quickly after adding the audit log to an in-memory queue. Another thread in the audit framework will read from this queue and save to RDBMS. In asynchronous mode, a single database commit can include number of audit logs (batch commit); this can result in significant performance improvements. If the in-memory queue is full, the audit log will be dropped; periodic log messages will be written to the component log file with the count of dropped audit logs.

The default mode for audit logging to RDBMS is asynchronous. To alter the default logging mode and other configurations, like the size of the in-memory queue, update the xasecure-audit.xml in the CLASSPATH, which is typically in the component’s configuration directory, for example /etc/hadoop/conf/xasecure-audit.xml. For any configuration changes to take effect, restart the component. A list of available configurations is provided in Configuration section below.

To handle higher rate and volume of audit logs in your environment, we suggest you plan appropriate database sizing, partitioning, and automated way of purging logs.

Logging to HDFS

Ranger Audit framework can be configured to store the audit logs to HDFS, in JSON format (example below). Audit logs in HDFS can later be processed by other applications, like Apache Hive, to query and report. Please note that audit reporting functionality in Ranger Administration Portal currently uses only the audit logs stored in RDBMS.

A sample Apache HBase access audit log in JSON format:


{
"resource":"tbl_xyz",
"resType":"table",
"reqUser":"user1",
"evtTime":"2014-11-25 22:40:33.946",
"access":"createTable",
"result":1,
"enforcer":"xasecure-acl",
"repoType":2,
"repo":"hbasedev",
"cliIP":"172.18.145.43",
"action":"createTable",
"agentHost":"host1",
"logType":"RangerAudit",
"id":"eb45f6e8-6737-4174-92f6-45a9beabf5e7"
}

During setup of each Ranger plug-in (HDFS/Hive/HBase/Knox/Storm), i.e. before running enable-*-plugin.sh, please specify the HDFS audit log properties in install.properties, as shown in the example below. Please ensure to create necessary HDFS/staging/archive directories with read and write privileges for the plug-in’s user or owner.

XAAUDIT.HDFS.IS_ENABLED=true
XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://namenode.example.com:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=/var/log/hadoop/%app-type%/audit
XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=/var/log/hadoop/%app-type%/audit/archive

More details on the tags supported in the file/directory name specifications are provided later in this section.

To minimize the performance impact, the calls to create audit log write the audit log to a staging file on the host where the component runs. The local staging file is rolled-over periodically, every 10 minutes by default. After a rollover, another thread in the audit framework writes/appends the staged file contents to a HDFS file. Depending upon the rollover interval configuration of the HDFS and local staging files, multiple local staged files can be written to the same HDFS file.

Saving of audit logs to local staging file can either be synchronous or asynchronous. In synchronous mode, the calls to audit will block the thread until the log is written to the staging file. By contrast, in asynchronous mode, the calls to audit will return quickly after adding the audit log to an in-memory queue. A separate thread in the audit framework will read from this queue and write to local staging file. If the in-memory queue is full when an audit call is made, the audit log will not be recorded. To keep record of unrecorded audit logs, a count of unrecorded audit logs will be periodically written to the component log.

As with logging to RDMS, the default mode for audit logging to HDFS is asynchronous. The logging mode and other configurations, like the size of the in-memory queue, rollover period, etc., can be changed by updating xasecure-audit.xml in the CLASSPATH (typically in the component’s configuration directory, for example /etc/hadoop/conf/xasecure-audit.xml). For changes to take effect, restart of the component is required. A list of available configurations is provided in Configuration section below.

To help organize the audit logs in the file system, Ranger audit framework supports various tags in the file/directory names. At the time of file creation, the audit framework replaces these tags with appropriate values. Here are the details of the tags supported on file and directory names:

  1. %hostname%

    Name of the current host in which the audit framework is executing.

  2. %time:date-format-specification%

    Current time formatted using the given specification. For more details on the supported format specification, please refer to Java SimpleDateFormat documentation.

  3. %jvm-instance%

    Unique identifier of the JVM instance in which the audit framework is executing – generated using Java VMID class.

  4. %property:system-property-name%

    Value of the given system property name in the JVM where audit framework is executing.

  5. %env:env-variable-name%

    Value of the given environment variable in the JVM where audit framework is executing.

  6. %app-type%

    Type of the application the audit framework runs in:
    hdfs, hiveServer2, hbaseMaster, hbaseRegional, knox, storm

Logging using Log4j

The Ranger Audit framework supports sending audit logs to log4j appender(s). Using this mechanism, you can send Ranger audit logs to destinations that have log4j appenders. To receive audit logs in JSON format, component’s log4j configuration should be updated to specify the appender(s) in the following property:

  • log4j.logger.xaaudit=

Configuration

Ranger audit framework reads its configuration from xasecure-audit.xml in the CLASSPATH, typically in the conf directory of the Hadoop component in which the Ranger plug-in runs. This file is populated with values provided by the user during Ranger plug-in installation. The configurations supported in xasecure-audit.xml along with the details of the values for each are listed in the following table; this file has additional configuration than the ones available during installation.Please note that for changes to this file to become effective, the component needs to be restarted.

Configuration Name Default Value Notes/strong>
xasecure.audit.is.enabled true Setting to enable/disable audit logging in the Ranger plug-in.
true – enable audit log
false – disable audit log
xasecure.audit.db.is.enabled false true – enable audit to RDBMS
false – disable audit to RDBMS
xasecure.audit.db.is.async false true – send audit logs to DB asynchronously
false – send audit logs to DB synchronously
xasecure.audit.db.async.max.queue.size 10240 Maximum number of audit logs to keep in queue. Attempts to create audit log when the queue is at maximum will result dropping of the audit log.
xasecure.audit.db.async.max.flush.interval.ms 5000 Maximum interval between commits to database.
xasecure.audit.db.config.retry.min.interval.ms 15000 Interval between attempts to connect to the database, after a failure.
xasecure.audit.jpa.javax.persistence.jdbc.driver None JDBC driver to connect to the DB. Example:
MySQL: net.sf.log4jdbc.DriverSpy
Oracle: oracle.jdbc.OracleDriver
xasecure.audit.jpa.javax.persistence.jdbc.url None JDBC URL to connect to the DB.
xasecure.audit.jpa.javax.persistence.jdbc.password None Password to connect to the DB.
xasecure.audit.hdfs.is.enabled false true – enable audit to HDFS
false – disable audit to HDFS
xasecure.audit.hdfs.is.async false true – send audit logs asynchronously
false – send audit logs synchronously
xasecure.audit.hdfs.async.max.queue.size 10240 Maximum number of audit logs to keep in queue. Attempts to create audit log when the queue is at maximum will result dropping of the audit log.
xasecure.audit.hdfs.config.destination.directroy None Absolute path to the HDFS directory in which audit logs should be stored. See the note below on the tags supported on file/directory names.
xasecure.audit.hdfs.config.destination.file None Name of the HDFS file to which audit logs should be written. See the note below on the tags supported on file/directory names.
xasecure.audit.hdfs.config.destination.flush.interval.seconds 900
(15 minutes)
Interval between calls to hflush on destination HDFS file.
xasecure.audit.hdfs.config.destination.rollover.interval.seconds 86400
(1 day)
Interval between rollover of destination HDFS file.
xasecure.audit.hdfs.config.destination.open.retry.interval.seconds 60
(1 minute)
Interval between calls to flush audit logs written to staging file.
xasecure.audit.hdfs.config.local.buffer.rollover.interval.seconds 600
(10 minutes)
Interval between rollover of staging file.
None Absolute path to the local directory to store audit log files after sending to HDFS. See the note below on the tags supported on file/directory names.
xasecure.audit.hdfs.config.local.archive.max.file.count None Maximum number of files to store in archive directory.
xasecure.audit.log4j.is.enabled false true – enable audit to log4j
false – disable audit to log4j
xasecure.audit.log4j.is.async false true – send audit logs asynchronously
false – send audit logs synchronously
xasecure.audit.log4j.async.max.queue.size 10240 Maximum number of audit logs to keep in queue. Attempts to create audit log when the queue is at maximum will result dropping of the audit log.
Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>