Hadoop GroupMapping – LDAP Integration

LDAP provides a central source for maintaining users and groups within an enterprise. There are two ways to use LDAP groups within Hadoop. The first is to use OS level configuration to read LDAP groups. The second is to explicitly configure Hadoop to use LDAP-based group mapping.

Here is an overview of steps to configure Hadoop explicitly to use groups stored in LDAP.

  • Create Hadoop service accounts in LDAP
  • Shutdown HDFS NameNode & YARN ResourceManager
  • Modify core-site.xml to point to LDAP for group mapping
  • Re-start HDFS NameNode & YARN ResourceManager
  • Verify LDAP based group mapping

Prerequisites: Access to LDAP and the connection details are available.

Step 1: Create Hadoop service accounts in LDAP

Here is an example services.ldif file which defines the Hadoop service accounts (hcat, mapred, hdfs, yarn, hbase, zookeeper, oozie, hive). It also defines the Hadoop group and makes Hadoop services a member of the Hadoop group. Add the accounts and groups in LDIF to your LDAP. Here is an example using the ldapadd command to do just that:

ldapadd -f /vagrant/provision/services.ldif -D cn=manager,dc=hadoop,dc=apache,dc=org -w hadoop

Note: The values in italics are specific to your environment.

Step 2: Shutdown Hadoop

See the Hortonworks Data Platform documentation for steps on shutting down HDFS NameNode & YARN ResourceManager.

Step 3: Modify core-site.xml to point to LDAP for group mapping

Back up your core-site.xml before making modifications to it. Below is a sample configuration that needs to be added to core-site.xml. You will need to provide the value for the bind user, bind password and other properties specific to your LDAP and make sure object class, user & group filter match the values specified in services.ldif

<property
  <name>hadoop.security.group.mapping</name>
  <value>org.apache.hadoop.security.LdapGroupsMapping</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.bind.user</name>
  <value>cn=Manager,dc=hadoop,dc=apache,dc=org</value>
</property>
<!--
<property>
  <name>hadoop.security.group.mapping.ldap.bind.password.file</name>
  <value>/etc/hadoop/conf/ldap-conn-pass.txt</value>
</property>
-->
<property>
  <name>hadoop.security.group.mapping.ldap.bind.password</name>
  <value>hadoop</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.url</name>
  <value>ldap://localhost:389/dc=hadoop,dc=apache,dc=org</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.url</name>
  <value>ldap://localhost:389/dc=hadoop,dc=apache,dc=org</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.base</name>
  <value></value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
  <value>(&amp;(|(objectclass=person)(objectclass=applicationProcess))(cn={0}))</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
  <value>(objectclass=groupOfNames)</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
  <value>member</value>
</property>
<property>
  <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
  <value>cn</value>
</property>

While group mapping configuration supports reading password from a file, in the above example relevant configuration is commented out due to this bug (HADOOP-10249) .

Step 4 : Re-start Hadoop

Follow the instructions in the Hortonworks Data Platform documentation to re-start HDFS NameNode & YARN ResourceManager.

Step 5: Verify LDAP group mapping

Run hdfs groups command. This command will fetch groups from LDAP for the current user. Note with LDAP group mapping configured, the hdfs permission can leverage groups defined in LDAP for access control

Conclusion

Since there are two ways in Hadoop to use groups in LDAP, a basic question is when to use each way. The OS based group mapping is a Linux/Unix method and won’t work on Windows. The explicit group mapping covered in this post will work on both Linux & Windows.

Let me know if you run into any issues with the steps in this post or have any comments on this post. In the next post I will cover configuring OS to read group information from LDAP.

Categorized by :
Administrator HDFS HDP 2 Security

Comments

Charles Slovak
|
April 19, 2014 at 10:57 am
|

How does this work with LDAP connecting to the AD from Windows ??

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Thursday, October 30, 2014
1:00 PM Eastern / 12:00 PM Central / 11:00 AM Mountain / 10:00 AM Pacific

More Webinars »

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.