Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 14, 2013
prev slideNext slide

How To Enable NFS Access to HDFS in Hortonworks Sandbox

In this blog we’ll set up NFS for HDFS access with the Hortonworks Sandbox 1.3. This allows the reading and writing of files to Hadoop using familiar methods to desktop users. Sandbox is a great way to understand this particular type of access.

If you don’t have it already, then download the sandbox here. Got the download? Then let’s get started.

Start the Sandbox. Get to this screen.

We will now enable Ambari so that we can edit the configuration to enable NFS. Log into the Sandbox as a root SSH session. The ‘root’ account password is ‘hadoop’.

Install the NFS Server bits for the Linux OS.

yum install nfs* -y

You may have to enable an externally facing network adapter to allow the yum command to resolve the correct repository. if this is not possible you will need the package called nfs-utils for Centos 6.

Start the Ambari Server

Navigating to the IP address provided when the sandbox started there is documentation provided on starting Ambari. The step are summarized below. Please be sure to remember to reboot the virtual machine after you have run the start_ambari script.

You must reboot the sandbox after you run the install Amabari script.

Open Ambari UI in the browser

Sign in with username: admin, password: admin.

We will now update the HDFS configs to enable NFS. To do that, we’ll need to stop the HDFS and MapReduce services, update the configs, and restart HDFS and MapReduce. MapReduce must be stopped first followed by HDFS.

Go to Services tab on top, then select MapReduce, and click Stop.

Go to Services tab on top, then select HDFS on the left, and choose Configs sub tab.

Click the Stop button to stop the HDFS services.

A successful service stoppage will show this:

In Configs tab, open the Advanced section, and change the value for dfs.access.time.precision to 3600000. This would be edited in the hdfs-default.xml via the command line.

In the same section, change the value for dfs.datanode.max.xcievers to 1024.

In Custom hdfs-site.xml section, add the following property:

This should then look like:

Then click the Save button.

Start the HDFS services, and then the MapReduce services

You need to stop the native Linux services nfs and portmap and then start the Hadoop enabled version:

service nfs stop
service rpcbind stop

hadoop portmap
hadoop nfs3

To get this started each time you restart your sandbox you can add a few lines to your rc.local startup script: start portmap start nfs3

This will place logs for each service in /var/log/hadoop.

Verify NFS server is up and running on the sandbox with the rpcinfo command. You may also run the showmount command both on the sandbox and on the client machine. You should see output similar to the output below stating “/” is available to everyone.

Create a user on your client machine that matches a user in the Sandbox HDP VM.

For example, hdfs is a user on the Sandbox VM. The UID for hdfs is 497.

On my client machine, which happens to be a Mac OS X machine, I’ll create a user hdfs with the same UID with the following commands:

sudo -i
mkdir /Users/hdfs
dscl . create /Users/hdfs
dscl . create /Users/hdfs RealName "hdfs"
dscl . create /Users/hdfs hint "Password Hint"
dscl . passwd /Users/hdfs hdfs
dscl . create /Users/hdfs UniqueID 497
dscl . create /Users/hdfs PrimaryGroupID 201
dscl . create /Users/hdfs UserShell /bin/bash
dscl . create /Users/hdfs NFSHomeDirectory /Users/hdfs
chown -R hdfs:guest /Users/hdfs

If on another operating system, create a user hdfs with the UID 497 to match the user on the sandbox VM. This is easily accomplished in Linux using the -u option to the adduser command. In Windows you likely want to use a NFS Client such as this. The answer for Server and premium versions of Windows includes adding the Subsystem for Unix Applications.

Mount HDFS as a file system on local client machine

mount -t nfs -o vers=3,proto=tcp,nolock HOSTIP:/  /PATH/TO/MOUNTPOINT

Now browse HDFS as if it was part of the local filesystem.

Load data off HDFS onto the local file system:

Delete data in HDFS:

Load data into HDFS. Take a file from the local disk,, and load it into the hdfs user directory on HDFS file system. On this local machine, HDFS is mounted at /Users/hdfs/mnt/

Additionally you can verify your files are in HDFS via the file browser in the Hue interface provided with the sandbox or returning to the command line you can change to users hdfs (su – hdfs) and use standard hadoop command line commands to query for your files.


Using this interface allows users of a Hadoop cluster to rapidly push data to HDFS in a way in which they are familiar from their desktops. Additionally this opens up the possibilities for scripting the pushing of data from some networked machine into Hadoop including upstream preprocessing of data from other systems.



Brandon Li says:

On my MacOS, the VirtualBox by default used the NAT network mode on guest, and thus the NFS export was not visible to the host or a different machine.

After I changed the gust network mode to Bridged Adapter, I could access the NFS service from outside.

Preetham says:

How to Enable UNIX_AUTH? and what are the configuration/changes needs to be done? please can you explain

Tyler Mitchell says:
Your comment is awaiting moderation.

Good tutorial thanks, a comment changes maybe need to happen to get it up to date, but is much of this setup even needed now? All the changes it mentioned were already in place by default in the sandbox. Thanks! 🙂

1. Looks like the “stop” buttons have moved to the “Service Actions” drop-down.
2. dfs.access.time.precision to 3600000 – was already set. Name of property is different now too? dfs.accesstime.precision?
3. dfs.datanode.max.xcievers setting is not under “advanced” – rather in “Custom hdfs-site.xml” section

Biswaranjan says:

If we have some files in local mount point and before processing those files the server got re-booted then what will happen with those files..Will it be deleted or still exist in local mount point..
Please suggest.

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums