The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

HDP on Linux – Installation Forum

HDFS space seems off

  • #33332
    Jeff Ferrell

    So i Installed 1.3.2 on a 3 node cluster and I set the secondary name node and third node as data nodes. The secondary name node has 3 2 TB drives on hardware raid so the total storage goes to about 4 TB then on the 3rd node I have 2 2TB drives in a jbod configuration. For the sdb1 drive on the 3rd node i mounted and added the mounted location to the DataNode directories in the web UI. Before i added the sdb1 drive in the third node it was reporting 5.x T B which seemed correct then when i added sdb1 it jumped to 10.7 TB and the secondary is reporting 7 TB + which is strange since it only has 4 TB of actual space. See the below details for more info. Any ideas ? One thing i did notice that seemed weird was the secondary name node also had the mounted drive location “/mnt/disk1” that i mounted on the third node is that normal ?

    [root@node2 disk1]# sudo -u hdfs hadoop dfsadmin -report
    Configured Capacity: 11774969516032 (10.71 TB)
    Present Capacity: 11158814830592 (10.15 TB)
    DFS Remaining: 11158459998208 (10.15 TB)
    DFS Used: 354832384 (338.39 MB)
    DFS Used%: 0%
    Under replicated blocks: 220
    Blocks with corrupt replicas: 0
    Missing blocks: 0

    Datanodes available: 2 (2 total, 0 dead)

    Decommission Status : Normal
    Configured Capacity: 7849935175680 (7.14 TB)
    DFS Used: 177410048 (169.19 MB)
    Non DFS Used: 413734744064 (385.32 GB)
    DFS Remaining: 7436023021568(6.76 TB)
    DFS Used%: 0%
    DFS Remaining%: 94.73%
    Last contact: Mon Aug 26 21:11:14 EDT 2013

    Decommission Status : Normal
    Configured Capacity: 3925034340352 (3.57 TB)
    DFS Used: 177422336 (169.2 MB)
    Non DFS Used: 202419941376 (188.52 GB)
    DFS Remaining: 3722436976640(3.39 TB)
    DFS Used%: 0%
    DFS Remaining%: 94.84%
    Last contact: Mon Aug 26 21:11:14 EDT 2013

  • Author
  • #33380

    Hi Jeff,

    This is not normal since it seems to be it is using the nfs mount of the node to calculate the capacity. Please adjust the following configuration to have one partition per directory in hdfs-site.xml.


    Jeff Ferrell

    ON my secondary name node i opened up the hdfs-site.xml config file and removed the /mnt/drive1 but when i restart hdfs it adds it right back in.The xml looks like the below. How do i get the config files across different nodes to be different with the ambari setup ? For node2 i just want /hadoop/hdfs/data and node 3 = /hadoop/hdfs/data,/mnt/disk1

    Current config below

    Jeff Ferrell

    Just as an update i deleted the /mnt/disk1 directory on node2 and then restarted with the /mnt/disk1 in the web ui config and i watched node 2 and it created the /mnt/disk1 directory on node2 im not sure whats going on.


    Hi Jeff,
    At the moment, Ambari does not support different config properties per node. This is a feature that Ambari team is looking in to. As far as your recent post, I was able to reproduce the issue on my single node cluster, where I stopped hdfs services, moved my datanode directory, then restarted the hdfs service via ambari and the folder was recreated. I have notified product team of this behavior.


    Jeff Ferrell

    Just so you guys know i resolved the issue by manually starting hdfs without ambari. See my steps below.

    1. Stop all services in ambari
    2. Edit each datanode config file they way you want.
    3. Run the below commands

    su hdfs
    /usr/lib/hadoop/bin/ –config /etc/hadoop/conf start namenode

    secondary name node
    su hdfs
    /usr/lib/hadoop/bin/ –config /etc/hadoop/conf start secondarynamenode
    /usr/lib/hadoop/bin/ –config /etc/hadoop/conf start datanode

    su hdfs
    /usr/lib/hadoop/bin/ –config /etc/hadoop/conf start datanode

    4. start all other services back up in ambari.
    5. View the hdfs space it should now report the proper disk space.

    Keep in mind if you stop hdfs in ambari and start again you will lose your custom config. Hopefully they fix this in future releases really annoying.

The forum ‘HDP on Linux – Installation’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.