HDFS Forum

HDFS and under-replicated blocks

  • #19163
    Francois BORIE


    I have a running cluster with 3 live datanodes, with a default HDFS replication factor of 3. I only get 111 total blocks on my datanodes, but I still have 45 blocks “under-replicated” (even if I let the cluster running for some days)

    I don’t understand why, because the namenode should automatically handle this replication.

    But are these block really under-replicated ?

    I’ve seen some threads on the web that indicate that it can be a “display” bug with all Hadoop 0.20 versions (for example, this one : http://stackoverflow.com/questions/7997587/under-replicated-blocks-count-is-inaccurate-buy-why)

    Do you agree with that ? Or shoud I always have 0 under replicated blocks.

    Many thanks for your help,



to create new topics or reply. | New User Registration

  • Author
  • #19164

    Hi François,

    How is your day so far? It is possible that this issue can be a bug, But let us find out more about the issue. From command line please run the following as hdfs user on the namenode.
    # hadoop fsck / -locations -blocks -files | grep -i -C6 miss
    #hadoop version
    Please post the output of the commands in the forum.


    Francois BORIE

    Hi Abdelrhaman,

    Thanks for your answer.

    You will find below the output of the commands you ask :

    -bash-4.1$ hadoop fsck / -locations -blocks -files | grep -i -C6 miss
    Over-replicated blocks: 0 (0.0 %)
    Under-replicated blocks: 45 (42.056076 %)
    Mis-replicated blocks: 0 (0.0 %)
    Default replication factor: 3
    Average block replication: 3.0
    Corrupt blocks: 0
    Missing replicas: 315 (98.130844 %)
    Number of data-nodes: 3
    Number of racks: 1
    FSCK ended at Thu Mar 28 11:09:08 CET 2013 in 1696 milliseconds

    The filesystem under path ‘/’ is HEALTHY

    -bash-4.1$ hadoop version
    Subversion -r
    Compiled by jenkins on Thu Jan 10 03:38:39 PST 2013
    From source with checksum ce0aa0de785f572347f1afee69c73861

    Many thanks,



    Larry Liu

    Hi, Francois

    What is the topology of your cluster? If all of 3 datanodes are in same rack, under replicated issue could happen. I recommend to use topology script to make 3 datanodes logically in 2 racks.


    Francois BORIE

    Hi Larry,

    Thanks for that confirmation. Actually you’re correct and my 3 datanodes are in the same rack (cf the output of the hadoop fsck command I’ve sent to Abdelrhaman).

    I think I will wait Ambari to be rack-awareness (I’ve seen it’s in your roadmap – AMBARI-645) to start playing with those parameters.




The topic ‘HDFS and under-replicated blocks’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.