Data integrity and availability are important for Apache Hadoop, especially for enterprises that use Apache Hadoop to store critical data. This blog will focus on a few important questions about Apache Hadoop’s track record for data integrity and availability and provide a glimpse into what is coming in terms of automatic failover for HDFS NameNode.
In 2009, we examined HDFS’s data integrity at Yahoo! and found that HDFS lost 650 blocks out of 329 million blocks on 10 clusters with 20,000 nodes running Apache Hadoop 0.20.3. Of the 650 lost blocks:
The customer who lost those 19 blocks would not find HDFS’s 99.99999% data reliability very comforting. Hence we take loosing 19 blocks out of 320 million blocks very seriously.
As described above, HDFS tolerates failures of storage servers (called DataNodes) and its disks. NameNode stores its metadata on multiple disks that typically include a non-local file server; hence when the HDFS NameNode is restarted it recovers its metadata. The HDFS file system is temporarily unavailable whenever the HDFS NameNode is down.
We examined failures of the HDFS’ NameNode over the last 18 months and found that there were 22 failures across 25 clusters over the last 18 months, which equates to 0.58 failures per cluster per year. The Mean-Time-Between-Failures (MTBFs) are:
Of these, only 8 would have benefitted from an automatic failure of the NameNode. These 8 included both hardware failures (some due memory errors) and software failures. Due to the low failure rate the MTFS improve only slightly with a 5 minute automatic failover:
Failures that would have not benefitted from automatic failover included cluster power failover, failure of non-redundant switches, configuration errors, etc.
More details are available in Robert Chansler’s Hadoop Summit 2011 presentation (which will soon be available as a separate blog).
Automatic failover for the HDFS NameNode is in active development. Suresh Srinivas (also from Hortonworks) and I have published a detailed design in JIRA HDFS-1623. The design has been well received by other active HDFS developers and the feature is actively being developed by engineers at Hortonworks, Cloudera, EBay and Facebook. Several sub-task JIRAs have been filed and are being addressed by developers from these organization.
The design was recently presented at a Hadoop User Group meeting on July 27th, 2011 (HA NameNode). Be sure to stay tuned to the Hortonworks blog for more details.
Much of the data reported in this blog was collected at Yahoo! and reported by Robert Chansler at Hadoop Summit 2011.
— Sanjay Radia