Full stack HA in Hadoop 1: HBase’s Resilience to Namenode Failover
In this blog, I’ll cover how we tested Full Stack HA with NameNode HA in Hadooop 1 with Hadoop and HBase as components of the stack.
Yes, NameNode HA is finally available in the Hadoop 1 line. The test was done with Hadoop branch-1 and HBase-0.92.x on a cluster of roughly ten nodes. The aim was to try to keep a really busy HBase cluster up in the face of the cluster’s NameNode repeatedly going up and down. Note that, HBase would be functional during the time NameNode would be down. It’d only affect those operations that requires a trip to the NameNode (for example, rolling of the WAL, or compaction, or flush), and those would affect only the relevant end users (a user using the HBase get API may not be affected if that get didn’t require a new file open, for example).
HBase was kept busy by running a load test – LoadTestTool (available in 0.92 branch), with a set of arguments (number of reader/writer threads, sizes of rows, etc.) that were selected induced significant pressure on the HBase cluster. In turn, the configuration of HBase was artificially modified so that HBase would make lots of trips to the NameNode for file operations (low flush thresholds, very low major compactions frequency). For the test, the NameNode was repeatedly brought up and down (specifically, a loop of “bring down the namenode, let it remain down for a small period of time, bring up the namenode, let it remain up for another period of time”). This stop-start-pattern had some randomness built into it.
The cluster kept up reasonably well with the above load and the failure mode. But we also saw that we were losing HBase RegionServers somewhat randomly. Upon a close analysis of the logs on the NameNode & RegionServers, what seemed to be the case was that file lengths were not recorded correctly in the edit-logs. This issue turned out to be a known issue, that was addressed in HDFS-1108. The fix was backported to Hadoop-1.0.x line. It should be noted that HA team at Hortonworks had fixed other issues and as is the usual practice for us, these fixes were applied to Apache Hadoop trunk and back ported to Hadoop 1.x line and will also be back-pported to the 2-alpha.
With the above fix in HDFS, the tests were rerun. The cluster remained up without any RegionServer losses for more than 48 hours. No glitches!
Well to be precise, the cluster started behaving weirdly since the datanodes started running out of space since the HBase load generation has successfully filled up the HDFS capacity in spite of repeated NameNode restarts. (I should file some jiras to handle that more gracefully!). While my tests were not using the automated failover of the NameNode node one can now configure the NameNode in Hadoop 1 to automatically failover using industry proven solutions as described in Sanjay’s post; the HBase community can start deploying NameNode HA along with resilience as the Namenode fails over.
Sanjay’s blog gives more details on how to deploy NameNode HA. Please get in touch with me (firstname.lastname@example.org) or Sanjay (email@example.com) if you need more details on NameNode HA, Full Stack HA with respect to HBase or any part of the above tests.