As we have said here, Hortonworks has been steadily increasing our investment in HBase. HBase’s adoption has been increasing in the enterprise. To continue this trend, we feel HBase needs investments in the areas of:
Reliability and High Availability (all data always available, and recovery from failures is quick)
Snapshots and backups (be able to take periodic snapshots of certain/all tables and be able to restore them at a later point if required)
Monitoring and Diagnostics (which regionserver is hot or what caused an outage)
Significant work has happened in each of the areas outlined above in the 0.94 and 0.96 (currently trunk) branches. For example, the MTTR (mean time to recover) work happening in HBASE-5843 will improve the data availability significantly. HBASE-5305 addresses wire compatibility. HBASE-6055 is the work underway on Snapshots. We believe by solving the above problems, HBase will gain a much wider adoption in the enterprise, and will be considered a very viable option for the use cases it supports.
Doing the above would open HBase to many of the enterprise users, and going forward, we envisage the need for:
Better and improved clients (asynchronous clients, and, in multiple languages)
Cell-level security (access control for every cell in a table)
Multi-tenancy (HBase becomes a viable shared platform for multiple applications using it)
Secondary indexing functionality
The above are some of the areas that Hortonworks is investing in as well. Stay tuned for further updates on these topics.
HBase is a critical component of the Apache Hadoop ecosystem and a core component of the Hortonworks Data Platform. HBase enables a host of low latency Hadoop use-cases; As a publishing platform, HBase exposes data refined in Hadoop to outside systems; As an online column store, HBase supports the blending of random access data read/write with application workloads whose data is directly accessible to Hadoop MapReduce.
The HBase community is moving forward aggressively, improving HBase in many ways. We are in the process of integrating HBase 0.94 into our upcoming HDP 1.1 refresh. This “minor upgrade” will include a lot of bug fixes (nearly 200 in number) and quite a few performance improvements and will be wire compatible with HBase 0.92 (in HDP 1.0). Here are some notable ones:
HBASE-4128 – Data Block Encoding of KeyValues (aka delta encoding / prefix compression) [PERFORMANCE]
HBASE-4465 – Lazy-seek optimization for StoreFile scanners [PERFORMANCE]
HBASE-5074 – support checksums in HBase block cache [PERFORMANCE]
HBASE-5128 – [uber hbck] Online automated repair of table integrity and region consistency problems [OPERABILITY]
HBASE-3584 – Allow atomic put/delete in one call [FEATURE]
HBASE-5229 – Provide basic building blocks for “multi-row” local transactions [FEATURE]
And 0.94 is only the start. Expect to see an a huge set of additional features, bug fixes, performance and operational improvements to HBase in the coming months. As more of our customers have deployed HBase it has become an increasingly important component of HDP 1. As a result, we’ve really been ramping up our investment in HBase this year, with a focus on enhancing HBase stability and operability. What follows is a summary of Hortonworkers recent HBase contributions.
1. Reliability improvements
We have established an automated test harness for testing HBase on a nightly basis. The harness involves automated deployment of HBase with a ‘production like’ configuration. After the cluster has been set up, a few heavy duty jobs are run. This has uncovered numerous bugs in the 0.92.x line.
Some of them are:
HBASE-5986: Clients can see holes in the META table when regions are being split
HBASE-6160: META entries from daughters can be deleted before parent entries
HBASE-6679: RegionServer aborts due to race between compaction and split
HBASE-6060: Regions’s in OPENING state from failed regionservers takes a long time to recover
HBASE-6758: The replication-executor should make sure the file that it is replicating is closed before declaring success on that file
2. Test Infrastructure Improvements
One of the biggest needs in the community is a good testing framework for HBase. As HBase is becoming more popular as a NoSQL data store, we need to make sure that the system is highly available and reliable in the face of common node failures, and that it is able to withstand the intense, high stress workloads users expect in production environments.
Towards this end we have been building an automated test framework inspired by Netflix’s ChaosMonkey tool. It can run a series of tests, while killing and restarting HBase servers and validate that the test results are correct. This brings to the fore the availability and reliability aspects of the system. For example, if a RegionServer is killed, another RegionServer or a set of RegionServers should pick the data that the killed RegionServer was serving.
Using the APIs provided by this testing framework, one can convert many of the tests in the HBase codebase to run in either unit test mode or in this new challenging “real cluster mode”. The test framework is part of the HBase codebase (via HBASE-6241), and many candidate tests have been identified that can be ported to use the new framework.
The Microsoft Windows port and certification of HBase is an ongoing joint development effort invovling Hortonworks and Microsoft engineers. We recently reached an important milestone, getting all of the hbase-0.94 unit tests passing on Windows. Work is underway to commit all the patches to HBase mainline under the umbrella jira HBASE-6814. We are well on the way to our goal of having HBase run equally well on Windows and Unix, opening up the Apache HBase community to a whole new universe of potential users and contributors.
4. HBase with NameNode HA setup and validation
We’ve been working to validate that HBase runs well with the new Apache Hadoop 1.0 HA features. The HBase HA testing blog is here .
5. The wire-compatibility work targeted for 0.96.x release.
We have done substantial work to move all protocols in HBase including the RPC implementation to use Google’s Protocol Buffers. Most of the work is captured in this umbrella jira – HBASE-5305.
All of the above is just what we’ve been doing recently and Hortonworkers are only a small fraction of the HBase contributor base. When one factors in all the great contributions coming from across the Apache HBase community, we predict 2013 is going to be a great year for HBase. HBase is maturing fast, becoming both more operationally reliable and more feature rich.
Yes, NameNode HA is finally available in the Hadoop 1 line. The test was done with Hadoop branch-1 and HBase-0.92.x on a cluster of roughly ten nodes. The aim was to try to keep a really busy HBase cluster up in the face of the cluster’s NameNode repeatedly going up and down. Note that, HBase would be functional during the time NameNode would be down. It’d only affect those operations that requires a trip to the NameNode (for example, rolling of the WAL, or compaction, or flush), and those would affect only the relevant end users (a user using the HBase get API may not be affected if that get didn’t require a new file open, for example).
HBase was kept busy by running a load test – LoadTestTool (available in 0.92 branch), with a set of arguments (number of reader/writer threads, sizes of rows, etc.) that were selected induced significant pressure on the HBase cluster. In turn, the configuration of HBase was artificially modified so that HBase would make lots of trips to the NameNode for file operations (low flush thresholds, very low major compactions frequency). For the test, the NameNode was repeatedly brought up and down (specifically, a loop of “bring down the namenode, let it remain down for a small period of time, bring up the namenode, let it remain up for another period of time”). This stop-start-pattern had some randomness built into it.
The cluster kept up reasonably well with the above load and the failure mode. But we also saw that we were losing HBase RegionServers somewhat randomly. Upon a close analysis of the logs on the NameNode & RegionServers, what seemed to be the case was that file lengths were not recorded correctly in the edit-logs. This issue turned out to be a known issue, that was addressed in HDFS-1108. The fix was backported to Hadoop-1.0.x line. It should be noted that HA team at Hortonworks had fixed other issues and as is the usual practice for us, these fixes were applied to Apache Hadoop trunk and back ported to Hadoop 1.x line and will also be back-pported to the 2-alpha.
With the above fix in HDFS, the tests were rerun. The cluster remained up without any RegionServer losses for more than 48 hours. No glitches!
Well to be precise, the cluster started behaving weirdly since the datanodes started running out of space since the HBase load generation has successfully filled up the HDFS capacity in spite of repeated NameNode restarts. (I should file some jiras to handle that more gracefully!). While my tests were not using the automated failover of the NameNode node one can now configure the NameNode in Hadoop 1 to automatically failover using industry proven solutions as described in Sanjay’s post; the HBase community can start deploying NameNode HA along with resilience as the Namenode fails over.
Sanjay’s blog gives more details on how to deploy NameNode HA. Please get in touch with me (email@example.com) or Sanjay (firstname.lastname@example.org) if you need more details on NameNode HA, Full Stack HA with respect to HBase or any part of the above tests.