The Hortonworks HBase team is excited to see HBase 96 released. It represents a broad community effort and massive amount of work that has been building for more than a year.
HBase 96 closes out over 2000 issues (2134 Jira tickets to be exact) and it represented the collective work from a VERY active community. Kudos to everyone involved! As the authors in a recent Apache blog alluded to, the HBase community is very healthy and includes developers from many companies including Hortonworks, Yahoo!, Cloudera, Salesforce, eBay, Intel, and Facebook, just to name just a few.
Some of the notable improvements found in HBase 0.96 are listed here.
Reduced Mean-Time-To-Recovery (MTTR)
In HBase 0.96, recovery time has improved significantly – less than a minute in our tests. This effort spanned across HBase, HDFS and Zookeeper components of the Hadoop ecosystem stack. See three blog posts for more details:
One can now take snapshots of an HBase table, clone it, copy it to a different cluster and restore it. Here’s an excellent overview from the “Table Snapshots” presentation at HBaseCon 2013.
Support for Microsoft Windows™
HBase now runs natively on Windows without any requirement on Cygwin.
In order to make reads more efficient, HBase 96 performs periodic compactions. Compactions merge data files, rewriting data into new files and removing the old. This topic is covered in this “Compaction Improvements” presentation from the HBaseCon 2013.
Integration Testing Infrastructure
We have borrowed ideas from other projects, including the infamous Chaos Monkey from Netflix, to come up with a set of large-scale tests. The tests can run infependently or along side the ChaosMonkey tool. The new infrastructure has enabled Hortonworks, and others in the community to identify several classes of bugs (which have since been fixed).
Future Proofing Via Wire Compatibility
One of the goals of the HBase 0.96 release was to make it easy to do rolling upgrades and transparent upgrades for client applications. To this end, we converted all the protocols in HBase to use Google’s ProtocolBuffer semantics and structure.
Data Type Flexibility
HBase 96 includes uniform representation of data so that applications can now reliably read/write data in HBase without concern of application overlap. An excellent blog around this topic is “Data Types != Schema”.
HBase has been broken into modules (server, client, etc.) for easy consumption in downstream projects. That means your applications that depend on the HBase client jar no longer bring along all the dependencies of the server module.
Overhaul of Metrics Framework
HBase migrated to using a cleaner metrics framework with more meaningful metrics. This is good news for tools that have been written to monitor HBase in real world deployments. HBase 96 integrates better with management tools such as Apache Ambari.
Many, Many Bug Fixes
So far, including all the above enhancements, more than 2000 issues have been resolved in the 0.96 codebase (including the fixes in 0.95.x development releases).