In Shaun Connolly’s post about balancing community innovation and enterprise stability, he discussed the importance of an open source project forging ahead with big improvements that are expected to be initially buggy and incomplete functionally but then stabilize over time. In the case of Apache Hadoop 2.0, currently in community Alpha release, the big improvements have been underway for the past 3 years and include such things as:
In the case of high availability (HA), it can take many months or years to get these types of solutions rock solid. While Hadoop 2.0 contains important HA-related features such as HDFS hot standby, we want to make sure we give it time to complete its community release process and allow extra time after that for bugs to be found and fixed to harden it for broad enterprise production use.
Moreover, implementing HA for Hadoop’s NameNode service, for example, can’t be thought of in a vacuum. It’s important to take a holistic, full stack view of HA: from the underlying server, through the operating system layer, on up through the actual services that require HA, as well as the impact those services may have on any other clients or services that depend on them.
HA is inherently an enterprise “ility” that is focused on minimizing unplanned downtime and IT service disruption. It is therefore critical that full stack high availability be founded on a rock-solid and proven foundation. We at Hortonworks are confident that in the Hadoop world, that stable foundation is Hadoop 1.0.
When discussing HA, we often get the following questions:
We are excited to say that we’ve been hard at work with virtualization and operating system vendors on a solution architecture for full stack high availability that
As a matter of fact, I discussed this Hadoop 1.0 HA solution architecture in my keynote at Hadoop Summit last week, and below is an illustration of the architecture that was demoed by the Hortonworks product team on the show floor:
The above diagram focuses mostly on HA as it relates to the NameNode and JobTracker services.
As we see it, key requirements for full stack high availability include:
At Hadoop Summit, we announced the jointly developed Hortonworks Data Platform High Availability (HA) Kit for VMware vSphere customers that enables full stack high availability for Hadoop 1.0 by eliminating the NameNode and JobTracker single points of failure. It is a flexible virtual machine-based high availability solution that integrates with the VMware vSphere™ platform’s HA functionality to monitor and automate failover for NameNode and JobTracker master services running within the Hortonworks Data Platform (HDP).
VMware customers can utilize their existing vSphere installations to deploy HA NameNode and JobTracker nodes as virtual machines in their HDP production cluster. Doing so provides the added benefits of automated restart of virtual machines in event of server or OS failures and smart resource management that confirms sufficient resources are available to restart virtual machines on different servers in event of server failure. For more information on the Hortonworks Data Platform High Availability (HA) Kit for VMware vSphere customers, register your interest in the HDP HA Kit for vSphere and work with our product team on trying out the new HA capabilities. Your input on this key feature area is important, so please sign up!
You should view our full stack high availability efforts with VMware as just a start. We are continuing our efforts to not only round out the VMware solution but also introduce robust full stack HA solution architectures with other partners [STAY TUNED!]. We are also firmly committed to continuing the Hadoop 2.0 HA work that we started and will roll that out widely when it stabilizes and is ready for broader enterprise use.
If you want to learn more about Hortonworks Data Platform, join us on June 26 (10am Pacific/1pm Eastern) for the live webinar “Apache Hadoop Just Got Simpler,” as we outline and demo the key features of the Hortonworks Data Platform (HDP).