Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 14, 2013
prev slideNext slide

NameNode High Availability in HDP 2.0

As part of a modern data architecture, Hadoop needs to be a good citizen and trusted as part of the heart of the business. This means it must provide for all the platform services and features that are expected of an enterprise data platform.

The Hadoop Distributed File System is the rock at the core of HDP and provides reliable, scalable access to data for all analytical processing needs. With HDP 2.0, built into the platform itself, HDFS now has automated failover with a hot standby, with full stack resiliency. We’ll break this down in this blog post and outline what this means for your HDP operations and cluster design.

HDP 2.0 NameNode High Availability: Feature Overview

Automated Failover: HDP pro-actively detects NameNode host and process failures and will automatically switch to the standby NameNode to maintain availability for the HDFS service. There is no need for human intervention in the process – System Administrators can sleep in peace!

Hot Standby: Both Active and Standby NameNodes have up to date HDFS metadata, ensuring seamless failover even for large clusters – which means no downtime for your HDP cluster!

Full Stack Resiliency: The entire HDP stack (MapReduce, Hive, Pig, HBase, Oozie) has been certified to handle a NameNode failure scenario without losing data or the job progress. This is vital to ensure long running jobs that are critical to complete on schedule will not be adversely affected during a NameNode failure scenario.

With HDP 2.0, this entire functionality is built into the HDP product. This means:

  • No need for shared storage
  • No need for external HA frameworks
  • Certified with a secure Kerberos configuration

High Availability Architecture Deep Dive

Let’s walk through the architecture. This will outline the different components that are built into HDP to provide this functionality.

In each cluster, two separate machines are configured as NameNodes. In a working cluster, one of the NameNode machine is in the Active state, and the other is in the Standby state.

The Active NameNode is responsible for all client operations in the cluster. The Standby NameNode maintains enough state to provide a fast failover. In order for the Standby node to keep its state synchronized with the Active node, both nodes communicate through a group of separate daemons called JournalNodes. The file system journal logged by the Active NameNode at the JournalNodes is consumed by the Standby NameNode to keep it’s file system namespace in sync with the Active.

In order to provide a fast failover, it is also necessary that the Standby node have up-to-date information of the location of blocks in your cluster. DataNodes are configured with the location of both the NameNodes and send block location information and heartbeats to both NameNode machines.


The ZooKeeper Failover Controller (ZKFC) is responsible for HA Monitoring of the NameNode service and for automatic failover when the Active NameNode is unavailable. There are two ZKFC processes – one on each NameNode machine. ZKFC uses the Zookeeper Service for coordination in determining which is the Active NameNode and in determining when to failover to the Standby NameNode.

Quorum journal manager (QJM) in the NameNode writes file system journal logs to the journal nodes. A journal log is considered successfully written only when it is written to majority of the journal nodes. Only one of the Namenodes can achieve this quorum write. In the event of split-brain scenario this ensure that the file system metadata will not be corrupted by two active NameNodes.

In HA setup, HDFS clients are configured with a logical name service URI and the two NameNodes corresponding to it. The clients perform source side failover. When a client cannot connect to a NameNode or if the NameNode is in standby mode, it performs fail over to the other NameNode.

Go Get It

HDP 2.0 brings advancements in reliability to the core of the Hortonworks Data Platform.

Check out our in-depth documentation on enabling a NameNode HA cluster.

Download HDP 2.0 Beta and deploy today!



  • Why is that the failover controlled called ZKFC? Its not trying to failover the zookeeper nodes. Its trying to failover the namenode. Should not it be called a Namenode Failover Controller or NnFC rather ?

    Another example where hadoop community has shown bad taste with naming conventions (other being secondary namenode) 😉

    Also, is it possible that there can be more than 2 namenodes in the picture, as in for example 3 namenodes. Just to be on the safer side when the active namenode is down and there is a failover and there is still a risk with the single node being active?
    Theoretically, since the namenode is integrated with quorom journal manager, it should be possible to horizontally scale the availability of namenodes, no?

    When a client cannot connect to a NameNode or if the NameNode is in standby mode, it performs fail over to the other NameNode.

    How does client know which namenode to connect to during failover? since the heartbeats are being sent to both the namenodes, what happens during the failover? and how does client get back to know of the re-availability of the namenode during the failback or just bringing up the service on another node? How would it be if it has to just bring up another node with namenode services during the failover (assuming there is a total hardware failure on one node). Do they contact Zookeeper service or Journal nodes?


  • Hi team,

    Please correct my understanding please….

    In this model, Active namenode will keep on taking the Edits from Journal Nodes (which may be on different physical box) and will applying the latest edits on his own FSImage in every 1 hours or at size of 128 MB.

    There is no separate secondary namenode process in an HA cluster, only a pair of namenode works because stand by namenode also have the separate copy of FSImage and keep on synchronize with Edits logs from Journal nodes.
    How the check point will work here in this case?

    Thanks & Regards,

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    If you have specific technical questions, please post them in the Forums

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>