Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 29, 2018
prev slideNext slide

Painless Disaster Recovery using Hortonworks Data Lifecycle Manager

In the age of big data, information is power. Enterprises are analyzing tremendous amounts of data in order to achieve significant competitive advantage, increase revenue or reduce risks.

For example, the success of the Human Genome Project has demonstrated how a global community of scientists can collectively produce and use data to benefit scientific progress and pave the way for many important new discoveries. The possibilities are limitless and enterprises understand that.

Inherently, big data volumes tend to be huge in size. And as a result they are difficult to backup in order to protect them from loss.  Without a Disaster Recovery (DR) plan in place, costs related to data loss can be tremendous.

The Data Lifecycle Manager (DLM), a DataPlane service takes the pain out of the important task of backing up mission critical data. With a single pane of glass to monitor the movement of data, metadata and the associated security policies, DLM helps make data highly available in a different location to protect from a disaster scenario and provide an seamless end user experience (UX) to the data consumers.

DR tools have been around for a long time. However many of them provide un-intuitive user experiences. The traditional thinking has been that cluster administrators do not really need well designed optimized experiences due to which the big data administrators have been forced to reckon with poorly designed systems. Functionality related to alerting, monitoring and debugging, especially when a disaster strikes, become extremely important and impart pain because of poorly designed user experience. Information flow with a industry-standard-proven user experience (UX) helps the infrastructure administrator to identify and resolve the issue in a timely manner. Busy screens result in admins missing important information that gets hidden in a sea of less relevant information. Once an error occurs the tools do not allow easy recovery or provide any context of the issue or how to fix it.

A well thought out user experience is quintessential for disaster recovery tools. Admins have long been the silent sufferers of a variety of issues that plague their tools and  in this case the repercussions of missing key alerts can be devastating. Market studies have shown that products with good user experience and designed with the user’s needs in mind tend to outperform the ones that are not intuitive and easy-to-use by a large margin. As mentioned before, a poor user experience tends to produce negative results.

DLM is powered by a simple intuitive flow. It allows admins to replicate data using a 3 step process.

1- Cluster Registration in the Data Plane

2- Pairing of Clusters

3- Creation of  Replication Policy

This makes the process of proactively creating a DR plan quick and easy. Data can be moved to compatible on-prem or cloud clusters seamlessly.

DLM provides a single pane dashboard to monitor the various policies and jobs in progress. It warns of errors in job completion and provides logs to fix the issue. The goal of DLM  is to provide the most relevant information at the right time so that admins can take action at the appropriate time. In case of an error, DLM provides proactive alerts to warn the admin to make recovery easy. The most critical issues that need admin attention are presented on a prominent area on the screen.


A handshake between compatible clusters enables the admin to make sure the right clusters are paired. It also informs the admin if there are any issues in pairing so that they can be fixed before the actual data movement starts. Once the pairing is done, the admin need not worry about individual cluster configurations and compatibility. Focus can then shift to actually selecting the necessary datasets that need protection and creating the policies to replicate them.

The policies listing page is also designed with usability and visual experience in mind. Each policy is a mini dashboard into itself. We call this concept design a dashrow (dashboard row). This format enables each policy row to act as a mini dashboard unto itself. Expanding the relevant sections in the dashrow gives more detailed information related to the expanded section. This provides a self-contained seamless experience to browse existing policies and the associated jobs. All the policy information is contained in a single page and hence the user can save multiple clicks and avoid back and forth movement.

The vision for DLM has always been to make the Big Data Administrator’s job easier. Through an intuitive experience that provides the most relevant information in a streamlined manner to the cluster administrator, the Data Lifecycle Manager is the painless approach to proactive Disaster Recovery.

Leave a Reply

Your email address will not be published. Required fields are marked *