Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button

Hortonworks Data Lifecycle Manager

Protect Your Enterprise Data On-Premises & In-Cloud Through Hadoop Replication


Data Lifecycle Manager (DLM) is a DataPlane application that protects not only your data but also the security policies associated with it through replication. This application empowers system administrators to replicate HDFS and Hive data from on-premises cluster to cloud storage. Replication of the Hive database from a cluster with underlying HDFS to another cluster with cloud storage is supported. DLM protects Data-at-Rest (TDE) and Data-in-Motion (TLS) and provides support for multiple key management service (KMS) and encryption keys.

Data Lifecycle Manager video imgvideo button


Safeguard critical data assets

DLM provides replication of HDFS and Hive data from on-premises cluster to cloud storage. DLM provides a web UI that administrators can use to create and manage replication and disaster recovery policies and jobs. Avoid unnecessarily copying renamed files and directories and protect the data against accidental or intentional modification to meet governance and disaster recovery (DR) requirements. DLM enables system administrators to:

  • Incrementally replicate Hive data and metadata
  • Replicate data between HDP clusters using HDFS snapshots
  • Provide support for data-at-rest (TDE) and data-in-motion (TLS) encryption
  • Prevent unauthorized access to data and supports segregation of duties
  • Configure the destination cluster to serve as the new source, if the source cluster becomes unavailable

Webinar: Global Data Management In A Multi-Cloud Hybrid World
Replicate security policies associated with data

Replicate not only data but also the metadata and security policies that have been associated with data. DLM enhances the productivity of system administrators by:

  • Exporting Apache Ranger policies for the HDFS directory from source Ranger service and replicating them to destination Ranger service
  • Replicating associated file metadata, table structures or schemas
  • Providing active/standby behavior or DR site using Ranger policies

Blog: Painless Disaster Recovery using Hortonworks Data Lifecycle Manager
Implement hybrid data replication

DLM supports replication of HDFS and Hive data between underlying HDFS and AWS S3 cloud storage. DLM provides administrators with:

  • Bi-directional data replication between cloud to on-premise environments
  • Flexibility to designate either cluster in a pair to serve as the source or as the destination in a replication policy
  • Native cloud storage replication to S3 buckets
  • Seamless integration between AWS-cloud and DLM for data and security policy replication

Blog: Data Replication in Hadoop
Get visibility into cluster status and automate replication tasks for enhanced productivity

Quickly identify any issues or verify the health of the clusters, policies, or jobs in DLM. View the total number of clusters enabled for DLM, the number for which all or some of the services are running, and the number of clusters for which remaining disk capacity is less than 10%. DLM provides system administrators with the flexibility to:

  • Create policies based on business rules
  • Replicate data based on data sets, day and time, the frequency of job execution and bandwidth restrictions

Blog: A Step-by-Step Guide for HDFS Replication