Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 28, 2013
prev slideNext slide

How To Migrate Your Hadoop Cluster to Hortonworks Data Platform 2.0

Now that Hortonworks Data Platform 2.0 is GA, you may be looking to migrate your Hadoop stack from another version to take advantage of Hadoop 2’s YARN-based architecture. Fortunately, our Professional Services & Support teams are getting a lot of practice at migration from other distributions as more and more customers turn to 100% enterprise-hardened Apache Hadoop for their big data platform.

While any specific migration may have a few gotchas from a vendor lock-in, or business integration perspective, this high-level process overview is battle tested on large-scale production clusters and we hope it helps you plan for your own migration.

Essential Migration Path

Depending on the source of your existing distribution, or intent, these are some obvious candidates for migration.


Essential Migration Steps

A Hadoop distribution has multiple Apache components, and possibly some vendor-specific components. This graphic shows best practice for the order in which to migrate the various components. The Hortonworks services team has automated some of the migration steps to simplify the process.


Risks and Mitigations

There are risks associated with any data center migration. Here are two key risks and their essential mitigations.

RISK: Data Loss.

HDFS is very stable and reliable, and we’ve not seen any data loss in actual migrations. Use proactive measures such as config and fsimage backup to keep the risk at a minimum. To avoid data loss, use safemode for checks and balances at each step of the upgrade.

RISK: Application Regression Issues

Early testing in development and test environments can help identify and implement config and code changes, with a final series of tests prior to production migration. Also refer to this guide on running existing applications on Hadoop 2 YARN.

Here to Help

Hadoop migrations benefit from practice, and we’re getting good at them as more and more customers turn to 100% enterprise-hardened Apache Hadoop for their big data platform. In fact this essential guidance has been used with many customers to migrate thousands of nodes already this year.  Of course, we anticipate that this will increase significantly with the availability of the YARN-based Hortonworks Data Platform 2.0.

We hope this brief guide helps you plan for your own migration. For specific migration questions, feel free to contact Hortonworks. Or visit our website to find out more about Hortonworks Data Platform 2.0


  • Hi Team
    We have an apache hadoop 1.X cluster. Can you please help , on how to upgrade the same cluster to HDP 2 with ambari. If you can point me to some docs/blogs mentioning the steps would be greatly appreciated.

  • Leave a Reply

    Your email address will not be published. Required fields are marked *