Apache Hadoop 2.0 (Alpha) Released

As the release manager for the Apache Hadoop 2.0 release, it gives me great pleasure to share that the Apache Hadoop community has just released Apache Hadoop 2.0.0 (alpha)! While only an alpha release (read: not ready to run in production), it is still an important step forward as it represents the very first release that delivers new and important capabilities, including:

In addition to these new capabilities, there are several planned enhancements that are on the way from the community, including HDFS Snapshots and auto-failover for HA NameNode, along with further improvements to the stability and performance with the next generation of MapReduce (YARN). There are definitely good times ahead.

Again, please note that the Apache Hadoop community has decided to use the alpha moniker for this release since it is a preview release that is not yet ready for production deployments for the following reasons:

  • We still need to iterate over some of the APIs (especially with the switch to protobufs) before we declare them stable, i.e. something that can be supported over the long run in a compatible manner.
  • Several features including HDFS HA, NextGen MapReduce et al need a lot more testing and validation before they are ready for prime time.
  • While we are excited about the progress made for supporting HA for HDFS, auto-failover for HDFS NameNode and HA for NextGen MapReduce are still a work-in-progress.

Please visit the Apache Hadoop Releases page to download hadoop-2.0.0-alpha and visit the Documentation page for more information.

~ Arun C. Murthy (@acmurthy)

Categorized by :
Hadoop HDFS MapReduce


May 23, 2012 at 2:54 pm

Is 2.0 still going to be a Linux-mostly release?

Arun C. Murthy
May 23, 2012 at 4:30 pm

Just presently, expect that to change soon!

May 26, 2012 at 8:13 pm

Congratulations on what is surely a huge amount of work.

However, I am a computer professional who has done distributed computing for almost 20 years, and I don’t understand what’s important here.

Don’t know “HA”, “Manual Failover” (of what), “next gen of map reduce” (to do what?), Performance (improvement — of what tasks)?

Could you explain in simple English without Hadoop jargon what practical problems 2.0 solves? Thanks.


Weng Zhou
May 28, 2012 at 5:54 am

I am interested to know when the Visual Basic interface becomes available for this Hadoop soft.

July 18, 2012 at 9:24 pm


I had installed hadoop stable version successfully. but confused while installing hadoop -2.0.0 version.

I want to install hadoop-2.0.0-alpha on two nodes, using federation on both machines. “rsi-1″, ‘rsi-2” are hostnames.

what should be values of below properties for implementation of federation. Both machines are also used for datanodes too.


One more point, in stable version of hadoop i have configuration files under conf folder in installation directory.

But in 2.0.0-aplha version, there is etc/hadoop directory and it doesnt have mapred-site.xml, hadoop-env.sh. do i need to copy conf folder under share folder into hadoop-home directory? or do i need to copy these files from share folder into etc/hadoop directory?


Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.