cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
May 23, 2012
prev slideNext slide

Apache Hadoop 2.0 (Alpha) Released

As the release manager for the Apache Hadoop 2.0 release, it gives me great pleasure to share that the Apache Hadoop community has just released Apache Hadoop 2.0.0 (alpha)! While only an alpha release (read: not ready to run in production), it is still an important step forward as it represents the very first release that delivers new and important capabilities, including:

In addition to these new capabilities, there are several planned enhancements that are on the way from the community, including HDFS Snapshots and auto-failover for HA NameNode, along with further improvements to the stability and performance with the next generation of MapReduce (YARN). There are definitely good times ahead.

Again, please note that the Apache Hadoop community has decided to use the alpha moniker for this release since it is a preview release that is not yet ready for production deployments for the following reasons:

  • We still need to iterate over some of the APIs (especially with the switch to protobufs) before we declare them stable, i.e. something that can be supported over the long run in a compatible manner.
  • Several features including HDFS HA, NextGen MapReduce et al need a lot more testing and validation before they are ready for prime time.
  • While we are excited about the progress made for supporting HA for HDFS, auto-failover for HDFS NameNode and HA for NextGen MapReduce are still a work-in-progress.

Please visit the Apache Hadoop Releases page to download hadoop-2.0.0-alpha and visit the Documentation page for more information.

~ Arun C. Murthy (@acmurthy)

Categories:

Comments

  • Congratulations on what is surely a huge amount of work.

    However, I am a computer professional who has done distributed computing for almost 20 years, and I don’t understand what’s important here.

    Don’t know “HA”, “Manual Failover” (of what), “next gen of map reduce” (to do what?), Performance (improvement — of what tasks)?

    Could you explain in simple English without Hadoop jargon what practical problems 2.0 solves? Thanks.

    Dan

  • Hi,

    I had installed hadoop stable version successfully. but confused while installing hadoop -2.0.0 version.

    I want to install hadoop-2.0.0-alpha on two nodes, using federation on both machines. “rsi-1″, ‘rsi-2” are hostnames.

    what should be values of below properties for implementation of federation. Both machines are also used for datanodes too.

    fs.defaulFS
    dfs.federation.nameservices
    dfs.namenode.name.dir
    dfs.datanode.data.dir
    yarn.nodemanager.localizer.address
    yarn.resourcemanager.resource-tracker.address
    yarn.resourcemanager.scheduler.address
    yarn.resourcemanager.address

    One more point, in stable version of hadoop i have configuration files under conf folder in installation directory.

    But in 2.0.0-aplha version, there is etc/hadoop directory and it doesnt have mapred-site.xml, hadoop-env.sh. do i need to copy conf folder under share folder into hadoop-home directory? or do i need to copy these files from share folder into etc/hadoop directory?

    Regards,
    Rashmi

  • Leave a Reply

    Your email address will not be published. Required fields are marked *