Hortonworks Data Platform 2.0 on openjdk

yum -y install openjdk-7

Apache Hadoop has always been very fussy about Java versions. It’s a big application running across tens of thousands of processes across thousands of machines in a single datacenter. This makes it almost inevitable that any race conditions and deadlock bugs in the code will eventually surface – be it in the Java JVM and libraries, in Hadoop itself, or in one of the libraries on which it depends.

Hence the phrase “there are no corner cases in a datacenter”. It may be amusing, but it makes a point: over time what bugs there are the software stack of a datacenter will eventually surface.

Hadoop, the applications on top, and their dependency libraries are the core of what we qualify when our QA team does a release of the official Apache Hadoop binaries -as it has done on the core Hadoop projects for every production-quality release of Hadoop. It is also the core of what we test when making an HDP release -qualifying the stack on top of those Apache releases.

Testing the JVM is an implicit part of this -which is why we always state which Java versions we have tested on and support. Usually these supported versions are behind the latest Sun/Oracle releases. For a long time Hadoop was only recommended “in production” on on specific versions of Oracle Java 1.6 . Indeed, HDP-1 is still only supported on these. Nowadays getting a supported Java 1.6 version is hard as its hidden away in the Java Archive Download pages. When you do download the JDK, the installation process involves click through licenses making automating deployment and maintenance that much harder.

Which is why for HDP-2 we are pleased to announce that not only is it tested and supported on the Oracle 1.7.0_21 JDK alongside the 1.6.0_31 version, we’ve also qualified it against openjdk-1.7.0_09-icedtea

As a result, HDP-2 offers a new way to install a supported JDK:

yum -y install openjdk-7

Now, you can install the openjdk JDK and have yum keep it up to date. That is not just for developer and proof-of-concept systems, that is production clusters of hundreds to thousands of nodes which is the same scale at which we test HDP releases.

Not only does this simplify deployment and other operations tasks , it also starts to pave the way for closer links between the OpenJDK team and the Hadoop developer community. The functionality and performance of the JVM is critical to Hadoop – and if we can get better insight into how the open JVMs work, if we can get the OpenJDK team to have Hadoop on their list of key applications to care about, we can become more confident that future openjdk releases will work even better with Hadoop.

Of course, this is all in the future. But maybe we can view that yum -y install openjdk-7 as the beginning.

Categorized by :
Administrator Apache Hadoop Developer HDP 2

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

Big Data Virtual Meetup Chennai
Wednesday, October 29, 2014
9:00 pm India Time / 8:30 am Pacific Time / 4:30 pm Europe Time (Paris)

More Webinars »

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.