Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
July 29, 2011
prev slideNext slide

Takeaways from OSCON 2011

For the first time in its history, OSCON, the premier open-source conference, had a special OSCON Data sub-conference. Apache Hadoop had a full track dedicated to it at OSCON Data. This clearly was indicative of the interest in Big Data and the central role Apache Hadoop plays in the space. A special shout out to Bradford Stephens and Sarah Novotny, the program chairs, who did a fantastic job with OSCON Data.

Hortonworks was well represented at OSCON Data 2011. Owen O’Malley and I presented talks and Alan Gates took a short break from his vacation to stop-by.

Owen presented a very interesting talk on ‘Developing and Deploying Hadoop Security’. The presentation covered the goals of Hadoop Security and how to use the new features to ensure the security of their HDFS and MapReduce clusters. Owen also talked about Yahoo’s experiences deploying the back-ported Hadoop Security features on their science and production clusters. He also covered details on the several man-years of effort which went into developing the comprehensive and well-integrated security work the Hortonworks (formerly at Yahoo!) team spent.

I presented a talk on ‘Next Generation Apache Hadoop MapReduce’. The talk covered the details on how the Apache Hadoop MapReduce framework has hit a scalability limit around 4,000 machines. We are developing the next generation of Apache Hadoop MapReduce that factors the framework into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Since downtime is more expensive at scale, high-availability is built-in from the beginning; as are security and multi-tenancy to support many users on the larger clusters. The new architecture will also increase innovation, agility and hardware utilization.

I also had great fun attending various talks such as OpenTSDB by Benoit Sigoure, which is a very interesting usage of HBase as a backend for time-series database, and Theory of Caching by Greg Luck. My personal highlight was the coming out party of Java JDK7 and more details on plans for JDK8 by Joe Darcy.

Overall it was a fantastic opportunity to meet folks and share ideas.

— Arun C. Murthy


Leave a Reply

Your email address will not be published. Required fields are marked *