ApacheCon EU Day One Roundup – Part 1

Hackathon and Aeromuseum Reception

ApacheCon Europe kicked off yesterday with an all-day Hackathon followed by a committer’s reception at the Sinsheim Technik Museum, which has – among other large aircraft, a Concorde in Air France livery. My favorite was the diesel engine from a U-Boat – and its enormous drive-shaft and pistons.

Taking the Guesswork out of Hadoop Infrastructure

Winding a rented Opal through its gears along village roads for half an hour from my hotel-out-of-a-black-forest-fairy-tale, I made it to ApacheCon EU’s first day of sessions mid-way through the first talk by Steve Watt, ‘Taking the Guesswork out of Hadoop Infrastructure.’ Steve talked about the harsh reality of fitting hardware to a given workload using Hadoop with the quote: “We’ve profiled our Hadoop applications so we know what type of infrastructure we need.” — Said No One, Ever. Steve covered ways to instrument your cluster and outlined practical ways to test and tune your Hadoop and HBase clusters.

He also discussed ‘System on a Chip and Hadoop,’ which brings to mind the recent debate about Hadoop-specific hardware solutions.

Discussions in the hallways centered around long-term trends and shifting economics around cluster computing. With the PC rapidly being replaced by mobile devices and tablets, will the economies of scale for large clusters of PCs change? Will the growth of cloud-computing replace the desktop PC and continue to drive economy of scale? Or, will custom solutions start to make headway over commodity hardware over the next five years as the desktop and notebook PC disappear, driving up the cost of PC-based servers and making custom hardware more competitive? Will the economies of scale and power-efficiency of mobile and tablet chips replace the PC processor in Hadoop clusters? Fun stuff to contemplate!



Chart from MobileRodie.

The chart below would indicate that PC nodes will remain competitive, but that mobile-derived hardware may get cheap enough to compete as well! Or perhaps I’m dreaming :)

Enabling Elastic, Multi-tenant, Highly Available Hadoop on Demand

Next up was Richard McDougall with Enabling Elastic, Multi-tenant, Highly Available Hadoop on Demand which covered the ins and outs of Hadoop with virtualization. We’ve talked previously on the Hortonworks blog about virtualization as a part of Hadoop NameNode HA on Hadoop 1.

Virtualizing Hadoop data nodes on Amazon EC2 or VMWare has posed a major tradeoff in performance in the past, and VMWare is hard at work getting that penalty down to 10% for VMWare virtualized Hadoop clusters. Project Serengeti was founded with this goal in mind.

Extending lifespan with Hadoop and R

Radek Maciaszek presented Extending lifespan with Hadoop and R, which covered his project to identify aging related genes using R and Hadoop at the UCL Institute of Healthy Aging.

Inside Hadoop Development

Hortonworks’ own Steve Loughran presented Inside Hadoop Development.

Thats it for now, I’ll summarize the rest of the day, up next!

Categorized by :
Apache Hadoop Hadoop Ecosystem Industry Happenings

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.