Although the Hadoop Summit San Jose 2014 has come and gone, the invaluable content—keynotes, sessions, and tracks—is available here. I’ve selected a few sessions below for Hadoop system administrators and dev-ops, curating them under a general Hadoop operations theme.
Dev-ops engineers and system administrators know best that ease of operations and deployments can make or break a large Hadoop production cluster, which is why they care about all of the following:
Not having so many desirable outcomes can deprive them of their sleep. In the case of Hadoop’s large-scale cluster operations and management, where the Enterprise Hadoop ecosystem comprises of both traditional and modern infrastructure components, the operational tasks can be herculean. As @DevOps_Borat sanguinely satirizes:
The good news is that people at the helm—at the nerve center of operations—shared their best practices on how they address and manage theses complex challenges at the Hadoop Summit. Here are a few:
|Lessons Learned from Building Big Data Platform From Ground Up||Video||Slides|
|Managing 2000 Node Cluster with Apache Ambari||Video||Slides|
|Hadoop 2 @ Twitter, Elephant Scale||Video||Slides|
|Lessons Learned – Monitoring the Data Pipeline at Hulu||Video||Slides|
|Collection of Small Tips on Stabilizing your Hadoop Cluster||Video||Slides|
|Hadoop and OpenStack||Video||Slides|
I cherry picked these few tracks that best addressed those topics, but you can always peruse through all the tracks on the schedule’s session description along any time slot, on any day, that piques your curiosity.
For example, when you hover and click on a session description, a popup will display in which you can either elect to watch the video or view the slides.
In the next blog, I’ll curate content on data access and management, in particular, the role YARN plays as an architectural anchor and the center of Modern Data Architecture (MDA).