Twitter Analytics Presents Hadoop and Pig at UC Berkeley

Twitter Analytics presented their distributed infrastructure, including Hadoop and Pig, at a UC Berkeley iSchool special course called INFO 290: Analyzing Big Data with Twitter. Twitter is a major contributor to many Apache projects. The course was over-subscribed and was a great success, as students got to learn from practicing data scientists using Hadoop on truly massive datasets. The entire lecture series is available here.

Bill Graham @billgraham, a Data Systems Engineer at Twitter Analytics and Apache Pig committer, presented an Introduction to Hadoop. His slides are available here. His presentation gives a comprehensive introduction to Apache Hadoop including its history, motivation, practice and operation.

Jonathan Coveney @jco, a Data Systems Engineer at Twitter Analytics and Apache Pig committer, presented Pig at Twitter. Slides for this presentation are available here. His presentation gives a comprehensive explanation of Apache Pig‘s philosophy, use and intricacies. It is one of the most thorough introductions to Pig I’ve seen and will serve as excellent documentation for beginners and intermediate Pig users alike.

Hats off to Twitter for their contribution to Apache open source and education. More Pig talks and papers are availableĀ on the Pig Confluence here.

Categorized by :
Apache Hadoop Hadoop Ecosystem Industry Happenings Pig

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.