Recap of the August Pig Hackathon at Hortonworks
The August Pig Hackathon brought Pig users from Hortonworks, Yahoo, Cloudera, Visa, Kaiser Permanente, and LinkedIn to Hortonworks HQ in Sunnyvale, CA to talk and work on Apache Pig.
Jonathan Coveney and Bill Graham from Twitter walked newer Pig users through how Pig translates a Pig Latin script to map reduce jobs and went over how to read the output of explain. For those interested, Hortonworks founder Alan Gates covers this in Chapter 1 of Programming Pig.
Thejas Nair walked through how to contribute patches to Pig and how to work with committers to get the patches in. You can learn more about this on the Pig Wiki.
The group talked through the proposal for a new EvalFunc interface that would make it much easier to write UDFs or User Defined Functions for Pig. Part of what makes Pig so powerful is its extensibility, and making that even easier would make Pig a better tool. A discussion in JIRA ticket PIG-2421 is availble if you want to collaborate on improving Pig’s eval funcs.
Alan Gates presented some thoughts on building a generic DAG (directed acyclic graph) execution and optimization engine that could be used by Pig and Hive and that would take advantage of new features in Hadoop 2.0. This would reduce duplication between the projects as well as allow users to share UDFs between them. We covered using Pig and Hive together and via HCatalog in previous posts.
You don’t have to be a Pig expert to attend a Pig meetup – all levels of proficiency are invited. Committers love to meet new users that appreciate their work. One attendant said, “There were many pig commiters at the meetup. The Twitter and HortonWorks people were very helpful.”
To find out about more Pig meetups, join the Pig User group on meetup. We can’t wait to see you there!


Pingback: Recap of the August Pig Hackathon at Hortonworks « Another Word For It
Would love to attend once I graduate