Hortonworks is on a mission to accelerate the development and adoption of Apache Hadoop. Through engineering open source Hadoop, our efforts with our distribution, Hortonworks Data Platform (HDP), a 100% open source data management platform, and partnerships with the likes of Microsoft, Teradata, Talend and others, we will accomplish this, one installation at a time.
What makes this mission possible is our all-star team of Hadoop committers. In this series, we’re going to profile those committers, to show you the face of Hadoop.
Education is a key component of this mission. Helping companies gain a better understanding of the value of Hadoop through transparent communications of the work we’re doing is paramount. In addition to explaining core Hadoop projects (MapReduce and HDFS) we also highlight significant contributions to other ecosystem projects including Apache Ambari, Apache HCatalog, Apache Pig and Apache Zookeeper.
Alan Gates is a leader in our Hadoop education programs. That is why I’m incredibly excited to kick off the next phase of our “Future of Apache Hadoop” webinar series. We’re starting off this segment with 4-webinar series on September 12 with “Pig out to Hadoop” with Alan Gates (twitter:@alanfgates). Alan is an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan is also a member of the Apache Software Foundation and a co-founder of Hortonworks.
Get to know Alan in this first installment of our “Meet the Committer” series.
Kim: Tell us about your current role and how you interact with Apache Hadoop projects?
Alan: I wear a number of different hats. I lead the team at Hortonworks that works on Pig, Hive, and HCatalog. I was one of the original committers on the Pig project when it started in Apache 5 years ago, and am still an active member of the community. I am also an active member of the HCatalog project. As an Apache member and part of the Apache Incubator I mentor HCatalog, Bigtop, and Oozie. This means I help those projects grow into top-level projects in Apache, mentoring them in the Apache way.
Kim: How did the Pig project come about?
Alan: Pig was started as a project in Yahoo! research. It was originally referred to simply as “the language”. One day one of the researchers said, “We need a name for this” and someone said, “How about Pig?” It stuck. After Yahoo! users began using Pig it was clear it was valuable. Yahoo! decided to invest in making it a production quality project. That’s when Olga Natkovich and I were brought into the project. We open sourced the project via the Apache Incubator, beefed it up to production quality, and started adding new features.
Kim: Can you provide a sneak peek of your presentation and what do you expect will be key take-away for folks attending this webinar?
Alan: I want to focus on a couple of things in the presentation. One, Pig 0.10 has added some exciting features like UDFs in JRuby and Boolean data type as well as many language enhancements and performance improvements. A lot of work is going into Pig now, especially with our six Google Summer of Code students pouring in new features. I will also talk some about changes we would like to make in Pig to take advantage of new features available in Hadoop 2.0. I hope the key take away will be different for each listener; hopefully it will be something new they did not know about Pig that will help them use it more effectively.
Kim: Who would win in a fight? Piglet or Miss Piggy?
Alan: This one’s easy. While Piglet was busy trying to explain that he was a very small animal and hence not given to fighting Miss Piggy would give him one of her feared karate chops and it would all be over.
I hope you would join us on September 12, 2012 @10am PDT / 1pm EDT to “Pig Out to Hadoop” with Alan Gates.
In the next few weeks we will be joined by other committers and Hadoop experts, including: Matt Foley, Mahadev Konar, and Arun C. Murthy. For more information and to register, go here: http://info.hortonworks.com/FutureofHadoopSeries.html