Announcing Apache Hive 0.12: Stinger Phase Two… DELIVERED

Stinger is not a product.  Stinger is a broad community based initiative to bring interactive query at petabyte scale to Hadoop. And today, as representatives of this open, community led effort we are very proud to announce delivery of Apache Hive 0.12, which represents the critical second phase of this project!

Only five months in the making, Apache Hive 0.12 comprises over 420 closed JIRA tickets contributed by ten companies, with nearly 150 thousand lines of code!  This work is perfectly representative of our approach… it is a substantial release with major contributions from a wide group of talented engineers from Microsoft, Facebook , Yahoo and others.

Delivery of SQL-IN-Hadoop Marches 
Hive12deuxThe Stinger Initiative was announced in February and as promised, we have seen consistent regular delivery of new features and improvements as outlined in the Stinger plan.  There are three roadmap vectors for Stinger: Speed, Scale and SQL.  Each phase of the initiative advances on all three goals and this release provides a significant increase in SQL semantics, adding the VARCHAR and DATE datatypes and improving performance ORDER by and GROUP by. Several features to optimize queries have also been added.

We also contributed numerous “under the hood” improvements, ie refactoring code and making it easier to build on top of hive – getting rid of some of the technical debt. This helps us deliver further optimizations in the long term, especially for the upcoming Apache Tez integration.

A complete list of the notable improvements included in the release is listed here and expect an updated performance benchmark soon!

Momentum

If you check out the release notes be prepared to scroll for quite sometime as it extends over 420 JIRA tickets.  A lot of people have ben involved and as you can see from the chart below, Hive is wildly active community.  It counts the number of emails sent by month to the Hive developer mailing list.  The momentum is building and the community is definitely engaged.

hive devMany people need to be thanked, most of them listed here: Alan Gates,Aleksey Gorshkov, Anandha L Ranganathan, Arup Malakar, Ashutosh Chauhan, Azrael, Bing Li, Brock Noland, Caofangkun, Chaoyu Tang, Chris Drome, Chu Tong, Daniel Dai, Deepesh Khandelwal, Dheeraj Kumar Singh, Dilip Joseph, Edward Capriolo, Eli Reisman, Eugene Koifman, Gabriel Reid, Gopal V, Gunther Hagleitner, Guo Hongjie, Hari Sankar Sivarama Subramaniyan, Harish Butani, Ido Hadanny, Ivan A. Veselovsky, Jarek Jarcec Cecho, Jason Dere, Johnny Zhang, Jon Hartlaub, Kevin Wilfong, Laljo John Pullokkaran, Lefty Leverenz, Mark Grover, Mark Wagner, Matthew Weaver, Mikhail Bautin, Morgan Phillips, Namit Jain, Navis, Owen O’Malley, Prasad Mujumdar, Prasanth J, Rob Weltman, Robert Roland, Roshan Naik, Samuel Yuan, Sarvesh Sakalanaga, Sean Busbey, Sergey Shelukhin, Shreepadma Venugopalan, Shuaishuai Nie, Sushanth Sowmyan, Swarnim Kulkarni, Teddy Choi, Thejas M Nair, Vikram Dixit K, Xiu, Xuefu Zhang and Yin Huai.

Availability

You can download the release from the Apache Hive website today.  The full Release Notes are also available.

Categorized by :
CIO & ITDM Developer HDP 2 Hive Other

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Try it with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox
HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.