The release of Hive 0.11 is exciting and represents a big step forward to delivery of Project Stinger and SQL-IN-Hadoop. There is still some work to be done however. We look forward to delivery of Hadoop 2 with YARN and the Apache Tez project as being huge increases to Hive performance, but this is not the only goal of Stinger.
Today, HiveQL provides a fairly good set of SQL data types and semantics and while this (or a subset thereof) may be good enough for some of the “on” Hadoop solutions, we feel there needs to be more, especially if Hadoop and Hive are to meet the stringent requirements of enterprise class business analytics. To this end, we have set a goal of compatibility with most of SQL-92 and beyond with some SQL-2003 extensions.
The release of Apache Hive 0.11 pushes us further towards SQL-compatibility with the decimal data type becoming more usable (JIRA HIVE-4271) and the addition of analytic functions for windowing and aggregates. It also vastly improves joins and all the while improves performance. Awesome.
There is a lot more work to be done however and well work with the community to get it done. Hive 0.11 had contributions from over 50 community members to close over 380 Jira tickets. That is astounding and a huge proof point of the open community and its unrivaled capability to innovate faster than any proprietary solution.
We will reach our goal soon. Here is what’s left to be done:
We look forward to providing updates to Hive all summer long!