SQL is the most popular use case for the Hadoop user community, and Apache Hive is still the defacto standard. Early this week, the Apache Hive community released Apache Hive 1.2.0.
Already the third release this year, the Hive developer community continues to improve the release and grow its team, with 11 Hive contributors promoted to committers in the last three months. Dedicated to make Hive enterprise-ready, the community has made improvements in the following areas:
- Additional SQL functionality
- Security enhancements
- Performance gains
- Stability and usability
For the complete list of features, improvements, and bug fixes, see the release notes. Here are notable improvements:
- Support for SQL Union (Union Distinct) functionality (HIVE-9039)
- Support for specifying column list in insert statement. Eg- insert into target(y,z) select * from source (HIVE-9481)
Performance and Optimizer Improvements
- Grace hash join algorithm for Hive so that Map Joins use disk on overflow instead of failing (HIVE-9277)
- Predicate PushDown enhancements (HIVE-9069)
- Improvements in stats for better MapJoin selection, Reducer Parallelism (HIVE-9392, HIVE-10107)
- Count Distinct Performance Improvements (HIVE-10568)
- CBO – Better Windowing support (HIVE-10627, HIVE-10686)
- Changes to comply with SQL:2011 standard for reserved/non-reserved keywords (HIVE-6617)
- Caching of statistics in HiveServer2 (HIVE-10382)
- Improve performance of Vector Map Join by using more vectorization techniques (HIVE-9937, HIVE-9824)
- Improvements to the hive authorization plugin api, to allow implementations such as ranger to filter results of metadata operations such as show tables.
- Support for cookie based authentication in HiveServer2 HTTP transport mode (HIVE-9709, HIVE-9710)
- Support for JDBC driver to enable 2-way SSL / pass additional HTTP headers via intermediate servers such as Knox (HIVE-10477, HIVE-10339)
- Cross-cluster warehouse replication support, in conjunction with Falcon.
- Improve HS2 logging, Allow logging verbosity to be set at session level (HIVE-10119)
- New explain plan output geared towards traditional RDBMS users (HIVE-9780)