In August 2009, the Facebook Data Infrastructure Team published a white paper that outlined a warehousing solution over Hadoop. They called it Hive. And since that time, this project has not only emerged as the defacto standard for SQL in Hadoop, but with the help of the Stinger initiative it has progressed from a batch only framework with limited SQL interface to a near SQL:2011 compliant, fully interactive SQL query engine. Today it is a truly interactive SQL engine that delivers results at petabyte scale.
Thousands of engineers have contributed to its maturity and the entire community (past and present) should be proud of this moment.
Today, the Apache Hive community has passed a vote, marking this current release a 1.0 release. This marks a point in time of stability and maturity for Apache Hive. It denotes a reliable release that can now serve as a baseline for future advancements.
In order to understand the progession of Hive from science project to world class SQL engine for Hadoop, it is interesting to review some of the advances found in its major releases.
This is an impressive list of features that can be found in any of the most important and widespread SQL solutions on the planet. Hive today looks very much like the SQL solutions we have deployed for years.
A 1.0 release made sense at this time because the community has advanced the project so that it meets the requirements to pass the sniff test as a reliable enterprise SQL option. It has grown up. It has graduated. Now that we have a stable and mature 1.0 version of Hive it is time to extend its already extensive capabilities to new horizons. What are the unique SQL capabilities that can be introduced with true interactive and real time query on Hadoop? The community has already outlined some of this as found in Stinger.next. In the near term we will see Hive and Spark and LLAP delivered. These are the new foundations for new extensions that are yet to be created, designed and delivered by this great community.
Hundreds of companies and thousands of developers should be thanked for their contribution, but none of this should be a surprise as this is the crux of the open source development model. There are two reasons why developers take part in this awesome open community.
This is the difference between open source and open community