Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
February 06, 2015
prev slideNext slide

Announcing Hive 1.0: A Stable Moment in Time

In August 2009, the Facebook Data Infrastructure Team published a white paper that outlined a warehousing solution over Hadoop. They called it Hive. And since that time, this project has not only emerged as the defacto standard for SQL in Hadoop, but with the help of the Stinger initiative it has progressed from a batch only framework with limited SQL interface to a near SQL:2011 compliant, fully interactive SQL query engine. Today it is a truly interactive SQL engine that delivers results at petabyte scale.

Thousands of engineers have contributed to its maturity and the entire community (past and present) should be proud of this moment.

Today, the Apache Hive community has passed a vote, marking this current release a 1.0 release. This marks a point in time of stability and maturity for Apache Hive. It denotes a reliable release that can now serve as a baseline for future advancements.

How we got here: A quick look at the past

In order to understand the progession of Hive from science project to world class SQL engine for Hadoop, it is interesting to review some of the advances found in its major releases.

Screen Shot 2015-02-06 at 1.54.16 PM

Screen Shot 2015-02-06 at 1.55.01 PM

Screen Shot 2015-02-06 at 1.55.53 PM

This is an impressive list of features that can be found in any of the most important and widespread SQL solutions on the planet. Hive today looks very much like the SQL solutions we have deployed for years.

The 1.0 moment is a glimpse into the future of Hive

A 1.0 release made sense at this time because the community has advanced the project so that it meets the requirements to pass the sniff test as a reliable enterprise SQL option. It has grown up. It has graduated. Now that we have a stable and mature 1.0 version of Hive it is time to extend its already extensive capabilities to new horizons. What are the unique SQL capabilities that can be introduced with true interactive and real time query on Hadoop? The community has already outlined some of this as found in Stinger.next. In the near term we will see Hive and Spark and LLAP delivered. These are the new foundations for new extensions that are yet to be created, designed and delivered by this great community.

Apache Hive: And Open Community at its best

Hundreds of companies and thousands of developers should be thanked for their contribution, but none of this should be a surprise as this is the crux of the open source development model. There are two reasons why developers take part in this awesome open community.

  • End users contribute because they need the solution to work
  • Organizations contribute because they want to explore adjacent opportunities

This is the difference between open source and open community

Thank you.

References

Tags:

Comments

  • Just in case I forgot to mention this before 😉 … we are on the edge of our seats waiting for Hortonworks to deliver Hive LLAP for sub-second SQL-on-Hadoop !!!

    The current major alternatives are either proprietary or may as well be… and my company is reluctant to build a multi-year strategy on those since replacing Oracle with a new Oracle-like lock-in vendor is… well… pointless.

    ps. I think you meant to write “Apache Hive: *An* Open Community at its best”?

    Regards,

    Hari Sekhon

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    If you have specific technical questions, please post them in the Forums

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>