Last week was an important milestone for Hortonworks: our one year anniversary. Given all of the activity around Apache Hadoop and Hortonworks, it’s hard to believe it’s only been one year. In honor of our birthday, I thought I would look back to contrast our original intentions with what we delivered over the past year.
Hortonworks was officially announced at Hadoop Summit 2011. At that time, I published a blog on the Hortonworks Manifesto. This blog told our story, including where we came from, what motivated the original founders and what our plans were for the company. I wanted to address many of the important statements from this blog here:
Hortonworks was formed to “accelerate the development and adoption of Apache Hadoop”. I returned to this point often throughout the manifesto. We committed to working with the community to accelerate the development and adoption of Apache Hadoop and we absolutely delivered on this promise. Over the past year, Apache Hadoop released Hadoop-1.0, the most stable line of Apache Hadoop ever. Hadoop-2.0, including the next generations architectures for both MapReduce and HDFS, was also released in alpha form. Apache Hadoop continues to gain momentum as proven by every important metric (downloads, web traffic, press & analyst coverage, conference and Meetup attendance, etc.). It was a banner year for Apache Hadoop and we are proud to have played an important role in making it happen.
We are “committed to open source” and commit that “all core code will remain open source”. This commitment is as solid today as it was a year ago. All code developed by Hortonworks has been contributed back to open source. In addition to our significant contributions to core Hadoop projects (MapReduce and HDFS), we have also made significant contributions to other Hadoop ecosystem projects including Ambari, HCatalog, Pig and ZooKeeper. We will continue to be a leader in the Hadoop community process and will continue to contribute all of our Hadoop development efforts back into the Apache community development process.
We will “make Apache Hadoop easier to install, manage and use”. This was a key focus for Hortonworks over the past year. We quickly learned that it would be beneficial to the market to offer a Hortonworks distribution of Apache Hadoop that delivered on this promise. Hortonworks Data Platform, which we recently made available to the entire ecosystem, addresses each of these areas. We have included an installer that greatly simplifies the installation process for Apache Hadoop. We included, for the first time, Apache Ambari, which allows organizations to manage and monitor their Hadoop clusters. We also tightly integrated Hortonworks Data Platform with Talend Open Studio for Big Data, which provides a visual design environment for connecting Hadoop with hundreds of enterprise data systems in order to make Hadoop easier to use. The result is a greatly simplified process for organizations that are looking for a pure Apache Hadoop distribution.
We will “make Apache Hadoop more robust”. Again, I’m pleased that we delivered on this promise. We were instrumental in the re-architectures of MapReduce and HDFS to address the enterprise needs of each of these core components. Our team has written a number of blogs and presentations on these topics that I strongly recommend you read if you haven’t already. Among the most significant are the following: NextGen MapReduce presentation, NextGen MapReduce Hits Mainline, Delivering on Hadoop .NEXT, Benchmarking Performance, Apache Hadoop 2.0 (Alpha) Released, Data Integrity and Availability in Apache Hadoop HDFS, An Introduction to HDFS Federation, NameNode HA Reaches an Important Milestone, Snapshots for HDFS and High Availability and Hadoop 1.0 – Perfect Together . The last post covers the ability to add new HA capabilities to the stable and proven Hadoop-1.0 line.
We will “make Apache Hadoop easier to integrate and extend”. We have made some important advancements in this area that may have gone unnoticed. Much of this work is related to HCatalog, an Apache project that provides a metadata and table management system for Hadoop. We feel strongly that HCatalog is the preferred path for simplifying data sharing between Hadoop and other enterprise data systems and have invested heavily into advancing this project and related APIs for HCatalog. By tightly integrating Talend Open Studio for Big Data, we have also made it much easier for a broader audience to integrate Hadoop with hundreds of existing data systems. We have also formed important partnerships with leaders such as Microsoft and Teradata to ensure that their platforms and applications are tightly integrated and optimized to work with Apache Hadoop.
We will “deliver an ever-increasing array of services aimed at improving the Hadoop experience and support in the growing needs of enterprises, systems integrators and technology vendors”. Over the past year, we have made available Hortonworks University, an exceptional Hadoop training program for developers, administrators and analysts; and Hortonworks Services, which leverages the deep domain experience of the Hortonworks technical staff to provide technical support to enterprises, systems integrators and technology vendors. Our training courses, in particularly, have been very well received by students who have continually praised our hands-on lab exercises as the best in the industry. We have recently expanded our training schedule, so check it out.
There were certainly many other notable achievements over the past year including
As you can see, we are very proud of our accomplishments in our first year. We were also glad to be recognized by Forrester as a leader in the Forrester Wave on Enterprise Hadoop Solutions. Really, how often do companies get recognized as leaders by Forrester in their very first year of existence?
While this blog took a look back at last year, stay tuned for another blog that looks forward to what we have planned for year two.