Extending Apache Hadoop to Millions of New Microsoft Users

Today we announced  that we were delivering on our earlier promise to help Microsoft bring Apache Hadoop to Windows. I’m pleased to share that Microsoft, with our collaboration and guidance, has now submitted a series of patches to Apache aimed at overcoming the challenges of running Apache Hadoop in Windows Server environments.

These patches, once vetted and approved by the community, will become part of the core Hadoop code base. They will also become available in the two major Apache Hadoop branches: hadoop-1.0 (the current stable branch, which is available as part of Hortonworks Data Platform v1.0) and hadoop-0.23 (the next generation of Apache Hadoop, which will be available as part of Hortonworks Data Platform v2.0).

In addition, Microsoft and Hortonworks are expanding our technical collaboration to include:

  1. A new JavaScript framework for Apache Hadoop that will enable JavaScript developers to perform iterative prototyping and interactive exploration of data in Hadoop, and
  2. An enhanced Hive ODBC driver that will enable Hadoop data to be analyzed using familiar tools such as Microsoft Excel and business intelligence (BI) clients such as PowerPivot for Excel.

Why is this announcement significant?

From a Hortonworks perspective, we are obviously pleased that Microsoft chose Hortonworks as their technology collaboration partner for Apache Hadoop. Microsoft identified the tremendous value that comes from working with the team that has been at the core of Apache Hadoop development since the beginning. This, however, is not what makes the announcement significant.

With this announcement, Microsoft is demonstrating their commitment to Apache Hadoop. They are embracing open source and giving back in a true open source tradition. Not only are they contributing code that allows Apache Hadoop to run effectively on Windows, they are also contributing a new JavaScript framework and an enhanced Hive ODBC driver. While Microsoft further embracing open source is a positive development, it’s also not what makes the announcement significant.

What makes this announcement significant is that Microsoft is opening up Apache Hadoop to literally millions of new users. There are millions of JavaScript developers that can now leverage the power of Apache Hadoop. There are many more millions of Excel and PowerPivot users that can also now derive value from Apache Hadoop using software is that already very familiar to them. Simply put, these contributions by Microsoft will extend Apache Hadoop to the most prolific data analysis tools in the world.

We have stated on many occasions our vision that Apache Hadoop will process half of the world’s data within the next five years (or less). The Microsoft contributions are a very important step in making that vision a reality.


Categorized by :
Hadoop Hadoop Ecosystem Industry Happenings

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.