Earlier this week Microsoft announced via their blog that a new version of Windows Azure HDInsight is available in public preview.
Microsoft recognizes the importance of the technical innovation in and around YARN as well as Hortonworks leadership in this area and we have worked collaboratively to bring Hadoop 2.2 to Azure via our Hortonworks Data Platform 2.0 for Windows release.
Apache Hadoop YARN is the data operating system for Hadoop and greatly expands the applications possible of this emerging technology by allowing multiple processing frameworks such as streaming or graph processing to plug in natively. It also improves the efficiency of clusters allowing them to better utilize resources and improve performance.
In their post Microsoft describes the substantial performance improvements delivered with this latest release and described their collaboration on the Stinger initiative to bring these improvement to market.
“This release of HDInsight is important because it is engineered on the latest version of Apache Hadoop 2.2 to provide order magnitude (up to 40x) improvements to query response times, data compression (up to 80%) for lower storage requirements, and leveraging the benefits of YARN (upgrading to the future “Data Operating System for Hadoop”).
The 40x improvements to query response times and up to 80% data compression are due to the collaboration between Microsoft, Hortonworks and other community contributors with the Stinger project. Microsoft leveraged the best practices developed in the optimization of SQL Server’s query execution engine to optimize Hadoop. We are pleased to bring enhancements to Hadoop that support such a dramatic performance improvement back to the open source community.”
Because HDInsight is built on HDP 2.0 for Windows customers have an unprecedented choice of deployment options for on-premise, cloud or hybrid deployment. They can start off in one deployment mode and move seamlessly to another as their requirements evolve.
Microsoft and Hortonworks have a shared vision of open innovation in and around Apache Hadoop and a commitment to deliver that via a 100% open source platform. This significant new offering enables organizations looking to deploy Hadoop based applications in the cloud to leverage the YARN based architecture of Hadoop 2.0.