Category Archives: Microsoft


Hadoop, Hadoop, Hurrah! HDP for Windows is Now GA!

HDP for WindowsToday we are very excited to announce that Hortonworks Data Platform for Windows (HDP for Windows) is now generally available and ready to support the most demanding production workloads.

We have been blown away with the number and size of organizations who have downloaded the beta bits of this 100% open source, and native to Windows distribution of Hadoop and engaged Hortonworks and Microsoft around evolving their data architecture to respond to the challenges of enterprise big data.

With this key milestone HDP for Windows offers the millions of customers running their business on Microsoft technologies an ecosystem-friendly Hadoop-based solution that is built for the enterprise and purpose built for Windows. This release cements Apache Hadoop’s role as a key component of the next generation enterprise data architecture, across the broadest set of datacenter configurations as HDP becomes the first production-ready Apache Hadoop distribution to run on both Windows and Linux.

Additionally, customers now also have complete portability of their Hadoop applications between on-premise and cloud deployments via HDP for Windows and Microsofts’s HDInsight Service.

Enterprise Hadoop Momentum

Since its beta availability, we’ve been working with customers across a wide range of industries including automotive, manufacturing, financial services, retail and government. Here are just a few examples of the tremendous opportunity those customers are seeing:

  • Automotive – a major automotive company wants to use HDP on Windows to create a centralized repository for all of the sensor data collected from their cars. The refinement and exploration of the data trends and patterns found through driving habits, maintenance and repair data and myriad other signals will be used to further improve the quality of their cars.
  • Healthcare – a major healthcare applications provider is looking to build the next generation of healthcare apps that integrate patient health record data with clinical study and FDA data so that the customer experience is enriched and provides a higher level of health care services at a lower cost.
  • Financial services – multiple major financial services organizations are looking to create centralized repositories across different divisions enabling them to explore and gain deeper insight into customer risk patterns.
  • Manufacturing – a major manufacturer of electronics will create a centralized repository of machine generated data coming from the production lines and compare and analyze that data with part failure and return data enabling them to identify and predict problems in production and increasing the quality of their products.

This is just a small sample of the emerging use cases for HDP on Windows. You can explore how Hadoop fits into your data architecture here.

Availability & Training

Hortonworks Data Platform for Windows is now available for download at: http://hortonworks.com/download/.

We also have training specifically designed for HDP on Windows, you can get more information here: http://hortonworks.com/hadoop-training/hadoop-on-windows-for-developers/

Hadoop SDK and Tutorials for Microsoft .NET Developers

Microsoft has begun to treat its developer community to a number of Hadoop-y releases related to its HDInsight (Hadoop in the cloud) service, and it’s worth rounding up the material. It’s all Alpha and Preview so YMMV but looks like fun:

  • Microsoft .NET SDK for Hadoop. This kit provides .NET API access to aspects of HDInsight including HDFS, HCatalag, Oozie and Ambari, and also some Powershell scripts for cluster management. There are also libraries for MapReduce and LINQ to Hive. The latter is really interesting as it builds on the established technology for .NET developers to access most data sources to deliver the capabilities of the de facto standard for Hadoop data query.
  • HDInsight Labs Preview. Up on Github, there is a series of 5 labs covering C#, JavaScript and F# coding for MapReduce jobs, using Hive, and then bringing that data into Excel. It also covers some Mahout use to build a recommendation engine.
  • Microsoft Hive ODBC Driver. The examples above use this preview driver to enable the connection from Hive to Excel.

If all of the above excites you our Hadoop on Windows for Developers training course also similar content in a lot of depth.

You can read more about the partnership between Hortonworks and Microsoft here, and you can download a preview of HDP for Windows here, or sign up for HDInsight over here. And if you’re hungry for more Hadoop tutorials, grab our own Hortonworks Sandbox here.

Week in Review: Sandboxes, HDP 2.0 Alpha 2, Hive Performance and Summits

Hadoop Summit It’s almost time for that final drive home of the week, and what a week it has been with a few new releases, a summit, and a little bit of technical fun. Here’s what happened:

New Sandbox Release. Yes, your favorite Hadoop VM image just got even better. Cheryle took us through the new features which included Ambari integration and Russell followed up with a quick tour of Ambari. There’s still plenty of time to download Sandbox for a weekend of data crunching fun.

HDP 2.0 Alpha 2 was released. This preview release demonstrates some of the performance improvements in store for the final HDP 2.0 release via YARN, enhancements to Hive per the Stinger Initiative, and Apache Tez. Just before the release, we posted some early test results which showed a 45X (yes, that’s forty five) performance improvement for Hive interactive queries. But that’s just the beginning as we push to 100X, and Microsoft also talked about their contributions to the Stinger Initiative with the same aim in mind.

If you’ve downloaded Sandbox and are looking for some inspiration for a little fun, then Russell also posted a two part series on extracting, loading, querying and analyzing your own Twitter archive with Hive. Part 1 is here, and Part 2 is here.

And finally, there was just the small matter of the Hadoop Summit in AmsterdamWe had a great time and hope you did too. Thank you for attending, contributing to the conversation and supporting Hadoop. If you’re now really excited to learn Hadoop, we posted about available training we have in Europe and Palo Alto.

And that was the week that was. Has your Sandbox downloaded yet?

Microsoft’s Contributions to the Stinger Initiative and Apache Hive

Guest blog post from Eric Hanson, Principal Program Manager, Microsoft

Hadoop had a crazy and collaborative beginning as an OSS project, and that legacy continues. There have been over 1,200 contributors across 80 companies since its beginning. Microsoft has been contributing to Hadoop since October 2011, and we’re committed to giving back and keeping it open.

Our first wave of contributions, in collaboration with Hortonworks, has been to port Hadoop to Windows, to enable it both for our HDInsight service on Windows Azure and for on-premises Big Data installations on Windows. Now, we’re starting to contribute to the Stinger initiative to dramatically speed up Hive and make it more enterprise-ready.

Contribution to the core of Apache Hadoop through Stinger

Our main activity in Stinger right now is around Tez, and vectorized query execution. One of our developers, Mike Liddell, has experience with DAG-based computations in Microsoft’s internal Dryad-LINQ effort, and has just joined Tez as a founding committer. I kick-started and helped guide our project to introduce columnstore data formats and vectorized (a.k.a. “batch mode”) query execution into SQL Server 2012.  After moving to the SQL Server Big Data team, I’ve been collaborating with Hortonworks developers since late last fall regarding how to make Hive faster. We heard about the ORC project, led by Owen O’Malley of Hortonworks, to improve the RCFile columnstore format. I’ve had several productive design discussions with Owen about ORC, and we really like the way it’s shaping up.

Based on our experience, we knew that a great columnstore format is only part of the story about making data warehouse-style queries run really fast. Good process and communication architecture is one – Tez is a great step there. Another is fast query execution (QE), and vectorized query execution research and field experience has shown it can speed up queries on the order of 10X-100X.

Some people were saying that fast QE required a total-rewrite in C++. I didn’t buy that, and I prototyped vectorized scan and filter operators in Java and shared this with Hortonworks. For simple conditions like column = constant, we’ve seen the ability to filter about 150 million rows per second on one thread in Java. We now have a two-company team introducing vectorized QE to Hive, consisting of two Hortonworks folks (Jitendra Pandey and Owen) and several Microsoft engineers. We’re going to take it in small steps, adding vectorized scans over ORC, and basic filter operations first. Then we’ll move on to vectorized aggregates and joins.

We think that the functional surface area of Hive, including its SQL query language, the open, extensible storage model over HDFS, and its easy programmer extensibility with Java UDFs, is quite compelling. It gives non-procedural access to Big Data, with ability for programmers to create custom Java add-ins that let them do complex calculations more easily that they can with Map-Reduce programs. Hive also has a strong community of OSS developers and users. It works on ultra-scale clusters on data sets vastly bigger than total cluster memory. Stinger aims to boost the speed of Hive to complement its rich functionality in a way that users will love.

An active participant in the open community

We’ve been part of OSS Big Data world for about a year and half now. Through the combined efforts of the overall Hadoop community, Microsoft, and Hortonworks, Hadoop is now accessible on Windows Server and Windows Azure. We’ve gained so much from the community. Now we’re helping return the favor by contributing to Stinger, with our eye on 100X performance gains.

Week in Review: From Plastics to Windows

We’re wrapping up another busy week at Hortonworks towers. I say another, but actually this is my first week. So… it’s a hello from me, I’m Marc Holmes, Community Director. What have we been talking about this week?

Plastics and Hadoop: discuss! We started the week with a post from our VP of Products, Bob Page drawing an analogy to the growth of the plastics industry with the disruption to the database market driven by Hadoop, looking at the connections and differences to SQL and pointing out ‘what we don’t know yet’ on the evolution of use cases for Hadoop.

Hadoop and Windows sitting in a tree… Arun and Suresh highlighted the joint effort between Hortonworks and Microsoft to make Apache Hadoop run natively on Windows, and celebrated the community vote to move this work into the mainline trunk. We’re community-driven open source folk and we’re delighted not only by the code, but the spirit of community contribution throughout. Microsoft talked about this work over on their Port 25 blog.

Out there. Meantime, there was a LOT of discussion on a couple of articles including this one - Proprietary Hadoop is a Losing Strategy - and this one - One Hadoop Distribution To Rule Them All as a follow up. We believe, and Arun points out, that ‘ultimately the winners in Hadoop will be those investing most heavily in its success’.

But what do you think at a personal level? Do you want Hadoop skills, or Hadoop-a-like skills? Let us know.

And finally, talking of skills, Russell Jurney explained how to Install Hadoop on Windows. So now you know.

Next week… should be quiet. Only the Hadoop Summit in Amsterdam, and a bunch of exciting stuff we’ll tell you more about then. Stay out of trouble and enjoy the show!

Expanding the Apache Hadoop Community to Windows

This post co-authored by Arun Murthy.

It’s been an exciting time for the Apache Hadoop community with new and innovative projects happening around performance (Apache Tez) — part of the Stinger initiative — and security (Apache Knox). In addition Hortonworks recently announced the availability of the beta version of Hortonworks Data Platform for Windows.

One of the things we believe strongly in here at Hortonworks is community driven open source and, obviously, the bigger the community, the better. The community opens itself up to new members by the developmental choices it makes and last week the Apache Hadoop community voted to significantly expand itself by agreeing to accept enhancements into the core trunk that make Apache Hadoop run natively on the Microsoft Windows platforms including Windows Server and Windows Azure. These enhancements were the result of many, many months of joint engineering work from Microsoft and Hortonworks and we are glad to see the community accept and embrace them. So far, as is common in the Apache Hadoop project, we developed these in a development branch for over a year and once this work was complete, the community voted to incorporate these changes into the mainline trunk.

Here are the highlights of the work done:

  • Command-line scripts for the Hadoop surface area
  • Mapping the HDFS permissions model to Windows
  • Abstracted and reconciled mismatches around differences in path semantics in Java and Windows
  • Native Task Controller for Windows
  • Implementation of a Block Placement Policy to support cloud environments, more specifically Windows Azure.
  • Implementation of Hadoop native libraries for Windows (compression codecs, native I/O)
  • Several reliability issues, including race-conditions, intermittent test failures, resource leaks.
  • Several new unit test cases written for the above changes

This is great news for the Apache Hadoop ecosystem because it enables a whole new swath of organizations using Microsoft Windows and, equally importantly, end-users to work with Apache Hadoop in their preferred environment. There is also the substantial ecosystem of technology vendors who build solutions for the Microsoft Windows platform who can now integrate their solutions on Windows. Additionally the system integrators who have invested and created expertise around the Windows platform will be able to extend their skills to Hadoop on Windows.

Of course it is also a great demonstration of contributing back to the community so that anyone can benefit from this work. It is also notable that our collaborative efforts with Microsoft also extend beyond core Apache Hadoop to projects like Apache Hive, Apache Pig, Apache Sqoop, Apache Oozie, Apache HCatalog and Apache HBase.

We at Hortonworks would like to extend our congratulations to Microsoft for giving back to the Apache Hadoop community and would like to extend a warm welcome; the community can look forward to seeing much more as we work together in the near future.

HOWTO install Hadoop on Windows

Installing the Hortonworks Data Platform for Windows couldn’t be easier. Lets take a look at how to install a one node cluster on your Windows Server 2012 machine. to let us know if you’d like more content like this.

msi_download
To start, download the HDP for Windows MSI at http://hortonworks.com/thankyou-hdp11-win/. It is about 460MB, and will take a moment to download. Documentation for the download is available here.

As indicated in the documentation here, first we must install Microsoft Visual C++ 2010 Redistributable Package (x64), available here.

Download and install .NET from here if you haven’t already.

We need to setup Java, which you can get here. We need to setup JAVA_HOME, which Hadoop requires. Make sure to install Java to somewhere without a space in the path, “Program Files” will not work!

To setup JAVA_HOME, in the file browsers -> right click computer -> Properties -> Advanced System Settings -> Environment variables. Then setup a new System variable called JAVA_HOME that points to your Java install (in this case, C:\java\jdk1.6.0_31).

JAVA_HOME

Finally, we need to download python from here and set the Path environment variable as we did JAVA_HOME. Go to Computer -> Properties -> Advanced System Settings -> Environment variables. Then append the install path to Python, for example C:\Python27, to this path after a ‘;’.

python_path

Verify your path is setup by entering a new shell and typing: python, which should run the python interpreter. Type quit() to exit. Now we’re ready for our configuration.

Next, notepad the file clusterproperties.txt, which we will setup for a simple, one node cluster operation. Note: first we need to discover our hostname, and enter it into our config instead of something generic like ‘localhost.’ Use the hostname command, for example:

hostname
WIN-4VLBRQK8FA8

We then place this hostname in our config. Be sure the replace the example value with your own hostname!

#Log directory
HDP_LOG_DIR=c:\hadoop\logs

#Data directory
HDP_DATA_DIR=c:\hadoop\data

#Hosts
NAMENODE_HOST=WIN-4VLBRQK8FA8
SECONDARY_NAMENODE_HOST=WIN-4VLBRQK8FA8
JOBTRACKER_HOST=WIN-4VLBRQK8FA8
HIVE_SERVER_HOST=WIN-4VLBRQK8FA8
OOZIE_SERVER_HOST=WIN-4VLBRQK8FA8
TEMPLETON_HOST=WIN-4VLBRQK8FA8
SLAVE_HOSTS=WIN-4VLBRQK8FA8

#Database host
DB_FLAVOR=derby
DB_HOSTNAME=WIN-4VLBRQK8FA8

#Hive properties
HIVE_DB_NAME=hive
HIVE_DB_USERNAME=hive
HIVE_DB_PASSWORD=hive

#Oozie properties
OOZIE_DB_NAME=oozie
OOZIE_DB_USERNAME=oozie
OOZIE_DB_PASSWORD=oozie

And finally, install HDP for Windows:

msiexec.exe /i "hdp-1.1.0-160.winpkg.msi" /lv install.log \
HDP_LAYOUT=c:\Users\Administrator\Downloads\clusterproperties.txt HDP_DIR=c:\HDP DESTROY_DATA="yes"

This will bring up an MSI install window. When it is done, to verify your installation, check the HDP_DIR it was installed to:

dir c:\HDP

You should see files, such as ‘start_local_hdp_services.cmd’. Run this file:

.\start_local_hdp_services.cmd

With services up, you’re in good shape to run the SmokeTests.

Run-SmokeTests.cmd

Which will fire off a mapreduce job right away. Congratulations, you’re Hadooping on Windows!

mapreduce

If you’d like to learn more about Hadoop, check out the Hortonworks Sandbox, a fully capable virtual machine for you to learn Hadoop with.

Putting the Elephant in the Window

 

For several years now Apache Hadoop has been fueling the fast growing big data market and has become the defacto platform for Big Data deployments and the technology foundation for an explosion of new analytic applications. Many organizations turn to Hadoop to help tame the vast amounts of new data they are collecting but in order to do so with Hadoop they have had to use servers running the Linux operating system. That left a large number of organizations who standardize on Windows (According to IDC, Windows Server owned 73 percent of the market in 2012 – IDC, Worldwide and Regional Server 2012–2016 Forecast, Doc # 234339, May 2012) without the ability to run Hadoop natively, until today.

windoweleWe are very pleased to announce the availability of Hortonworks Data Platform for Windows providing organizations with an enterprise-grade, production-tested platform for big data deployments on Windows. HDP is the first and only Hadoop-based platform available on both Windows and Linux and provides interoperability across Windows, Linux and Windows Azure. With this release we are enabling a massive expansion of the Hadoop ecosystem. New participants in the community of developers, data scientist, data management professionals and Hadoop fans to build and run applications for Apache Hadoop natively on Windows. This is great news for Windows focused enterprises, service provides, software vendors and developers and in particular they can get going today with Hadoop simply by visiting our download page.

This release would not be possible without a strong partnership and close collaboration with Microsoft. Through the process of creating this release, we have remained true to our approach of community-driven enterprise Apache Hadoop by collecting enterprise requirements, developing them in open source and applying enterprise rigor to produce a 100-precent open source enterprise-grade Hadoop platform.

One of our goals at Hortonworks is to make Hadoop and enterprise viable data platform available on as many platforms as possible. In fact, it is already available today in a range of deployment options including: on-premise, virtual, cloud and an appliance. For organizations looking to leverage Apache Hadoop, they now have even more choices of deployment options between Linux and Windows, giving them more freedom to meet their internal policies and standards. For Microsoft Windows customers, they have complete portability of their Apache Hadoop applications between on premise and cloud deployments, as Hortonworks Data Platform for Windows and HDInsight Service on Windows Azure are built on exactly the same code line.

If you are in the SF Bay Area this week, you can talk to us live about the power of the Hortonworks Data Platform for Windows at booth #316 at the Strata Conference, taking place February 26-28 at the Santa Clara Convention Center in Santa Clara, Calif.

 We will also be conducting the webinar, “Unlocking the Other Half: Introduction to Hortonworks Data Platform for Windows,” on Tuesday, March 12 at 10 a.m. PST / 1 p.m. EST.

To register for the webinar, please visit http://info.hortonworks.com/Hortonworks_HDPonWindows_webcast.html.

 

DINOSAURS ARE REAL: Microsoft WOWs audience with HDInsight at Strata NYC (Hortonworks Inside)

You don’t see many demos like the one given by Shawn Bice (Microsoft) today in the Regent Parlor of the New York Hilton, at Strata NYC. “Drive Smarter Decisions with Microsoft Big Data,” was different.

For starters – everything worked like clockwork. Live demos of new products are notorious for failing on-stage, even if they work in production. And although Microsoft was presenting about a Java-based platform at a largely open-source event… it was standing room only, with the crowd overflowing out the doors.

Shawn demonstrated working with Apache Hadoop from Excel, through Power Pivot, to Hive (with sampling-driven early results!?) and out to import third party data-sets. To get the full effect of what he did, you’re going to have to view a screencast or try it out but to give you the idea of what the first proper interface on Hadoop feels like…

There was a comedian who had a bit about… remember when you first saw Jurassic Park for the first time? No matter how old you were, your child-like response was, “DINOSAURS ARE REAL!!!!!!$!!$##!” Our reaction to Jurassic Park was CGI technology disrupting cinema, provoking the same kind of reaction early cinema had on viewers who felt real concern that the horse or train approaching would run them over. At least thats what I learned wasting a lottery-funded academic scholarship on film classes at a state university before having the good sense to fail out and use my time productively.

That feeling you got when you saw your first CGI raptor is what Microsoft’s demo was like, except it went… “HADOOP IS IN EXCEL!!$%!%!%!$????!!!”

This is a serious thing for me, because I hooked up Pig and Excel years ago:

Which is a crappy demo of Hadoop connecting to Excel, but which gives me mucho moral authority to state that Microsoft’s demo was the right way to hook data to Excel. Take it from someone that spent half of his twenties trying to build web applications that could compete against Excel: until data is in Excel… it ain’t real. With Microsoft’s new offering… big data just got real.

To put this into perspective:

And just so you know I’m not bullshitting you about Hadoop and Big Data and Raptors and next thing you know you’re checking for your wallet and nodding awkwardly and trying to find a pause in this lunatic rant to get the hell out of here, I’ll just come out and tell you:

I have a raptor named lame-o-saurus in a Cowboy Curtis hat permanently tattood on my body. Again, we resort to visualization (mind the hair):

To summarize:

  1. I am the world’s primary authority on the wrong way to hook Hadoop to Excel.
  2. I have strange tattoos which affirm the validity of my metaphors.
  3. Microsoft has fundamentally altered Big Data with their HDInsights offering.
  4. Yesterday, a breakthrough happened in the Regent Parlor of the Hilton, NYC.

Visicalc… we’ve come such a long way.

Why Microsoft is committed to Hadoop and Hortonworks

This guest blog post is from Microsoft’s Dave Campbell providing more details on why they chose Hortonworks for  HDInsights.

Last February at Strata Conference in Santa Clara we shared Microsoft’s progress on Big Data, specifically working to broaden the adoption of Hadoop with the simplicity and manageability of Windows and enabling customers to easily derive insights from their structured and unstructured data through familiar tools like Excel.

Hortonworks is a recognized pioneer in the Hadoop Community and a leading contributor to the Apache Hadoop project, and that’s why we’re excited to announce our expanded partnership with Hortonworks to give customers access to an enterprise-ready distribution of Hadoop that is 100 percent compatible with Windows Server and Windows Azure.  To provide customers with access to this Hadoop compatibility, yesterday we also released new previews of Microsoft HDInsight Server for Windows and Windows Azure HDInsight Service, our Hadoop-based solutions for Windows Server and Windows Azure.

With this expanded partnership, the Hadoop community will reap the following benefits of Hadoop on Windows:

  • Insights to all users from all data: Analyze unstructured Hadoop data with familiar tools like Excel.  Through integration with award-winning Microsoft BI tools such as PowerPivot and Power View,  HDInsight enables analysis of all your data (structured or unstructured), including data on Linux .
  • Enterprise-ready Hadoop with HDInsight: Offering the most reliable, innovative and trusted distribution available.  Microsoft and Hortonworks together deliver tighter security through integration with Windows Server Active Directory, ease of management through System Center integration, and built-in high availability with Hortonworks Data Platform 1.1. Additionally, harness your existing .NET and JavaScript developers with rich developer frameworks that enable them to write and deploy MapReduce jobs.
  • Simplicity of Windows for Hadoop: Microsoft HDInsight Server for Windows Server significantly simplifies setup and provisioning of Hadoop through streamlined packaging.  So, you don’t need to choose and test the right Hadoop projects on your own.  In the cloud, Windows Azure HDInsight Service simplifies deployment so much that you can now setup a 16-node Hadoop cluster in only 10 minutes!  System Center simplifies management through integration with the Apache Ambari project.  With this integration IT Operators can manage their Hadoop clusters side-by-side with their databases, applications and other IT assets on a single glass pane.
  • Extend your data warehouse with Hadoop: HDP 1.1 improves integration of Hadoop with relational Data Warehouses with HCatalog.  This provides SQL-like language access to Hadoop so that customers can enrich their analysis by including insights from Hadoop environments into the Enterprise Data Warehouse and BI systems.  Additionally, Microsoft enables customers to extend their Enterprise Data Warehouses with Hadoop connectors for SQL Server and Parallel Data Warehouse appliance.
  • Seamless Scale and Elasticity of the Cloud: Microsoft offers HDInsight both in the cloud and on-premise, with seamless migration across the two environments based on your needs. The cloud service offers elastic scalability, a simplified deployment and management experience and a low-cost way to experiment with Hadoop. Deploying Microsoft HDInsight Server on Windows Server provides enterprise-class security through integration with Active Directory, simplified management with System Center management and availability with a trusted and reliable Hadoop distribution.

This is a very exciting milestone, and we hope you’ll join us for the ride as we continue partnering with Hortonworks to democratize big data.  Download HDInsight today at Microsoft.com/BigData.

Enabling Big Data Insight for Millions of Windows Developers

At Hortonworks, we fundamentally believe that, in the not-so-distant future, Apache Hadoop will process over half the world’s data flowing through businesses. We realize this is a BOLD vision that will take a lot of hard work by not only Hortonworks and the open source community, but also software, hardware, and solution vendors focused on the Hadoop ecosystem, as well as end users deploying platforms powered by Hadoop.

If the vision is to be achieved, we need to accelerate the process of enabling the masses to benefit from the power and value of Apache Hadoop in ways where they are virtually oblivious to the fact that Hadoop is under the hood. Doing so will help ensure time and energy is spent on enabling insights to be derived from big data, rather than on the IT infrastructure details required to capture, process, exchange, and manage this multi-structured data.

So how can we accelerate the path to this vision? Simply put, we focus on enabling the largest communities of users interested in deriving value from big data.

Since one of the world’s most widely used business intelligence tools is Microsoft Excel, and since Microsoft is arguably one of the best companies at enabling and mobilizing large and vibrant developer communities, needless to say we at Hortonworks are excited and bullish on the expansion of our partnership with Microsoft.

Today Microsoft unveiled previews of Microsoft HDInsight Server and Windows Azure HDInsight Service, big data solutions that are built on Hortonworks Data Platform (HDP) for Windows Server and Windows Azure respectively. These new offerings aim to provide a simplified and consistent experience across on-premise and cloud deployment that is fully compatible with Apache Hadoop.

This news represents a significant inflection point for the big data market in general and for the importance of open source Apache Hadoop in particular. Unlocking the Windows Server and Windows Azure markets for Hadoop means more businesses will be able to tap into its benefits.

Moreover, these new offerings represent months of joint engineering work across both the Microsoft and Hortonworks engineering and product teams. Microsoft’s commitment to doing this work in a way that improves open source Apache Hadoop and related Apache projects has been unwavering; which translates into goodness for the open source community.

I encourage you to try out the fruits of our labors in one of two ways:

• Download Microsoft HDInsight Server and play with Hadoop on your own Windows machine.
• Access Windows Azure HDInsight Service and play with Hadoop in the cloud.

I encourage you to go to http://hortonworks.com/partners/microsoft/ in order to learn more and get started!

Finally, check out Microsoft’s announcement for more information! http://blogs.technet.com/b/dataplatforminsider/archive/2012/10/22/simplifying-big-data-for-the-enterprise.aspx