Category Archives: Hadoop Ecosystem


Advanced Analytics: Making Decisions at the Speed of Business

Retailers today are faced with addressing the new behaviors of an evolving customer base by leveraging the changing landscape and its new dynamics.  Retail consumers online are sharing, friend validating, researching, learning and developing a point of view ─ offline they are touching, brand comparing and brand associating.  Retailers now more than ever before have to think in terms of “integrated commerce” and leverage Big Data for big results in the marketplace.

Forward-thinking organizations are discovering the possibilities of unconstrained analytics and quickly realizing the potential of accelerating the spread of analytics across the company ─ ultimately driving the speed of acquiring new customers, responding to consumer and market change, and increasing their “share of wallet”. Retail analysts want to spend more time in the analytic discovery process, and less time acquiring and preparing data, so they can uncover new market opportunities and reduce risks. Their goal is to create a sustain­able competitive advantage that lets retailers predict con­sumer shopping patterns, increase market basket size by small percentages and better target new customers  – quickly translating into millions or billions of dollars.

paraccelHortonworks partner ParAccel has an Analytic Platform with parallel, bi-directional integration between ParAccel and Hortonworks Data Platform enabling cooperative analytic processing, leveraging the data and analytic functions of both sys­tems. The ParAccel Analytic Platform is built to run deep “in-database” analytics on massive amounts of data across systems ─ extending Hortonworks Data Platform for big data analytics. Joint customers find the integrated platforms provide a powerful, cost-effective solution for big data management and advanced analytics.

The architecture creates an open environment where analysts can bring in data from data warehouses and leverage data in Hortonworks Data Platform before or in the middle of a query. ParAccel also recently added support for HCatalog, making this integration speedy and efficient. It’s a great solution for offloading analytics from traditional platforms or bringing in internet, sensor data or normalized (structured) social media data. These out of the box modules give analytic-driven retailers access to the full range of data needed to make the right marketing, merchandising, and store operations decisions every time at the speed of business.

Learn more – join Hortonworks in the upcoming ParAccel webinar “Advanced Analytics on Hadoop Data” May 21st at 10 am PT.

Enterprise Big Data Analytics with Hortonworks and Datameer

Today, 94% of Hadoop users perform analytics on large volumes of data that were not possible before. How do they do it? Cool applications, that’s how.

You have seen various stats that indicate enterprises need better ways of making use of data but they bear repeating: The volume of business data worldwide, across all companies, doubles every 1.2 years, according to a study published by eBay in May, 2012. And market research firm IDC released a forecast showing the big data market may grow from $3.2 billion in 2010 to $16.9 billion in 2015. Clearly, enterprises need better ways of making use of all of this data, which contains innumerable insights for improving business processes and profitability.

datameerHortonworks partner Datameer, has a horizontal application for big data discovery that includes self-service data integration, analytics and visualization on top of Hadoop, including pre-built analytic applications.

While Datameer itself is a horizontal application for big data discovery that includes self-service data integration, analytics and visualization on top of Hadoop, Datameer takes it one step further and even offers pre-built analytic applications. Datameer’s Analytics App Market is the world’s first marketplace for buying and selling analytic applications that allows users to simply plug in their own data and see the final results visualized, without having to do the work of building the analysis.

The applications are downloaded with a single click, and range from broad, horizontal use cases that most any organization could utilize like email analytics or social media brand sentiment analysis to very specific use-case driven applications like Zendesk Forum analytics or JIRA ticket analyses. The best part is the marketplace is constantly growing as data scientists and subject matter experts from around the world create and contribute new applications for virtually any structured or unstructured data source.

Betting on Hadoop

Joe Nicholson, VP of marketing at Datameer, explains that the idea of Big Data analytics has exploded in the past 5 years. Business intelligence is not new, he said. What changed is the rise of so-called unstructured data. “Today, companies want to track things like customer paths taken through a website, email network usage, comments posted on websites or collaborative tools, or find the useful information hidden in millions and millions of tweets,” said Nicholson.

“There’s no way to do it all in any sort of timely fashion, or without breaking the bank, without first getting all of your data in Hadoop. So first and foremost comes the need to be able to get that data in yourself, without relying on IT. Then you want to point-and-click your way through your analysis and get instant feedback so you can analyze the same way you think. When you’ve built your analysis, that’s when you want to run it against your entire dataset. And finally, you want to visualize your results with just a few clicks. Import corporate logos, add text, make the report your own. We do all of that in Datameer, and we couldn’t do it if we hadn’t made this fundamental bet on Hadoop.”

Datameer partnered with Hortonworks back in 2011, and the two companies have been working together to accelerate the development and adoption of big data analytic solutions that leverage or extend the Apache Hadoop platform, and allow users to tap into the massive amounts of unstructured data.

The joint webinar conducted earlier this year, “Big Data Analytics: Is Your Elephant Enterprise Ready?” addressed critical project components such as data security, high availability, user training and use case development.

 

For more information on Datameer visit www.Datameer.com or @datameer

Hortonworks at Yahoo! Hack Europe

IMG_0549Some news from the UK as Yahoo! Hack Europe welcomed Hortonworks this past weekend in central London.  This two-day event sponsored by Yahoo! was focused on celebrating collaboration, learning and innovation using the worlds leading technologies.  Chris Harris, our local EMEA Solution Engineer was on hand to add to the discussions.  Partnering with Microsoft, we were able to showcase our HDP on the Azure platform.  This was a fantastic opportunity for the 350 delegates to be expose to both Azure and enterprise ready Hadoop provided as HDInsight Service.

After an appearance of the Yahoo bigger than life, Hack Robot (seriously, check it out…), who made sure that everyone was entertained, the hack started with vengeance.  Hyped up on the sweetie cart full of everyone’s favorites, most delegates were now officially up for the challenge.  Inspired by the passion, Chris lead a thought provoking workshop, where a number of the hackers were able to try out real life scenarios on how Hadoop as part of the HDInsight service can and will be impacting business decisions.  After partaking in a few more of the free donuts and sandwiches, a few more questions answered and a number of people inspired, Chris finally left the hackers to enjoy the rest of their weekend.  Congratulations to everyone who took part and the winners!  From what we gather the whole weekend was a grand success and we look forward to working with them on the next one and possibly seeing you there!

Chris’ decks can be found on Slideshare, and we’ve embedded them below too. Our thanks to everyone who attended!

How to Build a Hadoop Data Science Team

Data scientists are in high demand these days. Everyone seems to be hiring a team of data scientists, yet many are still not quite sure what data science is all about, and what skill set they need to look for in a data scientist to build a stellar Hadoop data science team. We at Hortonworks believe data science is an evolving discipline that will continue to grow in demand in the coming years, especially with the growth of Hadoop adoption. This role requires experience and knowledge in math, statistics and machine learning, programming and scripting, as well as visualization techniques.

Hadoop data scientists

We tend to think of the data scientist role as a continuum of skills:

Software engineers really enjoy crafting new production-grade software systems, that are testable and maintainable, secure and scale well. Some of those software engineers specialize in working with data. They tend to be highly skilled in technologies like SQL, Hadoop, HIVE/PIG and Map-reduce, and excel at building production quality data pipelines. We call those “data engineers”.

Research scientists focus on academic research in machine learning and statistical techniques, creating brand new algorithms like support vector machines and deep learning, and prove theoretical properties of such algorithms. Applied scientists are those research scientists who thrive on solving real world problems with real data. They are very good at applying state-of-the-art algorithms and techniques to real world data.

The data scientist role combines the skill set and experience of a data engineer with that of the applied scientist. It is quite difficult to find good data scientists, because the combination of all these skills and interests are rarely found in a single person.“Okay, okay, I understand it’s hard to find good data scientists”, you may say, “but I still need to complete my data projects, what should I do?” One option might be to train data engineers to be experts in math, statistics and applied science. Or maybe hire applied scientists and train them to be good software engineers. In my experience that approach has limited success, because good software engineers may not be as good in applied science, or may not be interested to shift their career in that direction. And vice versa.

Instead, simply build a Hadoop data science team that combines data engineers and applied scientists, working in tandem to build your data products. Back when I was at Yahoo!, that’s exactly the structure we had:  applied scientists working together with data engineers to build large-scale computational advertising systems.

 

 

Apache Hadoop Patterns of Use: Refine, Enrich and Explore

“OK, Hadoop is pretty cool, but exactly where does it fit and how are other people using it?”  Here at Hortonworks, this has got to be the most common question we get from the community… well that and “what is the airspeed velocity of an unladen swallow?”

We think about this (where Hadoop fits) a lot and have gathered a fair amount of expertise on the topic.  The core team at Hortonworks includes the original architects, developers and operators of Apache Hadoop and its use at Yahoo, and through this experience and working within the larger community they have been privileged to see Hadoop emerge as the technological underpinning for so many big data projects. That has allowed us to observe certain patterns that we’ve found greatly simplify the concepts associated with Hadoop, and our aim is to share some of those patterns here.

ThumbnailAs an organization laser focused on developing, distributing and supporting Apache Hadoop for enterprise customers, we have been fortunate to have a unique vantage point.

With that, we’re delighted to share with you our new whitepaper ‘Apache Hadoop Patterns of Use’. The patterns discussed in the whitepaper are:

Refine: Collect data and apply a known algorithm to it in a trusted operational process.
Enrich: Collect data, analyze and present salient results for online apps.
Explore: Collect data and perform iterative investigation for value.

You can download it here, and we hope you enjoy it.

 

 

 

Where are Hortonworkers? Events and Meetups 8th April to 22nd April

Hortonworkers are out there – here is a rundown of events and meet ups we’ll be at in the next couple of weeks and we hope we’ll see you there. Did we miss any? Want us to attend your event? Let us know!

Big Data Innovation Summit

April 10-11, 2013, San Francisco, CA

http://theinnovationenterprise.com/summits/big-data-innovation-summit-april-2013-san-francisco

Spring into April and jump into Big Data! Be sure to meet us at Big Data Innovation Summit by the bay. We’re excited to have Alan Gates, co-founder of Hortonworks, presents on a couple of really exciting talks and we hope you can join us.

  •  April 11 @9:30am: Coordinating the Many Tools of Big Data in Hadoop
  •  April 11 @ 12:30pm: Hadoop Now, Next and Beyond
  •  April 11 @ 2:00pm: Roundtable Session: Use Case Patterns: Horizontal or Vertical

As a global sponsor, we’ll also be exhibiting. Look for us in the exhibit area and meet members of the Hortonworks team, who will be happy to discuss any questions you have on Hadoop and Hortonworks.

PASS Business Analytics Conference

April 10-12, 2013, Chicago, IL

http://www.passbaconference.com – booth S5

We’re excited to participate in the first PASS BA or Business Analytics community driven event. We will be speaking at three sessions: “Why Apache Hadoop for Data Science”, “The Future of Apache Hive and Hadoop 2.0”, and “Big Data: Threat or Opportunity?”

Teradata Universe Copenhagen 2013

April 14-17, 2013, Copenhagen, Denmark

http://www.teradataemea.com/

We’re delighted to be a Platinum sponsor at Teradata Universe. The conference gathers experts from internationally recognized companies and presenters from Teradata’s customer community to deliver insights on new trends driving the industry on how Big Data Analytics are used to drive business value.

Chris Harris, Solutions Engineer at Hortonworks, will be speaking at the Solution Showcase on “Big Data: Making Sense of it all!” on Monday April 15 at 12:40 and Tuesday April 16 at 11:20.

More on the Hortonworks / Teradata partnership can be found at www.hortonworks.com/teradata

eMetrics Summit

April 14-18 2013, San Francisco

http://www.emetrics.org/sanfrancisco/2013/

Hortonworks VP Products, Bob Page, will be speaking at two sessions at this analytics event.

OpenStack

April 15-18, 2013, Portland, Oregon

http://openstacksummitapril2013.sched.org/

We’re heading to our very first OpenStack Summit to talk about all things Apache Hadoop on OpenStack and we would love to meet you! A cloud deployment model makes perfect sense for Hadoop, which (a) allows for efficient infrastructure usage and (b) is a net new workload for most organizations (awesome…far fewer legacy considerations).  So Hadoop + OpenStack seems like a logical fit.  If your organization is interested in combining these two mega technology trends, it would be great to connect with our team who can share what others are doing!

There are many ways to meet the Hortonworks team!
We’ll be speaking:

And we’re exhibiting! Come by our Hortonworks booth, say hello, geek out to Hadoop and Big Data and pick up an awesome swag while you’re at it!

Charlotte Hadoop Users Group, 11th April 2013

http://www.meetup.com/CharlotteHUG/

Terry Padgett will present on the Stinger Initiative, Tez and Knox

Bay Area HUG, 17th April 2013

http://www.meetup.com/hadoop/events/63737062/

Owen O’Malley will present on the Stinger Initiative

Chicago HUG, 22nd April 2013

http://www.meetup.com/Chicago-area-Hadoop-User-Group-CHUG/events/106391622/

George Vetticaden will present on the Stinger Initiative, Tez and Knox.

Keynotes from Hadoop Summit Amsterdam 2013

The slides and videos from Hadoop Summit in Amsterdam have begun to flow so you can enjoy the sessions.

Whilst you’re thinking about which sessions to watch and read, then we suggest taking a look at the keynotes for the event:
  • What is the point of Hadoop? (VideoSlides)
  • Matt Aslett, Research Director, Data Management and Analytics, 451 Research
  • Real-World insight into Hadoop in the Enterprise (Video)
  • Panel featuring HSBC, eBay, Neustar and More
We hope you enjoy these sessions, and the content from the tracks. Let us know in the comments! And don’t forget that there is plenty of time to register for Hadoop Summit San Jose 2013.

Hadoop Market Momentum and You

On 27th March, the Wall Street Journal published an article ‘VCs Bet Big Bucks on Hadoop’ and it seems clear that the market is going to be huge. But what does that mean to you and your personal skills investment? Here’s our view:

Hadoop is HOT

Hadoop is incredibly hot right now as the number of available jobs continues to grow enormously (hey – we even have a bunch of our own right here).

Indeed’s Job Trends shows Hadoop as 7th hottest skill and it’s in great company alongside those app development skills such as iOS, Android and jQuery. I guess that’s to be expected of course: insights from big data is the fuel to smartest apps of the future.

The Hadoop trend itself is fairly clear. In growth terms, that is pretty explosive!

Indeed Job Trends

 

A quick search on LinkedIn will pull back around 1200 Hadoop jobs right now (it was 1281 when I checked). And you can also look at the Skills page to see the associated set of component technologies and their relative growth.

Hortonworks is HOT

Apart from the WSJ, just last week, MomentumIndex called out Hortonworks as the 2011 Startup with the most Momentum from a pool of 900 startups being tracked from that year.

We also know when we talk to customers that they’re excited about our approach to pure, community-driven, open source Hadoop. We know developers are excited to get hands on with Hadoop via the Sandbox. And we say great public responses like those we saw at Hadoop Summit Amsterdam, that our approach is the right one.

Hadoop, Hortonworks and YOU are HOT

Hortonworks believes in Hadoop and we believe in the power of community-driven open source. We know that this is just the beginning for Hadoop and we back everyone investing their skills in Hadoop, and taking this journey with us. All the way.

Get Started: You can get started by downloading our Sandbox - it’s a VM package containing everything you need to run a single node cluster (I love that expression!) and is packed with tutorials and demos.

Get Connected: Stay in touch. When we say community we mean it – come follow us on TwitterFacebookLinkedIn- we want to hear from you as to how we’re doing to provide you with the tools and capabilities to do what your business is demanding. Find a Hadoop User Group (HUG), and come along to the Hadoop Summit.

Get Certified: If you want to differentiate yourself and grab one of those jobs, then you can train and certify with us too. All of the details on that are here.

Dive in and enjoy.

Hadoop Summit North America 2013: Community Choice Results

And the voting is over and the results are in for the Community Choice program of the Hadoop Summit San Jose 2013.

With over 300 sessions, and around 6000 users casting more than 15000 votes there was a lot of excitement to participate and influence the results - thanks to everyone for your contribution. At the end of the process, the selectees are:

  • Application and Data Science Track: Watching Pigs Fly with the Netflix Hadoop Toolkit (Netflix)
  • Deployment and Operations Track: Continuous Integration for the Applications on top of Hadoop (Yahoo!)
  • Enterprise Data Architecture Track: Next Generation Analytics: A Reference Architecture (Mu Sigma)
  • Future of Apache Hadoop Track: Jubatus: Real-time and Highly-scalable Machine Learning Platform (Preferred Infrastructure, Inc.)
  • Hadoop (Disruptive) Economics Track: Move to Hadoop, Go Fast and Save Millions: Mainframe Legacy Modernization (Sears Holding Corp.)
  • Hadoop-driven Business / BI Track: Big Data, Easy BI (Yahoo!)
  • Reference Architecture Track: Genie – Hadoop Platformed as a Service at Netflix (Netflix)

Congratulations to the selectees for each track, and a further honorable mention to Sears for winning the ‘Longest Session Title So Far’ which was a surprisingly hard fought contest!

The content selection committee will now be working hard to select the remaining sessions for the tracks, and we’ll cover those participants in more depth later.

With the Community Choice program complete we’re one step closer to a great event! Thanks again to everyone for taking part and stand by for more updates.

Week in Review: Sandboxes, HDP 2.0 Alpha 2, Hive Performance and Summits

Hadoop Summit It’s almost time for that final drive home of the week, and what a week it has been with a few new releases, a summit, and a little bit of technical fun. Here’s what happened:

New Sandbox Release. Yes, your favorite Hadoop VM image just got even better. Cheryle took us through the new features which included Ambari integration and Russell followed up with a quick tour of Ambari. There’s still plenty of time to download Sandbox for a weekend of data crunching fun.

HDP 2.0 Alpha 2 was released. This preview release demonstrates some of the performance improvements in store for the final HDP 2.0 release via YARN, enhancements to Hive per the Stinger Initiative, and Apache Tez. Just before the release, we posted some early test results which showed a 45X (yes, that’s forty five) performance improvement for Hive interactive queries. But that’s just the beginning as we push to 100X, and Microsoft also talked about their contributions to the Stinger Initiative with the same aim in mind.

If you’ve downloaded Sandbox and are looking for some inspiration for a little fun, then Russell also posted a two part series on extracting, loading, querying and analyzing your own Twitter archive with Hive. Part 1 is here, and Part 2 is here.

And finally, there was just the small matter of the Hadoop Summit in AmsterdamWe had a great time and hope you did too. Thank you for attending, contributing to the conversation and supporting Hadoop. If you’re now really excited to learn Hadoop, we posted about available training we have in Europe and Palo Alto.

And that was the week that was. Has your Sandbox downloaded yet?

Hadoop Summit 2013 Amsterdam – It’s A Wrap!

We want to take a moment to thank everyone who attended the Hadoop Summit in Amsterdam - THANK YOU! With nearly 500 people registered for the event we think we can safely say is was a big success. We’ve had overwhelming support to do it again next year – so watch this space.

The awesome Beurs Van Berlage venue set us up for a series of fantastic conversations and really well attended sessions and talks as Hadoop continues to explode onto the enterprise scene . Outside of the main tracks, there was great attendance for NLHUG and BoF talks, and kudos to the 10 presenters who ran those lightning talks. Finally, the customer panel was also well received, with great practical advice on adopting Hadoop from HSBC, Neustar and eBay.

But of course it wouldn’t be an event without a party, and we had a great time at the Heineken Experience (from what we can remember).  We put some photos on our Facebook page, but @timoelliott did a much better job than us with this fantastic set on Flickr. This one shows the awesome venue:

hadoop summit exhibition hall

So did you enjoy the summit?  Head over to Facebook  and let us know your favorite part and why: keynotes, tracks, lightning talks, the sandbox experience in the dev cafe, or the party.

And here is a tiny selection of some of the most recent Tweets closing out the show:

Hadoop Summit Tweet

Hadoop Summit Tweet

Hadoop Summit Tweet

Hadoop Summit Tweet

With the community voting just about complete - you still have a few hours to take part – for Hadoop Summit San Jose we are barely 3 months away from a whole bunch of new content and connections and we hope you join us there too!

Thanks again!

Seamless Reporting & Analytics for Apache Hadoop & Big Data Users

Jaspersoft, a Hortonworks certified technology partner, recently completed a survey on the early use of Apache Hadoop in the enterprise. The company found 38% of respondents require real-time or near real-time analytics for their Big Data with Hadoop. Also, within the enterprise, there is a diverse group of people who use Hadoop for such insights: 63% are application developers, 15% are BI report developers and 10% are BI admins or casual business users. Register for a free webinar to hear more.

So, for Hadoop users, the partnership between Hortonworks and Jaspersoft provides a good combination– Jaspersoft provides the ideal complement for reporting and analysis of Hadoop-based Big Data systems through a full suite of ETL, Apache Hive, and native Apache HBase connectors for low-latency data exploration. Not only does the company have an open source model that empowers users to deploy Big Data reporting and analytics quickly and cost-effectively, pre-defined reports make it easy for a wide group of users to gain and share immediate insight.

Jaspersoft joined the Hortonworks Technology Partner Program in 2012, extending advanced reporting capabilities to Hadoop users. The Hortonworks Technology Partner Program is designed to assist ISVs and other solution providers to integrate and extend their solutions for Hadoop, and includes a variety of technical enablement, technical support and training offerings. According to Hortonworks’ CTO Eric Baldschwieler, “Jaspersoft’s industry-leading reporting, analysis, and dashboard products, together with the Hortonworks Data Platform, make it easy and cost-effective for customers to derive maximum insights and value from their largest data stores.”

Choosing the right analytical approach

As easy as this sounds, there are still several approaches to analyzing and reporting on Big Data and numerous use cases— web analytics, fraud detection, security monitoring and healthcare just to name a few. Choosing the right approach depends on what insights you need and why you need them, and can make all the difference in how much value you extract from your data.

An upcoming webinar hosted by Hortonworks and Jaspersoft on March 13 will delve into the various architectural choices used in Hadoop reporting and analytics, and several use cases will be discussed. Register now.

 

Getting Ready for The Elephant Party in Europe

We are just under two weeks away from start of the first ever Hadoop Summit Europe and with all of the final preparations being made we thought we would highlight some of the not to be missed activities in and around the event. The event is filling fast but you can still register here.

Here are 10 great reasons to attend!

1)   Great track content – there are 35 informative sessions on Apache Hadoop and related technologies for you to choose from selected by the community and delivered by the experts themselves.

2)   Great keynotes – leading industry analyst Matt Aslett will present the opening keynote and we will also hear from open source veteran Shaun Connolly as well as Hortonworks CTO Eric Baldeschwieler

3)   Hadoop in the Enterprise expert panel – We will have a live panel discussion from industry leaders incuding eBay, HSBC and Neustar discussing how and why they use Apache Hadoop.

4)   Meetups – the NLHUG and other communities will be meeting around the event.

5)   Lightening talks – we’ve got rapid fire content coming to you in the form of community selected lightening talks. These 5 minute sessions will give you a taste of a wide range of technologies and initiatives

6)   It’s Amsterdam – historic, edgy and fun!

7)   Ecosystem – The conference has the support of the broader Hadoop ecosystem so you can come and discuss Hadoop and big data in the solutions showcase.

8)   Community – The Apache Hadoop community is big and getting bigger. Come meet and mingle with other community members to learn about the latest goings on and make new connections.

9)   Get Hadoop certified – Calling all Hadoop Experts! We’re bringing certification to you! If you are ready to take the exam to become a Hortonworks Certified Apache Hadoop Developer (HCAHD) or a Hortonworks Certified Apache Hadoop Administrator (HCAHA).

10)   Get trained on Hadoop – we’ve got a host of classes available during the event to help you learn or sharpen your Hadoop skills. This includes a newly added Applying Data Science class. Check out the classes.

11)  BONUS reason – have a beer on us at the Hadoop Summit Party at the Heineken Experience a cool venue at a historic location.

Register now, don’t miss the party hope to see you there!

Putting the Elephant in the Window

 

For several years now Apache Hadoop has been fueling the fast growing big data market and has become the defacto platform for Big Data deployments and the technology foundation for an explosion of new analytic applications. Many organizations turn to Hadoop to help tame the vast amounts of new data they are collecting but in order to do so with Hadoop they have had to use servers running the Linux operating system. That left a large number of organizations who standardize on Windows (According to IDC, Windows Server owned 73 percent of the market in 2012 – IDC, Worldwide and Regional Server 2012–2016 Forecast, Doc # 234339, May 2012) without the ability to run Hadoop natively, until today.

windoweleWe are very pleased to announce the availability of Hortonworks Data Platform for Windows providing organizations with an enterprise-grade, production-tested platform for big data deployments on Windows. HDP is the first and only Hadoop-based platform available on both Windows and Linux and provides interoperability across Windows, Linux and Windows Azure. With this release we are enabling a massive expansion of the Hadoop ecosystem. New participants in the community of developers, data scientist, data management professionals and Hadoop fans to build and run applications for Apache Hadoop natively on Windows. This is great news for Windows focused enterprises, service provides, software vendors and developers and in particular they can get going today with Hadoop simply by visiting our download page.

This release would not be possible without a strong partnership and close collaboration with Microsoft. Through the process of creating this release, we have remained true to our approach of community-driven enterprise Apache Hadoop by collecting enterprise requirements, developing them in open source and applying enterprise rigor to produce a 100-precent open source enterprise-grade Hadoop platform.

One of our goals at Hortonworks is to make Hadoop and enterprise viable data platform available on as many platforms as possible. In fact, it is already available today in a range of deployment options including: on-premise, virtual, cloud and an appliance. For organizations looking to leverage Apache Hadoop, they now have even more choices of deployment options between Linux and Windows, giving them more freedom to meet their internal policies and standards. For Microsoft Windows customers, they have complete portability of their Apache Hadoop applications between on premise and cloud deployments, as Hortonworks Data Platform for Windows and HDInsight Service on Windows Azure are built on exactly the same code line.

If you are in the SF Bay Area this week, you can talk to us live about the power of the Hortonworks Data Platform for Windows at booth #316 at the Strata Conference, taking place February 26-28 at the Santa Clara Convention Center in Santa Clara, Calif.

 We will also be conducting the webinar, “Unlocking the Other Half: Introduction to Hortonworks Data Platform for Windows,” on Tuesday, March 12 at 10 a.m. PST / 1 p.m. EST.

To register for the webinar, please visit http://info.hortonworks.com/Hortonworks_HDPonWindows_webcast.html.

 

Apache Pig 0.11 Released!

Apache Pig version 0.11 was released last week. An Apache Pig blog post summarized the release. New features include:

  • A DateTime datatype, documentation here.
  • A RANK function, documentation here.
  • A CUBE operator, documentation here.
  • Groovy UDFs, documentation here.

And many improvements. Oink it up for Pig 0.11! Hortonworks’ Daniel Dai gave a talk on Pig 0.11 at Strata NY, check it out:

Go to page:12345