Hadoop Ecosystem

Industry news, partner stories, buzz and happenings

We just released the second video in the Hortonworks Executive Series. This one features Matt Foley, Test and Release Engineering Manager for Hortonworks.

In this video, Matt provides an overview of Hortonworks Data Platform (HDP), including a summary of the Apache Hadoop components included in the distribution and the testing processes involved in the release process. Matt also provides an overview of Apache Ambari, an open source project that is adding monitoring and management capabilities to Apache Hadoop.…

We are pleased to support today’s announcement from Citrix that they have contributed CloudStack to the Apache community. For those new to CloudStack, it is an open source cloud computing software that helps organizations build and manage cloud infrastructures. It is similar to Amazon Web Services EC2 environment except that it enables organizations to build public, private or hybrid cloud environments using their own pooled computing resources.

Citrix announced today that they were reaffirming their commitment to open source by working with the Apache Software Foundation to make CloudStack 3 an Apache project, released under Apache Software License 2.0.…

I’m pleased to announce the first in a series of videos featuring Hortonworks founders and executives sharing their thoughts on how Apache Hadoop is being extended to become the next generation enterprise data platform. Over the coming weeks and months, you will be hearing from folks such as Matt Foley, Arun Murthy, Sanjay Radia and Alan Gates, just to name a few.

The first video features Shaun Connolly, Hortonworks VP of Corporate Strategy, talking about the Hortonworks vision for Apache Hadoop.…

As I first mentioned when we announced Hadoop Summit 2012, we are focused on making Hadoop Summit the preeminent conference for the Apache Hadoop community. Today I’m pleased to tell you about Community Choice, a public online voting system that enables the entire Apache Hadoop community to have a say in the sessions chosen for Hadoop Summit. Anybody can vote and the top vote getters in each track will automatically be included in the Hadoop Summit agenda.…

Today we announced an important strategic partnership with Talend, provider of the world’s most popular open source data integration platform. This is another win for both Hortonworks customers and the larger Apache Hadoop community. There were two key aspects of the announcement that I wanted to highlight:

Talend releases Talend Open Studio for Big Data

Based upon Talend’s very popular open source data integration platform, Talend Open Studio for Big Data adds connectors for HDFS, HBase, Pig, Sqoop and Hive.…

Today we announced  that we were delivering on our earlier promise to help Microsoft bring Apache Hadoop to Windows. I’m pleased to share that Microsoft, with our collaboration and guidance, has now submitted a series of patches to Apache aimed at overcoming the challenges of running Apache Hadoop in Windows Server environments.

These patches, once vetted and approved by the community, will become part of the core Hadoop code base. They will also become available in the two major Apache Hadoop branches: hadoop-1.0 (the current stable branch, which is available as part of Hortonworks Data Platform v1.0) and hadoop-0.23 (the next generation of Apache Hadoop, which will be available as part of Hortonworks Data Platform v2.0).…

Hortonworks and Teradata announced a strategic relationship today that includes joint go-to-market and development work to more closely integrate Hortonworks Data Platform with the Teradata Analytical Ecosystem. I wanted to take the opportunity to highlight this important partnership and share my thoughts on why this is an important milestone for Hortonworks and the larger Apache Hadoop community.

As somebody that has been heavily involved in the development of Apache Hadoop for six years and counting, it’s personally exciting to see Hadoop entering a new phase of adoption.…

Hadoop RPC is the primary communication mechanism between the nodes in an Apache Hadoop cluster. Maintaining wire compatibility, as new features are added to Apache Hadoop, has been a significant challenge with the current RPC architecture. In this blog, I highlight the architectural improvement in Hadoop RPC and how it enables wire compatibility and rolling upgrades.

Challenges for Wire Compatibility

Earlier Hadoop RPC used Writable serialization that made it difficult to evolve the protocols while maintaining wire compatibility.…

In our previous blogs and webinars we have discussed the significant improvements and architectural changes coming to Apache Hadoop .Next (0.23). To recap, the major ones are:

  • Federation for Scaling HDFS – HDFS has undergone a transformation to separate Namespace management from the Block (storage) management to allow for significant scaling of the filesystem. In previous architectures, they were intertwined in the NameNode.
  • NextGen MapReduce (aka YARN) – MapReduce has undergone a complete overhaul in hadoop-0.23, including a fundamental change to split up the major functionalities of the JobTracker, resource management and job scheduling/monitoring into separate daemons.

Today we announced our plans to release a public preview of the Hortonworks Data Platform (HDP) version 2. HDP v2 will leverage Apache Hadoop 0.23, which is the first major update to Hadoop in more than three years. Among other advancements, HDP v2 will include the NextGen MapReduce architecture, HDFS NameNode HA and HDFS Federation. It will also include the most up-to-date stable components including HCatalog, HBase, Hive and Pig; all fully integrated and tested at scale.…

We’ve been looking for the elephant in the room for some time. We knew he was there, but we just couldn’t find him. It’s clear that he is now here and his name is Hortonworks. As such, we are very excited to announce today that Index Ventures has made an investment in Hortonworks.

The elephant toy – Hadoop – has become a household name in the Big Data sector these days and we’ve been tracking it for some time at Index.…

I spent some time last week at ApacheCon NA 2011 in Vancouver, BC. It was a good experience and I enjoyed catching up with friends and colleagues involved in the Hadoop project and also meeting some of the executives of the Apache Software Foundation in person. It is clear that the Apache community is thriving and that interest in Hadoop remains very high.

Hortonworks is committed to supporting Apache and we are pleased to have been a gold sponsor of this event. …

As the framework architects and developers of Apache Hadoop MapReduce, we are always looking for ways to simplify the complex tasks associated with large-scale processing of data. We want users and organizations to spend their time on analyzing their growing data to gain valuable insights, not on menial tasks such as massaging their data for consumption or tediously parsing complex structures in their data. The Informatica HParser technology is extremely valuable in this regard.…

I just spent a day at the Apache Lucene Eurocon conference in Barcelona. I gave a keynote presentation on how the Apache Lucene & Solr communities had a lot to gain from Apache Hadoop and how Hadoop could also gain from their contributions and technology. It was a good show and it was great to have a chance to meet the Lucid Imagination folks and others in the Apache search community.…

If when we started building an Apache Hadoop team at Yahoo!, someone had told me that in the future we would partner with Microsoft to improve Hadoop’s performance on Windows, I would have found the prediction hard to believe. The first time a Microsoft executive suggested that they would like to work with us to improve Apache Hadoop, I told them I found their proposal “mind-bending”. I also told them that if we could do it the right way, I liked the idea.…

Go to page:« First...10...1415161718