The Hortonworks Blog

Posts categorized by : Apache Hadoop

For those of you new to Apache ZooKeeper, it is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. To learn more about ZooKeeper, please visit the Apache ZooKeeper homepage.

As part of stabilizing Apache ZooKeeper 3.4 branch, ZooKeeper 3.4.3 has just been released. It is a bug fix release on the 3.4 branch and fixes 17 issues out of which 1 is very critical and can cause data inconsistency (ZOOKEEPER-1367).…

In our previous blogs and webinars we have discussed the significant improvements and architectural changes coming to Apache Hadoop .Next (0.23). To recap, the major ones are:

  • Federation for Scaling HDFS – HDFS has undergone a transformation to separate Namespace management from the Block (storage) management to allow for significant scaling of the filesystem. In previous architectures, they were intertwined in the NameNode.
  • NextGen MapReduce (aka YARN) – MapReduce has undergone a complete overhaul in hadoop-0.23, including a fundamental change to split up the major functionalities of the JobTracker, resource management and job scheduling/monitoring into separate daemons.

Today we announced our plans to release a public preview of the Hortonworks Data Platform (HDP) version 2. HDP v2 will leverage Apache Hadoop 0.23, which is the first major update to Hadoop in more than three years. Among other advancements, HDP v2 will include the NextGen MapReduce architecture, HDFS NameNode HA and HDFS Federation. It will also include the most up-to-date stable components including HCatalog, HBase, Hive and Pig; all fully integrated and tested at scale.…

Congratulations! The Hadoop Community has given itself a big holiday present: Release 1.0.0! This release has been six years in the making, and has involved:

  • Hard work and cooperation from dozens of software developers and contributors from across the industry, including of course Doug Cutting and Mike Cafarella’s early work in Nutch and the founding Hadoop team at Yahoo, Doug, Owen O’Malley and many others, with leadership from Eric14.  Special thanks to all the Hadoop committers.

Motivation

Apache Hadoop provides a high performance native protocol for accessing HDFS. While this is great for Hadoop applications running inside a Hadoop cluster, users often want to connect to HDFS from the outside. For examples, some applications have to load data in and out of the cluster, or to interact with the data stored in HDFS from the outside. Of course they can do this using the native HDFS protocol but that means installing Hadoop and a Java binding with those applications.…

As the Release Manager, it’s my privilege to present Apache Hadoop 0.23:

Release: http://hadoop.apache.org/common/releases.html
Documentation: http://hadoop.apache.org/common/docs/r0.23.0/

I’ll present a short overview of the release in this post, more details are available in my recent talk on Apache Hadoop 0.23 at Hadoop World, 2011.…

We’ve been looking for the elephant in the room for some time. We knew he was there, but we just couldn’t find him. It’s clear that he is now here and his name is Hortonworks. As such, we are very excited to announce today that Index Ventures has made an investment in Hortonworks.

The elephant toy – Hadoop – has become a household name in the Big Data sector these days and we’ve been tracking it for some time at Index.…

I spent some time last week at ApacheCon NA 2011 in Vancouver, BC. It was a good experience and I enjoyed catching up with friends and colleagues involved in the Hadoop project and also meeting some of the executives of the Apache Software Foundation in person. It is clear that the Apache community is thriving and that interest in Hadoop remains very high.

Hortonworks is committed to supporting Apache and we are pleased to have been a gold sponsor of this event. …

As the framework architects and developers of Apache Hadoop MapReduce, we are always looking for ways to simplify the complex tasks associated with large-scale processing of data. We want users and organizations to spend their time on analyzing their growing data to gain valuable insights, not on menial tasks such as massaging their data for consumption or tediously parsing complex structures in their data. The Informatica HParser technology is extremely valuable in this regard.…

I just spent a day at the Apache Lucene Eurocon conference in Barcelona. I gave a keynote presentation on how the Apache Lucene & Solr communities had a lot to gain from Apache Hadoop and how Hadoop could also gain from their contributions and technology. It was a good show and it was great to have a chance to meet the Lucid Imagination folks and others in the Apache search community.…

If when we started building an Apache Hadoop team at Yahoo!, someone had told me that in the future we would partner with Microsoft to improve Hadoop’s performance on Windows, I would have found the prediction hard to believe. The first time a Microsoft executive suggested that they would like to work with us to improve Apache Hadoop, I told them I found their proposal “mind-bending”. I also told them that if we could do it the right way, I liked the idea.…

We are very excited to enter into a strategic relationship with Microsoft to help bring Apache Hadoop to Windows customers. We are equally pleased that Microsoft will also work closely with the Hadoop community and propose contributions back to the Apache Software Foundation and the Hadoop project.

Hortonworks will provide Microsoft with important Hadoop support and training that will help accelerate the delivery of Apache Hadoop for Windows Server and Windows Azure, including insight into feature roadmap and designs, feedback on code reviews and regression and acceptance testing.…

Several weeks ago, Hortonworks published a blog post that highlighted the tremendous contributions that Yahoo has made to Hadoop over the years. The point was two-fold: 1) to pay homage to our former employer, and 2) to clarify that Yahoo will continue to be a major contributor to Hadoop.

Earlier this week, Cloudera responded to our post, calling it a misleading story. While we generally don’t comment on another vendor’s blogs, even if they assert things that we find questionable, we felt we had to respond to this one.…

Oracle embraced Apache Hadoop this week with the announcement of the Oracle Big Data Appliance that includes an open source distribution of Apache Hadoop.

We welcome Oracle to the Apache Hadoop community and look forward to their participation in the growing Hadoop ecosystem.  We hope that Oracle will commit to using the official releases of Hadoop from the Apache Foundation.  We believe that such a commitment will allow their customers to extract the most possible value from their Hadoop Appliances and facilitates the rapid growth of the Hadoop ecosystem.…

I’m pleased to announce that we’ve become a sponsor of the Apache Software Foundation (ASF). The ASF has been fundamental to Apache Hadoop’s success and our team’s ability to meet our goals since the inception as the Yahoo! Hadoop team in 2006. This is why we convinced Yahoo! to become a Apache Platinum Sponsor back in 2007, which it remains to this day. Now that we are operating as an independent company and continuing to benefit from Apache’s support, we made it a priority to continue to sponsor Apache.…

Go to page:« First...10...2021222324

Thank you for subscribing!