The Hortonworks Blog

Posts categorized by : Industry Happenings

We are very excited to announce the Alpha release of the Hortonworks Data Platform 2.0 (HDP 2.0 Alpha).

HDP 2.0 Alpha is built around Apache Hadoop 2.0, which improves availability of HDFS with High Availability for the NameNode along with several performance and reliability enhancements. Apache Hadoop 2.0 also significantly advances data processing in the Hadoop ecosystem with the introduction of YARN, a generic resource-management and application framework to support MapReduce and other paradigms such as real-time processing and graph processing.…

Back in June we joined Teradata Aster in a webcast “Back to the Future – MapReduce, Hadoop and the Data Scientist” to highlight the benefits of Apache Hadoop and the role that data scientists are playing in big data. You can check out the replay here. The discussion focused around how big data architectures could bring more value to businesses using relational DBMS technology and Hadoop, and how the two can coexist.…

Series Introduction

Packetloop CTO Michael Baker (@cloudjunky) made a big splash when he presented ‘Finding Needles in Haystacks (the Size of Countries)‘ at Blackhat Europe earlier this year. The paper outlines a toolkit based on Apache Pig, Packetpig @packetpig (available on github), for doing network security monitoring and intrusion detection analysis on full packet captures using Hadoop.

In this series of posts, we’re going to introduce Big Data Security and explore using Packetpig on real full packet captures to understand and analyze networks.…

We had another amazing turn out on our Ambari webinar with Matt Foley a couple of weeks back. This series was meant to educate Hadoop enthusiasts and help them gain better understanding of the value of Hadoop and I think we’re on the right track. If you missed or would like a refresher from our last two webinars (Pig and Ambari) you can find the recording here: https://hortonworks.com/webinars/

We’re starting the third installment of the “Future of Apache Hadoop” series next Wednesday on “Scaling Apache Zookeeper to the Next Generation Applications” with Mahadev Konar (@mahadevkonar) Hortonworks co-founder and core contributor and PMC member of the Apache Zookeeper.…

I spent some time at the first ever DataWeek in San Francisco last week.  It is a brand new show and it was very well-run, spread across a few cool spaces with an interesting mix of novice to experienced data professionals.  They had a good blend of labs, speakers, panels and great networking opportunities.  In all, it was great and a big thanks and kudos to the organizers.

I took part in a panel and also presented a three-hour overview of Hadoop. …

There will be a Pig meetup at Strata NYC/Hadoop World, at 6:30PM on Wed, Oct 24th in the Bryant Room of the Hilton New York. This will also be the inaugural meeting of the NYC Pig User Group, which Doug Daniels of Pig contributor Mortar Data was good enough to organize. We look forward to future Pig meetups in NYC!

Hortonworks’ own Daniel Dai @daijy, VP of Apache Pig, will present on new features in Pig 0.11.…

Hortonworks is hosting an Apache YARN Meetup on Friday, Oct 12, to solicit feedback on the YARN APIs. We’ve talked about YARN before in a four-part series on YARN, parts one, two, three and four.

YARN, or “Apache Hadoop NextGen MapReduce,” has come a long way this year. It is now a full-fledged sub-project of Apache Hadoop and has already been deployed on a massive 2,000 node cluster at Yahoo.…

Alan Gates presented HCatalog to the Chicago Hadoop User Group (CHUG) on 9/17/12. There was a great turnout, and the strength of CHUG is evidence that Chicago is a Hadoop city. Below are some kind words from the host, Mark Slusar.

On 9/17/12, the Chicago Hadoop User Group (CHUG) was delighted to host Hortonworks Co-Founder Alan Gates to give an overview of HCatalog. In addition to downtown Chicago meetups, Allstate Insurance Company in Northbrook, IL hosts regular Chicago Hadoop User Group Meetups.…

InfoQ has an article out today on HCatalog by Hortonworks’ own Alan Gates and Russell Jurney.

Apache Hadoop enables a revolution in how organization’s process data, with the freedom and scale Hadoop provides enabling new kinds of applications building new kinds of value and delivering results from big data on shorter timelines than ever before. The shift towards a Hadoop-centric mode of data processing in the enterprise has however posed a challenge: how do we collaborate in the context of the freedom that Hadoop provides us?…

As the Hadoop ecosystem has exploded into many projects, searching for the right answers when questions arise can be a challenge. Thats why I was thrilled to hear about search-hadoop.com, from Sematext. It has a sister site called search-lucene where you can… search lucene!

Search-Hadoop.com searches across projects – JIRAs, source code, mailing lists, wikis, etc. so you can see design and API docs, as well as questions, answers and general documentation.…

Hadoop featured prominently at Stanford’s annual XLDB conference last week, as representatives from academia and industry gathered to discuss Extremely Large Databases. The conference program, with slides are available: http://www-conf.slac.stanford.edu/xldb2012/ProgramC.asp. A highly technical lineup presented on Big Data in biology and physics, and cloud computing and Hive in particular were topic areas.

Hortonworks’ own Ashutosh Chauhan @ashutoshchauhan, an Apache Pig, Hive and HCatalog committer, presented ‘Hive vs Pig: Similarities and Differences‘ (slides).…

Partner Webinar Series

On September 18 at 10am PT/1pm ET we join our partner Datameer in a webcast aimed at providing answers to some common questions we hear in the industry. Specifically, what are some of the use cases that big data analytics is perfect for?

By looking at some common uses we are seeing, you’ll be able to envision how you can leverage the analytics results from your own data.…

Partner Webinar Series

Hortonworks boasts a rich and vibrant ecosystem of partners representing a huge array of solutions that leverage Hadoop, and specifically Hortonworks Data Platform, to provide big data insights for customers. The goal of our Partner Webinar Series is to help communicate the value and benefit of our partners’ solutions and how they connect and use Hortonworks Data Platform.

Look to the Clouds

Setting up a big data cluster can be difficult, especially considering the assembly of all the all the equipment, power, and space to make it happen.…

Twitter Analytics presented their distributed infrastructure, including Hadoop and Pig, at a UC Berkeley iSchool special course called INFO 290: Analyzing Big Data with Twitter. Twitter is a major contributor to many Apache projects. The course was over-subscribed and was a great success, as students got to learn from practicing data scientists using Hadoop on truly massive datasets. The entire lecture series is available here.

Bill Graham @billgraham, a Data Systems Engineer at Twitter Analytics and Apache Pig committer, presented an Introduction to Hadoop.…

The August Pig Hackathon brought Pig users from Hortonworks, Yahoo, Cloudera, Visa, Kaiser Permanente, and LinkedIn to Hortonworks HQ in Sunnyvale, CA to talk and work on Apache Pig.

Jonathan Coveney and Bill Graham from Twitter walked newer Pig users through how Pig translates a Pig Latin script to map reduce jobs and went over how to read the output of explain. For those interested, Hortonworks founder Alan Gates covers this in Chapter 1 of Programming Pig.…

Go to page:« First...910111213...Last »