The Hortonworks Blog

It was 10 years ago today (Feb 2) that my first patch (https://issues.apache.org/jira/browse/NUTCH-197) went into the code that two days later became Hadoop (https://issues.apache.org/jira/browse/HADOOP-1).

I had been working on Yahoo Search’s WebMap, which was the back end that analyzed the web for the search engine.  We had been working on a C++ implementation of GFS and MapReduce, but after hiring Doug Cutting decided that it would be easier to get Yahoo’s permission to contribute to code that was already open source rather than open source our C++ project.…

Hello everyone and welcome to the start of my blogging adventure. I’m Mike Schiebel, Cybersecurity Strategist at Hortonworks where I’m focused on cybersecurity to inject enterprise level security features into the Hadoop ecosystem and provide input into the Apache Metron open source project.  I figured introductions are in order, to explain the where and why behind my blog series.

Who am I?

I’ve taken a long and twisting road before ending up at Hortonworks.…

Santa will be busy this year. On December 24th he’s scheduled to deliver presents to billions of children globally. Buddy and the Keeblers will be working overtime to meet the demand, and Santa has called in temp work from Legolas and Dobby.

There’s little doubt that Santa is a master of lean manufacturing, but there’s only so much muda you can cut from the factory floor. After all, his supply chain has been perfected over decades and his workforce is loyal and perfectly aligned with the mission.…

In September, Hortonworks partnered with ManTech and B23 to foster a vibrant open community to accelerate the development of OpenSOC. In December we additionally partnered with Rackspace Managed Security and submitted OpenSOC to the Apache Incubator as a podling under the name of Apache Metron. A decision to rename the project was made to represent the new direction and the new community. Now the process of graduating Metron to a top-level project (TLP) has begun.…

An interesting and atypical thing is happening in Healthcare. Leading data driven organizations are not simply looking to share their Hadoop experiences, successes, use cases, and best practices … but more than ever before, they are embracing the opportunity to share their experiences outside their organizations, in a style that resembles the open source community on which Hadoop was built.

It all started at Hadoop Summit on June 10th 2015 when a simple breakfast meeting was organized to showcase the experiences of a couple of healthcare’s earliest adopters of Hadoop.…

Deploying a lock-in free data platform is critical for an enterprise. By this, we mean using a non-proprietary code and implementing interoperability to eliminate the risk of being dependent on a single vendor for your current or future needs.

Over two thirds of respondents to our survey agree that maintaining freedom of choice was a key criterion when it came to selecting the Hortonworks Data Platform. (Source: TechValidate TVID 4A8-731-250.)

They didn’t want to be limited to what one vendor can offer – they wanted to have platform portability, industry-wide standards and choices on third party application support from a broader ecosystem.…

As the US nears its holiday of giving thanks, Hortonworks is reflecting on open and all the bounty it brings.

Hortonworks philosophy has always been predicated on an open approach.

Open innovation, open community, open development, open delivery…a fully open source business.

Why open? Because real innovation happens not in isolation but in collaboration where the best minds within the community are working towards a common goal.

It’s not always easy when we are pushing the proverbial open source boulder up the proprietary hill, but with every donation to Apache Software Foundation (ASF), the hill gets easier to climb and we get closer to constant and nimble advancements forward for customers and the market as a whole.…

“It’s all about Hortonworks company vision, 100% open source and enterprise support.” Source: TechValidate TVID 8A7-EFF-21C

Hortonworks’ customer experience survey shows that our community innovation strategy is validated by our customers, with more than two-thirds of those who responded to the survey said they value community innovation

Hortonworks has been dedicated to 100% open community development since the very beginning because this strategy maximizes value we can bring to our customers.…

Earning the prestigious Teradata EPIC award is no easy feat. Partners who would like to have a shot at winning the top recognition need to demonstrate how their solution provides a unified, high-performance big data analytics system for an enterprise and show measurable return on investment. After receiving Teradata’s EPIC award recognition for Big Data Intelligence in 2013 and 2014, Hortonworks, yet again, has been recognized as the leader by winning this award for the third year in a row.…

Since our founding in mid-2011, our vision for Hadoop has been that “half the world’s data will be processed by Hadoop”. With that long-term vision in mind, we focus on the mission to establish Hadoop as the foundational technology of the modern enterprise data architecture that unlocks a whole new class of data-driven applications that weren’t previously possible.

We use what we call the “Blueprint for Enterprise Hadoop” for guiding how we invest in Hadoop-related open source technologies as well as enabling the key integration points that are important for deploying Enterprise Hadoop within a modern data architecture, on-premises or in the cloud, in a way that enables the business and its users to maximize the value from their data.…

This article originally appeared at Opensource.com and is reproduced here.

There are rapidly growing feature set, high commit rates, and code contributions happening across the globe to Apache Hadoop and related Apache Software Foundation projects. However, the number of woman developerscommitters, and Project Management Committee (PMC) members in this vast and diversified ecosystem are really diminutive. For the Hadoop project alone, only 5% out of 84 committers are women; and this has been the case for over the past 2 years.…

I’d like to take a quick moment to welcome Julian Hyde as the latest addition to the Hortonworks engineering team. Julian has a long history of working on data platforms, including development of SQL engines at Oracle, Broadbase, and SQLstream. He was also the architect and primary developer of the Mondrian OLAP engine, part of the Pentaho BI suite.

Julian’s latest role has been as the author and architect of the Optiq project – an Apache licensed open source framework.…

We’re continuing our series of quick interviews with Apache Hadoop project committers at Hortonworks.

This week Mahadev Konar discusses Apache Ambari, the open source Apache project to simplify management of a Hadoop cluster.

Mahadev was on the team at Yahoo! in 2006 that started developing what became Apache Hadoop. Since then, he has also held leadership positions in the Apache Zookeeper and Apache Ambari projects. He is an architect and project management committee member for Apache Ambari, Apache ZooKeeper and Apache Hadoop.…

We’re continuing our series of quick interviews with Apache Hadoop project committers at Hortonworks.

This week – as Hadoop 2 goes GAArun Murthy discusses his journey with Hadoop. The journey has taken Arun from developing Hadoop, to founding Hortonworks, to this week’s release of Hadoop 2, with its Yarn-based architecture.

Arun describes the difference between MapReduce and YARN, and how they are related in Hadoop 2 (and by extension in Hortonworks Data Platform v2).…

We’re continuing our series of quick interviews with Apache Hadoop project committers at Hortonworks.

This week Mahadev Konar discusses Apache ZooKeeper, the open source Apache project that is used to coordinate various processes on a Hadoop cluster (such as electing a leader between two processes).

Mahadev was on the team at Yahoo! in 2006 that started developing what became Apache Hadoop. He has been involved with Apache ZooKeeper since 2008, when the project was open sourced.…