cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

The Hortonworks Blog

A Beginners Guide to Becoming an Apache Contributor Venkatesh Sellappa, Teradata My name is Venkatesh Sellappa. My background is primarily application of analytics in the Big Data Space, before either of them was called that. We used to just call it programming. My session is an account of my personal journey into the often contentious […]

Recently, Apache Spark set the world of Big Data on fire. With a promise of amazing performance and comfortable APIs, some thought that Spark was bound to replace Hadoop MapReduce. Or is it? Looking closely into it, Spark rather appears to be a natural complement to Apache Hadoop YARN, the architectural center of Hadoop… Hadoop is already transforming […]

Advanced Execution Visualization of Spark jobs Author: Zoltán Zvara, Márton Balassi, András Garzó, Hungarian Academy of Sciences in collaboration with Ericsson Understanding the physical plan of a big data application is often crucial for tracking down bottlenecks and faulty behavior. Apache Spark although offering useful Web UI component for monitoring and understanding the logical plan […]

Hello everyone and welcome to the start of my blogging adventure. I’m Mike Schiebel, Cybersecurity Strategist at Hortonworks where I’m focused on cybersecurity to inject enterprise level security features into the Hadoop ecosystem and provide input into the Apache Metron open source project.  I figured introductions are in order, to explain the where and why […]

A couple of months ago I joined Hortonworks. There was an undeniable pull to go into the fire of crazy fast innovation and growth. About four seconds in, I realized there was so much more than just the pace of execution and growth but rather a bigger opportunity to be a part of something game-changing. […]

Today we proudly announced that Arkena, one of Europe’s leading media services companies, is using Hortonworks Data Platform (HDP™) to provide its media customers with an advanced analytics platform to deliver content to OTT customers through its content delivery network (CDN). This is a guest post from Reda Benzair the Vice President of Technical Development […]

We take pride in producing valuable technical blogs and sharing them with a wider audience. Of all the blogs published in 2015 on our website, the following were most popular: Take a look at 5 techniques enabling Hive to support both batch and interactive workloads at speed and scale. 5 Ways to Make Your Hive Queries […]

HDFS is core part of any Hadoop deployment and in order to ensure that data is protected in Hadoop platform, security needs to be baked into the HDFS layer. HDFS is protected using Kerberos authentication, and authorization using POSIX style permissions/HDFS ACLs or using Apache Ranger. Apache Ranger (http://hortonworks.com/hadoop/ranger/) is a centralized security administration solution […]

FireKing offers best-in-class security products for asset protection in retail, commercial, and home office environments. With Hortonworks Data Platform (HDP®), FireKing now analyzes its operational data to accurately measure the productivity of service technicians, allowing the field service organization to deliver improved customer service while also reducing the costs of servicing safes, cash-management systems, and […]

Santa will be busy this year. On December 24th he’s scheduled to deliver presents to billions of children globally. Buddy and the Keeblers will be working overtime to meet the demand, and Santa has called in temp work from Legolas and Dobby. There’s little doubt that Santa is a master of lean manufacturing, but there’s […]

We are pleased to announce that the 2nd release of Hortonworks DataFlow is now available. Hortonworks DataFlow is a data-source agnostic, real time data collection and dataflow management platform designed to meet the practical challenges of collecting and moving data securely and efficiently. HDF 1.1 builds on the strength of the initial GA version of […]

In September, Hortonworks partnered with ManTech and B23 to foster a vibrant open community to accelerate the development of OpenSOC. In December we additionally partnered with Rackspace Managed Security and submitted OpenSOC to the Apache Incubator as a podling under the name of Apache Metron. A decision to rename the project was made to represent […]

An interesting and atypical thing is happening in Healthcare. Leading data driven organizations are not simply looking to share their Hadoop experiences, successes, use cases, and best practices … but more than ever before, they are embracing the opportunity to share their experiences outside their organizations, in a style that resembles the open source community […]

It’s our pleasure to host Ryan Peterson, Chief Solution Strategist at EMC, as a guest blogger to expand upon another great step in our partnership to deliver compelling customer solutions through joint engineering efforts.  Follow Ryan @BigDataRyan. Object storage isn’t a new concept and EMC’s been innovating around it since the beginning. Take our Centera […]

Posted in partnership with Paige Schaefer, Product Marketing at Trifacta. The insurance industry is wrestling with the tremendous growth of data sources at its disposal. Traditional ETL processes are expensive, time-consuming, and complicated by the variety of data structures and formats. In contrast, Hadoop platforms provide a clean, safe, and manageable format for data wrangling, the […]