The Hortonworks Blog

A couple of months ago I joined Hortonworks. There was an undeniable pull to go into the fire of crazy fast innovation and growth. About four seconds in, I realized there was so much more than just the pace of execution and growth but rather a bigger opportunity to be a part of something game-changing. The opportunity to partake in trailblazing the world of data. The opportunity to offer a unique value proposition of truly 100% open technology.…

Today we proudly announced that Arkena, one of Europe’s leading media services companies, is using Hortonworks Data Platform (HDP™) to provide its media customers with an advanced analytics platform to deliver content to OTT customers through its content delivery network (CDN). This is a guest post from Reda Benzair the Vice President of Technical Development at Arkena. You can also join Arkena and Hortonworks February 16th for a live and on-demand webinar about their Advanced Analytics Platform click here.…

We take pride in producing valuable technical blogs and sharing them with a wider audience. Of all the blogs published in 2015 on our website, the following were most popular:

  • Learn how Zeppelin, Spark SQL and MLLib can be combined to simplify exploratory Data Science. 

HDFS is core part of any Hadoop deployment and in order to ensure that data is protected in Hadoop platform, security needs to be baked into the HDFS layer. HDFS is protected using Kerberos authentication, and authorization using POSIX style permissions/HDFS ACLs or using Apache Ranger.

Apache Ranger (http://hortonworks.com/hadoop/ranger/) is a centralized security administration solution  for Hadoop that enables administrators to create and enforce security policies for HDFS and other Hadoop platform components.…

FireKing offers best-in-class security products for asset protection in retail, commercial, and home office environments.

With Hortonworks Data Platform (HDP®), FireKing now analyzes its operational data to accurately measure the productivity of service technicians, allowing the field service organization to deliver improved customer service while also reducing the costs of servicing safes, cash-management systems, and commercial locking hardware.

Read the Complete FireKing Case Study  

Field Service Challenges Before HDP

FireKing employs over 80 field technicians spread across the continental US, performing nearly 50,000 service jobs per year.…

Santa will be busy this year. On December 24th he’s scheduled to deliver presents to billions of children globally. Buddy and the Keeblers will be working overtime to meet the demand, and Santa has called in temp work from Legolas and Dobby.

There’s little doubt that Santa is a master of lean manufacturing, but there’s only so much muda you can cut from the factory floor. After all, his supply chain has been perfected over decades and his workforce is loyal and perfectly aligned with the mission.…

We are pleased to announce that the 2nd release of Hortonworks DataFlow is now available. Hortonworks DataFlow is a data-source agnostic, real time data collection and dataflow management platform designed to meet the practical challenges of collecting and moving data securely and efficiently.

HDF 1.1 builds on the strength of the initial GA version of HDF released in September 2015. HDF 1.1 supports additional security models, improves the user experience, and increases user options for accessing and delivering data.…

Hadoop Summit – Dublin taking place 13-14 April 2016 http://www.hadoopsummit.org/dublin

Unlike other conferences, Hadoop Summit is driven for the community by the community and this year’s speaker submissions have been open for public viewing http://hadoopsummit.uservoice.com/. The top vote getting sessions are automatically selected for the conference. The competition was strong, the content was amazing and with over 13,000 votes tallied, we are happy to announce that the results are in!…

In September, Hortonworks partnered with ManTech and B23 to foster a vibrant open community to accelerate the development of OpenSOC. In December we additionally partnered with Rackspace Managed Security and submitted OpenSOC to the Apache Incubator as a podling under the name of Apache Metron. A decision to rename the project was made to represent the new direction and the new community. Now the process of graduating Metron to a top-level project (TLP) has begun.…

An interesting and atypical thing is happening in Healthcare. Leading data driven organizations are not simply looking to share their Hadoop experiences, successes, use cases, and best practices … but more than ever before, they are embracing the opportunity to share their experiences outside their organizations, in a style that resembles the open source community on which Hadoop was built.

It all started at Hadoop Summit on June 10th 2015 when a simple breakfast meeting was organized to showcase the experiences of a couple of healthcare’s earliest adopters of Hadoop.…

It’s our pleasure to host Ryan Peterson, Chief Solution Strategist at EMC, as a guest blogger to expand upon another great step in our partnership to deliver compelling customer solutions through joint engineering efforts.  Follow Ryan @BigDataRyan.

Object storage isn’t a new concept and EMC’s been innovating around it since the beginning. Take our Centera and Atmos products as key examples. The first Centera was created around the idea that objects could store much higher quantities of data than a file system in a single store while the other aspect of Centera was a rich set of security and compliancy features file systems had not been able to achieve.…

Posted in partnership with Paige Schaefer, Product Marketing at Trifacta.

The insurance industry is wrestling with the tremendous growth of data sources at its disposal. Traditional ETL processes are expensive, time-consuming, and complicated by the variety of data structures and formats. In contrast, Hadoop platforms provide a clean, safe, and manageable format for data wrangling, the critical first step of the data analysis process.

Forward-thinking insurance companies have embraced data wrangling as more than janitorial work.…

Deploying a lock-in free data platform is critical for an enterprise. By this, we mean using a non-proprietary code and implementing interoperability to eliminate the risk of being dependent on a single vendor for your current or future needs.

Over two thirds of respondents to our survey agree that maintaining freedom of choice was a key criterion when it came to selecting the Hortonworks Data Platform. (Source: TechValidate TVID 4A8-731-250.)

They didn’t want to be limited to what one vendor can offer – they wanted to have platform portability, industry-wide standards and choices on third party application support from a broader ecosystem.…

We are in the midst of the third industrial revolution, driven by IoT and Big Data analytics. This is a fundamental blurring of boundaries between the physical and digital worlds, which has resulted in disruptive new business models.

Register now for the Webinar on Thursday, December 10th , at 11:00am PST, with guest speakers Frank Gillett from Forrester Research, Grant Bodley from Hortonworks, Sriram Jayaraman from HARMAN, and Darrell Swope from Landis+Gyr as they look into how IoT and analytics address the opportunities and challenges in the digital economy.…

Customers who are dealing with massive growth of large unstructured data sources need to evaluate their storage architecture for scalability, flexibility and ease of management. In a recent study, performed by the Enterprise Strategy Group, IT decision makers were asked to identify their biggest storage challenges. In addition to rapid data growth rate, they also pointed to hardware costs, data protection and staffing costs. Object-based storage solutions aim at addressing a number of these challenges, and EMC’s answer comes with Elastic Cloud Storage (ECS).…