The Hortonworks Blog

Earning the prestigious Teradata EPIC award is no easy feat. Partners who would like to have a shot at winning the top recognition need to demonstrate how their solution provides a unified, high-performance big data analytics system for an enterprise and show measurable return on investment. After receiving Teradata’s EPIC award recognition for Big Data Intelligence in 2013 and 2014, Hortonworks, yet again, has been recognized as the leader by winning this award for the third year in a row.…

Apache Spark’s momentum continues to grow and throughout 2015 we saw customers across all industries get real value from using it with the Hortonworks Data Platform (HDP). Examples include:

Insurance Optimize their claims reimbursements process by using Spark’s machine learning capabilities to process and analyze all claims. Healthcare Build a Patient Care System using Spark Core, Streaming and SQL. Retail Use Spark to analyze point-of-sale data and coupon usage. Internet Use Spark’s ML capability to identify fake profiles and enhance products matches that they show their customers.…

When you count on your Hadoop environment to power business-critical applications, you can’t afford to let problems get in the way of performance. By getting ahead of issues before they lead to cluster degradation or downtime, you can deliver the Big Data insights your business relies on with the speed and reliability competitive markets demand. That’s the thinking that led Hortonworks to fundamentally change our support model with the introduction of Hortonworks SmartSense®, a collection of tools and services that’s quickly becoming part of the standard operating procedure of many of our customers—with impressive results.…

This post was co-authored by Eric Thorsen, Hortonworks GM for CPG and Retail, and Jeff Seibert, Analytics Strategist at Blue Granite.

More and more, consumers interact with their retailers through mobile devices. Because communication is easier, consumers tend to assume that customer service is similarly responsive. And so, they are more likely to become frustrated and abandon a shopping cart, whether online or in the store.

Retailers, Consumer Products Companies, and Quick Serve Restaurants (QSRs) are racing to bridge this gap between consumer expectations and their ability to provide quality service at the speed of mobile.…

We are very excited to announce that Grant Bodley will speak about Big Data, the Internet of Anything (IoAT) and the Connected Car at this year’s West Coast Automotive Data Event on Wednesday October 27th in San Diego.

Telematics West Coast is the premier event that explores and develops new ideas and business practices opening up with Big Data. Hortonworks is thrilled to be a Gold Sponsor at the event.

Join Us at Telematics West Coast

Attend Grant’s session at 10am, The Information Superhighway for Automotive Transformation to learn more about Big Data, the Internet of Anything (IoAT) and how the Connected Car has created a new Information Superhighway that fundamentally changes the relationship between automakers and car buyers.…

We are excited to announce that Arun Murthy will be one of the keynote speakers and will be talking about how to simplify Data Science with Spark and Hadoop next week at the Spark Summit Europe 2015 next week in Amsterdam.

Spark Summit is the premier event that brings the Apache Spark community together. This will be the first Spark Summit held in Europe, we at Hortonworks are thrilled to serve as Gold Sponsors on its maiden voyage.…

Hackathons, Hackfest, and Codefests have an initial air of invincibility. They challenge participants, even veterans—not if the attendees work together or if the community collaborates and innovates together. That air of invincibility quickly dissipates.

Last Saturday, because of such camaraderie and collaboration, a harmony of innovative ideas flourished and came to fruition at an Ambari Hackfest.

Open Data Platform Initiative (ODPi) founding partners Hortonworks and Pivotal co-hosted and co-sponsored an Ambari Hackfest at the Pivotal site near the scenic Foothills in Palo Alto.…

Geospatial data is pervasive—in mobile devices, sensors, logs, and wearables. This data’s spatial context is an important variable in many predictive analytics applications.

To benefit from spatial context in a predictive analytics application, we need to be able to parse geospatial datasets at scale, join them with target datasets that contain point in space information, and answer geometrical queries efficiently.

Unfortunately, if you are working with geospatial data and big data sets that need spatial context, there are limited open source tools that make it easy for you to parse and efficiently query spatial datasets at scale.…

Is a Lake Big Enough to House Your Ocean of Data?

Contrary to popular belief, Hadoop was not the elephant-in-the-china-shop that marauded and disrupted the data center. The real culprit is data and how it has exploded in volume. The past two or three years have seen a rise in the number of successful Hadoop projects in enterprises to tackle this explosion of big data. These large volumes of data, the emergence of the Hadoop technology and the need to store all the siloed data in one place have prompted the phenomenon called the Data Lake among enterprises.…

Our guest blogger today is Rob Rosen, Senior Director Partner Solutions at Platfora, describes how to help customers achieve strategic advantage through data discovery.

While many people have heard the notion of “known unknowns” and “unknown unknowns,” it may surprise you to discover that the concept was first popularized by a NASA scientist. In a presentation given at TEDx GeorgeMasonU, Dr. Kirk Borne described how he used the concept of “known unknowns” (things that we knew might exist, but hadn’t seen evidence of) and “unknown unknowns” (things that we could discover and knew nothing about, but would truly surprise us), and how they relate to the concept of Big Data.…

The advent of connected manufacturing has ushered in an era where low-cost machine sensors take thousands of measurements per second at many points across the manufacturing process. This stream of sensor data enables manufacturers to quickly detect emerging anomalies and solve issues before they impact yield and quality.

Big Data insights enable predictive analytics for those rapid, proactive process adjustments. Manufacturers can capitalize on this opportunity by following an approach that combines the power of Teradata with Hortonworks Data Platform’s storage and compute efficiencies at extreme scale.…

Recent innovations in the Internet-enabled Connected Cars that we drive today have spawned a whole new set of opportunities and challenges for carmakers. The opportunities come from the ability to capture detailed, current data on how drivers actually operate their cars and how those cars respond to that use.

Register for the October 22 Webinar

That data can be extraordinarily valuable for uses such as preventative maintenance, product development, manufacturing optimization and recall avoidance.…

I recently had the pleasure of visiting with Arvind Battula, Sr. Data Scientist at Schlumberger. We discussed his background as a chemical and mechanical engineer and his move onto the Data and Analytics team as a data scientist. The following is a transcript of my conversation with Arvind. We discussed his background, his interesting focus areas for data science in oil and gas, and technologies that he believes will help transform the industry.…

Metro Transit of St. Louis (MTL) operates the public transportation system for the St. Louis metropolitan region. The organization’s mission is “Meeting the region’s transit needs by providing safe, reliable, accessible, customer-focused service in a fiscally responsible manner.”

Meeting the Challenge to Provide Safe, Reliable Public Transport

To ensure the safety of passengers and the proper use of public funds, MTL has always performed regular maintenance on its bus fleet. But lacking detailed data on how bus components were actually performing, the agency maintained vehicles retroactively.…

The Personalized Medicine Initiative (PMI), based out of the Life Sciences Institute of the University of BC, has deployed HDP and PHEMI Central Big Data Warehouse to collect, store and manage genomic and clinical data for Molecular You (MY). 

PHEMI is a Hortonworks Technology partner and in this blog, Richard Proctor, General Manager, Global Healthcare at Hortonworks interviews PHEMI’s Roy Wilds, Dir. of Product Management, along with PMI’s Chief Operating Officer and Co-founder of Molecular You, Rob Fraser, to discuss this groundbreaking work.  …