cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

From the Dev Team

In a world that creates 2.5 quintillion bytes of data every year, it is extremely cheap to collect, store and curate all the data you will ever care about. Data is de facto becoming the largest untapped asset. So how can organizations take advantage of unprecedented amounts of data? The answer is new innovations; and […]

We are excited to announce the general availability of Hortonworks Sandbox with HDP 2.3 on Microsoft Azure Gallery. Hortonworks Sandbox is already a very popular environment for developers, data scientists and administrators to learn and experiment with the latest innovations in Hortonworks Data Platform. The hundreds of innovations span across Apache Hadoop, Kafka, Storm, Spark, […]

Everyday more and more new devices—smartphones, sensors, wearables, tablets, home appliances—connect together by joining the “Internet of Things.” Cisco predicts that by 2020, there will be 50 billion devices connected to Internet of Things. Naturally, they all will emit streams of data, in short intervals. Obviously, these data streams will have to be stored, will […]

Last week, on July 22nd, we announced the general availability of HDP 2.3. Of the three part blog series, the first blog summarized the key innovations in the release—ease of use & enterprise readiness and how those are helping deliver transformational outcomes—while the second blog focused on data access innovation. In this final part, we […]

On August 4th at 10:00 am PST, Eric Thorsen, General Manager Retail/CP at Hortonworks and Krishnan Parasuraman, VP Business Development at Splice Machine, will be talking about how Hadoop can be leveraged as a scale-out relational database to be the System of Record and power mission critical applications. In this blog, they provide answers to […]

On July 22nd, we introduced the general availability of HDP 2.3. In part 2 of this blog series, we explore notable improvements and features related to Data Access. SQL on Hadoop Spark 1.3.1 Stream Processing Systems of Engagement that scale HDP Search We are especially excited about what these data access improvements mean for our […]

We are very pleased to announce that Hortonworks Data Platform (HDP) Version 2.3 is now generally available for download. HDP 2.3 brings numerous enhancements across all elements of the platform spanning data access to security to governance. This version delivers a compelling new user experience, making it easier than ever before to “do Hadoop” and […]

Drink from Elephant’s Well Of Knowledge Developer success starts with open and reusable code, and a community that allows for both consumption of code and contribution of updates to the code base. This success engenders a thriving and evolving community. To that end, today we are announcing the Hortonworks Gallery for developers. Located on GitHub, the […]

Early this year, ApacheTM FalconTM became a Top Level Project (TLP) in the Apache Software Foundation. The project continues to mature as a framework for simplifying and orchestrating data lifecycle management in Hadoop by offering out-of-the-box data management policies. The Apache Falcon 0.6.1 release builds on this foundation by providing simplified mirroring functionality and a […]

Hortonworks is always pleased to see new contributions come into the open-source community. We worked with our customer, Hotels.com, to help them develop libraries and utilities around Apache Hive, the Apache ORC format and Cascading. It’s great to see the results released for the community. In this guest blog, Adrian Woodhead, Big Data Engineering Team […]

As YARN drives Hadoop’s emergence as a business-critical data platform, the enterprise requires more stringent data security capabilities. The Apache Ranger delivers a comprehensive approach to security for a Hadoop cluster. It provides a platform for centralized security policy administration across the core enterprise security requirements of authorization, audit and data protection. On June 10th, […]

In his blog, Tim Hall wrote, “Enterprises are embracing Apache Hadoop to enable their modern data architectures and power new analytic applications. The freedom to choose the on-premises or cloud environments for Hadoop that best meets the business needs is a critical requirement.” One of the choices in deploying Hadoop in the cloud environment is with Microsoft Azure using […]

Mayank Bansal, of EBay, is a guest contributing author of this collaborative blog. This is the 4th post in a series that explores the theme of enabling diverse workloads in YARN. See the introductory post to understand the context around all the new features for diverse workloads as part of Apache Hadoop YARN in HDP. Background  In Hadoop YARN’s […]

Introduction Multihoming is the practice of connecting a host to more than a single network. This is frequently used to provide network-level fault tolerance – if hosts are able to communicate on more than one network, the failure of one network will not render the hosts inaccessible. There are other use cases for multi-homing as […]

The Apache community released Apache Pig 0.15.0 last week. Although there are many new features in Apache Pig 0.15.0, we would like to highlight two major improvements: Pig on Tez enhancements Using Hive UDFs inside Pig Below are some details about these important features. For the complete list of features, improvements, and bug fixes, please […]