Hortonworks Data Platform

Completely open source Apache Hadoop data platform, architected for the enterprise

The latest community innovation

HDP 2.1 comprises the most recent innovation being delivered in the open Hadoop community. It delivers the latest releases across Hadoop and the key related projects into a single integrated and tested platform for the enterprise.

Key highlights of HDP 2.1 include:

Interactive query with Hive and Tez

HDP delivers on the commitments made last year with the final phase of the Stinger Initiative; a concerted effort to improve the performance of Hive and SQL query in Hadoop. The results of Stinger represents the hard work of hundreds of individuals who either contribute privately or represent one of more than 45 companies that contribute to Hive.

Introducing Apache Tez for the fastest Hive ever!

Apache Tez reimagines the original MapReduce for interactive query capabilities to meet the needs of users of the most widely-used data access engine for Hadoop: Apache Hive.

Vectorized Query

Through a deep engineering partnership with and contributions from Microsoft, Apache Hive can now take advantage of vectorized query execution and accelerate computations of data in memory by up to 100x

Community leads Innovation!

145 Developers from 45 Companies deliver interactive Apache Hive.

Hive is already the most widely-used data access engine for Hadoop. Behind the significant enhancements to Hive is the story of how the open community can lead technology innovation. Through this collective effort, Hive is the most robust, mature and secure SQL solution for Hadoop.

New Processing Engines...

The benefits of YARN as the data-operating system are delivered in HDP 2.1 with the inclusion of new engines for processing stream data and search :

Stream Processing with Apache Storm

Apache Storm is a distributed real-time computation system for processing fast, large streams of data. Storm adds reliable real-time data processing capabilities to HDP 2.1. Storm in Hadoop helps capture new business opportunities with low-latency dashboards, security alerts, and operational enhancements integrated with other applications running in your Hadoop cluster.

Search with Apache Solr

Apache Solr introduces high performance indexing & sub-second search times over billions of documents. Apache Solr provides powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, management of rich documents (e.g., Word, PDF), and geospatial search. It is highly reliable, scalable and fault tolerant, providing distributed indexes, load-balanced queries, automated failover and recovery, centralized configuration and replication.

Operations with Apache Ambari

HDP 2.1 includes the very latest version of Apache Ambari which now supports Apache Storm, Apache Falcon and Apache Tez, provides extensibility and rolling restarts, as well as other significant operational improvements.

... Expanded Enterprise Capabilities

Data Governance with Apache Falcon

Apache Falcon is a framework for simplifying data management and pipeline processing in Apache Hadoop®. It enables users to automate the movement and processing of datasets for ingest, pipelines, disaster recovery and data retention use cases. Instead of hard-coding complex dataset and pipeline processing logic, users can now rely on Apache Falcon for these functions, maximizing reuse and consistency across Hadoop applications.

Perimeter Security with Apache Knox

The Knox Gateway (“Knox”) is a system that provides a single point of authentication and access for Hadoop services in a cluster. Knox Gateway provides security for multiple Hadoop clusters, with these advantages:
  • Provide perimeter security to make Hadoop security setup easier
  • Support authentication and token verification security scenarios
  • Enable integration with enterprise identity management environments
  • Manage security across multiple clusters and multiple versions of Hadoop
HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.