Hortonworks Data Platform

The completely open source Apache Hadoop data platform, architected for the enterprise

HDP 2.3 - Another major advance for Open Enterprise Hadoop.

Hortonworks Data Platform 2.3 represents yet another major step forward for Hadoop as the enterprise data platform. This release incorporates the most recent innovations that have happened in Hadoop and its supporting ecosystem of projects. HDP 2.3 packages more than a hundred new features across all our existing projects. Every component is updated and we have added some key technologies and capabilities to HDP 2.3.

HDP 2.3 Asparagus Diagram

Key highlights of HDP 2.3 include:

Breakthrough Usability for Hadoop

HDP 2.3 eliminates much of the complexity administering Hadoop and improves developer productivity

HDP 2.3 leverages the Ambari Views Framework to deliver new user views and a breakthrough user experience for both cluster operators and developers.

For Hadoop Operators...

  • Smart Configuration An entirely new user experience within Ambari which is guided, opinionated, and more digestible for configuration of HDFS, YARN, HBase, and Hive.
  • YARN Capacity Scheduler Configure shared access to large clusters through a much easier web interface to the YARN Capacity Scheduler.
  • Customized Dashboards Create a tailored dashboard and keep an eye on the metrics you value most.
Apache Ambari - Dashboard
Ambari User Views

...and for developers

  • Fast and easy SQL Editor for Hive.An integrated experience that allows for SQL query building, displaying a visual “explain plan”, and allowing for an extended debugging experience when using the Tez execution engine.
  • Easy Pig editor and web based HDFS browserIn addition to the SQL builder, a Pig Latin Editor brings a modern browser-based IDE experience to Pig. There is also a File Browser for HDFS.
  • An entirely new user experience for Apache Falcon A web-forms based approach allows for rapid development of feeds and processes. The new Falcon UI also allows you to search and browse processes that have executed, visualize lineage and setup mirroring jobs to replicate files and databases between clusters or to cloud storage such as Microsoft Azure Storage.

Impressive improvements across all data access engines

Consolidating access to data YARN as its architectural center As organizations strive to efficiently store their data in a single repository and interact with it simultaneously in different ways, they need SQL, streaming, data science, batch and more… all in the same cluster. HDP 2.3 adds new engines including:

Enhanced SQL Semantics in Apache Hive

Hive adds time intervals and UNION semantics, 2.5x performance improvements and improved query scheduling, along with a more streamlined user interface for Hive within Ambari.

Solr on YARN

The Solr search engine is being built to run on YARN and is now in technical preview. This critical advancement allows customers to reduce their total cost of ownership by deploying Solr within the same cluster as other workloads – eliminating the need for a “side cluster” dedicated to indexing data and delivering search results.

New capabilities for feature-rich Spark applications

Apache Spark on YARN is enhanced with the new DataFrame API, machine learning algorithms such as clustering, frequent pattern-mining algorithms and a technology preview of SparkSQL.

Advances towards comprehensive security and governance

Centralized Authorization
Security administrators can now define and manage security policies and capture security audit information for HDFS, Hive, HBase, Knox, Storm and now Solr, Kafka and YARN.
HDFS DARE (Data At Rest Encryption)
Provides security administrators the ability to manage keys and authorization policies for key management store (KMS) by introducing data encryption to encrypt data in HDFS files, combined with Apache Ranger embedded open source Hadoop KMS.
Audit Optimization and Scalable Storage
Provides a framework to optimize audit creation and storage, with interactive query powered by Solr. Users now have ability to combine security audit with data lineage in Apache Atlas for a comprehensive view of their data.

Introducing Apache Atlas

A common approach to Hadoop data governance from the open source community

As enterprises across all major industries deploy Hadoop into corporate data and processing environments, a common approach to working with metadata and data governance becomes a necessity.

Apache Atlas was created by a consortium of enterprises and Hortonworks to meet this need. Atlas enhances governance capabilities in Hadoop for both prescriptive and forensic models enriched by taxonomical metadata. Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack. Atlas enables platform-agnostic governance controls that effectively address enterprise compliance requirements.

Hortonworks SmartSense

Proactive monitoring and maintenance with your HDP Support Subscription

Deploy HDP with proactive and intelligent support. Hortonworks SmartSense gathers insight, provides recommendations, and helps optimize cluster utilization and health. Hortonworks SmartSense is included with every HDP Support Subscription.

Faster Support Case Resolution
By easily capturing log files and metrics for insight and resolution.
Proactive Cluster Configuration
Via intelligent stream of cluster analytics and data-driven recommendation.
Long-Range Cluster Optimization
Through a proactive view into customer’s cluster utilization that can be used to drive capacity planning.
Join our Webinars
Join us for a series of talks on some of the new enterprise functionality available in HDP including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.