Hortonworks Data Platform

The completely open source Apache Hadoop data platform, architected for the enterprise

HDP delivers Open Enterprise Hadoop

Architected, developed, and built completely in the open, Hortonworks Data Platform (HDP) provides an enterprise ready data platform that enables organizations to adopt a Modern Data Architecture.

With YARN as its architectural center it provides a data platform for multi-workload data processing across an array of processing methods – from batch through interactive to real-time, supported by key capabilities required of an enterprise data platform — spanning Governance, Security and Operations.

Complete & Open

HDP is built in the open. Go with the standard and take advantage of the latest innovation being delivered by Hortonworks and the open-source community. All our work is contributed into the wide array of projects governed by the Apache Software Foundation.

Enterprise Ready

HDP is built for the enterprise. HDP includes rich data security, governance and operations functionality that works across component technologies and integrates with pre-existing EDW, RDBMS and MPP systems.

Fully Integrated

HDP integrates with, and augments, your existing applications and systems so that you can take advantage of Hadoop with only minimal change to existing data architectures and skillsets. Deploy HDP in-cloud, on-premise or from an appliance across both Linux and Windows.


An Open Enterprise Hadoop Data Platform

Hortonworks Data Platform enables the deployment of Open Enterprise Hadoop –  leveraging 100% open source components, driving enterprise readiness requirements and empowering the adoption of brand new innovations that comes out of the Apache Software Foundation and key Apache projects.

This comprehensive set of capabilities is aligned to the following functional areas: Data Management, Data Access, Data Governance and Integration, Security, and Operations.

Presentation & Applications
Enable both existing and new applications to provide value to the organization.
Enterprise Management & Security
Empower existing operations and security tools to manage Hadoop.
Governance Integration
Data Lifecycle & Governance
Data Workflow
Data Access
Access your data simultaneously in multiple ways (batch, interactive, real-time)
  • ISV Engines
Store and process your Corporate Data Assets
HDFS Hadoop Distributed File System
Data Management
Store and process your Corporate Data Assets
Administration Authentication Authorization Auditing Data Protection
Deploy, Manage and Monitor
Provisioning, Managing, & Monitoring
Deployment Choice
  • Linux & Windows
  • On Premise or Cloud/Hosted

HDFS & YARN : The core of Hadoop


The core components of HDP are YARN and Hadoop Distributed Filesystem (HDFS). YARN is the architectural center of Hadoop that enables you to process data simultaneously in multiple ways. YARN provides the resource management and pluggable architecture for enabling a wide variety of data access methods. HDFS provides the scalable, fault-tolerant, cost-efficient storage for big data.

More info : HDFS | YARN


Access data from a variety of engines


YARN provides the foundation for a versatile range of processing engines that empower you to interact with the same data in multiple ways, at the same time. This means applications can interact with the data in the best way: from batch to interactive SQL or low latency access with NoSQL. Emerging use cases for data science, search and streaming are also supported with Apache Spark, Solr and Storm. Additionally, ecosystem partners provide even more specialized data access engines for YARN.

More info : Hive | Tez | Pig | Storm | Spark | HBase | Accumulo | Solr


Load and manage data according to policy


HDP extends data access and management with powerful tools for data governance and integration. They provide a reliable, repeatable, and simple framework for managing the flow of data in and out of Hadoop. This control structure, along with a set of tooling to ease and automate the application of schema or metadata on sources is critical for successful integration of Hadoop into your modern data architecture.

Hortonworks has engineering relationships with all of the data management providers to enable their tools to work and integrate with HDP.

More info : Falcon | Oozie | Sqoop | Flume | Kafka


Authentication, Authorization, & Data Protection


Security is woven and integrated into at HDP in multiple layers. Critical features for authentication, authorization, accountability and data protection are in place so that you can secure HDP across these key requirements. Consistent with approach throughout all of the enterprise Hadoop capabilities, HDP also ensures you can integrate and extend your current security solutions to provide a single, consistent, secure umbrella over your modern data architecture.

More info : Knox | Ranger


Provision, manage, monitor and operate Hadoop clusters at scale


Operations teams deploy, monitor and manage a Hadoop cluster within their broader enterprise data ecosystem. HDP delivers a complete set of operational capabilities that provide both visibilities into the health of your cluster as well as tooling to manage configuration and optimize performance across all data access methods. Apache Ambari provides APIs to integrate with existing management systems: for instance Microsoft System Center and Teradata ViewPoint.

More info : Ambari | Zookeeper


Integrates with your data analytics tools


Hortonworks has a thriving ecosystem of vendors providing additional capabilities and/or integration points. These partners contribute to and augment Hadoop with given functionality across Business Intelligence and Analytics, Data Management Tools and Infrastructure. Systems Integrators of all sizes are building skills to assist with integration and solution development.

More info : Hortonworks Technology Ecosystem


Deploy on-premise, in the cloud, on Windows too.


HDP provides the broadest range of deployment options for Hadoop: from Windows Server or Linux to virtualized Cloud deployments. It is the most portable Hadoop distribution, allowing you to easily and reliably migrate from one deployment type to another. We also provide automated capabilities for backup to Microsoft Azure and Amazon S3.

More info : Microsoft | RedHat | Rackspace | Teradata


Upcoming Webinar

Not only open-source – but built in the open.

HDP demonstrates our commitment to growing Hadoop and its sub-projects with the community and completely in the open. HDP is assembled entirely of projects built through the Apache Software Foundation. How is this different from open-source, and why is it so important?

Proprietary Hadoop extensions can be made open-source simply by publishing to github. But compatability issues will creep in, and as the extensions diverge from the trunk, so too does reliance on the extension’s vendor.

Community driven development is different. By combining the efforts of technologists across a diverse range of companies, the roadmap is stronger, and the quality deeper. In the long run we believe community driven innovation will outpace that of any single company.
Learn More…

The Hortonworks engineering team includes contributors and committers across the array of Hadoop projects.More about Hortonworks’ innovation on community projects.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Join our Webinars
Join us for a series of talks on some of the new enterprise functionality available in HDP including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.