Hortonworks Data Platform
HDP 2 builds on the massive scale processing and storage of HDP 1.x with an entirely new data operating system (YARN) enabling multiple workloads, applications and processing engines across single clusters with greater efficiency than ever before, as well as the latest releases of Apache Hadoop projects for management, data processing and core operations.
Developed in the open, for the enterprise:
Enables you to interact with all data in multiple ways simultaneously, making Hadoop a true multi-use data platform.
Provides a stable, tested and complete package of the key services required for the enterprise data architecture of tomorrow.
Interoperates with your existing best of breed tools from existing data center partners and allows you to reuse team capabilities.
HDP is is built and supported by the original the architects, builders and operators of Apache Hadoop
HDP : A complete distribution
HDP 2.0 comprises the latest release across Hadoop and the key related projects into a single integrated and tested platform appropriate for mainstream use. A complete Hadoop data platform delivers on all the key features across core, data and operational services required for Hadoop to be used as an enterprise data platform.
HDP contains all the baseline core services that allow you to implement Hadoop as a reliable, secure, multi-use enterprise data platform. It packages all the most recent innovations around YARN, HDFS2, Security and High Availability and offers the latest innovations from the open source community with the testing and quality you expect from enterprise software.
HDP is the only distribution to contain the most recent advancements of YARN which provides a true multi-use platform enabling you to store data once and simultaneously access in multiple ways.
HDP not only packages native High Availability features but also integrates with best of breed HA solutions from RedHat and continuous Hadoop from WanDisco to complement the native HA capabilities of Hadoop 2.0.
Tested at Scale
HDP combines the most useful and stable versions of Apache Hadoop and its related projects into a single, certified package tested at scale on hundreds of production nodes.
The Data Services found in HDP allow you to model, manipulate and access data in Hadoop. It includes multiple tools for interacting with data in many ways: in batch, interactive with SQL, or real-time with NoSQL.
Interactive Query & SQL
HDP includes the most recent version of Apache Hive, the de facto standard for SQL-in-Hadoop.With nearly every BI and visualization tool already certified, Hive meets the speed, and scale requirements of your SQL queries while providing the broadest range of semantics for SQL interaction with Hadoop.
Real-Time with NoSQL
HBase provides a tested, reliable and integrated tool for fault-tolerant columnar storage and quick access to large quantities of sparse data. HDP contains the most up-to-date version of HBase (0.96) which adds keys features such as MTTR and snapshots.
Hortonworks has led the way with metadata management within Hadoop with HCatalog and HDP packages this useful tool so that metadata can be shared within Hadoop and also exposed via RESTful interface outside Hadoop to ease integration into existing tools.
HDP includes Apache Ambari for provisioning, management and operation of Hadoop clusters. Built around a robust set of APIs, Apache Ambari enables you to integrate with existing management tools to provide a single consistent enterprise operations experience.
Ambari Simplifies cluster deployment across just a few or thousands of Hadoop nodes and across all platforms: cloud, virtual and physical environments (Windows and Linux). The intuitive, graphical, wizard based tool allows you to easily provision, configure and test all the Hadoop services and core components.
Gain insight into the state and performance across your cluster. The elegant UI of Ambari abstracts complex Hadoop settings so that your operations team can administer Hadoop services, change configurations and manage ongoing growth of a Hadoop cluster without advanced skills.
HDP helps provide a single pane of glass so that the ops team can manage a Hadoop cluster with existing tools, such as Microsoft System Center and Teradata Viewpoint. Ambari also leverages standard technologies and protocols with Nagios and Ganglia for deeper customization.
HDP provides the broadest range of deployment options for Hadoop: from Windows Server or Linux to virtualized Cloud deployments. It is the most portable Hadoop distribution, allowing you to easily and reliably migrate from one deployment type to another.
Windows & Linux
Only HDP provides both a Windows and Linux Hadoop distribution meaning the best fit for the data center whatever your choice of platform.
HDP is available as Microsoft HDInsight on the Azure cloud and is the default distribution available for a private or public cloud at Rackspace. Also, our work with OpenStack, allows you to easily install and provision a Hadoop cluster in that environment.
Since the final package of HDP is the same across the cloud deployments and on premise, you can seamlessly migrate or port from one deployment method to another.