We are very pleased to announce that the Hortonworks Data Platform Version 2.2 (HDP) is now generally available for download. With thousands of enhancements across all elements of the platform spanning data access to security to governance, rolling upgrades and more, HDP 2.2 makes it even easier for our customers to incorporate HDP as a core component of Modern Data Architecture (MDA).
HDP 2.2 represents the very latest innovation from across the Hadoop ecosystem, where literally hundreds of developers have been collaborating with us to evolve each of the individual Apache Software Foundation (ASF) projects from the broader Apache Hadoop ecosystem. These projects have now been brought together into the complete and open Hortonworks Data Platform (HDP) delivering more than 100 new features and closing out thousands of issues across Apache Hadoop and its related projects.
These distinct ASF projects from across the Hadoop ecosystem span every aspect of the data platform and are easily categorized into:
A simple architectural rendering of those capabilities across the 5 elements of HDP is below:
Seen another way, the chart below captures the evolution of HDP over the past 2 years and illustrates the synchronization of the core ASF projects into a single enterprise data platform. While others choose to fork the work done in the community into their own proprietary versions that quickly diverge from the trunk, with HDP you can be sure you are always leveraging the very latest innovation from the Apache community rather than the capacity of any single vendor.
Every component in the HDP stack has been updated and we have added some key technologies and capabilities to HDP 2.2.
While YARN has allowed new engines to emerge for Hadoop, the most popular integration point with Hadoop continues to be SQL and Apache Hive is still the defacto standard. This release delivers phase 1 of the Stinger.next initiative, a broad, open community based effort to improve speed, scale and SQL semantics.
Apache Spark is an elegant, attractive development API allowing developers to rapidly iterate over data via machine learning and other data science techniques. In this release, we plan to deliver an integrated Spark on YARN experience with improved integration to Hive 0.14 support and support for ORCFile by year-end. These improvements allow Spark to easily share and deliver data within and around Spark.
Included in HDP 2.2, Apache Kafka has quickly become the standard for high-scale, fault-tolerant, publish-subscribe messaging system for Hadoop. It is often used with Storm and Spark so that you can stream events in to Hadoop in real time and its application within the “internet of things” uses cases is tremendous.
In HDP 2.2, the rolling upgrade feature takes advantage of versioned packages, investments at the core of many of the projects and the underlying HDFS High Availability configuration to enable you to upgrade your cluster software and restart upgraded services, without taking the entire cluster down.
Management and monitoring a cluster continues to be high priority for organizations adopting Hadoop. Our completely open approach via Apache Ambari is unique and we are excited to have Pivotal and HP jump on board to support Ambari with some of the other leaders in the data center like Microsoft and Teradata. In HDP 2.2, over a dozen new features to manage Hadoop have been added, but some of the biggest include:
Data architects require Hadoop to act like other systems in the data center and business continuity through replication across on-premises and cloud-based storages targets is a critical requirement. In HDP 2.2, we extend the capabilities of Apache Falcon to establish an automated policy for cloud backup to Microsoft Azure or Amazon S3. This is the first step in a broader vision to enable extensive heterogeneous deployment models for Hadoop spanning Cloud-based and on-premises.
Hortonworks is 100% committed to open source and the value provided by an active and open community of developers. HDP is the ONLY 100% open source Hadoop distribution and our code goes back into an open ASF governed project with a live and broad community. Over the course of the past few weeks and continuing into next, we have released a series of blog posts outlining in more detail some of the features that can be found within each of the various projects that comprise the HDP stack. We invite you to explore these highlight blogs as well as a complete list of new features and Jira tickets closed.
HDP 2.2 is now hortonwoks.com/hdp.