Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 10, 2018
prev slideNext slide

What’s new in Hortonworks DataFlow 3.3?

We are excited to announce the General Availability of Hortonworks DataFlow (HDF) 3.3. Throughout 2018, the HDF releases have been focusing on making the platform more robust for taking on advanced streaming architectures. We brought about several innovations and key enhancements on the operational side as well as on the development side of the enterprise. We have made HDF a highly reliable platform for Kafka in keeping up with the latest release as well as making sure that it is tightly integrated with the rest of our platform for comprehensive security and governance. We also introduced a brand new innovation, Hortonworks Streams Messaging Manager, that addresses Kafka Blindness and strengthens our focus on Kafka and related streaming architecture models.

With HDF 3.3, our focus has been unwavering on this front. In the past releases, we had empowered the Operations team, DevOps and the Security/Governance teams. In this release, we focus on the Application Developer personas and the BI Developer personas. With support for Kafka 2.0 as a key highlight for HDF 3.3, we are also proud to announce support for Kafka Streams. With this addition, we become a very unique solution provider to offer a choice of three stream processing engines for our customers to choose from for their streaming architectures. We already had support for Spark Structured Streaming and Streaming Analytics Manager (SAM) for Storm. Now, we offer support for Kafka Streams. We had recently published a very detailed post on when to choose which streaming engine and for what purpose.

With the upcoming HDP 3.1 release, we also bring about some exciting innovations to enhance our Kafka offering –

  1. New Hive Kafka Storage Handler (for SQL Analytics) – View Kafka topics as tables and execute SQL via Hive with full SQL Support for joins, windowing, aggregations, etc.
  2. New Druid Kafka Indexing Service (for OLAP Analytics) – View Kafka topics as cubes and perform OLAP style analytics on streaming events in Kafka using Druid.

Checkout our recent blog post on how we have democratized analytics within Kafka with such new access patterns.

HDF 3.3 includes the following major innovations and enhancements:

Core HDF Enhancements
  • Support for Kafka 2.0, the latest Kafka release in the Apache community, with lots of enhancements into security, reliability and performance.
  • Support for Kafka 2.0 NiFi processors
  • NiFi Connection load balancing – This feature allows for bottleneck connections in the NiFi workflow to spread the queued-up flow files across the NiFi cluster and increase the processing speed and therefore lessen the effect of the bottleneck.
  • MQTT performance improvements including handling a higher velocity of messages streaming from field IoT devices


Enhanced Streaming Support
  • Kafka Streams Support
    • Kafka Streams is now an officially supported component
    • Integration with Schema Registry, Ranger and Streams Messaging Manager (SMM)
    • Supports fully Kerberized/Rangerized Kafka clusters


Cross Platform Integrations
  • Kafka 2.0 – Ambari and Ranger
    • Ambari support to install, configure, manage Kafka 2.0 multi-node secure clusters.
    • Ranger support for new ACL like topic symmetric create/delete level permissions
  • Kafka Streams – Schema Registry and Ranger
    • Manage security policies of streams apps in Ranger
    • Schema Registry Serializer/Deserializer support for Streams Apps
  • KNOX SSO Support
    • Knox SSO support for Schema Registry and SAM


Enhanced Operations/Administrative Features
  • Site to Site (S2S) Reporting Task – Component name filtering
    • Allow administrators to efficiently capture the desired scope of provenance data from the running systems.
  • Decommission NiFi nodes in a cluster – New API feature that allows for an administrator to completely remove a NiFi node from a cluster. This feature doesn’t just shut down the node but rather ensures that all data (content, and flowfile) is offloaded from the NiFi node, pausing flow execution until this is complete. Once all of the data is offloaded from the node the framework will then gracefully remove the node from the cluster. This guarantees that no data is left on the local storage of the NiFi node that is removed from the cluster.
  • Ambari support for Express and Rolling upgrades from HDF 3.2
  • Smartsense can be used in an HDF 3.3 installation without requiring HDP.
  • Explicit fine-grained Ranger permissions for starting/stopping processor groups within the NiFi canvas
  • Ranger integration with NiFi Registry

The complete release notes for HDF 3.3 are available here.

Keep following our series of blog posts on Kafka and HDF 3.3. There are more exciting innovations to follow soon!

Leave a Reply

Your email address will not be published. Required fields are marked *