We are thrilled to announce the general availability of Hortonworks DataFlow (HDF) version 3.1 – introducing powerful new data-in-motion capabilities for edge analytics, flow management and streaming analytics, all powered and managed by a set of enterprise management services for governance, security and monitoring.
The governing vision of HDF is to be the industry leading data-in-motion platform that collects, curates, analyzes and acts on data in motion across the data center, cloud, edge devices, and sensors. Based on working with real-world customer use cases, we strongly believe that a data-in-motion platform must have both flow management and streaming analytics capabilities powered by a set of enterprise management services. With HDF 3.1, we continue to execute on this vision. See the below diagram that highlights some of the new features and read on!
HDF Flow Management Features in HDF 3.1
In HDF 3.1, improvements were made on the flow management side to facilitate core enhancements and cross-component integration and operability on the platform. Advanced control and manageability are introduced to help customers easily navigate, store, secure, and consume data-in-motion to achieve faster time-to-value, cross-component transparency, and lowered operational overheads.
- Apache NiFi Registry powered by the Apache NiFi Registry, this is a net new component as part of the enterprise foundational service offering in the HDF platform. It facilitates abstracting Apache NiFi flows, enabling version control at a very granular level, and allowing users to easily design & deploy flows across environments. This functionality significantly improves the storage, control, and management of versioned flows, further shortening the software development life cycle (SDLC) and accelerating application deployment to achieve faster time-to-value.
- Productivity enhancement by enabling integrations between NiFi and components such as Apache Atlas, Hortonworks SmartSense, and Apache Knox, HDF 3.1 provides better manageability and access of data as well as toolsets across the platform. When HDF – NiFi is deployed and used in conjunction with Hortonworks Data Platform (HDP) services, users can get a comprehensive cross-component lineage view at dataset level with Atlas, easier data collection process with SmartSense for faster troubleshooting, and the convenience of single sign-on capability with Knox as a standard security gateway.
The below diagram summarizes the key new improvements/features in HDF Flow.
HDF Stream Processing Features in HDF 3.1
Powerful new features were added to streaming that focused on making the tasks of developers, devops and architecture teams easier to perform in an enterprise environment. The top 4 new capabilities in streaming are the following:
- Apache Kafka 1.0 support with full integration with HDF Services – Kafka 1.0 provides important new features including more stringent message processing semantics with support for message headers and transactions, performance improvements and advanced security options.
- Apache Ambari support for Kafka 1.0 – Install, configure, manage, upgrade, monitor, and secure Kafka 1.0 clusters with Ambari.
- Apache Ranger support for Kafka 1.0 – Manage access control policies (ACLs) using resource or tag-based security for Kafka 1.0 clusters.
- New NiFi and SAM processors for Kafka 1.0 – New processors in NiFi and Hortonworks Streaming Analytics Manager (SAM) support Kafka 1.0 features including message headers and transactions.
- SAM’s Test Mode and the new SAM Operations Module – SAM continues to make the jobs of Developers and DevOps easier with two new capabilities: SAM Test Mode and the new SAM Operations Module
- SAM “Test Mode” – This new feature allows developers to test SAM apps by mocking out sources using test data, enabling the creation of unit tests for SAM apps integrated into their continuous integration and delivery (CI/CD) environments.
- New SAM Operations module – This new module provides DevOps tooling with rich visualization to monitor and troubleshoot app performance/failure issues.
- SAM extensibility improvements – In the previous release, SAM users could only consume events from Kafka and there was no way to build and use custom sources (e.g., Amazon Kinesis). With HDF 3.1, developers can now build and register custom sources and sinks integrated with Hortonworks Schema Registry. Developers can also use existing Storm code and wrap it with the SAM SDK and register it with SAM.
- Hortonworks Schema Registry’s new schema version lifecycle management – Schema Registry is popular with many of our customers but a common ask was to support lifecycle actions for schema versions. With HDF 3.1, developers and platform teams can update and manage schema states including archive, disable and build custom states. Developers and governance teams can also branch schema versions and can perform workflow lifecycle actions including fork, start review, finish review, enable and merge.
The below diagram summarizes the key new improvements/features in HDF Streaming.
What’s Next? HDF 3.1 Blog Series
The HDF product and engineering teams are excited to share more details on these exciting new features in the HDF 3.1 release. So, over the next few weeks, we will be publishing a set of blogs about these new features as part of the HDF 3.1 blog series. The next two blogs slated for next week are about Apache Nifi Registry and Kafka 1.0 and integration of them with our HDF services.
Amazon Kinesis is a trademark or registered trademark of Amazon in the United States and/or other countries.
Druid is a registered trademark of Metamarkets Group, Inc.