Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
August 10, 2016
prev slideNext slide

Three Things To Know About HDF 2.0

Hortonworks Dataflow (HDF) offers a combination Apache NiFI, Kafka and Storm. HDF 2.0 has significant architecture and enterprise productivity features to make it faster and easier to deploy, manage and analyze streaming data. In the next few weeks, we will go into more details, but for now, here are the three highlights to take note of.

1) An integrated ecosystem of Apache NiFi, Kafka, and Storm

HDF 2.0 offers an enterprise ready,  integrated deployment and management option for streaming analytics, from the edge into the core with

  • Apache NiFi for dynamic, configurable data pipelines, through which all sources, systems and destinations communicate.
  • Apache Kafka for high throughput distributed messaging with pub sub semantics to operate at speed on big data volumes that adapt to differing rates of data creation and delivery
  • Apache Storm for real-time streaming analytics to create immediate insights at  massive scale, with performance that is 6-10X faster than any previous Storm release.

The new streaming analytics features of HDF 2.0 enable businesses to turn in-motion data into real-time insights, with the highlights below.

  • Storm Windowing and State Management
  • IMproved Storm Topology Debugging including Dynamic Worker Profiling, Topology Event Inspector, Dynamic Log Levels and Distributed Log Search
  • Improved Kafka SASL and Kafka Automated Replica Leader Election
  • Improved Storm Scalability with Pacemaker Daemon, Resource Aware Scheduling and Improved Nimbus HA

Many enterprises today deploy a combination of individual products for data movement, data collection, messaging bus and real time streaming analytics to create a an integrated in-house solution. HDF accelerates the on ramp to streaming analytics with an integrated enterprise ready solution.

2) Enterprise readiness, integration with Ambari and Ranger

With the new enterprise readiness features of HDF 2.0, businesses can accelerate business value from data in motion through operational visibility and centralized security.

Operational Visibility Improvements of HDF 2.0

  • Integrated and comprehensive platform level monitoring and management
  • Integration with Ambari Views and Grafana provides improved metrics collection and sampling to get more accurate and granular metrics performance, as well as time series metrics visualization and configurable metrics dashboards

Centralized Security Improvements of HDF 2.0

  • Integrated governance and security for NiFi, Storm and Kafka
  • Centralized authorization management for all the components

HDF 20 Hortonworks Ambari Apache NiFi Kafka Storm

3) Extending the reach towards the edge, support for Apache MINIFI

minifi-logo

HDF 2.0 supports Apache MiNiFi, a subproject of Apache NiFi,  designed to solve the difficulties of managing and transmitting data feeds to and from the source of origin, enabling edge intelligence to adjust dataflow behavior with bi-directional communication, out to the last mile of digital signal.

MiNiFi is designed to be a very small and lightweight footprint*, generate the same level of data provenance as NiFi that is vital to edge analytics and IoAT (Internet of Any Thing) and integration with NiFi for follow-on dataflow management and full chain of custody of information. (MiNiFi is pronounced “minify”, [min-uh-fahy]) and the java agent is supported as part of HDF 2.0.)

*MiNiFi java agent code base is <40 Mb, with a configurable memory footprint. For more information about MiNiFi see the Apache MiNiFi project page.  For a connected car example of MiNIFi, see here.

Those are the three things to know about HDF 2.0 that we will delve into further detail upon in upcoming blog posts. In the meantime, we would recommend the following for further reading about how Hortonworks DataFlow is used in real world environments.

Leave a Reply

Your email address will not be published. Required fields are marked *