As part of the product management leadership team at Hortonworks, there is nothing more valuable than talking directly with customers and learning about their successes, challenges, and struggles implementing their big data and analytics use cases with HDP and HDF. These conversations provide more insight than any analyst report, white paper, or market study.
In my 4+ years at Hortonworks, I have had many opportunities for face time with our more than 1000 customers. These conversations have strongly influenced how we build enterprise software products that are easier to use.
There have been a handful of moments with customers that leave an indelible mark, reshaping how one thinks about a problem set. One of those moments occurred a few months ago with a customer who was using Apache NiFi as part of the Hortonworks DataFlow (HDF) platform to ingest, route/move, enrich, and transform data from edge devices like cable modems, voice over IP phones, and home security systems. HDF was transformative for this customer and they especially appreciated NiFi’s compelling user experience to greatly reduce operational effort for data ingestion and flow management.
I posed the following question to the customer:
“Where did you experience pain when implementing this use case? Where can we continue to innovate in HDF to ease those pains?”
The response went something like this:
“Using NiFi with its rich UI has been a refreshingly delightful experience for us as we build flow management applications. However, we desperately need the same type of experience when building streaming analytics apps. Flow management only gets us halfway there. We need a rich UI to build analytical apps that operate on the stream.”
The above response has been echoed by almost every one of our customers, and it has strongly influenced the strategic direction, efforts, and investments in the Hortonworks data-in-motion platform: Hortonworks DataFlow (HDF). We have gleaned two insights from the customer’s response:
What is the difference between flow management and streaming analytics?
As the customer above noted, one needs both capabilities to be successful. This is the reason that HDF was expanded in the middle of 2016 to offer stream processing in the HDF 2.0 release with Apache Storm and Kafka. The below diagram summarizes this expansion.
Simply adding Apache Storm and Kafka to HDF does not address the second key point: building stream analytics quickly and easily. Customers often cite the following key challenges:
How do we address these challenges? Over the last 6 months, the Hortonworks Stream Processing engineering and product management teams have been working on a brand new set of powerful components that address each of these challenges. The below sections outline some of the fundamental principles driving this initiative.
There were two driving design principles that drove this effort. First, these new set of components should allow the user to design, develop, deploy and manage complex streaming analytics apps without them knowing the complexities of the the underlying streaming engine. The developer should be able to build complex streaming analytics apps writing as as little code as possible. Second, the toolsets need to cater to three important personas within the organization:
Over the next few weeks, the Hortonworks engineering and product management team will be publishing a series of blogs providing more details on this new tool and other new enterprise management services required for this new and exciting technology. Stay Tuned!!
Read the next blog post in this series: HDF Series Part 2: A Shared Schema Registry – What is it and Why is it important?