Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
HDF > Develop with Hadoop > Real World Examples

Kafka in Trucking IoT on HDF

Explore Kafka in the Demo

cloud Ready to Get Started?


Explore Kafka in the Demo


While the demo application runs, you will gain an understanding of how Kafka receives data from a producer at its particular topics.


Environment Setup

If you have the latest Hortonworks DataFlow (HDF) Sandbox installed, then the demo comes preinstalled. If not or you do not already have it setup, then refer to Setup Demo on existing HDF/HDP.

Open a terminal on your local machine and access the sandbox through the shell-in-a-box method. Please visit Learning the Ropes of the Hortonworks Sandbox to review this method.

Before we can perform Kafka operations on the data, we must first have data in Kafka, so let’s run the NiFi DataFlow Application. Refer to the steps in this module: Run NiFi in the Trucking IoT Demo, then you will be ready to explore Kafka.

Turn Kafka component on if it’s not already on through Ambari.

Persist Data Into Kafka Topics

A NiFi simulator generates data of two types: TruckData and TrafficData as a CSV string. There is some preprocessing that happens on the data to prepare it to be split and sent by NiFi’s Kafka producers to two separate Kafka Topics: trucking_data_truck and trucking_data_traffic.

List Kafka Topics

From the terminal, we can see the two Kafka Topics that have been created:

/usr/hdf/current/kafka-broker/bin/ --list --zookeeper localhost:2181




View Data in Kafka Topics

As messages are persisted into the Kafka Topics from the producer, you can see them appear in each topic by writing the following commands:

View Data for Kafka Topic: trucking_data_truck:

/usr/hdf/current/kafka-broker/bin/ --zookeeper localhost:2181 --topic trucking_data_truck --from-beginning

View Data for Kafka Topic: trucking_data_traffic:

/usr/hdf/current/kafka-broker/bin/ --zookeeper localhost:2181 --topic trucking_data_traffic --from-beginning

As you can see Kafka acts as a robust queue that receives data and allows for it to be transmitted to other systems.

Note: You may notice the is data encoded in a format we cannot read, this format is necessary for Schema Registry. The reason we are using Schema Registry is because we need it for Stream Analytics Manager to pull data from Kafka.

Next: Learn Basic Operations of Kafka

You have already become familiar with some Kafka operations through the command line, so let’s explore basic operations to see how those topics were created, how they can be deleted and how we can use tools to monitor Kafka.