Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
June 02, 2016
prev slideNext slide

What’s New in Apache Storm 1.0 – Part 1 – Enhanced Debugging

Debugging distributed systems can be difficult largely because they are designed to run on many (possibly thousands) of hosts in a cluster. This process typically involves monitoring and analyzing log files spread across the cluster, and if the necessary information is not being logged, service restarts and job redeployment may be required. Not only is this process tedious, it can also be disruptive in the case of systems running in production.

The latest 1.0 release of Apache Storm includes a number of important new features that address this difficulty. In this post we’ll take a high level look at what these features mean for Storm users and administrators.

H96566k

The first computer bug. A moth that was stuck in a relay discovered by Grace Hopper.

Dynamic Log Levels

In previous versions of Storm changing logging levels required manually editing configuration files across all nodes in the cluster. This was especially tedious in large clusters, and to make matters worse, once you were finished you had repeat the process to revert those changes.

Storm 1.0 allows you to change any log level directly from the Storm UI or the command line, without having to remotely login to machines in the cluster. What’s more, it also allows you to specify an expiration time after which the changes will be automatically reverted.

Distributed Log Search

The log file viewer added in the Apache Storm 0.9.1 release made accessing Storm’s log files significantly easier, but in some cases still required examination individual log files one-by-one. In Storm 1.0 the UI now includes a powerful search feature that allows you to search a specific topology log file, or across all topology log files in the cluster, even archived files.

When performing a topology-wide search, the UI will search across all supervisor nodes for a match. The search results include a link to the matching log file, as well as host and port information that allow you quickly identify on which machine a specific log event occurred. This feature is particularly helpful when trying to track down when and where a particular error occurred.

Event Sampling

In the past, it was common practice for developers to insert “debug” bolts or Trident functions into their Storm topologies in order to trace the flow of data through a topology. The problem with this approach was that these “debug” components were usually not meant for production, and removing them necessitated repackaging and redeploying the topology.

The Event Sampling feature introduced in Storm 1.0 eliminates the need for this practice by allowing users to sample a percentage of live data as it flows through a topology, and view it and download it directly from the Storm UI. Users can sample data at the topology level, or even drill down and sample data from individual spouts and bolts. When you are finished sampling, simply turn it off. There’s no need to stop or redeploy the topology.

Dynamic Worker Profiling

When debugging or tuning JVM applications for performance and memory usage there are a few utilities that are invaluable:

  • Heap Dumps provide a snapshot of all the objects allocated in the JVM heap at a given point in time.
  • jstack provides stack traces of all threads in a JVM process.
  • Java Flight Recorder is a tool for collecting, profiling and diagnostic information about a running JVM application.

Typically these tools are used from the command line on the machine where the JVM application is running. With a distributed system such as Storm, using these tools required logging into the specific machine, identifying the target process, and manually running the appropriate tool.

In Storm 1.0, access to these tools is integrated directly into the Storm UI. Getting a heap dump, jstack stack trace, or Java Flight Recorder recording are as easy as clicking a button and downloading the resulting file. Once downloaded, you can use the analysis and visualization tools of your choice to get an in-depth view into the JVM process.

Conclusion

While debugging a distributed system such as Storm may not fit everyone’s definition of “fun,” it is frequently necessary and unavoidable at times. These new enhancements in Storm 1.0 make that job significantly easier than it has been in the past.

For a more in-depth look at how to use these new features, check out this technical article by Apache Storm Committer and PMC Member Arun Mahadevan.

Happy debugging!

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>