Apache Storm and Hadoop
In February 2014, the Apache Storm community released Storm version 0.9.1. Storm is a distributed, fault-tolerant, and high-performance real-time computation system that provides strong guarantees on the processing of data. Hortonworks is already supporting customers using this important project today.
Many organizations have already used Storm, including our partner Yahoo! This version of Apache Storm (version 0.9.1) is:
- Highly scalable. Like Hadoop, Storm scales linearly
- Fault-tolerant. Automatically reassigns tasks if a node fails
- Reliable. Supports “at least once” and “exactly once” processing semantics
- It is an Apache Project. Which brings with it the brand, governance and large community of the Apache Software Foundation.
Netty-based Messaging Transport
The biggest code change in version 0.9.1 was the removal of the 0MQ transport in favor of a pure java Netty-based transport. Special thanks to the engineering team at Yahoo! for contributing that.
Previously, installing the 0MQ native binaries proved difficult for many users. The pure-java solution cures that headache. Netty also improves Storm’s performance over 0MQ, allowing twice as many messages per second through the same cluster.
All this being said, the 0MQ transport is still an available and supported option for those who want to use it.
Windows Platform Support
This is the first release of Storm with built-in Windows support. This is an important step for those who have invested in a Windows-based infrastructure and want to use Storm for real-time, stream processing.
Hortonworks Data Platform is the only Hadoop distribution that supports Windows. Now that Storm is part of HDP, it will also run on Windows.
Apache Maven for Storm Builds
From a developer perspective, we migrated from using Leiningen as our build tool to using Apache Maven. This was the right thing to do for release management. Maven had more options when it came to integrating Storm’s build process with the ASF release infrastructure.
Now we’re in a much better position to release early and often.
Coming Next: Security, Multi-tenancy and Storm-On-YARN
Now that we have our first Apache release out, we’re in a better position to work on what matters most to our users: improving Storm and adding new features.
A focus for upcoming releases will be security and multi-tenancy. The engineering team at Yahoo! has contributed a tremendous amount of work in that regard, and we’ll be looking to get those features added to the main codebase.
There is also a lot of interest in support for running Storm on YARN. Again, Yahoo! has done a lot of work in this area, and has open-sourced a preliminary implementation of Storm on YARN.
Storm Comes to the Apache Software Foundation (ASF)
This is Storm’s first release from the Apache Software Foundation. The ASF ensures that released software adheres to a stringent set of licensing and distribution rules that protect both the users of the software and the contributing developers.
Thanks to the Team
Many people worked hard to bring Storm into the ASF and to release version 0.9.1. Thanks to the following folks who made this release possible: Andy Feng; David Lao; Derek Dagit; Flip Kromer; James Xu; Jason Jackson; Nathan Marz and Robert Evans.