Storm, Stream Data Processing
YARN opened up Hadoop for data access by applications other than MapReduce. One of the most commonly demanded use cases was the antithesis of batch: stream processing in Hadoop. Apache Storm is a fully certified component of HDP 2.1, and our customers are using stream processing for real-time analysis of some of the most common new types of data such as sensor and machine data.
The team at BackType/Twitter originally conceived Storm to analyze the tweet stream in real time. Storm became an official Apache incubation project in September 2013. Hortonworks engineering is deeply committed to integrate Storm with Hadoop.
Beginning with Hortonworks Data Platform version 2.1, Apache Storm is a fully-certified component of HDP. The current version of Storm replaces 0MQ data transport with pure Java netty-based transport, and eliminates the challenge of installing the 0MQ native binaries. Storm 0.9.1 also includes built-in support for Windows.
- Install, Start & Stop via Ambari
- Kafka, HBase & HDFS Connectors
- Ganglia & Nagios Monitoring
- Ingest & Notification for JMS
- Data Persistence: EDWs, RDBMS, Cassandra
- HA Management w/Ambari
- AD/LDAP Authentication Plugin
- Declarative “wiring”
- Hive Update Support
- Advanced Scheduler