Storm, Stream Data Processing
The general availability of YARN promises to open up the range of processing engines in Hadoop. One of the most common use cases is in fact the antithesis of batch: stream processing in Hadoop. Early adopters are using Apache Storm and stream processing to analyze some of the most common new types of data such as sensor and machine data in real time.
Streams in HDP
Bringing stream data processing to enterprise Apache Hadoop and Hortonworks Data Platform
Storm on YARN
Use the YARN Hadoop operating system to allow multiple workloads to be applied to Hadoop data simutaneously
Bring baseline high availability, management, authentication and advanced scheduling to Storm
Originally conceived and built by the team at BackType/Twitter to analyze the tweet stream in real time, Storm became an official Apache incubation project in September 2013. Hortonworks has initiated an engineering commitment to deeply integrate Storm with Hadoop, and specifically as a supported component of the 100% Open Source Hortonworks Data Platform. Hortonworks will be making a preview of Storm available in Q4 of this year and will be including a fully certified version of Storm in the Hortonworks Data Platform in Q1 of 2014.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Modern Data Architecture
Find out how Hadoop integrates with your existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure to tackle the challenges of big data.
- Installation with Ambari
- Ganglia & Nagios Monitoring
- Data ingest Spouts
- Bolts for notification and data persistence – HDFS, HBase
- AD/LDAP plugin for authentication
- HA management with Ambari
- Declarative “wiring”
- Hive update support
- Advanced scheduler
- OpenStack Savanna support