Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
June 03, 2016
prev slideNext slide

The Evolution of the Modern Data Application

Business is having to adapt or die, and building data-aware compelling applications are key in this evolving market. Jim Walker – Vice President of Marketing, EverString lead the Modern Data Track session selection for Hadoop Summit San Jose. His committee focussed on submissions that showcased applications that derived business value and competitive advantage from large volumes of data. These set of sessions are idea for both business and technical audiences since they cover case studies and tips for effective exploration of business data, visualization and solutions.

The committee’s recommendation on their top 3 tracks are:

War on stealth cyberattacks that target unknown vulnerabilities

Speaker: George Vetticaden from Hortonworks

In large enterprise organizations, a the Security Operations Center team uses a number of different security tools like traditional Security Information and Event Management (SIEM) or endpoint security solutions. SOC teams are moving to next gen defense techniques to enable alerts based on processing streaming data in real-time to stop attacks as fast as possible. The session will walk through a day in the life of an SOC analyst, how automated evidence collection and response accelerates threat detection and response to real-world data breaches. Lessons learned from enterprises using Metron, as well as the experience of the Apache Metron community will be shared

Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Learning DSLs

Speakers: Murali Kaundinya and Gopi Janakiraman from Merck
Wearable devices, point of care devices, health sensors and monitors generate a vast volume and variety of data that has to be transported reliably for real time health analytics. These devices range from resource and bandwidth constrained sensors, microcontrollers to smartphones and tablets. They use https or some secure tunneling over wireless and IP networks to interact with large Internet scale Health platforms. In this talk, we present an architecture with several live demonstrations of an integrated platform built with node.js, MQTT, Apache Storm, Cascading and Pattern and MongoDB. The analytics in the back end computes across context, location, longitudinal data, consent, EMR, prescription, clinical data and other interconnected data streams and healthcare domain constraints respectively. Continuous development and delivery of this large ecosystem demands agility with just-right tooling. We show architectural abstractions that hides complexity and detail a frictionless development and deployment process and practices with optimal tooling. In this talk, we share our experiences using tools like XText to develop and integrate other pre-existing domain specific languages. Specifically, we will share our experiences with the popular DSLs with Hadoop and Machine Learning. Through live demos, we will show the types of real-time health analytics that is achievable.

Lambda architecture: how we merged batch and real-time

Speakers: Sewook Wee and Sotos Matzanas from Trulia

When Lambda Architecture was introduced in 2013, many of us got excited and attempted to implement the architecture, but soon started to scratch our heads when thinking about merging the batch view and the real-time view at the serving layer. It is more challenging when the merging of the two views requires more than simply adding values from them. At Trulia, we built a real-time Personalization platform that updates user’s browsing history and computes various preferences (e.g., location or price range) in real-time. In our case, it was non trivial to merge the batch layer and the speed layer. To keep the holistic view of the user, we merge browsing histories from multiple devices as soon as we capture the link between them through login or registration. Additionally, some of our preference scoring algorithms didn’t make sense to run solely on delta data but needed the entire user history. Moreover, coordinating a transition from one epoch (batch + delta) to the next combo was complex. Finally, we wanted its maintenance as simple as a retry when failure occurs. In this talk, we will share how we architected our platform, and especially how we merged and transitioned through epoch cycles.

Hope to see you at the sessions, but you need to register to attend Hadoop Summit San Jose.

Leave a Reply

Your email address will not be published. Required fields are marked *