Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 17, 2015
prev slideNext slide

Hadoop Summit Dublin — Community Choice Winners Announced

Hadoop Summit – Dublin taking place 13-14 April 2016

Unlike other conferences, Hadoop Summit is driven for the community by the community and this year’s speaker submissions have been open for public viewing The top vote getting sessions are automatically selected for the conference. The competition was strong, the content was amazing and with over 13,000 votes tallied, we are happy to announce that the results are in!


Before announcing the winners, we’d like to thank all of you who submitted abstracts, took part in tweeting, sharing, urging support and voting on sessions. This year’s event will be bigger and better than ever!


Also, now is the best time to register to see this content and much more. Super early bird registration kicks in with all an access pass at €699. So register here:


Without further ado, the Community Choice Winners for Hadoop Summit 2016 are…


Apache Committer Insights

How To: A beginners guide to becoming a Apache Contributor

Venkatesh Sellappa, Teradata UK, Solution Architect

I am a new contributor to Apache NiFI and this talk takes a light-hearted look at my journey of how to become a contributor to an Apache Project. It will outline the skills required, the steps to take for setting up a project, the correct etiquette in subscribing to a mailing list, the right way to ask a question, the way to engage with the community and the best practices for submitting a patch, documentation etc.

Applications of Hadoop and the Data-Driven Business  

Crime Prediction using Hadoop framework

Romika Yadav, Research Scholar & Savita Kumari, Assistant Professor, Indira Gandhi University Meerpur Rewari India

Crime forecasting for the future is a process that find out the crime rate change from one year to the next years and project those changes in the future. Crime is an offence in the society and it has been observed that crime is committing by the criminals at any place, time and form. So need to predict those crime events, it can save the lives of persons. One of the well-known crimes in world is attack on September 11, 2001 on world trade Center. Crime not only effect on the individual one but it affect to the people of the whole country as well. In this regard the enforcement agencies and researchers need to performed and also having a responsibility to analyze the crime events from the voluminous crime data set. Crime analysis is the crucial trend for the police department about the prediction of crime, their associated information that includes types of crime, probable methods and location of crime. Proposing a crime prediction model for crime detection, crime visualization, crime prevention and crime prediction using big data techniques provide accurate visualization of data and perform computation fastly on MapReduce tool of Hadoop framework.


Data Science Applications for Hadoop

Machine Learning in Big Data – Look Forward or Be Left Behind

Bill Porto, RedPoint Global Inc., Senior Engineering Analyst

Applying machine learning to Big Data is something many strive for and few achieve – yet. Creating models to predict customer response or to segment customer data into set categories are “predictable” use cases. Taking data, discovering what it can tell you, and creating a model and use for it sounds simple enough. It’s a start, but not enough to impact sustainable revenue or cost advantage for your enterprise.


This session will cover the mission critical questions related to model choice, viability horizon, practical design alternatives, learning from on-the-fence model factors, and opportunities for automating access to changing data and netting-out error and noise…


Hadoop and the Internet of Things

Hadoop Everywhere: Geo-Distributed Storage for Big Data

Nikhil Joshi,Consultant Product Manager & Priya Lakshminarayanan, Director Product Management, EMC

Traditionally, HDFS provides robust protection against disk failures, node failures and rack failures. The mechanisms to protect data against entire datacenter failures and outages leave much to be desired. Neither the storage substrate (HDFS), nor the applications on top (MapReduce, Hive, HBase) are capable of running across geographies/data-centers. With Hadoop’s increased enterprise adoption, there is greater need to protect business critical datasets in Hadoop clusters. This is motivated in large part by compliance, regulation, data protection and business continuity planning…

Hadoop Application Development: Dev Languages, Scripting, SQL and NoSQL

Cooperative data exploration with IPython notebook

Piotr Lusakowski,, Senior Software Engineer

In this talk we’ll take a deep dive into how an IPython notebook can be connected to a running Spark application and how it can be used for data exploration and debugging. IPython notebook is an established tool in the Data Science community and embedding the application’s Spark Context within it, can speed up development and limit errors. One possible usage model is to share a set of precomputed RDDs cached in Spark’s cluster memory between multiple users. This approach reduces the resource usage, since the precomputation happens only once and the cached data is not replicated for each user. Sharing SQL contexts allows instantaneous access to temporary results by other users…


Hadoop Governance, Security, Deployment and Operations

Advanced execution visualization of Spark jobs

Zoltán Zvara & Marton Balassi, Hungarian Academy of Sciences Researcher, Developer

Understanding the physical plan of a big data application is often crucial for tracking down bottlenecks and faulty behavior. Apache Spark although offering useful Web UI component for monitoring and understanding the logical plan of the jobs, lacks a tool that helps to understand the physical plan of the task scheduler and the possibility to monitor execution at a very low level, along with the communication triggered by RDDs and remote block-requests. We propose a tool that allows users to real-time monitor and later to replay, examine job executions on any cluster currently supported by Spark….


The Future of Apache Hadoop  

Overview of Apache Flink: the 4G of Big Data Analytics Frameworks

Slim Baltagi, Capital One, Financial Corporation Director of Big Data Engineering & Fellow

This is an introductory level talk about Apache Flink: a multi-purpose Big Data analytics framework leading a movement towards the unification of batch and stream processing or stream processing-first in the open source. With the many technical innovations it brings along with its unique vision and philosophy, it is considered the 4 G (4th Generation) of Big Data Analytics frameworks providing the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases….

What’s next?

The winning sessions from the community vote (above) will be combined with a set of content that is being curated by a group of Hadoop experts and veterans by our content selection committees. Once the schedule has been created, we will post on the website in early January!  Hadoop Summit  Partner sponsorship opportunities are now open for business, and registration is open for attendees.


This is the eighth Hadoop Summit we’ve hosted and the content this year is very strong. No matter where you are in your journey with Apache Hadoop, just learning and exploring or full production there will be sessions you can learn from and take away practical usable advice. You’ll also be able to interact with to core engineers and architects from across the various Apache projects that make up an enterprise grade Hadoop platform.


We hope to see you there!


Register for Super Early bird here


For the latest on all Hadoop Summit announcements, follow @hadoopsummit on Twitter and on Facebook.


Leave a Reply

Your email address will not be published. Required fields are marked *