Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
September 11, 2017
prev slideNext slide

Strata Data Conference New York – Powering business outcomes with data science

The annual Strata Data Conference will make its next stop in New York the week of September 25-29. The core theme of the conference this year is around driving business transformations through the power of data. And in the world of data, few topics excite as much as data science, machine learning, and deep learning. And with good reasons! The combination of the Hadoop big data world and the much older data science, machine learning world is a perfect marriage.

The Big Data Journey

While Apache Hadoop has been around for a decade or more, it is really from 2011 onwards that it was packaged into a platform that adoption really started taking off.

Big Data Innovation
The Big Data Journey

Data at Rest: It brought us the concept of the Data Lake to manage the growing repositories of data at rest. But it remained a batch processing world until Yarn became the Data Operating System of the Hadoop platform. Yarn enabled multiple data engines or workloads to co-exist on the cluster – all accessing the same Data Lake and not a copy. Now we had batch, SQL and quickly others followed.

Data in Motion: The explosion in IoT devices and use cases drove us to require better ways to move the data from the edges to our Data Lake – with full security, lineage and provenance. Apache Nifi came to the fore as that data transportation and logistics layer. But it wasn’t just about the data movement – the emerging use cases for real-time analytics challenged the traditional concept of real-time which was about how fast can we move the data from inception to our place of analytics. Think CDC (change data capture) and other approaches.  The real answer came when we started pushing the analytic down to the edge where the data got created!  Stream processing!

Connected Data Platforms: The next frontier was to not just manage Data at Rest and Data in Motion, but we have to do so on premises, in the cloud, on another cloud and all combined.  Now I can run multiple workloads, batch, Hive, Spark on all my data (at rest and in motion) and have the freedom to run it where I want.

Business Outcomes With Data Science At Scale

In a recent blog on Data Science on HDP, Vinay Shukla and Huzefa Hakim speak of some of the benefits that some of the data science disciplines have when combined with big data. In my session on September 27th at 5:25pm (location 1E 17) I will talk about the next part of the journey to bring these two worlds of Hadoop and Data Science together at scale to deliver business outcomes in a connected world.

Come Meet The Team From Hortonworks

Come visit the Hortonworks team at our booth in the exhibition hall (BOOTH #601, M148, M149) and get a chance to see our Connected Data Platforms in action. You will also get a chance to see the Hortonworks Data Science solution that is powered by HDP and IBM Data Science Experience.



Justin Germain says:

Excited to see everyone at the conference this year!

WalmartOne Associate Login says:

nice post

hindi blog says:

Ramadan wishes

ankush mamgai says:

great news nice article.
Eid Mubarak Wishes

ankush mamgai says:

great news nice article.

Leave a Reply

Your email address will not be published. Required fields are marked *