Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button

Enterprise Data Science at Scale

Data science holds tremendous potential for organizations to uncover new insights and drivers of revenue and profitability. Big Data has brought the promise of doing data science at scale to enterprises, however this promise also comes with challenges for data scientists to continuously learn and collaborate. Data Scientists have many tools at their disposal such as notebooks like Juypter and Apache Zeppelin & IDEs such as RStudio with languages like R, Python, Scala and frameworks like Apache Spark. Given all the choices how do you best collaborate to build your model and then work through the development lifecycle to deploy it from test into production?

Why Data Science on Big Data?

In this meetup you will cover the attributes of a modern data science platform that empowers data scientists to build models using all the data in their data lake and foster continuous learning and collaboration. We will show a demo of Apache Zeppelin, Apache Spark, Apache Livy and Apache Hadoop with the focus on integration, security and model deployment and management.

Data Science at Scale DEMO

The demo will cover the Data Science life cycle: develop model in team environment, train the model with all the data on a Hadoop cluster, deploy model into production.  The model will be a Spark ML model.

Practical ML Topic: How to Build Recommendation Engines
How can online stores build personalized recommendations for their customers? Learn how to build a recommendation engine, using k-means clustering, and how you can deploy the machine learning model as an API.


Networking and Pizza

6:30- 7:00 PM

Why Data Science on Big Data?

Data Science at Scale Demo

Practical ML Topic: How to Build Recommendation Engines



Polong Lin is a Data Scientist at IBM. He regularly teaches data science at conferences and meetups.

Thursday, November 9, 2017
IBM Centre for Solution Innovation 120 Bloor Street East, Toronto, ON