Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

FOD Meetup #3 : Data Science, Spark with RapidMiner and Serverless

Bonjour à tous,

Le prochain Meetup Future Of Data aura lieu le 08 Juin à partir de 18h30 chez D2SI. Merci à D2SI de nous accueillir dans leurs locaux et merci à RapidMiner et Hortonworks qui offrent le buffet de clôture.


Au programme, trois présentations en Français suivis d’un apéro networking:

18h30 – 19h00 : Accueil des participants

19h00 – 19h30 : What methodology to adopt for a Data Science project? (Amélie Groud – FastConnect)

19h30 – 20h00 : Apache Spark with RapidMiner demo (Jess Kilubu – RapidMiner and Dalila Messedi – Itecor)

20h00 – 20h30 : Ooso: MapReduce the Serverless way (Nicolas Monchy and Othmane Nahyl – D2SI)

20h30 – 21h00 : Apéro


What methodology to adopt for a Data Science project? – Amélie Groud – Data Scientist, FastConnect

The key to the success of a Data Science project is mastering the data source. It is important to know the structure of the data and the relationships that exist between the different variables in order to build the best mathematical model. For this, many statistical tools are available but you need the right methodology to not fall into certain traps.

Apache Spark with RapidMiner demo – Jess Kilubu – Inside Sales Rep, RapidMiner / Dalila Messedi – Senior Data Consultant, Itecor

Learn how SparkRM improves performance and increases productivity when working in-Hadoop clusters with parallel loops and the ability to bootstrap algorithms.

Ooso: MapReduce the Serverless way – Nicolas Monchy – Data engineering consultant, D2SI / Othmane Nahyl – Data engineering intern, D2SI

What if we could mix Severless and Big Data ? Serverless gives us huge scalability and parallelization and it sounds like something we could use in Big Data. This is why Ooso was developed. It is a Java library that lets you do MapReduce in a serverless way based on AWS Lambda and Amazon S3 ( All you need to implement is your map and reduce functions. What about the performance and the limitations ? This is what will be discussed during the talk while a demo will be performed.

Thursday, June 8, 2017
D2SI: 29 bis rue d’Astorg, 75008, 75008 Paris