Explore a Hadoop based architecture that leverages streaming, ETL and machine learning components of the stack to deliver real time predictive analytics. We will take a real use case from a customer, which processes & visualizes a real time stream of events with Storm, trains a predictive model with Spark ML libs on YARN and plugs the Spark model back into Storm to make real-time predictions. Automating the above pipeline leads to a closed loop, real time predictive analytics solution, which is applicable to several industries & use cases.
As a Solutions Engineer at Hortonworks, Mac helps enterprises improve the scale, performance, and cost effectiveness of their Big Data applications using various components from HDP, the industry leading open source Hadoop distribution. Prior to joining Hortonworks, Mac served as a Solutions Architect with multiple vendors in the In-Memory/Big Data space and also previously served as a Director of Information Systems in the higher-education sector. He has over 13 years of experience designing, developing, and integrating enterprise systems where performance and scalability are essential.
Shane’s relentless curiosity for distributed computing has led to wearing many hats throughout his career; from Linux guru, to managing high volume web properties, to data, to software dev. As a Solutions Engineer at Hortonworks, he helps customers get the most value out of the only 100% open source Hadoop distribution, the Hortonworks Data Platform. Through real world experience, Shane has gained a strong foundation across many verticals, with experience ranging from fortune 100s to start ups. Shane has had the luxury of working with Hadoop full time since 2010, managing everything from single digit and two hundred node plus Hadoop clusters, and participating in the evolution of the modern data architecture.