Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

Intro to Alluxio and Spark

6:00 PM- 6:30 PM: drinks, mingling

6:30 PM – 8:30PM: Intro to Alluxio and Spark

Brief Description:

Alluxio (formerly Tachyon) is a memory speed virtual distributed storage system and leverages memory for managing data across different storage. Many deployments use Alluxio with Spark because Alluxio helps Spark be more effective and further accelerate applications. We discuss how Alluxio helps Spark be more effective and describe different type of production deployments, involving Mesos, Cloud, onPrem, where Alluxio and Spark are working together.

Abstract:

Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system and leverages memory for managing data and accelerating access to data across different storage systems. Alluxio has a quickly growing open source community of developers and users. Many deployments use Alluxio with Spark, and some of them scale out to over PB’s of data.
While Spark is gaining great adoption in the big data ecosystem, Alluxio enables Spark to be even more effective. Alluxio provides a unified namespace of data from various different storage systems, which is convenient for application developers. Alluxio also uses memory to store hot data for applications for fast access to important data. Even while Spark has in-memory cache, Alluxio in-memory storage can further improve Spark applications.
In this talk, we introduce Alluxio, discuss how Alluxio helps Spark be more effective, show benchmark results with Spark RDDs and DataFrames, and describe production deployments, involving Mesos, Cloud, onPrem, where Alluxio and Spark are working together. The Demo will also include accessing storage platforms such as S3 and Hadoop (HDFS), with Spark jobs execute in Apache Zeppelin, powered by Alluxio.

Speaker:

Ancil McBarnett is the Sales Engineer Lead in the East for Alluxio, which produces a memory speed virtual distributed file storage system, uniquely positioned to drive next generation analytics. Prior to Alluxio he was at Hortonworks as a Security and Hive SME, helping different customers kickstart their Hadoop journey and before that, he was the Architect Manager for a state agency responsible for the sharing of secure and sensitive data among first responder and justice systems, where security was a priority. Since joining Alluxio, he has worked mainly with Financial Service providers who are looking to utilize Alluxio as the ideal platform to bridge compute frameworks, especially in containerized environments such as Mesos, with multiple storages such as S3, HDFS and CEPH.

Thursday, September 21, 2017
1601 Market Street 19th Floor - wework, Philadelphia, PA