Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
cta
HDP Developer: Enterprise Apache Spark I

Overview

This course is designed as an entry point for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Spark. Topics include: An overview of the Hortonworks Data Platform (HDP), including HDFS and YARN; using Spark Core APIs for interactive data exploration; Spark SQL and DataFrame operations; Spark Streaming and DStream
operations; data visualization, reporting, and collaboration; performance monitoring and tuning; building and deploying Spark applications; and an introduction to the Spark Machine Learning Library.

Prerequisites

Students should be familiar with programming principles and have previous experience in software development using either Python or Scala. Previous experience with data streaming, SQL, and HDP is also helpful, but not required.

Target Audience

Software engineers that are looking to develop in-memory applications for time sensitive and highly iterative applications in an Enterprise HDP environment.


1
Day

An Introduction to Zeppelin and RDDs

Objectives

  • HDP Overview for Developers
  • Overview ofApacheZeppelin and Spark
  • Working with RDDs
  • Pair RDDs

Labs

  • Using HDFS Commands
  • Introduction to Spark REPLs and Zeppelin
  • Create and Manipulate RDDs
  • Create and Manipulate Pair RDDs

Spark Streaming

Working with Data Visualization

An Introduction to Machine Learning with Spark

Live Training

LIVE CLASS
DATE & TIME
LOCATION
REGISTER
Pages:
1
2