Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
cta
HDP Developer: Apache Pig and Hive

Overview

This 4 day training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core and Spark SQL.

Prerequisites

Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.

Target Audience

Software developers who need to understand and develop applications for Hadoop.

1
Day

An Introduction to the Hadoop Distributed File System

Objectives

  • Understanding Hadoop
  • The Hadoop Distributed File System
  • Ingesting Data into HDFS
  • The MapReduce Framework

Labs

  • Starting an HDP Cluster
  • Demonstration: Understanding Block Storage
  • Using HDFS Commands
  • Importing RDBMS Data into HDFS
  • Exporting HDFS Data to an RDBMS
  • Importing Log Data into HDFS Using Flume
  • Demonstration: Understanding MapReduce
  • Running a MapReduce Job

An Introduction to Apache Pig

An Introduction to Apache Hive

Working with Spark Core, Spark SQL and Oozie

Live Training

LIVE CLASS
DATE & TIME
LOCATION
REGISTER