HDP Developer: Windows

This course is designed for developers who create applications and analyze Big Data in Apache Hadoop on Windows using Pig and Hive. Topics include: Hadoop, YARN, the Hadoop Distributed File System (HDFS), MapReduce, Sqoop and theHiveODBC Driver


4 days


Students should have programming experience, preferably with Visual Studio and SQL, as well as familiarity with the Windows Server operating system. No prior Hadoop knowledge is required.

Target Audience

Software developers who need to understand and develop applications for Hadoop 2.x on Windows.

Course Objectives

At the completion of the course students will be able to:

  • Describe Hadoop and Hadoop and YARN  
  • Describe the Hadoop ecosystem 
  • List Components & deployment options for HDP onWindows  
  • Describe the HDFS architecture
  • Use the Hadoop client to input data into HDFS 
  • Transfer data between Hadoop and Microsoft SQL Server
  • Describe the MapReduce and YARN architecture
  • Run a MapReduce job on YARN 
  • Write a Pig script 
  • Define advanced Pig relations 
  • Use Pig to apply structure to unstructured Big Data 
  • Invoke a Pig UserDefined Function 
  • Use Pig to organize and analyze Big Data 
  • Describe how Hive tables are defined and implemented 
  • Use Hive windowing functions 
  • Define and use Hive file formats 
  • Create Hive tables that use the ORC file format 
  • Use Hive to run SQLlike queries to perform data analysis 
  • Use Hive to join datasets 
  • Create ngrams and context ngrams using Hive 
  • Perform data analytics 
  • Use HCatalog with Pig and Hive 
  • Install and configure HiveODBC Driver for Windows 
  • Import data from Hadoop into Microsoft Excel 
  • Define a workflow using Oozie 


Hortonworks offers a comprehensive certification program that identifies you as an expert in Apache Hadoop. Visit Certification for more information.

Hortonworks  University
Hortonworks University is your expert source for Apache Hadooptraining and certification. Public and private on-site courses areavailable for developers, administrators, data analysts and otherIT professionals involved in implementing big data solutions.Classes combine presentation material with industry-leading hands-on labs that fully prepare students for real-world Hadoop scenarios.

Please contact us at trainingops@hortonworks.com for any questions on Apache Hadoop training courses or if you would like to discuss an on-site training course.


Upcoming Courses

See our Schedule