Developing Apache Hadoop Applications in Java

Hortonworks Certified Developer for Apache Hadoop This 4 days hands-on training course takes a deep-dive into developing Java MapReduce applications for Big Data deployed on the Hadoop Distributed File System (HDFS). Students who attend this course will learn how to harness the power of Apache Hadoop™ and MapReduce to manipulate, analyze and perform computations on their Big Data.


This course assumes students have experience developing Java applications and using a Java IDE. Labs are completed using the Eclipse IDE and Maven.

Target Audience

Experienced Java developers responsible for developing MapReduce applications and performing analysis of Big Data stored on Apache Hadoop.

Course Objectives

At the completion of the course, student will be enabled to perform the following:

  • Write a Java MapReduce application using Eclipse and Maven
  • Develop a Combiner to perform map aggregation
  • Customize input and output formats of a MapReduce job
  • Compute mathematical computations on your Big Data files
  • Use best practices to optimize MapReduce jobs
  • Create JUnit tests for a MapReduce job
  • Discover trends in your Big Data
  • Define an Oozie workflow
  • Access Apache HBase™ data from a Java MapReduce job
  • Write custom, user-defined functions for Apache Pig™ and Apache™ Hive

Lab Content

Students will work through the following exercises using Eclipse, Maven and the Hortonworks Data Platform:

  • Configuring a Hadoop ™ Development Environment
  • Word Count
  • Distributed Grep
  • Inverted Index
  • Using a Combiner
  • Computing an Average
  • Writing a Custom Partitioner
  • Using a TotalOrderPartitioner
  • Custom Sorting
  • Writing a Custom InputFormat
  • Customizing Output
  • Simple Moving Average
  • Using Data Compression
  • Defining a RawComparator
  • A Map-Side Join
  • Using a Bloom Filter
  • Unit Testing
  • Defining an Oozie Workflow
  • Term Frequency–Inverse Document Frequency (TF-IDF)
  • Accessing HBase from Java MapReduce
  • Writing a User-Defined Pig Function
  • Writing a User-Defined Hive Function


  • $2795
  • Students who complete a paid reservation at least two weeks prior to the start of the course will enjoy a 10% discount
  • Note that discounts cannot be combined
Refer to the Developing Apache Hadoop Solutions for Java Developers data sheet for additional course details. Please contact us for any questions on Apache Hadoop training courses or would like to discuss a custom, on-site training course.



Upcoming Classes

Thank you for subscribing!