Hadoop 2.0 Java Developer Certification

References for Certification Candidates

Intended Audience

This certification is intended for developers who design, develop and architect Hadoop-based solutions written in the Java programming language. Candidates for this exam are software developers and engineers who understand and can develop all aspects of a MapReduce application using the Hadoop Java API.

Exam Format

The Certified Apache Hadoop 2.x exam consists of 50 open response and multiple-choice questions. The exam is delivered in English.

Practice Exams

Certification candidates may take two practice exams at no charge. Register at the certification site.

Exam Scheduling

This exam is administered through Kryterion, Inc. The exam can be sat at authorized testing center or via remote proctoring. For additional information and to register please visit our certification site.

Time Limit

The time allotted for the exam is 90 minutes.

Passing Score

A passing score is  75%.

Retake Policy

If a candidate does not pass an exam on the first attempt, he or she may register and sit the exam as soon as the final score is delivered.   After the second attempt a candidate must wait 7 calendar days from their original appointment time before he or she can register to retake the exam. Should a candidate need to retake the exam again there will be a 10 day waiting period. Once the exam is Passed, a candidate may not make any further attempts.

Exam Security

Hortonworks reserves the right to refuse certifying a candidate who violates exam security policies. This includes copying and redistribution of exam material, using any type of study material during the exam itself, attempting to photograph exam items and taking an exam using a false identity.

Other Certifications

Hortonworks offers certification for Administrators and Hadoop Developers who are developing in Hive and Pig.

Courses to Prepare

The following courses can help prepare a certification candidate for the Hadoop 2.0 Developer Certification. Course participation is encouraged but not required. Any student who attends these courses will receive a voucher to cover the cost of one certification attempt:

Core Exam Topic Areas

Objective 1.1 – HDFS and MapReduce

  • Understand how the NameNode maintains the filesystem metadata
  • Understand how data is stored in HDFS
  • Understand the WebHDFS commands
  • Understand the hadoop fs command
  • Understand the relationship between NameNodes and DataNodes
  • Understand the relationship between NameNodes and namespaces in Hadoop 2.0
  • Understand how HDFS Federation works in Hadoop 2.0
  • Understand the various components of NameNode HA in Hadoop 2.0
  • Understand the architecture of MapReduce
  • Understand the various phases of a MapReduce job
  • Demonstrate how key/value pairs flow through a MapReduce job
  • Write Java Mapper and Reducer classes
  • Use the org.apache.hadoop.mapreduce.Job class to configure a MapReduce job
  • Use the TextInputFormat and TextOutputFormat classes
  • Write a custom InputFormat
  • Configure a Combiner
  • Define a custom Combiner
  • Define a custom Partitioner
  • Use the Distributed Cache
  • Use the CompositeInputFormat class

Objective 2.1 – YARN

  • Understand the architecture of YARN
  • Understand the components of the YARN ResourceManager
  • Demonstrate the relationship between NodeManagers and ApplicationMasters
  • Demonstrate the relationship between ResourceManagers and ApplicationMasters
  • Explain The relationship between Containers and ApplicationMasters
  • Explain how Container failure is handled for a YARN MapReduce job

Objective 3.1 – Pig and Hive

  • Differentiate between Pig data types, including the complex types bag, tuple and map
  • Define Pig relations
  • Write a User-Defined Pig Function
  • Invoke a Pig UDF
  • Explain how Hive tables are defined and implemented
  • Manage External vs. Hive-managed tables
  • Write a User-Defined Hive Function
  • Invoke a Hive UDF

Objective 4.1 – Hadoop 2.0

  • Understand the relationship between NameNodes and DataNodes
  • Understand the relationship between NameNodes and namespaces in Hadoop 2.0
  • Explain how HDFS Federation works in Hadoop 2.0
  • Demonstrate understanding of the various components of NameNode HA in Hadoop 2.0

Objective 5.1 – HBase

  • Use the HBase API to add or delete a row to an HBase table
  • Use the HBase API to retrieve data from an HBase table

Objective 6.1 – Workflow

  • Understand Oozie workflow actions
  • Deploy Oozie workflows
  • Use the org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl class to define a workflow
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Thank you for subscribing!