Apache Hadoop 2.0: Operations Management with the Hortonworks Data Platform

This four-day Apache Hadoop 2.0 training course is designed for administrators who deploy and manage Apache Hadoop 2.0 clusters. Through a combination of lecture and hands-on exercises you will learn how to install, configure, maintain and scale your Hadoop 2.0 environment.  At the end of this course you will have a solid understanding of how Hadoop works with Big Data and through the hands-on exercises will have completed the Hadoop deployment lifecycle for a multi-node cluster.

After successfully completing this training course each student will receive one free voucher for the Hortonworks Certified Apache Hadoop Administrator Exam.

Duration | Objectives | Audience | Course Outline

Duration

The Hadoop Administration course spans four days and provides a solid foundation for management of your Hadoop clusters..  A full outline is below.

Objectives

After completing this course, students should be able to:

  • Describe various tools and frameworks in the Hadoop 2.0 ecosystem
  • Describe the Hadoop Distributed File System (HDFS) architecture
  • Install and configure an HDP 2.0 cluster
  • Use Apache Ambari to monitor and manage a cluster
  • Write and store files in HDFS
  • Perform a file system check
  • Configure a file replication factor
  • Mount HDFS on a local file system using the NFS Gateway
  • Deploy and configure YARN on a cluster
  • Configure MapReduce
  • Troubleshoot a MapReduce job
  • Schedule YARN jobs
  • Configure the capacity and fair schedulers of the ResourceManager
  • Move data between HDP clusters
  • Access a cluster over HTTP
  • Configure a Hive server
  • Use Sqoop to transfer data between Hadoop and relational databases
  • Use Flume to ingest streaming data into HDFS
  • Deploy and run an Oozie workflow
  • Commission and decommission worker nodes
  • Use the HDFS snapshot feature
  • Configure a cluster to be rack-aware
  • Implement and configure NameNode HA
  • Secure a Hadoop cluster

Prerequisites

This course utilizes a Linux environment.  Attendees should know how to navigate and modify files within a Linux environment. Existing knowledge of Hadoop is not required.

Audience

This course is designed for IT administrators and operators responsible for installing, configuring and supporting an Apache Hadoop 2.0 deployment in a Linux environment.

Course Outline

Day 1:

  • Introduction to HDP and Hadoop 2.0
  • HDFS Architecture
  • Installation Prerequisites and Planning
  • Configuring Hadoop
  • Ensuring Data Integrity

Day 2:

  • HDFS NFS Gateway
  • YARN Architecture and MapReduce
  • Job Schedulers
  • Enterprise Data Movement
  • HDFS Web Services

Day 3:

  • Hive Administration
  • Transferring data with Sqoop
  • Moving Log Data with Flume
  • Workflow Management: Oozie
  • Monitoring HDP 2.0 Services

Day 4:

  • Commissioning and Decommissioning Nodes
  • Backup & Recovery
  • Rack Awareness and Topology
  • NameNode High-Availability (HA) 
  • Securing HDP

Tuning & BenchmarkingPricing

  • $2795
Please contact us for any questions on Operations Management with the Hortonworks Data Platform or would like to discuss a custom, on-site training course.

Resources

Upcoming Classes

Thank you for subscribing!