HDP Operations: Install and Manage with Apache Ambari

Overview

This course is designed for administrators who will be managing the Hortonworks Data Platform (HDP). It coversinstallation, configuration, maintenance, security and
performance topics.

Duration

4 days

Prerequisites

Attendees should be familiar with with Hadoop and Linux environments. 

Target Audience

IT administrators and operators responsible for installing, configuring and supporting an Apache Hadoop 2.0 deployment in a Linux environment.

Format

50% Instructor-led lecture/discussion, 50% hands-on labs.

Course Objectives

After completing this course, students should be able to:

  • Describe various tools and frameworks in the Hadoop 2.0ecosystem
  • Describe the Hadoop Distributed File System (HDFS)architecture
  • Install and configure an HDP 2.0 cluster
  • Use Ambari to monitor and manage a cluster
  • Describe how files are written to and stored in HDFS
  • Perform a file system check using command line and browser-based tools
  • Configure the replication factor of a file
  • Mount HDFS to a local filesystem using the NFS Gateway
  • Deploy and configure YARN on a cluster
  • Configure and troubleshoot MapReduce jobs
  • Describe how YARN jobs are scheduled 
  • Configure the capacity and fair schedulers of theResourceManager
  • Use WebHDFS to access a cluster over HTTP
  • Configure a Hiveserver
  • Describe how Hive tables are created and populated
  • Use Sqoop to transfer data between Hadoop and a relational database
  • Use Flume to ingest streaming data into HDFS
  • Deploy and run an Oozie workflow
  • Commission and decommission worker nodes
  • Configure a cluster to be rack-aware
  • Implement and configure NameNode HA
  • Secure a Hadoop cluster

Lab Content

Students will work through the following lab exercises using the Hortonworks Data Platform:

  • Install HDP 2.X using Ambari
  • Add a new node to the cluster
  • Stop and start HDP services
  • Use HDFS commands
  • Verify data with block scanner and fsck
  • Mount HDFS to a local file system
  • Troubleshoot a MapReduce job
  • Configure the capacity scheduler
  • Use distcp to copy data from a remote cluster
  • Use WebHDFS
  • Use Hive tables
  • Use Sqoop to transfer data
  • Install and test Flume
  • Run an Oozie workflow
  • Commission and decommission a worker node
  • Use HDFS snapshots
  • Configure rack awareness
  • Implement NameNode HA
  • Secure an HDP cluster

Additional Information

Resources

Upcoming Courses

See our Schedule