HDP Operations: Install and Manage with Apache Ambari

Overview

This course is designed for administrators who will be managing the Hortonworks Data Platform (HDP) 2.2. It covers installation, configuration, maintenance, security and performance topics.

Duration

4 days

Prerequisites

Attendees should be familiar with with Hadoop and Linux environments. 

Target Audience

IT administrators and operators responsible for installing, configuring and supporting an HDP 2.2 deployment in a Linux environment.

Format

50% Lecture/Discussion
50% Hands-on Labs

Course Objectives

After completing this course, students should be able to:

  • Describe various tools and frameworks in the Hadoop 2.x ecosystem
  • Understand support for various types of cluster deployments
  • Understand storage, network, processing, and memory needs for a Hadoop cluster
  • Understand provisioning and post deployment requirements
  • Describe Ambari Stacks, Views, and Blueprints
  • Install and configure an HDP 2.2 cluster using Ambari
  • Understand the Hadoop Distributed File System (HDFS)
  • Describe how files are written to and stored in HDFS
  • Explain Heterogeneous Storage support for HDFS
  • Use HDFS commands
  • Perform a file system check using command line
  • Mount HDFS to a local file system using the NFS Gateway
  • Understand and configure YARN on a cluster
  • Configure and troubleshoot MapReduce jobs
  • Understand how to utilize Capacity Scheduler
  • Utilize cgroup and node labeling
  • Understand how Slider, Kafka, Storm and Spark run on YARN
  • Use WebHDFS to access HDFS over HTTP
  • Understand how to optimize and configure Hive
  • Use Sqoop to transfer data between Hadoop and a relational database
  • Use Flume to ingest streaming data into HDFS
  • Understand how to use Oozie and Falcon
  • Commission and decommission worker nodes
  • Configure a cluster to be rack-aware
  • Understand NameNode HA and ResourceManager HA
  • Secure a Hadoop cluster

Students will work through the following lab exercises using the Hortonworks Data Platform:

Hands-On Labs

  • Install HDP 2.2 cluster using Ambari
  • Add new hosts to the cluster
  • Managing HDP services
  • Using HDFS commands
  • Verify data with Block Scanner and fsck
  • Troubleshoot a MapReduce job
  • Configuring the Capacity Scheduler
  • Using WebHDFS
  • Using Sqoop
  • Install and test Flume
  • Mounting HDFS to a Local File System
  • Using distcp to copy data from a remote cluster
  • Dataset Mirroring using Falcon
  • Commissioning and Decommissioning Services
  • Using HDFS snapshots
  • Configuring Rack Awareness
  • Configure NameNode HA using Ambari
  • Setting up the Knox Gateway
  • Securing an HDP Cluster

Additional Information

Resources

Upcoming Courses

See our Schedule