HDP Operations: Install and Manage with Apache Ambari


This course is designed for administrators who will be managing the Hortonworks Data Platform (HDP). It coversinstallation, configuration, maintenance, security and
performance topics.


4 days


Attendees should be familiar with with Hadoop and Linux environments. 

Target Audience

IT administrators and operators responsible for installing, configuring and supporting an Apache Hadoop 2.0 deployment in a Linux environment.


50% Instructor-led lecture/discussion, 50% hands-on labs.

Course Objectives

After completing this course, students should be able to:

  • Describe various tools and frameworks in the Hadoop 2.0ecosystem
  • Describe the Hadoop Distributed File System (HDFS)architecture
  • Install and configure an HDP 2.0 cluster
  • Use Ambari to monitor and manage a cluster
  • Describe how files are written to and stored in HDFS
  • Perform a file system check using command line and browser-based tools
  • Configure the replication factor of a file
  • Mount HDFS to a local filesystem using the NFS Gateway
  • Deploy and configure YARN on a cluster
  • Configure and troubleshoot MapReduce jobs
  • Describe how YARN jobs are scheduled 
  • Configure the capacity and fair schedulers of theResourceManager
  • Use WebHDFS to access a cluster over HTTP
  • Configure a Hiveserver
  • Describe how Hive tables are created and populated
  • Use Sqoop to transfer data between Hadoop and a relational database
  • Use Flume to ingest streaming data into HDFS
  • Deploy and run an Oozie workflow
  • Commission and decommission worker nodes
  • Configure a cluster to be rack-aware
  • Implement and configure NameNode HA
  • Secure a Hadoop cluster

Lab Content

Students will work through the following lab exercises using the Hortonworks Data Platform:

  • Install HDP 2.X using Ambari
  • Add a new node to the cluster
  • Stop and start HDP services
  • Use HDFS commands
  • Verify data with block scanner and fsck
  • Mount HDFS to a local file system
  • Troubleshoot a MapReduce job
  • Configure the capacity scheduler
  • Use distcp to copy data from a remote cluster
  • Use WebHDFS
  • Use Hive tables
  • Use Sqoop to transfer data
  • Install and test Flume
  • Run an Oozie workflow
  • Commission and decommission a worker node
  • Use HDFS snapshots
  • Configure rack awareness
  • Implement NameNode HA
  • Secure an HDP cluster

Additional Information


Upcoming Courses

See our Schedule