HDP Administration Fast Track

Overview

This 5-day training course is designed for primarily for systems administrators and platform architects who need to understand HDP cluster capabilities, and manage HDP clusters. Topics include: Understanding HDF capabilities, Apache Hadoop, Apache YARN, HDFS, and other Hadoop ecosystem components. Students will understand how to administer, manage, and monitor HDP clusters.
 

Target Audience

This course is for students who range from having an understanding of server software concepts, to system administrators and platform architects who plan on administering HDP clusters.
 

Prerequisites

Students should be familiar with server or platform software concepts and have a basic understanding of system administration.
 

Course Outline

Day 1: An Introduction to Apache Hadoop and HDFS

  • Big Data, Hadoop and the Hortonworks Data Platform
  • Installing the Hortonworks Data Platform
  • Using HDFS Storage
  • Managing Apache Ambari Users and Groups
  • Managing Hadoop Services
Day 1 Labs:
  • Setting Up the Lab Environment
  • Installing HDP
  • Managing Apache Ambari Users and Groups
  • Managing Hadoop Services
Day 2: Working with HDFS
  • Using HDFS Storage
  • Managing HDFS Storage
  • Adding, Deleting, and Replacing Worker Nodes
  • Configuring Rack Awareness
Day 2 Labs:
  • Using Hadoop Storage
  • Using WebHDFS
  • Using HDFS Access Control Lists
  • Managing Hadoop Storage
  • Managing HDFS Quotas
  • Adding, Decommissioning, and Recommissioning Worker Nodes
  • Configuring Rack Awareness
Day 3: Working with Apache YARN
  • YARN Resource Management
  • YARN Applications
  • YARN Capacity Scheduler
Day 3 Labs:
  • Managing YARN Using Ambari
  • Managing YARN Using CLI
  • Running Sample YARN Applications
  • Setting Up for Capacity Scheduler
  • Managing YARN Containers and Queues
  • Managing YARN ACLs and User Limits
  • YARN Node Labels
Day 4: High Availability, Backups and Configuring Centralized Cache
  • HDFS and YARN High Availability
  • Monitoring a Cluster
  • Protecting a Cluster with Backups
  • Configuring Heterogenous HDFS Storage
  • Managing the HDFS NFS Gateway
  • Configuring HDFS Centralized Cache
Day 4 Labs:
  • Configuring NameNode High Availability
  • Configuring ResourceManager High Availability
  • Managing Apache Ambari Alerts
  • Managing HDFS Snapshots
  • Using DistCP
  • Configuring HDFS Storage Policies
  • Configuring an NFS Gateway
  • Configuring HDFS Centralized Cache
Day 5: Performing a Rolling Upgrade
  • Apache Hive Tuning
  • Managing Workflows Using Apache Oozie
  • Integrating Ambari with LDAP
  • Automating Cluster Provisioning Using Ambari Blueprints
  • Performing an HDP Rolling Upgrade
Day 5 Labs:
  • Configuring Apache Hive High Availability
  • Managing Workflows Using Apache Oozie
  • Integrating Apache Ambari with AD/LDAP
  • Automating Cluster Provisioning using Apache Ambari
  • Performing an HDP Upgrade

SLI Main Menu