INSTRUCTOR-LED COURSE

HDP Overview Apache Hadoop Essentials

Course Information

Duration: 1 day

Version: HW HDP-123

Price: $700.00

Certification:

Exam:

Learning Credits:

ALL DATES GUARANTEED

Check out our full list of training locations and learning formats. Please note that the location you choose may be an Established HD-ILT location with a virtual live instructor.

COURSE DELIVERY OPTIONS

  • Live Classroom

Train face-to-face with the live instructor.

  • Established HD-ILT Location

Interact with a live, remote instructor from a specialized, HD-equipped classroom near you.​

  • Virtual Remote

Attend the live class from the comfort of your home or office.

Register

OVERVIEW

This course provides a technical overview of Apache Hadoop. It includes high-level information about concepts, architecture, operation, and uses of the Hortonworks Data Platform (HDP) and the Hadoop ecosystem. The course provides an optional primer for those who plan to attend a hands-on, instructor-led courses.

Prerequisites:

No previous Hadoop or programming knowledge is required. Students will need browser access to the Internet.

 

Target Audience:

Data architects, data integration architects, managers, C-level executives, decision makers, technical infrastructure team, and Hadoop administrators or developers who want to understand the fundamentals of Big Data and the Hadoop ecosystem.

 

Course Objectives:

  • Describe the use case for Hadoop
    • Identify Hadoop Ecosystem architectural categories
    • Data Management
    • Data Access
    • Data Governance and Integration
    • Security
    • Operations
  • Detail the HDFS architecture
  • Describe data ingestion options and frameworks for batch and real-time streaming
  • Explain the fundamentals of parallel processing
  • See popular data transformation and processing engines in action
    • Apache Hive
    • Apache Pig
    • Apache Spark
  • Detail the architecture and features of YARN
  • Describe how to secure Hadoop

 

Course Outine:

Day 1: HDP Overview: Apache Hadoop Essentials


OBJECTIVES

  • The Case for Hadoop
  • The Hadoop Ecosystem
  • HDFS Architecture
  • Ingesting Data
  • Parallel Processing
  • Apache Hive Overview
  • Apache Pig Overview
  • Apache Spark Overview
  • YARN Architecture
  • Hadoop Security

DEMONSTRATIONS

  • Operational Overview with Ambari
  • Loading Data into HDFS
  • Streaming Data into HDFS
  • Processing with MapReduce
  • Data Manipulation with Hive
  • Risk Analysis with Pig
  • Risk Analysis with Spark
  • Securing Ranger with Hive