Lorem ipsum dolor sit amet, conse ctetur adip elit, pellentesque turpis.

  • No products in the cart.

Image Alt

Big Data Administration Using Cloudera

Big Data Administration Using Cloudera

Categories:
Big Data
Reviews:

Overview:

This Big Data Administrator training course is based on Cloudera distribution. With this Admin training, the participants learn to install, configure, maintain and monitor versatile frameworks bundled with Cloudera distribution including HDFS, YARN, Sqoop, Flume, Pig, Hive, Spark, Kafka and Impala.

Course Structure

The program is focussed on Cloudera Hadoop Cluster Administration using 5.x distribution. Below points provide a high-level overview of the course: 

  • Introduction to Cloudera Hadoop Administrator using Cloudera Manager
  • Understand how Cloudera production deployment can be setup
  • Install, Configure, Manage, Secure, Test and Troubleshoot Hadoop Cloudera Cluster
  • Manage and secure production grade Hadoop Cloudera Cluster using Kerberos and Sentry

The intended audience for this course:

  • Big Data Administrator
  • DevOps
  • Big Data Architects
Introduction to Hadoop and Spark Ecosystem
  • Big Data Overview
  • Key Roles in Big Data Project
  • Key Business Use cases
  • Hadoop and Spark Logical Architecture
  • Typical Big Data Project Pipeline
Introduction
    • Roles in Big Data Project
    • Types of Administrators
    • Responsibilities of Administrator
    • Why Hadoop and Spark?
    • Core Hadoop Components
    • Fundamental Concepts
    • Logical Architecture of Hadoop and Spark
    • Use Cases
HDFS
    • Introduction
    • HDFS Physical Architecture
    • Why HDFS?
    • Limitations
    • Using the Namenode Web UI
    • Using the Hadoop Commands on Shell
Data Acquisition
    • Ingesting Data from Relational Databases with Sqoop
    • Flume
Kafka
    • Overview
    • Ecosystem
    • Connect API
    • Integrating HDFS  
    • REST Interfaces
YARN and MapReduce
    • What Is MapReduce?
    • Basic MapReduce Concepts
    • YARN Overview
    • YARN Cluster Architecture
    • YARN Concepts
    • Resource Allocation
    • Failure Recovery
    • Using the YARN Web UI
Cloudera Manager
    • Why Cloudera Manager?
    • Cloudera Manager Features
    • Cloudera Manager Installation
    • Installing CDH Using Cloudera Manager
    • Performing Basic Administration Tasks Using Cloudera Manager
Capacity Planning
    • Things to consider
    • Choosing the Right Hardware
    • Configuring Nodes
    • Planning for Cluster Management
Hadoop Installation and Initial Configuration
    • Deployment Types
    • Installing Hadoop
    • Specifying the Hadoop Configuration
    • Performing Initial HDFS Configuration
    • Performing Initial YARN and MapReduce Configuration
    • Hadoop Logging
Installing and Configuring Hive, Impala, and Pig
    • Hive
    • Impala
    • Pig
Hadoop Clients
    • What is a Hadoop Client?
    • Installing and Configuring Hadoop Clients
    • Installing and Configuring Hue
    • Hue Authentication and Author
Advanced Cluster Configuration
    • Advanced Configuration Parameters
    • Configuring Hadoop Ports
    • Explicitly Including and Excluding Hosts
    • Configuring HDFS for Rack Awareness
    • Configuring HDFS High Availability
Managing and Scheduling Jobs
    • Managing Running Jobs
    • Scheduling Hadoop Jobs
    • Configuring the FairScheduler
    • Impala Query Scheduling
Cluster Maintenance
    • Checking HDFS Status
    • Copying Data Between Clusters
    • Adding and Removing Cluster Nodes
    • Rebalancing the Cluster
    • Cluster Upgrading
Cluster Monitoring and Troubleshooting
    • General System Monitoring
    • Monitoring Hadoop Clusters
    • Common Troubleshooting Hadoop Clusters
    • Common Misconfigurations
Hadoop Security Overview
    • Basics of Security
    • Hadoop’s Security System Concepts
    • What Kerberos Is?
    • Securing a Hadoop Cluster with Kerberos
    • How does Kerberos work?
    • Installation
    • Configuration
    • Sentry Overview
    • Hands-on

Participants should preferably have prior Software development experience along with basic knowledge of SQL and Unix commands. Knowledge of Python/Scala would be a plus.

Course Information

Duration

4 Days / 5 Days

Mode of Delivery

Instructor led/Virtual

Level

Intermediate

Have more queries?Our representative will get back to you!



Fill up the form to download the course PDF

Your Name (required)

Your Email (required)

Phone (required)