Home
Big Data Training
Hadoop Training
Administrator Training for Apache Hadoop Training Course

Administrator Training for Apache Hadoop Training Course

Audience:

The course is intended for IT specialists looking for a solution to store and process large data sets in a distributed system environment

Goal:

Deep knowledge on Hadoop cluster administration.

This course is available as onsite live training in Slovakia or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

1: HDFS (17%)

Describe the function of HDFS Daemons
Describe the normal operation of an Apache Hadoop cluster, both in data storage and in data processing.
Identify current features of computing systems that motivate a system like Apache Hadoop.
Classify major goals of HDFS Design
Given a scenario, identify appropriate use case for HDFS Federation
Identify components and daemon of an HDFS HA-Quorum cluster
Analyze the role of HDFS security (Kerberos)
Determine the best data serialization choice for a given scenario
Describe file read and write paths
Identify the commands to manipulate files in the Hadoop File System Shell

2: YARN and MapReduce version 2 (MRv2) (17%)

Understand how upgrading a cluster from Hadoop 1 to Hadoop 2 affects cluster settings
Understand how to deploy MapReduce v2 (MRv2 / YARN), including all YARN daemons
Understand basic design strategy for MapReduce v2 (MRv2)
Determine how YARN handles resource allocations
Identify the workflow of MapReduce job running on YARN
Determine which files you must change and how in order to migrate a cluster from MapReduce version 1 (MRv1) to MapReduce version 2 (MRv2) running on YARN.

3: Hadoop Cluster Planning (16%)

Principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster.
Analyze the choices in selecting an OS
Understand kernel tuning and disk swapping
Given a scenario and workload pattern, identify a hardware configuration appropriate to the scenario
Given a scenario, determine the ecosystem components your cluster needs to run in order to fulfill the SLA
Cluster sizing: given a scenario and frequency of execution, identify the specifics for the workload, including CPU, memory, storage, disk I/O
Disk Sizing and Configuration, including JBOD versus RAID, SANs, virtualization, and disk sizing requirements in a cluster
Network Topologies: understand network usage in Hadoop (for both HDFS and MapReduce) and propose or identify key network design components for a given scenario

4: Hadoop Cluster Installation and Administration (25%)

Given a scenario, identify how the cluster will handle disk and machine failures
Analyze a logging configuration and logging configuration file format
Understand the basics of Hadoop metrics and cluster health monitoring
Identify the function and purpose of available tools for cluster monitoring
Be able to install all the ecosystem components in CDH 5, including (but not limited to): Impala, Flume, Oozie, Hue, Manager, Sqoop, Hive, and Pig
Identify the function and purpose of available tools for managing the Apache Hadoop file system

5: Resource Management (10%)

Understand the overall design goals of each of Hadoop schedulers
Given a scenario, determine how the FIFO Scheduler allocates cluster resources
Given a scenario, determine how the Fair Scheduler allocates cluster resources under YARN
Given a scenario, determine how the Capacity Scheduler allocates cluster resources

6: Monitoring and Logging (15%)

Understand the functions and features of Hadoop’s metric collection abilities
Analyze the NameNode and JobTracker Web UIs
Understand how to monitor cluster Daemons
Identify and monitor CPU usage on master nodes
Describe how to monitor swap and memory allocation on all nodes
Identify how to view and manage Hadoop’s log files
Interpret a log file

Requirements

Basic Linux administration skills
Basic programming skills

35 Hours

Number of participants

Online

Classroom

Select Location

Please select a Venue

Price per participant

Open Training Courses require 5+ participants.

Administrator Training for Apache Hadoop Training Course - Booking

Full name *

Email *

Phone *

Job Title

Company Name

Address 1 *

City *

State / Province

Country *

Postcode *

Start Date

Tax ID

Dates are subject to availability and take place between 09:30 and 16:30.

Payment *

Bank Transfer (Invoice, PO)

Debit / Credit Card

Comments

Allow Publishing Certificate

If you check this box the participants will receive an option to publish their course certificate on the NobleProg Certified Professional Catalogue.

Terms and Conditions *

I am an authorised representative of the above named client and I wish to book the above courses or services in accordance with NobleProg Terms and Conditions and Privacy Policy.

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Administrator Training for Apache Hadoop Training Course - Enquiry

Full name *

Email *

Phone *

Number of participants

Company Name

Company Address

How do you want to take the course?

Client Premises

Online

Classroom

Comments

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Administrator Training for Apache Hadoop - Consultancy Enquiry

Consultancy Enquiry

Full name *

Phone *

Email *

Company Name

Consultancy Subject *

Consultancy Goal

Consultancy Duration

Number of Consultants

Suitable Date

Who will the consultant work with?

Consultancy Urgency *

Comments

Inform me about discounts and promotions

Please read our Privacy Policy to find out how we use your data

Testimonials (3)

I genuinely enjoyed the many hands-on sessions.

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

I genuinely enjoyed the big competences of Trainer.

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

I mostly liked the trainer giving real live Examples.

Simon Hahn

Course - Administrator Training for Apache Hadoop

Upcoming Courses

Administrator Training for Apache Hadoop

2025-09-22 09:30

35 hours

Bratislava CBC

1940 EUR (Online)

2940 EUR (Classroom)

Administrator Training for Apache Hadoop

2025-10-06 09:30

35 hours

Bratislava CBC

1940 EUR (Online)

2940 EUR (Classroom)

Administrator Training for Apache Hadoop

2025-10-20 09:30

35 hours

Bratislava CBC

1940 EUR (Online)

2940 EUR (Classroom)

Administrator Training for Apache Hadoop

2025-11-03 09:30

35 hours

Bratislava CBC

1940 EUR (Online)

2940 EUR (Classroom)

Related Courses

Programming with Big Data in R

21 Hours

Big Data is a term that refers to solutions destined for storing and processing large data sets. Developed by Google initially, these Big Data solutions have evolved and inspired other similar projects, many of which are available as open-source. R is a popular programming language in the financial industry.

R Fundamentals

21 Hours

R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among statisticians, engineers and scientists without computer programming skills who find it easy to use. Its popularity is due to the increasing use of data mining for various goals such as set ad prices, find new drugs more quickly or fine-tune financial models. R has a wide variety of packages for data mining.

Data Mining with R

14 Hours

Econometrics: Eviews and Risk Simulator

21 Hours

This instructor-led, live training in Slovakia (online or onsite) is aimed at anyone who wishes to learn and master the fundamentals of econometric analysis and modeling.

By the end of this training, participants will be able to:

Learn and understand the fundamentals of econometrics.
Utilize Eviews and risk simulators.

Forecasting with R

14 Hours

This instructor-led, live training in Slovakia (online or onsite) is aimed at intermediate-level data analysts and business professionals who wish to perform time series forecasting and automate data analysis workflows using R.

By the end of this training, participants will be able to:

Understand the fundamentals of forecasting techniques in R.
Apply exponential smoothing and ARIMA models for time series analysis.
Utilize the ‘forecast’ package to generate accurate forecasting models.
Automate forecasting workflows for business and research applications.

HR Analytics for Public Organisations

14 Hours

This instructor-led, live training (online or onsite) is aimed at HR professionals who wish to use analytical methods improve organisational performance. This course covers qualitative as well as quantitative, empirical and statistical approaches.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Marketing Analytics using R

21 Hours

Audience

Business owners (marketing managers, product managers, customer base managers) and their teams; customer insights professionals.

Overview

The course follows the customer life cycle from acquiring new customers, managing the existing customers for profitability, retaining good customers, and finally understanding which customers are leaving us and why. We will be working with real (if anonymous) data from a variety of industries including telecommunications, insurance, media, and high tech.

Format

Instructor-led training over the course of five half-day sessions with in-class exercises as well as homework. It can be delivered as a classroom or distance (online) course.

R for Data Analysis and Research

7 Hours

Audience

managers
developers
scientists
students

Format of the course

on-line instruction and discussion OR face-to-face workshops

Introduction to R

21 Hours

This course covers the manipulation of objects in R including reading data, accessing R packages, writing R functions, and making informative graphs. It includes analyzing data using common statistical models. The course teaches how to use the R software (https://www.r-project.org) both on a command line and in a graphical user interface (GUI).

R

21 Hours

Neural Network in R

14 Hours

This course is an introduction to applying neural networks in real world problems using R-project software.

Advanced R Programming

7 Hours

This course is for data scientists and statisticians that already have basic R & C++ coding skills and R code and need advanced R coding skills.

The purpose is to give a practical advanced R programming course to participants interested in applying the methods at work.

Sector specific examples are used to make the training relevant to the audience

Statistical Analysis using SPSS

21 Hours

This instructor-led, live training in Slovakia (online or onsite) is aimed at beginner-level to intermediate-level professionals who wish to perform statistical analysis using SPSS to interpret data accurately, run complex statistical tests, and generate meaningful insights.

By the end of this training, participants will be able to:

Navigate the SPSS interface and manage datasets efficiently.
Perform descriptive and inferential statistical analyses.
Conduct t-tests, ANOVA, MANOVA, regression, and correlation analyses.
Apply non-parametric tests, principal component analysis, and factor analysis for advanced data interpretation.

Talent Acquisition Analytics

14 Hours

This instructor-led, live training (online or onsite) is aimed at HR professionals and recruitment specialists who wish to use analytical methods improve organisational performance. This course covers qualitative as well as quantitative, empirical and statistical approaches.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Introduction to Data Visualization with Tidyverse and R

7 Hours

Audience

Format of the course

By the end of this training, participants will be able to:

In this instructor-led, live training, participants will learn how to manipulate and visualize data using the tools included in the Tidyverse.

The Tidyverse is a collection of versatile R packages for cleaning, processing, modeling, and visualizing data. Some of the packages included are: ggplot2, dplyr, tidyr, readr, purrr, and tibble.

Beginners to the R language
Beginners to data analysis and data visualization

Part lecture, part discussion, exercises and heavy hands-on practice

Perform data analysis and create appealing visualizations
Draw useful conclusions from various datasets of sample data
Filter, sort and summarize data to answer exploratory questions
Turn processed data into informative line plots, bar plots, histograms
Import and filter data from diverse data sources, including Excel, CSV, and SPSS files

Administrator Training for Apache Hadoop Training Course

Audience:

Goal:

Course Outline

1: HDFS (17%)

2: YARN and MapReduce version 2 (MRv2) (17%)

3: Hadoop Cluster Planning (16%)

4: Hadoop Cluster Installation and Administration (25%)

5: Resource Management (10%)

6: Monitoring and Logging (15%)

Requirements

Testimonials (3)

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

Simon Hahn

Course - Administrator Training for Apache Hadoop

Upcoming Courses

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Administrator Training for Apache Hadoop Training Course

Audience:

Goal:

Course Outline

1: HDFS (17%)

2: YARN and MapReduce version 2 (MRv2) (17%)

3: Hadoop Cluster Planning (16%)

4: Hadoop Cluster Installation and Administration (25%)

5: Resource Management (10%)

6: Monitoring and Logging (15%)

Requirements

Testimonials (3)

Jacek Pieczatka

Course - Administrator Training for Apache Hadoop

Grzegorz Gorski

Course - Administrator Training for Apache Hadoop

Simon Hahn

Course - Administrator Training for Apache Hadoop

Upcoming Courses

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Administrator Training for Apache Hadoop

Related Courses

Programming with Big Data in R

R Fundamentals

Data Mining with R

Econometrics: Eviews and Risk Simulator

Forecasting with R

HR Analytics for Public Organisations

Marketing Analytics using R

Audience

Overview

Format

R for Data Analysis and Research

Audience

Format of the course

Introduction to R

R

Neural Network in R

Advanced R Programming

Statistical Analysis using SPSS

Talent Acquisition Analytics

Introduction to Data Visualization with Tidyverse and R

Related Categories

Hadoop

Statistics

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites