Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- Defining Big Data
- Reasons for the growing popularity of Big Data
- Real-world Big Data Case Studies
- Key Characteristics of Big Data
- Solutions for processing Big Data
Hadoop and Its Components:
- Introduction to Hadoop and its core components
- Hadoop Architecture and the types of data it can handle and process
- A brief history of Hadoop, key companies adopting it, and their motivations
- Detailed explanation of the Hadoop Framework and its components
- Understanding HDFS and the operations of reading from and writing to the Hadoop Distributed File System
- Steps to set up a Hadoop Cluster in various modes: Standalone, Pseudo-distributed, and Multi-node
(This section covers establishing a Hadoop cluster using VirtualBox, KVM, or VMware, configuring the necessary network settings, starting Hadoop Daemons, and testing cluster functionality).
- Explanation of the MapReduce framework and its operational mechanics
- Executing MapReduce jobs on a Hadoop cluster
- Concepts of replication, mirroring, and rack awareness within Hadoop clusters
Hadoop Cluster Planning:
- Strategies for planning your Hadoop cluster
- Evaluating hardware and software requirements for cluster planning
- Analyzing workloads to prevent failures and optimize cluster performance
Introduction to MapR and Its Value:
- Overview of MapR and its architecture
- Exploring and working with MapR Control System, MapR Volumes, snapshots, and mirrors
- Planning a cluster specifically for the MapR environment
- Comparing MapR with other distributions and Apache Hadoop
- Installation and deployment of MapR clusters
Cluster Setup and Administration:
- Managing services, nodes, snapshots, mirrored volumes, and remote clusters
- Understanding and managing cluster nodes
- Gaining insight into Hadoop components and installing them alongside MapR services
- Accessing data on the cluster, including via NFS, while managing services and nodes
- Data management using volumes, user and group management, role assignment to nodes, node commissioning and decommissioning, cluster administration, performance monitoring, metric configuration and analysis, and MapR security administration
- Understanding and utilizing M7 Native storage for MapR tables
- Configuring and tuning the cluster for optimum performance
Cluster Upgrades and Integration:
- Upgrading MapR software versions and understanding upgrade types
- Configuring the MapR cluster to access an HDFS cluster
- Setting up a MapR cluster on Amazon Elastic Mapreduce
All the above topics include demonstrations and practice sessions to provide learners with hands-on experience of the technology.
Requirements
- Foundational knowledge of the Linux File System
- Basic understanding of Java
- Familiarity with Apache Hadoop (recommended)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay