Course Outline
Introduction
- Why and how project teams adopt Hadoop
- How it all started
- The Project Manager's role in Hadoop projects
Understanding Hadoop's Architecture and Key Concepts
- HDFS
- MapReduce
- Other pieces of the Hadoop ecosystem
What Constitutes Big Data?
Different Approaches to Storing Big Data
HDFS (Hadoop Distributed File System) as the Foundation
How Big Data is Processed
- The power of distributed processing
Processing Data with MapReduce
- How data is picked apart step by step
The Role of Clustering in Large-Scale Distributed Processing
- Architectural overview
- Clustering approaches
Clustering Your Data and Processes with YARN
The Role of Non-Relational Database in Big Data Storage
Working with Hadoop's Non-Relational Database: HBase
Data Warehousing Architectural Overview
Managing Your Data Warehouse with Hive
Running Hadoop from Shell-Scripts
Working with Hadoop Streaming
Other Hadoop Tools and Utilities
Getting Started on a Hadoop Project
- Demystifying complexity
Migrating an Existing Project to Hadoop
- Infrastructure considerations
- Scaling beyond your allocated resources
Hadoop Project Stakeholders and Their Toolkits
- Developers, data scientists, business analysts and project managers
Hadoop as a Foundation for New Technologies and Approaches
Closing Remarks
Requirements
- A general understanding of programming
- An understanding of databases
- Basic knowledge of Linux
Testimonials (5)
Trainer's preparation & organization, and quality of materials provided on github.
Mateusz Rek - MicroStrategy Poland Sp. z o.o.
Course - Impala for Business Intelligence
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.
Safar Alqahtani - Elm Information Security
Course - Big Data Analytics in Health
I thought he did a great job of tailoring the experience to the audience. This class is mostly designed to cover data analysis with HIVE, but me and my co-worker are doing HIVE administration with no real data analytics responsibilities.
ian reif - Franchise Tax Board
Course - Data Analysis with Hive/HiveQL
I genuinely enjoyed the many hands-on sessions.
Jacek Pieczątka
Course - Administrator Training for Apache Hadoop
The fact that all the data and software was ready to use on an already prepared VM, provided by the trainer in external disks.