Python, Spark, and Hadoop for Big Data Training Course

Python is a scalable, flexible, and widely used programming language for data science and machine learning. Spark is a data processing engine used in querying, analyzing, and transforming big data, while Hadoop is a software library framework for large-scale data storage and processing.

This instructor-led, live training (online or onsite) is aimed at developers who wish to use and integrate Spark, Hadoop, and Python to process, analyze, and transform large and complex data sets.

By the end of this training, participants will be able to:

Set up the necessary environment to start processing big data with Spark, Hadoop, and Python.
Understand the features, core components, and architecture of Spark and Hadoop.
Learn how to integrate Spark, Hadoop, and Python for big data processing.
Explore the tools in the Spark ecosystem (Spark MlLib, Spark Streaming, Kafka, Sqoop, Kafka, and Flume).
Build collaborative filtering recommendation systems similar to Netflix, YouTube, Amazon, Spotify, and Google.
Use Apache Mahout to scale machine learning algorithms.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Testimonials (4)

Examples/exercices perfectly adapted to our domain

Luc - CS Group

Course - Scaling Data Analysis with Python and Dask

The fact of having more practical exercises using more similar data to what we use in our projects (satellite images in raster format)

Matthieu - CS Group

Course - Scaling Data Analysis with Python and Dask

A lot of practical examples, different ways to approach the same problem, and sometimes not so obvious tricks how to improve the current solution

Python, Spark, and Hadoop for Big Data Training Course

Course Outline

Requirements

Testimonials (4)

Luc - CS Group

Course - Scaling Data Analysis with Python and Dask

Matthieu - CS Group

Course - Scaling Data Analysis with Python and Dask

Rafał - Nordea

Course - Apache Spark MLlib

Caterina - Stamtech

Course - Developing APIs with Python and FastAPI

Related Courses

Python and Spark for Big Data (PySpark)

Introduction to Graph Computing

Artificial Intelligence - the most applied stuff - Data Analysis + Distributed AI + NLP

Apache Spark MLlib

Data Analysis with Python, Pandas and Numpy

Accelerating Python Pandas Workflows with Modin

Machine Learning with Python and Pandas

Scaling Data Analysis with Python and Dask

FARM (FastAPI, React, and MongoDB) Full Stack Development

Developing APIs with Python and FastAPI

Scientific Computing with Python SciPy

Game Development with PyGame

Web application development with Flask

Advanced Flask

Build REST APIs with Python and Flask

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites