GitHub - kenneth-pro/python-data-engineering-mastery · GitHub
Skip to content

kenneth-pro/python-data-engineering-mastery

Repository files navigation

🐍 Python Data Engineering Mastery

A comprehensive curriculum from Python basics to production data pipelines.

📚 Curriculum Overview

Module Topic Sub-Topics
00 Getting Started Setup, REPL, Environment
01 Python Fundamentals Variables, Numbers, Strings, Lists, Tuples, Dicts, Sets, Operators
02 Control Flow Conditionals, Loops, Comprehensions, Iterators/Generators
03 Functions & Modules Basics, Args/Kwargs, Decorators, Closures/Scope
04 Intermediate Python Exceptions, Files, Regex, Datetime, OOP
05 Data Engineering Essentials NumPy, Pandas Basics, Pandas Advanced
06 Data Formats CSV, JSON, Parquet
07 APIs & Databases REST APIs, Flask, Web Scraping
08 Cloud Pipelines S3, Lambda, Pipeline Patterns

🚀 Quick Start

# Clone and setup
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Launch notebooks
jupyter notebook

📁 Structure

├── 00_getting_started/
│   ├── README.md          ← Core concepts & cheatsheets
│   └── demo.ipynb         ← Hands-on notebook
│
├── 01_python_fundamentals/
│   ├── README.md          ← Module overview
│   ├── demo.ipynb         ← Interactive demos
│   ├── 01_variables_types/
│   │   └── README.md      ← Deep dive
│   ├── 02_numbers/
│   ├── 03_strings/
│   ├── 04_lists/
│   ├── 05_tuples/
│   ├── 06_dictionaries/
│   ├── 07_sets/
│   └── 08_operators/
│
├── 02_control_flow/        (4 sub-topics)
├── 03_functions_modules/   (4 sub-topics)
├── 04_intermediate_python/ (5 sub-topics)
├── 05_data_engineering_essentials/ (3 sub-topics)
├── 06_data_formats/        (3 sub-topics)
├── 07_apis_databases/      (3 sub-topics)
├── 08_cloud_pipelines/     (3 sub-topics)
│
├── projects/               ← Capstone projects
├── data/                   ← Sample data files
├── exercises/              ← Practice exercises
└── requirements.txt

🎯 Learning Paths

Beginner (Modules 00-03)

Python fundamentals: syntax, data types, control flow, functions

Intermediate (Modules 04-06)

File handling, data manipulation with Pandas, data formats

Advanced (Modules 07-08)

APIs, databases, cloud services, production pipelines

📊 Features

  • Visual Cheatsheets - ASCII diagrams for quick reference
  • Deep Dive Sub-Topics - Detailed coverage of each concept
  • Hands-On Notebooks - Runnable code in Jupyter
  • Data Engineering Focus - Real-world patterns and best practices
  • Practice Exercises - Build skills with guided challenges

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors