Sunbelt Computer Software

📡 Telecom Data Platform

This project demonstrates an end-to-end data engineering pipeline combining:

Batch processing
Real-time streaming (Kafka)
ELT transformations (dbt)
Orchestration (Airflow)

🛠 Tech Stack

Python – ETL scripts & Airflow DAGs
PostgreSQL – OLTP and raw data storage
Kafka – Real-time event streaming
dbt – Transformations, incremental models, and tests
Airflow – Orchestration of batch & streaming pipelines
Docker – Containerized infrastructure

📁 Project Structure

🚦 Pipeline Status & Architecture

Architecture Overview

┌─────────────┐
│  Source DB  │  ← optional initial batch ingestion
└─────┬───────┘
      │
      ▼
┌─────────────┐
│ Batch ETL   │  ← Airflow DAG task: batch_ingestion
│ raw_user_activity table
└─────┬───────┘
      │
      ▼
┌─────────────┐
│ Kafka       │  ← Producer generates events
│ Topic:      │
│ user_activity
└─────┬───────┘
      │
      ▼
┌─────────────┐
│ Kafka       │  ← Consumer reads from topic
│ Consumer    │
│ Writes to   │
│ raw_user_activity table
└─────┬───────┘
      │
      ▼
┌─────────────┐
│ dbt Models  │
│ staging →   │
│ intermediate → marts
└─────┬───────┘
      │
      ▼
┌─────────────┐
│ dbt Tests   │
│ Data quality│
└─────────────┘

🧩 Workflow (Airflow DAG)

DAG Name: telecom_pipeline

Execution Order:

batch_ingestion → kafka_producer → kafka_consumer → dbt_run → dbt_test

Tasks:

Batch Ingestion (batch_ingestion)
- Pulls initial data from source or CSV
- Inserts into raw_user_activity table in PostgreSQL
Kafka Producer (kafka_producer)
- Generates synthetic user events (calls, SMS, internet usage)
- Publishes messages to Kafka topic user_activity
Kafka Consumer (kafka_consumer)
- Reads messages from Kafka topic user_activity
- Inserts events into raw_user_activity table in PostgreSQL
DBT Transformations (dbt_run)
- Staging: stg_user_activity
- Intermediate: int_user_activity
- Marts: mrt_user_metrics
DBT Tests (dbt_test)
- Validates not-null and unique constraints
- Ensures transformed data quality

⚙️ Notes & Best Practices

Database Setup

PostgreSQL schemas: raw, staging, intermediate, marts
Airflow metadata DB uses a separate Postgres instance: airflow_postgres

Kafka Setup

Topic: user_activity
Zookeeper & Kafka run in Docker alongside Airflow

Environment Variables

Configured in .env file
Loaded in DAG and ETL scripts via python-dotenv

Persistence

PostgreSQL volumes: pgdata, airflow_pgdata ensure schemas survive Docker restarts
Do not run docker-compose down -v if you want to preserve data

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
dags		dags
ingestion		ingestion
telecom_dbt		telecom_dbt
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.py		config.py
docker-compose.yml		docker-compose.yml
logger.py		logger.py
requirements.txt		requirements.txt

Folder	Purpose
`ingestion/`	Data ingestion scripts (batch + streaming)
`dbt/`	dbt transformations (staging, intermediate, marts)
`airflow/`	DAGs, plugins, and configuration
`docker/`	Dockerfiles and Compose configuration

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📡 Telecom Data Platform

🛠 Tech Stack

📁 Project Structure

🚦 Pipeline Status & Architecture

Architecture Overview

🧩 Workflow (Airflow DAG)

Tasks:

⚙️ Notes & Best Practices

Database Setup

Kafka Setup

Environment Variables

Persistence

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

📡 Telecom Data Platform

🛠 Tech Stack

📁 Project Structure

🚦 Pipeline Status & Architecture

Architecture Overview

🧩 Workflow (Airflow DAG)

Tasks:

⚙️ Notes & Best Practices

Database Setup

Kafka Setup

Environment Variables

Persistence

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages