GitHub - bashoori/bashoori · GitHub
Skip to content

bashoori/bashoori

Folders and files

Repository files navigation

Hi, I'm Bita 👋

Data Engineer · Microsoft Stack · Big Data · Cloud-Native ETL
Building scalable, reliable data platforms with Azure, Fabric, and Spark.

LinkedIn Portfolio Email Profile views


🎯 About

Data Engineer with 5+ years of experience building scalable, reliable cloud data platforms using Microsoft Azure, Fabric, and Apache Spark. I specialize in turning fragmented, inconsistent data into trustworthy systems where small data quality issues don't become business problems.

My focus:

  • Cloud Data Platforms — Microsoft Azure, Fabric, medallion architecture, cloud-native design
  • Big Data Processing — Apache Spark, distributed computing, performance optimization
  • Analytics Engineering — Dimensional modeling, analytics-ready data warehouses, Power BI
  • Data Quality & Reliability — Validation, monitoring, observability, production patterns

📍 Vancouver, Canada · 🇨🇦 Open to Data Engineer / Analytics Engineer roles · Public sector focus (TransLink, health authorities, government)


🛠 Tech Stack

Cloud & Big Data Platforms

Data Warehousing & Analytics

Orchestration & Processing

Infrastructure & Tools

Architecture & Patterns
Medallion Architecture · Dimensional Modeling · Real-time Pipelines · Cloud-Native Design · Data Quality Validation · Observability


📌 Featured Projects

Enterprise-scale medallion lakehouse on Microsoft Fabric. Multi-region retail data platform (27 countries, 10B+ annual transactions).

  • Incremental processing, data quality validation, analytics-ready models
  • Unified customer / product / sales dimensional models
  • Power BI integration for real-time reporting

Stack: Microsoft Fabric · OneLake · PySpark · Power BI · Big Data at scale

End-to-end data warehouse on real TransLink GTFS transit data using medallion architecture.

  • Bronze → Silver → Gold layers with embedded data-quality checks
  • Handled domain edge cases (GTFS times beyond 24:00)
  • Dimensional models for ridership analysis and operational insights
  • Public sector project (TransLink / transit authority relevance)

Stack: Python · SQL Server · Medallion · Dimensional Modeling

Production-grade ETL pipeline demonstrating enterprise reliability patterns.

  • Apache Airflow orchestration, Spark distributed processing, AWS infrastructure
  • Idempotency, failure handling, comprehensive monitoring and observability
  • Real-world migration case study: legacy batch jobs → cloud-native DAGs

Stack: Apache Airflow · Spark · AWS · Production Patterns

Medallion-based lakehouse using Delta Lake + Unity Catalog.

  • Governed data access, scalable PySpark transformations, reusable logic

Stack: Databricks · Delta Lake · Unity Catalog · PySpark

Health data integration using FHIR standards.

  • Multi-source healthcare data consolidation, compliance-focused design
  • Public sector angle (health authority data solutions)

Stack: FHIR · Healthcare APIs · Python · Data Integration


🎯 What I Focus On

Medallion Architecture — Bronze/Silver/Gold layering, clean separation of concerns
Big Data at Scale — PySpark, distributed processing, performance optimization
Cloud-Native Platforms — Microsoft Azure, Fabric, modern data lakehouse design
Data Quality — Validation gates, anomaly detection, observability
Analytics Engineering — Dimensional models, fact/dimension tables, Power BI
Production Reliability — Failure handling, retries, monitoring, operational maturity
Public Sector Data — TransLink, healthcare, government-relevant skills


📊 GitHub Stats


🎓 Certifications & Learning

  • Microsoft Certified: Azure Data Fundamentals (DP-900)
  • 📘 In progress: Microsoft Fabric Data Engineer (DP-700)
  • 📚 Building hands-on lakehouse projects on Microsoft Fabric & Databricks

💼 Currently Looking For

Data Engineer / Analytics Engineer roles focused on:

  • ✓ Microsoft Azure / Fabric cloud platforms
  • ✓ Big data processing (Spark, distributed systems)
  • ✓ Medallion / lakehouse architectures
  • ✓ Analytics-ready data warehouse design
  • Public sector (TransLink, BC Public Service, health authorities, municipalities)

📍 Vancouver, Canada · Open to on-site / hybrid / remote


📫 Let's Connect

I'm open to collaborating on data platform projects, exploring new roles, or discussing pipelines and data architecture.

LinkedIn · Portfolio · Email

Build systems that remain reliable as complexity grows.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors