Football Data Pipeline
End-to-end pipeline that ingests, transforms, and loads football match data for downstream analytics.
- API SourceMatch data
- AirflowOrchestration
- GCSRaw Lake
- SparkTransform
- BigQueryWarehouse
- dbtMart
- AnalyticsConsumption
Challenge
Orchestrating multi-step ingestion and transformation reliably across 6 dependent stages.
Approach
Designed a 6-task Airflow DAG (daily schedule); used Spark for heavy transforms; modelled marts with dbt backed by 10 data-quality tests.
Impact
Fully automated daily pipeline — zero manual steps from API source to analytics-ready mart, guarded by 10 dbt tests.