Skip to content

HighviewOne/DataEngineeringZoomcamp2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

banner

πŸ› οΈ Data Engineering Zoomcamp 2026

My homework solutions and projects for the Data Engineering Zoomcamp 2026 β€” a free course by DataTalks.Club covering the modern data engineering stack.

Course Modules Workshop Project License


Python Docker PostgreSQL BigQuery dbt Apache Spark Apache Flink Terraform


πŸ“š Homework Solutions

# Module Topic Tech Stack
1 Module1 Docker & SQL Docker, PostgreSQL, Terraform, GCP
2 Module2 Workflow Orchestration Kestra
3 Module3 Data Warehouse BigQuery, dlt
4 Module4 Analytics Engineering dbt, dimensional modeling
5 Module5 Data Platforms Bruin
6 Module6 Batch Processing Apache Spark (PySpark)
7 Module7 Streaming PyFlink, Redpanda
πŸ§ͺ Workshop dlt Workshop dlt, DuckDB, REST API

Each module folder contains a homework.md with the questions, my answers, and the code/queries used to derive them.

πŸš€ Capstone Project

SoCal NOD Tracker β€” Foreclosure Early-Warning Pipeline

An end-to-end pipeline tracking Notice of Default (NOD) filings across 6 Southern California counties β€” a leading indicator of foreclosure activity.

Daily CSVs β†’ Kestra DAG β†’ GCS (data lake) β†’ BigQuery (raw) β†’ dbt (marts) β†’ Looker Studio

Stack: Terraform Β· Kestra Β· Google Cloud Storage Β· BigQuery Β· dbt Β· Looker Studio

See the full project README for architecture, reproduction steps, and the dashboard.

🧰 Tools & Technologies

Layer Tools
Containerization Docker, Docker Compose
Orchestration Kestra
Data Lake Google Cloud Storage
Data Warehouse BigQuery, PostgreSQL, DuckDB
Ingestion dlt
Transformation dbt, Bruin
Batch Processing Apache Spark / PySpark
Streaming Apache Flink (PyFlink), Redpanda
Infrastructure as Code Terraform
Visualization Looker Studio

πŸ“‚ Repository Structure

.
β”œβ”€β”€ Module1/        # Docker & SQL
β”œβ”€β”€ Module2/        # Workflow Orchestration (Kestra)
β”œβ”€β”€ Module3/        # Data Warehouse (BigQuery, dlt)
β”œβ”€β”€ Module4/        # Analytics Engineering (dbt)
β”œβ”€β”€ Module5/        # Data Platforms (Bruin)
β”œβ”€β”€ Module6/        # Batch Processing (Spark)
β”œβ”€β”€ Module7/        # Streaming (PyFlink, Redpanda)
β”œβ”€β”€ Workshop/       # dlt Workshop
└── project1/       # Capstone: SoCal NOD Tracker

πŸŽ“ About the Course

The Data Engineering Zoomcamp is a free, hands-on course covering data engineering fundamentals: containerization, workflow orchestration, data warehousing, analytics engineering, batch processing, and streaming.

πŸ‘€ Author

Michael β€” @HighviewOne

πŸ“„ License

Released under the MIT License.

⭐ If you find this useful, consider giving it a star!

About

πŸ› οΈ My homework solutions & end-to-end projects for the Data Engineering Zoomcamp 2026 (DataTalks.Club) β€” Docker, BigQuery, dbt, Spark, Flink, Kestra & more

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors