This repository contains the code for the AI for Industry workshop series at KTH AI Society. The workshop series is organized by Timothy Lindblom and aimed at teaching students how to work with data at scale.
We focus on data wrangling in Python using DuckDB. DuckDB is an embeddable SQL OLAP database management system. It is designed to be easy to install, use, and extend, while remaining fast and reliable.
In this workshop, the use of Docker for cloud applications and workplace environment standardization is examined.
An real-life example is shown via a RAG (Retrieval Augmented Generation) pipeline using OpenAIs API and hosting on Google Cloud.