Notebook2REST

This project is about turning Jupyter notebooks into something we can call through a REST API. Instead of running a notebook by hand, the idea is that we can send a request, execute it in the cloud, and get the result back in a more structured way.

We built this around FastAPI, Docker, Papermill, and AWS. A request comes in through the API, the notebook is started as an AWS Batch job, and the executed notebook is stored in S3.

We made this in collaboration with a client from LifeWatch, based on the need to make notebook-based workflows easier to deploy and reuse without rebuilding everything by hand after every change.

What this project does

The main goal is to make notebooks easier to use as services instead of standalone files. In our setup, that means:

extracting notebook parameters automatically,
exposing notebooks through an API,
running them inside containers,
and storing the output after execution.

The idea is simple: push a notebook, then have a matching REST-accessible service instead of a manual notebook workflow.

How it works

The project has a few main parts:

Parameter extraction

pipeline.py looks through notebooks in the current working directory and collects variables that start with param_. Those are written into paramdump.json, which is then used by the API to know which notebooks exist and which parameters they accept.

API layer

app.py contains the FastAPI app. This is the part that handles requests such as:

starting a notebook run,
checking the status of a job,
listing jobs,
listing available notebooks,
viewing notebook parameter defaults,
and downloading the executed output notebook.

Notebook execution

docker/run_from_json.py handles the actual notebook run inside the container. It reads the given parameters, patches the notebook, and executes it with Papermill.

The Docker image is defined in docker/dockerfile.

AWS integration

lambda/aws_batch.py connects the API to AWS Batch and S3. It submits jobs, maps AWS Batch states to simpler API states, lists jobs, and fetches output notebooks after execution.

Request flow

This is the basic flow:

A client sends a POST request for a notebook.
The API loads paramdump.json from S3 and checks the notebook and its default parameters.
If extra parameters are included, they are validated first.
The API submits an AWS Batch job.
The notebook runs inside a container with Papermill.
The output notebook is written to S3.
The client can later check the status or download the executed notebook output.

API routes

Right now the API has these routes:

POST /jobs/{notebook}
GET /jobs
GET /jobs/notebooks
GET /jobs/notebooks/{name}
GET /jobs/{job_id}
GET /jobs/{job_id}/output

Example body for starting a notebook:

{
  "param_example": 42
}

The API returns simplified job states instead of raw AWS Batch states:

queued
running
succeeded
failed

Repository structure

app.py - FastAPI app
pipeline.py - notebook parameter extraction
lambda/aws_batch.py - AWS Batch and S3 helper functions
lambda/config.py - AWS config
api_repo/pipeline.sh - older helper script for notebook conversion
docker/run_from_json.py - notebook execution script
docker/dockerfile - Docker image definition

Important assumptions

There are a few things the current code expects:

the AWS resources already exist,
the S3 bucket is called notebook2rest,
the AWS region is eu-west-1,
paramdump.json has been generated and uploaded before the API is used,
and notebook parameters use the param_ prefix.

Dependencies

Some of the main dependencies used in this repo are:

FastAPI
Uvicorn
Papermill
nbformat / nbconvert / Jupyter
boto3

The Docker setup also separates common runtime dependencies from notebook-specific dependencies through:

docker/requirements.txt
docker/requirements_notebook.txt

Team

Jona Aalten
Mike van der Deure
Theo Gatea
Colin de Koning

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.github		.github
api_repo		api_repo
docker		docker
docs		docs
lambda-build		lambda-build
lambda		lambda
.gitignore		.gitignore
README.md		README.md
app.py		app.py
pipeline.py		pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notebook2REST

What this project does

How it works

Parameter extraction

API layer

Notebook execution

AWS integration

Request flow

API routes

Repository structure

Important assumptions

Dependencies

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Notebook2REST

What this project does

How it works

Parameter extraction

API layer

Notebook execution

AWS integration

Request flow

API routes

Repository structure

Important assumptions

Dependencies

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages