Skip to content

jay-zhao1/task-forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mini-lambda-rust

mini-lambda-rust is a small serverless-style job execution platform built with Rust, Tokio, Axum, and SQLite. It exposes an HTTP API for submitting jobs, persists metadata in SQLite, processes work asynchronously with a Tokio worker pool, and tracks retries, failures, and timeouts with structured logs.

The project is intentionally scoped as a local MVP, but the structure is production-oriented:

  • Axum HTTP API
  • Tokio async runtime
  • SQLx + SQLite persistence
  • In-memory Tokio mpsc queue
  • Configurable worker pool
  • Structured tracing logs
  • JSON error responses
  • Graceful shutdown with queue draining
  • Basic integration tests

Features

  • POST /jobs submits a new job and queues it for background execution.
  • GET /jobs/:id fetches one job with current metadata and status.
  • GET /jobs lists jobs ordered by creation time descending.
  • POST /jobs/:id/retry requeues failed or timed-out jobs.
  • DELETE /jobs/:id deletes a job record.
  • GET /health checks API and database availability.
  • Failed jobs can be retried automatically up to max_retries.
  • Timed-out jobs are marked timeout and can be retried manually.

Architecture

The codebase is split into small modules:

  • src/main.rs: binary entrypoint, tracing setup, HTTP server, graceful shutdown
  • src/lib.rs: runtime bootstrap
  • src/api/: Axum router and handlers
  • src/models/: API and domain models
  • src/db/: SQLite connection, schema initialization, and job queries
  • src/scheduler/: job submission and retry orchestration
  • src/worker/: worker pool and simulated job execution
  • src/state/: shared application state
  • src/error/: centralized error handling and JSON responses

Runtime flow:

  1. A client submits a job to POST /jobs.
  2. The API validates the request, stores the job in SQLite, and pushes the job ID onto a Tokio mpsc queue.
  3. A worker picks up the job, marks it running, and simulates async execution.
  4. The worker updates the database to success, failed, or timeout.
  5. If a job fails and retry_count < max_retries, it is requeued automatically.

Job Model

Each job stores:

  • id
  • payload
  • status
  • created_at
  • started_at
  • finished_at
  • retry_count
  • max_retries
  • timeout_seconds
  • result
  • error_message

Statuses:

  • pending
  • running
  • success
  • failed
  • timeout

Simulated Execution Rules

The worker uses a small deterministic simulation layer so the service is easy to demo locally.

If the submitted payload is a JSON object, these optional keys are recognized:

  • duration_ms: simulated processing duration, defaults to 750
  • should_fail: if true, the worker returns an error
  • result: custom success message
  • error_message: custom failure message

Example payloads:

{
  "task": "resize-image",
  "duration_ms": 250,
  "result": "image resized successfully"
}
{
  "task": "failing-job",
  "duration_ms": 50,
  "should_fail": true,
  "error_message": "simulated downstream failure"
}

To trigger a timeout, submit a duration_ms larger than the configured timeout_seconds.

Configuration

Environment variables:

  • SERVER_ADDR: HTTP bind address, default 127.0.0.1:3000
  • DATABASE_URL: SQLite connection string, default sqlite://mini_lambda.db
  • WORKER_COUNT: number of workers, default 4
  • DEFAULT_TIMEOUT_SECONDS: fallback timeout for jobs, default 30
  • DEFAULT_MAX_RETRIES: fallback retry count, default 3

Database Initialization

The application initializes the SQLite schema automatically at startup with CREATE TABLE IF NOT EXISTS statements. No external migration command is required for the MVP.

The jobs table includes indexes on:

  • created_at
  • status

How To Run

Prerequisites

  • Rust stable
  • SQLite support through SQLx

Start the service

cargo run

Or with custom settings:

SERVER_ADDR=127.0.0.1:4000 \
DATABASE_URL=sqlite://mini_lambda.db \
WORKER_COUNT=4 \
DEFAULT_TIMEOUT_SECONDS=10 \
DEFAULT_MAX_RETRIES=2 \
cargo run

Run tests

cargo test

API Examples

Health check

curl http://127.0.0.1:3000/health

Submit a successful job

curl -X POST http://127.0.0.1:3000/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "payload": {
      "task": "send-email",
      "duration_ms": 200,
      "result": "email delivered"
    },
    "timeout_seconds": 5,
    "max_retries": 1
  }'

Example response:

{
  "job_id": "8ee65756-65ca-48db-bf59-f1d8271f18ce",
  "status": "pending"
}

Fetch a job

curl http://127.0.0.1:3000/jobs/8ee65756-65ca-48db-bf59-f1d8271f18ce

Example response:

{
  "id": "8ee65756-65ca-48db-bf59-f1d8271f18ce",
  "payload": {
    "task": "send-email",
    "duration_ms": 200,
    "result": "email delivered"
  },
  "status": "success",
  "created_at": "2026-03-13T04:00:00.000000Z",
  "started_at": "2026-03-13T04:00:00.010000Z",
  "finished_at": "2026-03-13T04:00:00.210000Z",
  "retry_count": 0,
  "max_retries": 1,
  "timeout_seconds": 5,
  "result": "email delivered",
  "error_message": null
}

List jobs

curl http://127.0.0.1:3000/jobs

Submit a failing job

curl -X POST http://127.0.0.1:3000/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "payload": {
      "task": "failing-job",
      "duration_ms": 50,
      "should_fail": true,
      "error_message": "simulated downstream failure"
    },
    "timeout_seconds": 5,
    "max_retries": 2
  }'

Retry a failed or timed-out job

curl -X POST http://127.0.0.1:3000/jobs/8ee65756-65ca-48db-bf59-f1d8271f18ce/retry

Delete a job

curl -X DELETE http://127.0.0.1:3000/jobs/8ee65756-65ca-48db-bf59-f1d8271f18ce

Logging

The service logs key lifecycle events with tracing:

  • job submitted
  • job started
  • job succeeded
  • job failed
  • job timed out
  • job retried

Set RUST_LOG to adjust verbosity:

RUST_LOG=mini_lambda_rust=debug cargo run

Error Responses

Errors are returned as JSON:

{
  "error": {
    "code": "validation_error",
    "message": "payload cannot be null"
  }
}

Future Improvements

  • durable queueing so pending work survives process restarts
  • per-job execution history instead of single-attempt timestamps
  • authentication and rate limiting
  • richer job payload contracts
  • metrics and OpenTelemetry export
  • SQLx migrations directory for schema evolution
  • job cancellation support
  • dead-letter queue for repeated failures

About

A lightweight async job execution runtime built with Rust, Tokio, and Axum.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages