Add Scotty: Haskell web framework on Warp (first Haskell entry!) by BennyFranciscus · Pull Request #233 · MDA2AV/HttpArena

BennyFranciscus · 2026-03-28T22:04:37Z

Scotty

Adds Scotty — a lightweight Haskell web framework inspired by Ruby's Sinatra, running on the Warp HTTP server.

This is the first Haskell entry in HttpArena! 🎉

Details


Language	Haskell (GHC 9.8)
Framework	Scotty 0.30
Engine	Warp 3.4
Type	Framework

Subscribed Tests

baseline, pipelined, noisy, limited-conn, json, upload, compression, mixed, async-db, static

Implementation Notes

Compiled with -O2 -threaded -rtsopts and runtime flags -N -A64m -I0 for maximum throughput
Dataset and large compression payload pre-loaded into memory at startup
Static files cached in a Map at startup with correct MIME types
Manual gzip/deflate compression using zlib (compression level 1 for speed)
SQLite via sqlite-simple, PostgreSQL via postgresql-simple (connection-per-request with bracket)
Multi-stage Docker build: haskell:9.8-slim builder → debian:bookworm-slim runtime

Validation

All 29 validation checks pass locally ✅

cc @scotty-web — would love to see Scotty's numbers on the leaderboard!

/validate

Scotty is a lightweight Haskell web framework inspired by Ruby's Sinatra, built on top of the high-performance Warp HTTP server. - Language: Haskell (GHC 9.8, compiled with -O2 -threaded) - Engine: Warp - Tests: baseline, pipelined, noisy, limited-conn, json, upload, compression, mixed, async-db, static - All 29 validation checks pass Implementation notes: - Dataset and large payload pre-loaded into memory at startup - Static files cached in a Map at startup with correct MIME types - Manual gzip/deflate compression using zlib (level 1 for speed) - SQLite via sqlite-simple, PostgreSQL via postgresql-simple - Multi-stage Docker build with bookworm-slim runtime

MDA2AV · 2026-03-28T22:12:22Z

/benchmark

github-actions · 2026-03-28T22:12:44Z

🚀 Benchmark run triggered for scotty (all profiles). Results will be posted here when done.

github-actions · 2026-03-28T22:27:20Z

Benchmark Results

Framework: scotty | Profile: all profiles

scotty / baseline / 512c (p=1, r=0, cpu=64)
  Best: 11479 req/s (CPU: 272.5%, Mem: 658.8MiB) ===

scotty / baseline / 4096c (p=1, r=0, cpu=64)
  Best: 12871 req/s (CPU: 262.0%, Mem: 2.6GiB) ===

scotty / baseline / 16384c (p=1, r=0, cpu=64)
  Best: 11927 req/s (CPU: 337.7%, Mem: 632.8MiB) ===

scotty / pipelined / 512c (p=16, r=0, cpu=unlimited)
  Best: 14055 req/s (CPU: 249.7%, Mem: 540.4MiB) ===

scotty / pipelined / 4096c (p=16, r=0, cpu=unlimited)
  Best: 15110 req/s (CPU: 273.7%, Mem: 2.3GiB) ===

scotty / pipelined / 16384c (p=16, r=0, cpu=unlimited)
  Best: 13811 req/s (CPU: 280.6%, Mem: 595.6MiB) ===

scotty / limited-conn / 512c (p=1, r=10, cpu=unlimited)
  Best: 11441 req/s (CPU: 265.1%, Mem: 1.1GiB) ===

scotty / limited-conn / 4096c (p=1, r=10, cpu=unlimited)
  Best: 11190 req/s (CPU: 309.2%, Mem: 2.0GiB) ===

scotty / json / 4096c (p=1, r=0, cpu=unlimited)
  Best: 14748 req/s (CPU: 286.7%, Mem: 2.2GiB) ===

scotty / json / 16384c (p=1, r=0, cpu=unlimited)
  Best: 13506 req/s (CPU: 304.1%, Mem: 2.6GiB) ===

scotty / upload / 64c (p=1, r=0, cpu=unlimited)
  Best: 51 req/s (CPU: 443.4%, Mem: 6.3GiB) ===

scotty / upload / 256c (p=1, r=0, cpu=unlimited)
  Best: 51 req/s (CPU: 448.8%, Mem: 6.7GiB) ===

scotty / upload / 512c (p=1, r=0, cpu=unlimited)
  Best: 0 req/s (CPU: 0%, Mem: 0MiB) ===

Full log

[run 2/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     64 (1/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    1.07s    1.07s    1.11s    1.15s    1.17s

  256 requests in 5.00s, 256 responses
  Throughput: 51 req/s
  Bandwidth:  8.10KB/s
  Status codes: 2xx=256, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 256 / 256 responses (100.0%)
  CPU: 523.5% | Mem: 11.8GiB

[run 3/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     64 (1/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    1.08s    1.09s    1.13s    1.15s    1.50s

  257 requests in 5.00s, 257 responses
  Throughput: 51 req/s
  Bandwidth:  8.13KB/s
  Status codes: 2xx=257, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 257 / 257 responses (100.0%)
  CPU: 507.9% | Mem: 20.6GiB

=== Best: 51 req/s (CPU: 443.4%, Mem: 6.3GiB) ===
  Input BW: 1020.00MB/s (avg template: 20971593 bytes)
[dry-run] Results not saved (use --save to persist)
httparena-bench-scotty
httparena-bench-scotty

==============================================
=== scotty / upload / 256c (p=1, r=0, cpu=unlimited) ===
==============================================
530e164a79e329b920f60c39e0b3963f259090c3d504692543d224b88060ecf3
[wait] Waiting for server...
[ready] Server is up

[run 1/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     256 (4/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    4.48s    4.49s    4.49s    4.50s    4.50s

  256 requests in 5.00s, 256 responses
  Throughput: 51 req/s
  Bandwidth:  8.10KB/s
  Status codes: 2xx=256, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 256 / 256 responses (100.0%)
  CPU: 448.8% | Mem: 6.7GiB

[run 2/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     256 (4/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    4.51s    4.52s    4.53s    4.53s    4.53s

  256 requests in 5.00s, 256 responses
  Throughput: 51 req/s
  Bandwidth:  8.10KB/s
  Status codes: 2xx=256, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 256 / 256 responses (100.0%)
  CPU: 436.2% | Mem: 12.7GiB

[run 3/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     256 (4/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    4.56s    4.57s    4.57s    4.58s    4.58s

  256 requests in 5.00s, 256 responses
  Throughput: 51 req/s
  Bandwidth:  8.10KB/s
  Status codes: 2xx=256, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 256 / 256 responses (100.0%)
  CPU: 431.0% | Mem: 21.7GiB

=== Best: 51 req/s (CPU: 448.8%, Mem: 6.7GiB) ===
  Input BW: 1020.00MB/s (avg template: 20971593 bytes)
[dry-run] Results not saved (use --save to persist)
httparena-bench-scotty
httparena-bench-scotty

==============================================
=== scotty / upload / 512c (p=1, r=0, cpu=unlimited) ===
==============================================
86ab59f33af750929dde9b786754f4ae42b49c4d01ea3719f39f75b43bf82556
[wait] Waiting for server...
[ready] Server is up

[run 1/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  CPU: 422.3% | Mem: 6.6GiB

[run 2/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  CPU: 429.8% | Mem: 13.7GiB

[run 3/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  CPU: 430.2% | Mem: 24.0GiB

=== Best: 0 req/s (CPU: 0%, Mem: 0MiB) ===
httparena-bench-scotty
httparena-bench-scotty
httparena-postgres
httparena-postgres
[restore] Restoring CPU governor to performance...

BennyFranciscus · 2026-03-28T22:30:18Z

Benchmark results are in — nice!

Baseline/JSON/Pipelined look solid: 11-15K req/s across concurrency levels. Not bad at all for Haskell, especially considering Scotty is a high-level framework on top of Warp.

Upload is the problem: 51 req/s at 64c/256c, then drops to 0 req/s at 512c. Memory balloons to 24 GiB. Classic sign that Wai/Warp is buffering the entire 20MB upload body in memory before passing it to the handler. At 512 concurrent connections that's ~10GB of request bodies in flight, which likely OOMs or stalls the GHC runtime.

I'll look into whether Scotty/Wai supports lazy or streaming request body consumption — if we can process chunks incrementally instead of strictRequestBody, that should fix it. Might need getRequestBodyChunk from the Wai Request type.

…emory Upload was using Scotty's 'body' (strictRequestBody) which reads the entire request body into a lazy ByteString before processing. At 512 concurrent connections with 20MB payloads, this causes ~10GB memory pressure and OOMs the GHC runtime (0 req/s at 512c, 24GiB at 256c). Switch to streaming via Wai's getRequestBodyChunk — reads and counts bytes incrementally with constant memory overhead.

BennyFranciscus · 2026-03-28T22:32:29Z

Pushed a fix — switched upload handler from body (which calls strictRequestBody and buffers everything in memory) to streaming via Wai's getRequestBodyChunk. Now reads and counts bytes incrementally with constant memory overhead.

This should fix the 0 req/s at 512c and the 24 GiB memory balloon.

MDA2AV · 2026-03-28T22:36:36Z

very low cpu usage

BennyFranciscus · 2026-03-28T22:38:10Z

Yeah, really bad CPU utilization — only ~270% out of 6400% available. GHC's IO manager with -N (all cores) creates 64 capabilities but they end up contending on each other instead of doing useful work. Warp doesn't scale linearly past ~4-8 cores on the traditional IO manager.

Going to try -N4 or -N8 in the RTS options — GHC typically hits peak throughput around that range for network-heavy workloads. Less scheduler overhead, less lock contention in the green thread runtime.

GHC's IO manager doesn't scale well past a few cores for network workloads — -N (all 64 cores) was only hitting ~270% CPU due to scheduler and lock contention. -N4 with -qn4 (4 parallel GC threads) should give better throughput by reducing overhead.

BennyFranciscus · 2026-03-28T22:38:34Z

Pushed a fix — limited RTS to -N4 -qn4 instead of -N (all cores). GHC's thread scheduler creates one capability per core with -N, but for IO-bound network work, all that buys you is contention. 4 capabilities with 4 GC threads should be the sweet spot.

If throughput is still low after re-benchmark, could try -N8 or experiment with the newer threaded IO manager (-with-rtsopts=--io-manager=native) available in GHC 9.8+.

BennyFranciscus · 2026-03-29T10:25:57Z

@MDA2AV Could we get a re-benchmark when you get a chance? The -N4 -qn4 RTS fix + streaming upload should improve CPU utilization significantly over the previous run.

MDA2AV · 2026-03-29T17:32:08Z

/benchmark --save

github-actions · 2026-03-29T17:32:30Z

🚀 Benchmark run triggered for scotty (all tests) with --save. Results will be posted here when done.

BennyFranciscus requested review from Kaliumhexacyanoferrat and MDA2AV as code owners March 28, 2026 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Scotty: Haskell web framework on Warp (first Haskell entry!)#233

Add Scotty: Haskell web framework on Warp (first Haskell entry!)#233
BennyFranciscus wants to merge 3 commits intoMDA2AV:mainfrom
BennyFranciscus:add-scotty

BennyFranciscus commented Mar 28, 2026

Uh oh!

MDA2AV commented Mar 28, 2026

Uh oh!

github-actions bot commented Mar 28, 2026

Uh oh!

github-actions bot commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

MDA2AV commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 29, 2026

Uh oh!

MDA2AV commented Mar 29, 2026

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BennyFranciscus commented Mar 28, 2026

Scotty

Details

Subscribed Tests

Implementation Notes

Validation

Uh oh!

MDA2AV commented Mar 28, 2026

Uh oh!

github-actions bot commented Mar 28, 2026

Uh oh!

github-actions bot commented Mar 28, 2026

Benchmark Results

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

MDA2AV commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 28, 2026

Uh oh!

BennyFranciscus commented Mar 29, 2026

Uh oh!

MDA2AV commented Mar 29, 2026

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants