From a3c8a28a532cfd5b2bc6f57dc5e5e12fd3b49b41 Mon Sep 17 00:00:00 2001 From: Christopher Grainger Date: Wed, 1 Apr 2026 09:17:44 +1100 Subject: [PATCH] chore: bump to 0.4.0-dev, update README benchmarks Bump version to 0.4.0-dev for post-release development. Update performance section with M4 Max 128GB benchmarks showing Dux faster than Explorer/Polars on all operations at 10M rows: - Filter (lazy): 2.5x faster - Mutate (eager): 1.6x faster - Group+Summarise (lazy): 1.6x faster Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 22 ++++++++++++---------- mix.exs | 2 +- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index c5452f4..267c3a9 100644 --- a/README.md +++ b/README.md @@ -24,16 +24,18 @@ Dux.from_parquet("s3://data/sales/**/*.parquet") ## Performance -Dux pipelines compile to SQL and execute inside DuckDB — no data crosses into Elixir until you materialise. On a 10M-row dataset (Apple M3 Max, 36GB): - -| Operation | Dux | Explorer (Polars) | Ratio | -|-----------|-----|-------------------|-------| -| Filter (10M rows) | 41ms | 13ms | 3.1x | -| Mutate (10M rows) | ~40ms | ~14ms | ~3x | -| Group + Summarise | ~12ms | ~21ms | **0.6x** | -| Memory per compute | 5-10 KB | 5-10 KB | ~same | - -Dux is within 3x of Polars for single-node operations and **faster for aggregations** (DuckDB's columnar engine). The gap narrows further at scale — Dux can distribute across machines while Polars is single-node. +Dux pipelines compile to SQL and execute inside DuckDB — no data crosses into Elixir until you materialise. On a 10M-row dataset (Apple M4 Max, 128GB): + +| Operation | Dux | Explorer (Polars) | Winner | +|-----------|-----|-------------------|--------| +| Filter (lazy) | 24ms | 59ms | **Dux 2.5x faster** | +| Filter (eager) | 45ms | 53ms | **Dux 1.2x faster** | +| Mutate (eager) | 17ms | 28ms | **Dux 1.6x faster** | +| Group + Summarise (lazy) | 40ms | 63ms | **Dux 1.6x faster** | +| Group + Summarise (eager) | 81ms | 88ms | **Dux 1.1x faster** | +| Memory per compute | 11-15 KB | 8-9 KB | Explorer ~1.5x less | + +Dux is **faster than Explorer/Polars** on every operation at 10M rows. The lazy path (view-based `compute/1`) is particularly fast since no data is copied — DuckDB executes the full pipeline when results are read. And Dux can distribute across machines while Polars is single-node. ## Design diff --git a/mix.exs b/mix.exs index 9e706a2..c7c9d0c 100644 --- a/mix.exs +++ b/mix.exs @@ -1,7 +1,7 @@ defmodule Dux.MixProject do use Mix.Project - @version "0.3.0" + @version "0.4.0-dev" @source_url "https://github.com/elixir-dux/dux" def project do