Skip to content

Latest commit

 

History

History
110 lines (79 loc) · 6.71 KB

File metadata and controls

110 lines (79 loc) · 6.71 KB

ggsql

A SQL extension for declarative data visualization based on the Grammar of Graphics. Queries combine SQL data retrieval with a visualization spec in one composable syntax:

SELECT date, revenue, region FROM sales WHERE year = 2024
VISUALISE date AS x, revenue AS y, region AS color
DRAW line
LABEL title => 'Sales by Region'

The user-facing site is at https://ggsql.org. The README at README.md is the public introduction.

Authoritative docs

Anything about ggsql syntax or semantics belongs in doc/, not in any CLAUDE.md file. That includes clause behaviour (VISUALISE, DRAW, PLACE, SCALE, FACET, PROJECT, LABEL), layer types, scales, aesthetics, and coordinate systems. CLAUDE.md files describe the implementation around those features — they should link to doc/syntax/ rather than restate.

Writing ggsql queries: when you need to author or modify a ggsql query, use the vendored skill at doc/vendor/SKILL.md. It is the source of truth for the syntax Claude should produce; do not invent clauses, settings, aesthetics, or layer types beyond what it documents.

Workspace layout

Folder Role Type Per-folder CLAUDE.md
src/ Core Rust library (crate ggsql) Cargo workspace member src/CLAUDE.md
ggsql-cli/ ggsql command-line binary Cargo workspace member ggsql-cli/CLAUDE.md
tree-sitter-ggsql/ Tree-sitter grammar + multi-language bindings Cargo workspace member (also npm + PyPI) tree-sitter-ggsql/CLAUDE.md
ggsql-jupyter/ Jupyter kernel Cargo workspace member (also PyPI via maturin) ggsql-jupyter/CLAUDE.md
ggsql-wasm/ WebAssembly bindings + browser playground Cargo workspace member ggsql-wasm/CLAUDE.md
ggsql-vscode/ VS Code / Positron extension Standalone TypeScript / npm ggsql-vscode/CLAUDE.md
doc/ Quarto documentation site (ggsql.org) Quarto project doc/CLAUDE.md

The Cargo workspace (/Cargo.toml) has five members: tree-sitter-ggsql, src, ggsql-cli, ggsql-jupyter, ggsql-wasm. Default workspace members exclude ggsql-wasm (it needs the wasm32 target and is built separately).

High-level pipeline

ggsql query  ──►  parser  ──►  Plot AST  ──►  executor  ──►  Spec  ──►  writer  ──►  output
                  (tree-sitter)              (Reader runs SQL,            (Vega-Lite JSON)
                                              applies stats,
                                              resolves scales)
  • The parser splits the query at the VISUALISE boundary. SQL goes to a pluggable Reader (DuckDB, SQLite, ODBC); the VISUALISE part becomes a typed Plot.
  • The executor ties the two together: SQL → DataFrame, AST resolved against actual schema, stats and scales applied per layer.
  • The writer renders the resolved Spec to an output format (today: Vega-Lite JSON).

For details — module layout, traits, where extension points live — see src/CLAUDE.md. For the Vega-Lite renderer specifically, src/writer/vegalite/CLAUDE.md. For the AST types, src/plot/CLAUDE.md.

Building

# Rust workspace (default members: tree-sitter-ggsql, src, ggsql-cli, ggsql-jupyter)
cargo build --workspace
cargo build --release --workspace

# Just the library
cargo build --package ggsql

# Just the CLI binary
cargo build --package ggsql-cli

# Wasm build (separate, not in default workspace members)
cd ggsql-wasm && ./build-wasm.sh

# VS Code extension
cd ggsql-vscode && npm install && npm run package

# Tree-sitter parser (regenerate after editing grammar.js)
cd tree-sitter-ggsql && npx tree-sitter generate

Cross-platform installers (NSIS / MSI / DMG / Deb): see INSTALLERS.md. Releases are tag-driven via .github/workflows/.

Testing

# Whole Rust workspace
cargo test --workspace

# A single crate
cargo test --package ggsql
cargo test --package ggsql-jupyter

# Tree-sitter corpus
cd tree-sitter-ggsql && npm test

# Jupyter kernel protocol tests (Python)
cd ggsql-jupyter/tests && pip install -r requirements.txt && pytest

Per-folder CLAUDE.md files cover component-specific test guidance.

Coding style

  • Reuse existing infrastructure and architectural choices. When adding new code, prefer extending or adapting what is already there over introducing a parallel implementation. If reuse requires changes elsewhere to accommodate the new caller, that is more palatable than implementing the same thing twice.
  • Comments describe the current state of the code. Do not reference past states, how something used to work, what was changed, or why an earlier approach was abandoned — that history belongs in commit messages and CHANGELOG.md.
  • CHANGELOG.md is the record of user-visible change over time. Consult it when you need to know when something landed or how behaviour evolved. Update it when adding a feature, changing behaviour, or removing something — but write one entry per feature, added when the feature is complete. Don't gradually accrete bullets during development.

Where to ask which question