Skip to content

pytgaen/fimod

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

fimod

the data shaper CLI

πŸ—οΈ Mold your data, shape your CI, play with your pipelines

πŸͺΆ Python-powered molding without Python installed

πŸ’‘ DRY your pipelines Β· Slim your container images Β· Tame your configs

Release License CI


fimod (Flexible Input, Mold Output Data) is a single Rust binary (~2.3 MB, UPX-compressed) with an embedded Python runtime (Monty). It reads JSON, YAML, TOML, CSV, NDJSON, and plain text - from files or directly from HTTP URLs - lets you transform data with Python expressions, and writes the result in any of those formats. No system Python, no pip install, no dependencies.

# πŸ” Filter, reshape, convert - in one command
fimod s -i users.json -e '[u for u in data if u["active"]]' -o active.csv

Hero Demo

# ⛓️ Chain transforms like Unix pipes - inside a single process. Also have some built-in helpers.
fimod s -i data.json -e '[u for u in data if u["age"] > 30]' -e 'it_sort_by(data, "name")'

# πŸ“¦ Batch-process entire directories
fimod s -i logs/*.json -m normalize.py -o cleaned/

πŸ“¦ Install

Linux / macOS

curl -fsSL https://raw.githubusercontent.com/pytgaen/fimod/main/install.sh | sh

The script downloads the right binary, installs it, then runs fimod registry setup to configure the examples mold catalog.

πŸ’‘ Options via env vars: FIMOD_VARIANT=slim Β· FIMOD_INSTALL=~/.local/bin Β· FIMOD_VERSION=0.1.0

Windows

Option 1 β€” via ubi (no script, antivirus-friendly)

ubi is a universal binary installer available on winget (pre-installed on Windows 10/11):

# πŸ“¦ 1. Install ubi (one-time, uses winget which is built into Windows)
winget install houseabsolute.ubi

# πŸ”„ Then restart PowerShell so ubi is found in PATH

# ⬇️ 2. Install fimod (classic β€” includes HTTP support)
ubi --project pytgaen/fimod --matching "fimod-v" --in "$env:USERPROFILE\.local\bin"

# Or install the slim variant (no HTTP support, smaller binary)
# ubi --project pytgaen/fimod --matching "fimod-slim-v" --in "$env:USERPROFILE\.local\bin"

# πŸ›€οΈ 3. Add to PATH (if not already present)
$BinDir = "$env:USERPROFILE\.local\bin"
$UserPath = [Environment]::GetEnvironmentVariable('PATH', 'User')
if ($UserPath -notlike "*$BinDir*") {
    [Environment]::SetEnvironmentVariable('PATH', "$BinDir;$UserPath", 'User')
    $env:PATH = "$BinDir;$env:PATH"
}

# πŸ—‚οΈ 4. Set up the examples mold catalog
fimod registry setup
Option 2 β€” PowerShell script (execution policy / antivirus may block)

⚠️ If your antivirus blocks this script, use Option 1 (ubi) instead β€” it downloads a signed binary directly from GitHub Releases with no script execution.

Download first, then run:

Invoke-RestMethod https://raw.githubusercontent.com/pytgaen/fimod/main/install.ps1 -OutFile "$env:TEMP\fimod-install.ps1"
& "$env:TEMP\fimod-install.ps1"

πŸ’‘ Same env var options as Linux: $env:FIMOD_VARIANT, $env:FIMOD_INSTALL, $env:FIMOD_VERSION

⚠️ VCRUNTIME140.dll not found?

fimod requires the Microsoft Visual C++ Redistributable, pre-installed on most Windows systems but missing in minimal environments (Windows Sandbox, fresh server installs).

winget install Microsoft.VCRedist.2015+.x64

Or download directly from Microsoft: https://aka.ms/vs/17/release/vc_redist.x64.exe

From source

git clone https://github.com/pytgaen/fimod && cd fimod
cargo build --release   # β†’ target/release/fimod

πŸ€” Why not jq / yq / awk / sed?

You already know Python. Why learn another DSL?

jq / yq - powerful but you need to learn a custom query language:

# jq: filter users older than 30
jq '[.[] | select(.age > 30)]' users.json

# fimod: same thing, it's just Python
fimod s -i users.json -e '[u for u in data if u["age"] > 30]'
# jq: project + sort + deduplicate
jq '[.[] | {id, name}] | sort_by(.name) | unique_by(.id)' data.json

# fimod: chain expressions, each feeds the next
fimod s -i data.json -e '[{"id": u["id"], "name": u["name"]} for u in data]' \
  -e 'it_unique_by(it_sort_by(data, "name"), "id")'

Python one-liner - works but painful boilerplate:

python3 -c "
import json, sys
data = json.load(sys.stdin)
print(json.dumps([u for u in data if u['active']]))
" < users.json

# fimod: same logic, zero boilerplate, no Python install
fimod s -i users.json -e '[u for u in data if u["active"]]'

πŸ‘‰ See the full feature comparison against jq, yq, and Python

πŸ‘€ A taste of what fimod can do

🐍 Pure Python transforms β€” Rust-powered I/O, serialization & builtins:

# YAML to JSON, filter active users, sort by name
fimod s -i users.yaml -e '[u for u in data if u["active"]]' -e 'it_sort_by(data, "name")' -o result.json
# Filter active users, then group by role β€” Unix pipes just work
fimod s -i users.json -e '[u for u in data if u["active"]]' | fimod s -e 'it_group_by(data, "role")'
# Enrich records with Python string methods β€” try this in jq...
fimod s -i users.json -e '[{**u, "slug": u["name"].lower().replace(" ", "-"), "domain": u["email"].split("@")[1]} for u in data]'

πŸ“¦ Registry molds β€” reusable recipes, one @name away:

# πŸ”€ Patch a YAML config with dot-path assignments
fimod s -i deployment.yaml -m @yaml_merge --arg set="spec.replicas=3,metadata.labels.env=prod" -o deployment.yaml
# πŸ” Anonymize PII fields with SHA-256
fimod s -i users.json -m @anonymize_pii --arg fields=email,phone -o users_anon.json
# πŸ“Š Deduplicate records by a field
fimod s -i data.json -m @dedup_by --arg field=email

πŸ“¦ More molds in the fimod-powered registry:

Mold Description
@gh_latest GitHub release resolver
@download wget-like fetch
@poetry_migrate Poetry β†’ uv/Poetry 2
@skylos_to_gitlab dead code β†’ GitLab Code Quality
fimod registry add fimod-powered https://github.com/pytgaen/fimod-powered
🍿 Even more taste... (in-place, regex, log parsing, env templating)
# πŸ”’ Anonymize emails in-place β€” replace with SHA-256 hashes
fimod s -i customers.csv -e '[{**r, "email": hs_sha256(r["email"])} for r in data]' --in-place
# πŸ•΅οΈ Mask IPs with regex β€” 192.168.1.42 β†’ 192.168.x.x
fimod s -i logs.json -e '[{**r, "ip": re_sub(r"\d+\.\d+$", "x.x", r["ip"])} for r in data]'
# πŸ“Š Raw log lines β†’ structured JSON records
fimod s -i server.log -m @log_parse \
  --arg regex='(\S+) \[(.+?)\] "(.+?)" (\d+)' \
  --arg fields=ip,timestamp,request,status
# πŸ”€ Inject environment variables into ${VAR} placeholders
fimod s -i config.json --env 'DB_*' -e '{k: env_subst(v, env) for k, v in data.items()}'

Run fimod mold list to browse all built-in molds.

πŸ”‹ Batteries included

πŸ—‚οΈ Multi-file slurp

The classic yq/jq slurp use case β€” merge a base config with environment overrides β€” but across any mix of formats:

# Merge base.yaml with prod overrides in TOML β€” impossible with yq
fimod s -i base.yaml -i prod.toml -s -e '
def transform(data):
    data[0].update(data[1])
    return data[0]
'

Slurp Demo

data is an array ordered like the -i flags; later entries win on conflict.

Named mode β€” append : to get a dict keyed by filename stem, clearer than an index when files have distinct roles:

# Merge base with prod overrides β€” role is explicit, no need to count -i flags
fimod s -i base.yaml: -i prod.yaml: -s -e '
def transform(data):
    data["base"].update(data["prod"])
    return data["base"]
'

Explicit aliases β€” when two files share the same name:

# Merge configs from sibling directories
fimod s -i eu/limits.toml:eu -i us/limits.toml:us -s \
  -e '{ region: v["max_requests"] for region, v in data.items() }'

The mold runs once on the combined result. Works across formats (JSON + YAML + TOML + CSV…).

⛓️ Chaining

Multiple -e expressions form an in-process pipeline - each step feeds data to the next:

fimod s -i data.json \
  -e '[u for u in data if u["age"] > 18]' \
  -e 'it_sort_by(data, "name")' \
  -e '[{"name": u["name"], "hash": hs_sha256(u["email"])} for u in data]'

Chaining Demo

🧰 Built-in helpers - no import needed

Family Functions Example
re_* search, match, findall, sub, split re_sub(r"(\w+)@(\w+)", r"\2/\1", text)
re_*_fancy same + fancy-regex $1/${name} syntax re_sub_fancy(r"(\w+)@(\w+)", "$2/$1", text)
dp_* get, set (nested dotpath) dp_set(data, "server.port", 8080)
it_* sort_by, group_by, unique, flatten, ... it_group_by(data, "status")
hs_* md5, sha1, sha256 hs_sha256(data["email"])
msg_* print, info, warn, error (to stderr) msg_warn("low coverage")
gk_* fail, assert, warn (validation gates) gk_assert(data.get("version"), "missing version")
env_subst ${VAR} substitution in templates env_subst("Hello ${NAME}", env)

Helpers are implemented in Rust. Regex patterns use fancy-regex (PCRE2). re_sub accepts Python \1/\g<name> syntax; re_sub_fancy uses $1/${name}.

πŸ“¦ Reusable molds & registries

A mold is a Python file with a transform(data, args, env, headers) function. Inline -e expressions are great for one-liners, molds are for transforms you want to name, test, and share.

# normalize.py
def transform(data, args, env, headers):
    return [{"name": u["name"].strip().title(), "email": u["email"].lower()} for u in data]
# Use a local mold
fimod s -i users.json -m normalize.py

# Use a remote mold - fetched and executed on the fly
fimod s -i users.json -m https://example.com/transforms/normalize.py

Registries are named collections of molds (local directories or GitHub/GitLab repos). The @ prefix resolves molds from registries:

fimod registry add team https://github.com/myorg/molds --default
fimod s -i data.csv -m @clean_csv          # from default registry
fimod s -i data.csv -m @team/clean_csv     # explicit registry
fimod mold list                           # browse available molds
fimod mold show @clean_csv               # inspect metadata & defaults
Private registry with token

For private GitHub/GitLab repos, fimod automatically uses $GITHUB_TOKEN or $GITLAB_TOKEN:

# 1. Export your token (add to .bashrc/.zshrc for persistence)
export GITHUB_TOKEN=ghp_xxx

# 2. Add a private registry
fimod registry add corp https://github.com/myorg/private-molds --default

# 3. Use molds β€” token is picked up automatically
fimod s -i data.json -m @corp/sanitize

# Verify token is detected
fimod registry show corp
#   Token:   $GITHUB_TOKEN (auto) β€” set βœ“

You can also use a custom env var per registry:

fimod registry add corp https://github.com/myorg/private-molds --token-env CORP_TOKEN
export CORP_TOKEN=ghp_yyy

CI/ephemeral environments β€” use FIMOD_REGISTRY instead of fimod registry add:

FIMOD_REGISTRY=./molds fimod s -i data.json -m @clean
FIMOD_REGISTRY="ci=./molds,staging=https://github.com/org/molds" fimod s -i data.json -m @ci/clean

fimod ships with a built-in mold catalog covering common tasks (CSV stats, JSON schema extraction, key renaming, PII anonymization, and more).

πŸ”₯ HTTP input (goodbye curl | jq)

The -i flag accepts URLs just like file paths. No curl, no wget, no pipes. Fimod fetches, parses, and transforms in a single command.

# Fetch and transform in one shot - replaces curl | jq
fimod s -i https://api.github.com/repos/pytgaen/fimod -e 'data["name"] + ": " + str(data["stargazers_count"]) + " stars"' --output-format txt

# Hit authenticated APIs with custom headers
fimod s -i https://api.github.com/user/repos \
    --http-header "Authorization: Bearer $GITHUB_TOKEN" \
    -e '[r["full_name"] for r in data]'

# πŸ‘€ Download binaries - bypass the transform pipeline entirely
fimod s -i https://example.com/archive.tar.gz --output-format raw -O

HTTP Demo

Powered by reqwest with rustls - proxy-aware out of the box (HTTP_PROXY / HTTPS_PROXY / NO_PROXY). Smart format detection reads Content-Type headers automatically. Use --input-format http for full access to status codes and response headers.

Requires the full build variant (default). Use FIMOD_VARIANT=slim to exclude HTTP support.

πŸ›‘οΈ Security model

Mold scripts are pure functions - they receive data and return a result. They cannot:

  • Read/write files, access the network, or call the OS
  • Import external libraries

All I/O stays in Rust. You can safely run molds from remote URLs without sandboxing concerns.

βš™οΈ How it works

 Input                  Python transform           Output
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ file / stdin β”‚       β”‚                   β”‚      β”‚ JSON / YAML  β”‚
β”‚ https://...  │─────▢│  your transform   │─────▢│ TOML / CSV   β”‚
β”‚ JSON / YAML  β”‚ Rust  β”‚  runs in Monty    β”‚ Rust β”‚ NDJSON / TXT β”‚
β”‚ TOML / CSV   β”‚ parse β”‚  (embedded Python)β”‚ ser. β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ NDJSON / TXT β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“– Documentation

πŸ“š Guides πŸ”§ Reference
Quick Start Formats - JSON, YAML, TOML, CSV, TXT, Lines, NDJSON, HTTP
Concepts Built-ins - re_*, dp_*, it_*, hs_*, msg_*, gk_*, env_subst
Mold Scripting Mold Defaults - # fimod: directives
CLI Reference Exit Codes - --check and set_exit()
Authoring Molds Cookbook 🍳
AI Integration & Agents πŸ€– Agent Skill ✨

⚠️ Project Status

fimod is young software - built with AI-assisted development ("vibe coding").

  • Monty (the embedded Python runtime) is an early-stage project by Pydantic. Its API is unstable and may change between releases.
  • fimod depends directly on Monty and inherits that instability. Expect breaking changes as both projects mature.
  • Versioning follows Semantic Release - breaking changes bump the major version.
  • Built-in helpers (re_*, dp_*, it_*, hs_*, msg_*, gk_*, env_subst) are implemented in Rust to complement Monty's limited stdlib. In particular, regex functions use fancy-regex syntax (Rust/PCRE2 flavour), not Python's re module - see Built-ins Reference.

Note

Regex: Fimod built-ins vs Monty's re module

Fimod was originally built on Monty v0.0.6, which had no regex support. We introduced re_search, re_sub, re_findall, etc. as Fimod built-in functions to fill that gap β€” a good example of the challenges of moving fast alongside a young runtime.

Since Monty v0.0.8, import re works β€” Monty implements a subset of Python's re module. Both approaches now work side by side:

  • Fimod's re_* built-ins β€” direct access to fancy-regex, including advanced features like variable-length lookbehind/lookahead
  • import re β€” familiar Python API, but only partially implemented in Monty (also backed by fancy-regex under the hood)

The re_* built-ins are here to stay for the foreseeable future (at least until late 2027). As Monty's re module matures, we'll reconsider.

Since import re is already well-known to Python developers, the documentation focuses on the re_* built-ins which are specific to Fimod.

πŸ“„ License

GNU Lesser General Public License v3.0 - see LICENSE.txt.

About

πŸ—οΈ fimod - the data shaper CLI. Transform JSON, YAML, TOML, CSV with Python expressions. No Python install, no deps β€” single Rust binary πŸͺΆ

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages