sorta is a lightweight CLI tool that automatically sorts PDF documents into folders based on their content using keyword rules.
Built for Linux (Arch-first), but fully portable.
- Keyword based PDF classification
- Hierarchical root tree routing (
childrenroots) - Rule templates with per-root overrides
- Safe moving with collision handling (
file_1.pdf,file_2.pdf) - Watch mode: auto-sort new files instantly
- Structured logs stored in
~/.cache/sorta - Built-in config validator (
sorta doctor) - Helpers for adding roots and rules (
add-root,add-rule)
sorta sorts documents in two steps:
-
Root selection (category level)
The tool scans the PDF text and chooses the best matching root using the root’s keywords. -
Rule selection (type level)
Inside that root, it applies the root’srule_set(NOTES, QB, SLIDES, etc.) to decide the final subfolder. -
Safe fallback
- If no match is found, the file goes into
UNSORTED - If multiple rules tie, the file is marked AMBIGUOUS and also moved to
UNSORTED
- If no match is found, the file goes into
Install from AUR:
yay -S sortasorta initThis creates ~/.config/sorta/config.toml
sorta editDefault dropbox:
~/SortaDrop/
sorta runFlags:
-f: Folder to scan (defaults to dropbox)--here: Scan the current working directory
sorta watch -f ~/SortaDropor to watch current working directory:
sorta watch --hereThis command can be ran as autostart to automatically move files from a folder as it gets added.
| Command | Description |
|---|---|
sorta run |
Sort all PDFs in dropbox once |
sorta watch |
Continuously watch folder and auto-sort |
sorta sort <file> |
Sort one file immediately |
| Command | Description |
|---|---|
sorta classify <file> |
Show predicted destination (no move) |
sorta list |
Show configured roots + rules |
sorta doctor |
Validate config + detect common problems |
Adds a new global rule template under [rules.*].
sorta add-rule NAME --keywords "kw1,kw2,kw3"sorta add-rule SLIDES -k "slides,ppt,deck,lecture"Adds a new leaf root folder AND automatically attaches it into the routing tree.
sorta add-root NAME \
--parent PARENT \
--path "DESTINATION_PATH" \
--keywords "kw1,kw2,kw3" \
--rules "RULE1,RULE2" \
[--override RULE=kw1,kw2]| Option | Value |
|---|---|
NAME |
Root name (stored lowercase) |
--parent |
Parent root to attach under |
--path |
Folder destination |
-k, --keywords |
Root matching keywords |
-r, --rules |
Rule set allowed inside this root |
--override (Optional) |
Override rules for this root |
sorta add-root physics \
--parent Documents \
--path "~/Documents/Sorta/Physics" \
-k "physics,quantum,optics" \
-r "NOTES,QB" \
--override NOTES="lecture slides,class notes"Config lives in:
~/.config/sorta/config.toml
Rules define document types:
[rules.NOTES]
keywords = ["notes", "lecture"]
[rules.QB]
keywords = ["question bank", "previous year"]Roots define folder destinations:
[roots.documents]
path = "~/Documents/Sorta"
children = ["study", "finance"]
[roots.study]
path = "~/Documents/Sorta/Study"
keywords = ["unit", "lecture"]
rule_set = ["NOTES", "QB"]You can override rule keywords for one root:
[roots.study.rule_overrides.NOTES]
keywords = ["lecture slides", "class notes"]Overrides only apply inside the particular root.
If multiple rules tie with the same score, sorta refuses to guess:
{
"type": "AMBIGUOUS",
"dest": "~/SortaDrop/UNSORTED"
}This prevents silent misfiling. All ambiguous files will be placed in ~/SortaDrop/UNSORTED
Logs are stored in:
~/.cache/sorta/sorta.log
Each entry is JSONL:
{"status":"MOVED","file":"unit2.pdf","dest":"...","timestamp":"..."}Run the following to verify your rules and roots and common mistakes:
sorta doctorClone and install locally:
git clone https://github.com/woterr/sorta
cd sorta
uv venv
source .venv/bin/activate
uv pip install -e .Run:
sorta --help