Most analysis APIs are used from Python (import metainformant...). The metainformant entry point (src/metainformant/__main__.py) exposes a small set of commands today.
Entry: uv run python -m metainformant or uv run metainformant.
Global flags:
--version— print package version--modules— list domain module names--help— usage (default when no arguments)
Subcommands:
uv run metainformant protein taxon-ids --file tests/data/protein/taxon_id_list.txt
uv run metainformant protein comp --fasta data/protein/example.faa
uv run metainformant protein rmsd-ca --pdb-a file1.pdb --pdb-b file2.pdb
uv run metainformant quality batch-detect --data samples.csv --batches batches.txt
uv run metainformant quality batch-detect --data samples.csv --batches batches.txt --alpha 0.01
uv run metainformant rna info
uv run metainformant gwas info
| Command | Purpose |
|---|---|
| protein taxon-ids | Read and print taxon IDs from a file (Protein Proteomes) |
| protein comp | Amino acid composition per sequence from FASTA |
| protein rmsd-ca | Kabsch RMSD between CA atoms of two PDB files |
| quality batch-detect | Batch-effect report from a numeric matrix (CSV) and per-sample batch labels file |
| rna info | Prints RNA sub-package summary (use Python API for workflows) |
| gwas info | Prints GWAS sub-package summary (use Python API or scripts for full runs) |
The main metainformant CLI does not implement rna run, rna plan, gwas run, dna fetch, or setup. Use one of:
Python API (see RNA Workflow, GWAS Workflow):
from pathlib import Path
from metainformant.rna.engine.workflow import load_workflow_config, plan_workflow, execute_workflow
config = load_workflow_config(Path("config/amalgkit/amalgkit_pogonomyrmex_barbatus.yaml"))
steps = plan_workflow(config)
result = execute_workflow(config, check=False)Module entry (amalgkit):
uv run python -m metainformant.rna.amalgkit --helpScript orchestrators (recommended for config-driven runs):
python3 scripts/rna/run_workflow.py --config config/amalgkit/amalgkit_pogonomyrmex_barbatus.yamlTUI variants: TUI Monitoring Tools below.
Run the test suite with pytest, not via metainformant:
bash scripts/package/test.shSee Testing.
sequenceDiagram
participant U as User
participant Py as Python_API
participant Eng as rna_engine_workflow
participant Am as amalgkit_cli
U->>Py: load_workflow_config(path)
U->>Py: plan_workflow(config)
Py->>Eng: plan_workflow
Eng-->>Py: list_of_steps
U->>Py: execute_workflow(config, check=True)
Py->>Eng: execute_workflow
Eng->>Am: run_amalgkit(step, params)
Am-->>Eng: subprocess_results
Eng-->>Py: WorkflowExecutionResult
See: RNA Workflow, DNA, GWAS Workflow, Testing.
Interactive terminal tools (separate from metainformant):
-
scripts/rna/monitor_tui.py— Dashboard for pipeline progress and system metrics (default refresh ~5s).python scripts/rna/monitor_tui.py
-
scripts/rna/run_workflow_tui.py— Workflow runner with TUI.python scripts/rna/run_workflow_tui.py --config config/amalgkit/amalgkit_pogonomyrmex_barbatus.yaml --threads 5
output/amalgkit/<species>/
├── work/ # Amalgkit intermediate files (metadata, selected samples, merge outputs)
│ ├── metadata/ # Sample metadata from NCBI (metadata_selected.tsv)
│ ├── getfastq/ # Symlinks to downloaded FASTQ files
│ ├── quant/ # Kallisto quantification results per sample
│ ├── merge/ # Combined abundance matrices
│ └── logs/ # Step-level log files
├── fastq/ # Raw downloaded FASTQ files (ENA .fastq.gz or SRA extracts)
│ └── getfastq/ # Per-sample subdirectories with FASTQ pairs
└── genome/ # Reference genome FASTA + Kallisto index
work/getfastq/ holds symlinks to fastq/getfastq/ where data files live.
- RNA:
config/amalgkit/*.yaml(RNA Workflow) - GWAS:
config/gwas/*.yaml(GWAS Workflow) - Networks:
config/networks/networks_template.yaml - Multi-omics:
config/multiomics/multiomics_template.yaml - Single-cell:
config/singlecell/singlecell_template.yaml
See Configuration Management for environment overrides.