Runyoro is a comprehensive AI platform designed to bridge the digital divide for the Shona language. This project implements state-of-the-art Natural Language Processing (NLP) techniques, including Transformer-based Neural Machine Translation (NMT), Autoregressive Text Generation, and Automatic Speech Recognition (ASR). The system features a modern web interface that allows users to translate text between Shona and English, generate creative Shona text, and transcribe spoken Shona-accented English.
- Python 3.8+
- Node.js 16+
- Supabase Account (for authentication and database)
-
Navigate to the backend directory:
cd web_app/backend -
Activate the project environment:
conda activate transformer
If you haven't set up the environment yet, refer to ENVIRONMENT.md.
-
Navigate to the frontend directory:
cd web_app/frontend -
Install dependencies:
npm install
The project has been organized to ensure distinct separation of concerns between source code, data, models, and the web application.
.
├── src/ # Core Source Code
│ ├── data/ # Data loaders (AfriSpeech, Text)
│ ├── models/ # Model definitions (Transformer, etc.)
│ ├── training/ # Training scripts for all modalities
│ ├── inference/ # Inference scripts for generation/translation
│ ├── evaluation/ # Evaluation metrics and tools
│ └── utils/ # Common utilities (seed, checkpoints)
├── data/ # Dataset storage
│ ├── Train/ # Training Datasets
│ │ ├── AfriSpeech/ # ASR Data (Shona)
│ │ └── Flores-200/ # Translation/Gen Data (Shona/English)
│ ├── Test/ # Test Datasets
│ │ ├── AfriSpeech/ # ASR Data (Shona)
│ │ └── Flores-200/ # Translation/Gen Data (Shona/English)
├── saved_models/ # Trained model artifacts and checkpoints
│ ├── checkpoints/ # Training checkpoints
│ └── whisper-*/ # Fine-tuned Whisper models
├── scripts/ # Utility and Runner Scripts
│ ├── train/ # Scripts to launch training jobs
│ ├── evaluation/ # Scripts to generate deliverables
│ └── utils/ # Log checkers and maintenance scripts
├── web_app/ # Full Stack Web Application
│ ├── backend/ # FastAPI Backend
│ └── frontend/ # Next.js Frontend
├── results/ # Evaluation outputs (PDFs, TSVs)
└── legacy/ # Archived previous versions and grading files
All training scripts are located in src/training/ and should be executed from the project root.
1. Neural Machine Translation (NMT)
python src/training/train_nmt.py --epochs 10 --batch_size 162. Text Generation (Shona)
python src/training/train_gen.py --epochs 10 --batch_size 163. Automatic Speech Recognition (ASR - Whisper)
python src/training/train_asr.py --use_loraOr use the runner script for background execution:
python scripts/train/run_training_job.pyTo generate the final deliverables (PDF report, Transcriptions TSV, Ground Truth TXT):
python scripts/evaluation/generate_deliverables.pyOutputs will be saved in the project root or specified output directory.
Generate Text:
python src/inference/generate_samples.py --checkpoint saved_models/checkpoints/gen-run-1_best.pth.tarWe trained a Transformer model from scratch for Shona-English translation.
- Architecture: Transformer (d_model=256, n_layers=3, heads=4)
- Performance: BLEU Score: 32.82%, WER: 0.697
Autoregressive models trained for Shona text generation.
- Large Model: Trained on 100,000 sentences.
- Result: Coherent Shona text generation with low validation loss.
Fine-tuned OpenAI's Whisper models using LoRA on the AfriSpeech-200 Shona dataset.
- Performance: WER: 33.13% (Significant improvement over baseline).