BoDmagh dataset is a Supervised Fine-Tuning (SFT) dataset for the Darija language
-
Updated
May 4, 2025 - Jupyter Notebook
BoDmagh dataset is a Supervised Fine-Tuning (SFT) dataset for the Darija language
Chrome extension to translate Darija
NLP pipeline for toxicity detection in Moroccan Darija and Arabizi. Powered by a fine-tuned ArabERT model, it robustly detects detects bad words, insults, hate speech, and evasion attempts.
Add a description, image, and links to the darija-llm topic page so that developers can more easily learn about it.
To associate your repository with the darija-llm topic, visit your repo's landing page and select "manage topics."