ml-evaluation

Here are 16 public repositories matching this topic...

OlivierBinette / er-evaluation

An End-to-End Evaluation Framework for Entity Resolution Systems

data-science statistics matching record-linkage entity-resolution evaluation fuzzy-matching disambiguation deduplication duplicate-detection author-name-disambiguation ml-testing ml-evaluation inventor-name-disambiguation

Updated Dec 3, 2023
Python

supermodeltools / mcpbr

Star

Model Context Protocol Benchmark Runner

python benchmarking machine-learning mcp ml-evaluation llm-evaluation model-context-protocol swe-bench

Updated Feb 27, 2026
Python

AmeyaWagh / robometric-frame

Star

A metrics library to evaluate vision language models with a pytorch eco system.

robotics policy-evaluation evaluation-metrics ml-evaluation torchmetrics diffusion-policy lerobot vision-language-action-model policy-lear

Updated Feb 8, 2026
Python

Giskard-AI / community-content

Sponsor

Star

✍️ Collaborate on writing technical content for the Giskard Community

testing content machine-learning ai ml tutorials artificial-intelligence tutorial-code ml-testing giskard ml-evaluation

Updated Nov 11, 2022

pareshrnayak / confusion-matrix-generator

Star

An open-source Streamlit web app to generate beautiful confusion matrices for multi-class machine learning models. Supports numeric and string labels, CSV upload, manual label entry, custom color maps, and displays evaluation metrics like Accuracy, Precision, Recall, and F1-score. Users can download the confusion matrix as an image.

python open-source data-science machine-learning data-visualization confusion-matrix model-evaluation multiclass-classification streamlit streamlit-webapp classification-metrics ml-evaluation confusion-matrix-generator

Updated Jan 18, 2026
Python

Comrade-1729 / lex-brief-ai

Star

Safety-first legal NLP system with hierarchical long-document processing, deterministic inference, clause extraction, and rule-based risk engine — built for traceability and deployment constraints.

nlp django transformers pytorch production-ml ml-evaluation rule-based-systems deterministic-inference

Updated Feb 10, 2026
Python

rodrigoguedes09 / model-observability-system

Star

Enterprise-grade machine learning observability platform that detects data drift, concept drift, and performance degradation in production models. Features statistical drift detection (KS test, PSI), real-time alerting, Redis caching, and FastAPI backend.

python machine-learning machine-learning-algorithms ml observability ml-observability ml-evaluation

Updated Jan 15, 2026
Python

johnsonhk88 / Data-Science-Challenge-Coursera-Project-Loan-Default-Prediction

Star

Data Science Challenge from Coursera Project : Loan Default Prediction

data-science machine-learning ai deep-learning random-forest exploratory-data-analysis coursera data-cleaning loan-default-prediction xgboost-classifier ml-evaluation

Updated Oct 16, 2024
Jupyter Notebook

kmock930 / Drug-Consumption-Machine-Learning-analysis

Star

This project contains codes and paperwork based on the course CSI5155 at University of Ottawa (delivered by Professor Dr. Herna Viktor).

machine-learning random-forest svm supervised-learning semi-supervised-learning mlp unsupervised-learning knn decision-tree ensemble-model gradient-boosting boosting receiver-operating-characteristic bagging xai area-under-curve ml-pipeline shap-analysis ml-evaluation

Updated Dec 9, 2024
Jupyter Notebook

greynewell / swe-bench-pro-action

Sponsor

Star

GitHub Action for SWE-bench Pro evaluation powered by mcpbr

python benchmarking mcp ai-agents github-actions ml-evaluation llm-evaluation swe-bench

Updated Feb 26, 2026
Shell

SvetLuna-Lab / Mini-rag-eval-demo

Star

Small, educational project that shows how to build a **minimal RAG pipeline** with a **simple evaluation loop**

python nlp machine-learning information-retrieval text-mining evaluation tfidf educational-project rag qa-system ml-evaluation retrieval-augmented-generation

Updated Nov 10, 2025
Python

praveendecode / Docker_ML_NLP_Projects

Star

Collection of Machine Learning (ML) and Natural Language Processing (NLP) projects showcasing a range of applications, algorithms, and techniques.

python machine-learning natural-language-processing machine-learning-algorithms streamlit ml-evaluation

Updated Dec 2, 2023

kirtis111 / e-commerce-recommendation-system

Star

End-to-end E-Commerce Recommendation System using implicit feedback, featuring Popularity, Item-Item CF, ALS (Matrix Factorization), and a Hybrid model, with offline evaluation and online serving via FastAPI + Streamlit.

data-science ecommerce personalization ranking recall recommendation-engine als recommender-systems hybrid-model implicit-feedback ndcg model-serving fastapi ml-pipeline streamlit ml-evaluation

Updated Jan 26, 2026
Jupyter Notebook

ivy-mainaa / System-Risk-in-Policy-Driven-AI-Systems

Star

Evaluation of system-level risks in content moderation models using policy-driven metrics, identity-based analysis, and governance-aligned datasets.

fairness content-moderation responsible-ai ai-governance ml-evaluation

Updated Jan 4, 2026
Jupyter Notebook

victoropp / naive-bayes-spam-detection

Star

A MATLAB-based machine learning project that implements a Naive Bayes spam email classifier using the UCI Spambase dataset. Includes feature selection, model tuning, performance evaluation, and deployment-ready model export.

data-science machine-learning deployment matlab naive-bayes academic-project roc-curve spam-classification precision-recall classification-models uci-dataset ml-evaluation bootstrap-accuracy

Updated May 23, 2025
MATLAB

TakundaVito / confusion-matrix-generator

Star

📊 Generate and visualize confusion matrices for multi-class models with ease, including metrics and custom formats in an open-source Streamlit app.

python open-source data-science machine-learning data-visualization confusion-matrix model-evaluation multiclass-classification github-config streamlit streamlit-webapp classification-metrics ml-evaluation confusion-matrix-generator

Updated Mar 2, 2026

Improve this page

Add a description, image, and links to the ml-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ml-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ml-evaluation

Here are 16 public repositories matching this topic...

OlivierBinette / er-evaluation

supermodeltools / mcpbr

AmeyaWagh / robometric-frame

Giskard-AI / community-content

pareshrnayak / confusion-matrix-generator

Comrade-1729 / lex-brief-ai

rodrigoguedes09 / model-observability-system

johnsonhk88 / Data-Science-Challenge-Coursera-Project-Loan-Default-Prediction

kmock930 / Drug-Consumption-Machine-Learning-analysis

greynewell / swe-bench-pro-action

SvetLuna-Lab / Mini-rag-eval-demo

praveendecode / Docker_ML_NLP_Projects

kirtis111 / e-commerce-recommendation-system

ivy-mainaa / System-Risk-in-Policy-Driven-AI-Systems

victoropp / naive-bayes-spam-detection

TakundaVito / confusion-matrix-generator

Improve this page

Add this topic to your repo