A modular ecosystem under this. namespace.
-
Updated
Mar 18, 2026 - HTML
A modular ecosystem under this. namespace.
🩺 Machine Learning diabetes prediction model using Support Vector Machine (SVM) classifier. Analyzes 8 medical features (glucose, BMI, age, etc.) from Pima Indian dataset to predict diabetes risk with 75-80% accuracy. Built with Python, scikit-learn, pandas. Includes data preprocessing, model training, and prediction system for diabetes..
Example code accompanying the sternberg concept cell data release for Kyzar et al. (2024)
A digital transformation of cyber assessment and authorization data with a relational schema
Feature Engineering with Python
Prepare and check data to comply with Darwin Core Standard in R
Unifying Biotic Interactions Data: Terminology, Data Analysis, Standardization, and Proposal of a Data Schema for Plant-Pollinator Interactions
Highlighting expertise in data migration, data normalization and standardization, this project demonstrates successful data transfer from Snowflake to Databricks. It emphasizes optimized data flow and enhanced accessibility through standardization, showcasing a commitment to ethical data practices.
A Python-based data cleaning project to streamline Quickbooks invoice data for analysis, paving the way for improved insights into sales, pricing, and inventory management.
A new package processes textual descriptions of drone designs to extract structured summaries of their operational capabilities. It focuses on identifying and categorizing key features such as locomot
Building a modern data warehouse with SQL Server, including ETL Processes, Data Modeling and Analytics
This project is about cleaning and preparing a global layoffs dataset for analysis, focusing on handling null values, correcting data types, and ensuring data integrity for more accurate insights.
This Data Analytics project focused on understanding the career preferences and motivations of Generation Z.Through survey data and analysis, this project aims to identify key trends and factors influencing their career choices, providing insights for employers,educators, and recruiters looking to engage with this new generation of talent.
vuln-structure is a package that extracts vulnerability details from raw text and outputs standardized, structured data for security teams.
csv-managed is a Rust command-line utility for high‑performance exploration and transformation of CSV data at scale, emphasizing streaming, typed operations, and reproducible workflows via schema and index files.
Hi folk, During my internship at KultureHire, I completed an end to end Data Analytics project. I created an executive and functional dashboard using pivot tables, conducted a thorough analysis, and provided actionable recommendations. I'm excited to share my work and the insights I discovered.
🌟 Data Cleaning and Processing 🌟 Handled missing values, removed duplicates, standardized salary formats, and treated outliers for consistency.Revealed trends in company performance, job roles, and salary distributions after refining the dataset. This project highlights the power of data preprocessing as the backbone of reliable analytics.
This repository contains a SQL-based data cleaning project where raw layoffs data was transformed into a clean and structured dataset. The project showcases practical SQL techniques such as duplicate removal, data standardization, null handling, and schema optimization, following real-world data preparation best practices.
🧹 Excel 数据标准化清洗工具 | 100+智能规则 · 两阶段安全处理 · 公式不动 · 逐条审核 · 变更日志导出
Add a description, image, and links to the data-standardization topic page so that developers can more easily learn about it.
To associate your repository with the data-standardization topic, visit your repo's landing page and select "manage topics."