10 Apr 13:24

github-actions

v0.3.1

1414207

v0.3.1 Latest

Latest

Full Changelog: v0.3.0...v0.3.1

Assets 4

10 Apr 12:04

SamoraHunter

v0.3.0

ed5de44

Release v0.3.0: Elasticsearch Testing & Data Safety

🚀 What's New in v0.3.0

This release focuses on industrializing the testing pipeline and enhancing data safety when interacting with Elasticsearch.

✨ Highlights

🔍 Integrated Elasticsearch Testing

Developers can now validate their clinical pipelines against a real, temporary Elasticsearch instance inside Docker. This replaces static mocks with actual search behavior.

🧪 Automated Synthetic Data Seeding

Includes new utilities to generate and seed realistic patient timelines into test clusters, complete with automated schema management via elastic_schemas.json.

🛡️ Data Ingestion Safety

Introduced strict guardrails that prevent accidental write operations to production Elasticsearch clusters during testing or development runs.

🤖 CI/CD Enhancements

Full support for local GitHub Action runners (via act), making it easier to debug complex notebook-based tests locally before pushing.

Full Changelog: v0.2.0...v0.3.0

Assets 4

23 Mar 10:22

SamoraHunter

v0.2.0

92c96f3

v0.2.0

Release v0.2.0

Database Backend Implementation

This release introduces a robust database backend using SQLAlchemy, which replaces the legacy file-based system as the default storage mechanism.

New Features

Database Support: Added support for SQLite (default) and PostgreSQL.
- Defaults to a local {project_name}.db SQLite database if no connection string is provided.
- Supports in-memory SQLite for testing.
Schema Management: The pipeline now handles automatic table creation and schema updates for:
- Raw Data: raw_data tables (e.g., raw_data.raw_bloods).
- Annotations: MedCAT annotations tables.
- Features: Feature vectors with JSON serialization for sparse/high-dimensional data.
Migration Utility: Added pat2vec/util/migrate_to_db.py to migrate existing file-based projects to the new database structure.

Configuration Changes

Added storage_backend option to config_class (values: 'database', 'file').
Added db_connection_string option to config_class.

Technical Improvements

Centralized Data Retrieval: Implemented get_df_from_db and updated retrieve_patient_data to abstract data access.
Performance: Implemented batch insertion and automatic index creation on primary keys (e.g., client_idcode, timestamps) to improve query performance.

Assets 4

22 Sep 21:33

SamoraHunter

v0.1.1

41e4a98

First public release of pat2vec

Full Changelog: https://github.com/SamoraHunter/pat2vec/commits/v0.1.1

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🚀 What's New in v0.3.0

✨ Highlights

🔍 Integrated Elasticsearch Testing

🧪 Automated Synthetic Data Seeding

🛡️ Data Ingestion Safety

🤖 CI/CD Enhancements

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Release v0.2.0

Database Backend Implementation

New Features

Configuration Changes

Technical Improvements

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: SamoraHunter/pat2vec

v0.3.1

Uh oh!

Release v0.3.0: Elasticsearch Testing & Data Safety

🚀 What's New in v0.3.0

✨ Highlights

🔍 Integrated Elasticsearch Testing

🧪 Automated Synthetic Data Seeding

🛡️ Data Ingestion Safety

🤖 CI/CD Enhancements

Uh oh!

v0.2.0

Release v0.2.0

Database Backend Implementation

New Features

Configuration Changes

Technical Improvements

Uh oh!

First public release of pat2vec

Uh oh!