Skip to content

Conversation

@Ulthran
Copy link
Contributor

@Ulthran Ulthran commented Jan 15, 2026

Motivation

  • Production assemblies were often joining to isolates with blank or mismatched IDs and suspected_organism values, causing the UI to collapse many entries into an "Unknown" bucket instead of the real species.
  • Normalize and trim values at the API layer to make the /api/assemblies/metrics results robust to whitespace and empty-string differences between assemblies.isolate_id and isolates.sample_id.

Description

  • Add a labeled column suspected_organism = func.nullif(func.trim(Isolate.suspected_organism), "") and return it in the assembly metrics payload in app/app.py.
  • Change the outerjoin to match trim(Assembly.isolate_id) with trim(Isolate.sample_id) so joins succeed even when IDs have surrounding whitespace.
  • The changes are contained in app/app.py and only affect the /api/assemblies/metrics query projection.

Testing

  • Ran formatting with black . which reformatted app/app.py and completed successfully.
  • Ran pytest tests/ which collected 9 tests and produced 3 passed, 6 failed with failures due to sqlite3.OperationalError: no such table in the test environment, indicating missing DB tables rather than a regression in the changed code.
  • Manual verification steps to reproduce/verify against a real DB are described in code review notes (counts of empty suspected_organism and joins using trimmed IDs).

Codex Task

Copilot AI review requested due to automatic review settings January 15, 2026 19:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR normalizes the join condition and suspected_organism field in the /api/assemblies/metrics endpoint to handle whitespace and empty strings in isolate identifiers and organism names. This prevents production assemblies from incorrectly joining to isolates or displaying "Unknown" instead of actual species names.

Changes:

  • Wraps suspected_organism with func.nullif(func.trim(...), "") to normalize whitespace and convert empty strings to NULL
  • Updates the outerjoin condition to use func.trim() on both Assembly.isolate_id and Isolate.sample_id for robust matching

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +230 to +232
.outerjoin(
Isolate, func.trim(Assembly.isolate_id) == func.trim(Isolate.sample_id)
)
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trimming logic applied to the join condition here is inconsistent with other places in the codebase. Line 167 performs a direct join without trimming (Assembly.isolate_id == isolate.sample_id). This inconsistency could lead to different behavior across endpoints. Consider either applying the same normalization in other join locations (like line 167) or documenting why this endpoint requires special handling.

Suggested change
.outerjoin(
Isolate, func.trim(Assembly.isolate_id) == func.trim(Isolate.sample_id)
)
.outerjoin(Isolate, Assembly.isolate_id == Isolate.sample_id)

Copilot uses AI. Check for mistakes.
Comment on lines +215 to +217
suspected_organism = func.nullif(func.trim(Isolate.suspected_organism), "").label(
"suspected_organism"
)
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new normalization logic for suspected_organism (trimming and converting empty strings to NULL) lacks test coverage. The existing test suite in tests/test_api.py does not include any tests for the /api/assemblies/metrics endpoint. Consider adding tests that verify the behavior when suspected_organism contains whitespace, empty strings, or valid values.

Copilot uses AI. Check for mistakes.
@Ulthran Ulthran closed this Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants