Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -413,4 +413,6 @@ skips = ["*/domain/*"]
[dependency-groups]
dev = [
"bioetl[dev]",
"lxml>=6.0.2",
"pyyaml>=6.0.3",
Comment on lines +416 to +417
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for lxml imports or usage in the codebase
echo "=== Searching for lxml imports ==="
rg -n "import lxml|from lxml" --type py

echo -e "\n=== Searching for lxml in requirements/configs ==="
rg -n "lxml" -g "*.toml" -g "*.txt" -g "*.cfg"

echo -e "\n=== Checking if any dev tool lists lxml as dependency ==="
# Check if mypy, basedpyright, or other type checkers mention lxml
rg -n "lxml" -g "*.md" -g "*.rst"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 264


🏁 Script executed:

# First, check the size of pyproject.toml
wc -l pyproject.toml

# Then view the relevant sections
echo "=== Lines 10-30 (Python version and main dependencies) ==="
sed -n '10,30p' pyproject.toml

echo -e "\n=== Lines 410-425 (dev dependency-groups) ==="
sed -n '410,425p' pyproject.toml

echo -e "\n=== Check if 'bioetl' is the package name (line 1-5) ==="
head -5 pyproject.toml

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 1181


pyyaml is already a runtime dependency; addition here is redundant.

Line 23 declares pyyaml>=6.0 in the main dependencies section. Since bioetl[dev] (line 415) includes all runtime dependencies, re-specifying pyyaml>=6.0.3 in [dependency-groups].dev is unnecessary duplication.

lxml>=6.0.2 has no usage in the codebase. If it's required for type stubs or a specific dev tool, add a comment explaining its purpose. Otherwise, remove it.

Proposed fix
 [dependency-groups]
 dev = [
     "bioetl[dev]",
-    "lxml>=6.0.2",
-    "pyyaml>=6.0.3",
 ]
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"lxml>=6.0.2",
"pyyaml>=6.0.3",
[dependency-groups]
dev = [
"bioetl[dev]",
]
πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 416 - 417, Remove the redundant pyyaml entry
from the [dependency-groups].dev list and either remove lxml if unused or add an
inline comment explaining why lxml>=6.0.2 is present for dev-only tooling;
specifically, delete "pyyaml>=6.0.3" (duplicate of the main
dependencies/pyyaml>=6.0) and either remove "lxml>=6.0.2" or annotate it in the
same dev group to state the intended dev-purpose (e.g., type stubs or a specific
linter/tool) so future readers know why the dev-only dependency exists.

]
2 changes: 1 addition & 1 deletion src/bioetl/domain/serialization.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ def is_complex_type(field_type: pa.DataType) -> bool:
def serialize_column_to_json(col: pa.ChunkedArray) -> pa.Array:
"""Serialize a complex Arrow column into stringified JSON values."""
vals = [
serialize_to_json(v.as_py()) if v.as_py() is not None else None for v in col
serialize_to_json(val) if (val := v.as_py()) is not None else None for v in col
]
return pa.array(vals, type=pa.string())

Expand Down
5 changes: 5 additions & 0 deletions tests/architecture/test_config_ci_invariants.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@
from bioetl.infrastructure.config.contract_policy_loader import (
load_pipeline_contract_policy,
)

import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
Comment on lines +46 to +48
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether path management is already centralized and how widespread per-file mutations are.
fd -i 'pyproject.toml' -i 'pytest.ini' -i 'tox.ini' -i 'conftest.py'
rg -n --type=py 'sys\.path\.insert\(' tests
rg -n 'pythonpath|pytest\.ini_options|tool\.pytest' pyproject.toml pytest.ini tox.ini 2>/dev/null || true

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 1007


Remove module-level sys.path mutation; centralized path configuration exists in pyproject.toml.

Line 48 mutates global import resolution at import time via sys.path.insert(). This pattern is fragile, order-dependent, and unnecessaryβ€”pyproject.toml already configures pythonpath = ["src"] in [tool.pytest.ini_options] (line 151). Relying on per-file path mutations instead of the centralized pytest configuration undermines the benefit of having it.

This pattern appears across the test suite in multiple files:

  • tests/unit/scripts/qa/test_report_duplication_baseline.py:7
  • tests/unit/scripts/ci/test_quality_integral_gate.py:7
  • tests/architecture/test_diagram_bundle_generator_contracts.py:16
  • tests/architecture/test_diagram_description_index_contracts.py:16

Remove the import manipulation and rely on pytest's centralized path configuration.

Cleanup
-import sys
-from pathlib import Path
-sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/architecture/test_config_ci_invariants.py` around lines 46 - 48, Remove
the module-level sys.path mutation: delete the imports and the
sys.path.insert(...) call (the lines importing sys and Path and the
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent))
statement) from this test file and the other listed tests; also remove any
now-unused Path or sys imports. Rely on the centralized pytest pythonpath
setting in pyproject.toml so the tests use normal absolute imports instead of
modifying sys.path at import time.


from scripts.schema import check_config_invariants as invariant_script
from scripts.schema.validate_pipeline_configs import _canonical_script

Expand Down
4 changes: 4 additions & 0 deletions tests/unit/scripts/ci/test_quality_integral_gate.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

from __future__ import annotations

import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent.parent.parent))

Comment on lines +5 to +8
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, explore the repository structure to understand the layout
find . -type f -name "conftest.py" | head -20

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 479


🏁 Script executed:

# Check if the target file exists and read its content
if [ -f "tests/unit/scripts/ci/test_quality_integral_gate.py" ]; then
  head -20 "tests/unit/scripts/ci/test_quality_integral_gate.py"
else
  echo "File not found at expected path"
  # Try to find the file
  find . -name "test_quality_integral_gate.py" -type f
fi

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 826


🏁 Script executed:

# Search for other sys.path.insert patterns in test files
rg "sys\.path\.insert\|sys\.path\.append" tests/ --type py -n

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 63


🏁 Script executed:

# Check the root conftest.py
cat tests/conftest.py

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 12687


🏁 Script executed:

# Check for pytest configuration files
ls -la | grep -E "pytest\.ini|pyproject\.toml|setup\.cfg"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 147


🏁 Script executed:

# Check the directory structure relative to the test file to understand the import path
ls -R tests/unit/scripts/ci/ | head -30

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 134


🏁 Script executed:

# Check where the scripts.ci module is located
find . -path "*scripts/ci*" -type f | grep -v __pycache__ | head -20

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 577


🏁 Script executed:

# Check pyproject.toml for pytest configuration
cat pyproject.toml | grep -A 20 "\[tool.pytest"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 715


🏁 Script executed:

# Also check what the actual import structure looks like - where is scripts relative to src?
ls -la | head -20

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 1305


🏁 Script executed:

# Verify the exact import path the test needs
head -20 scripts/ci/quality_integral_gate.py

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 587


Remove the per-module sys.path injection and use centralized pytest configuration instead.

The sys.path.insert(0, ...) on line 7 mutates the interpreter-wide import precedence during module load, which is brittle and can cause cross-test interference. The repository's pyproject.toml already configures pythonpath = ["src"], so add the repository root to this list to make scripts.ci importable from all tests:

pythonpath = ["src", "."]

Alternatively, configure the path in the root tests/conftest.py fixture to avoid per-file duplication.

πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/scripts/ci/test_quality_integral_gate.py` around lines 5 - 8, The
test file test_quality_integral_gate.py mutates interpreter import precedence
via sys.path.insert(0, str(Path(...))) β€” remove that per-module injection and
rely on centralized configuration: delete the sys.path.insert call from the
module and either add the repository root to the existing pythonpath array in
pyproject.toml (e.g., pythonpath = ["src", "."]) so scripts.ci is importable, or
add a root-level tests/conftest.py fixture that appends the repo root to
sys.path for all tests; ensure no other tests keep per-file sys.path hacks and
run the test suite to verify imports resolve.

from scripts.ci.quality_integral_gate import ArchitectureTestStats
from scripts.ci.quality_integral_gate import (
TestHealthClassification as HealthClassification,
Expand Down
4 changes: 4 additions & 0 deletions tests/unit/scripts/qa/test_report_duplication_baseline.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
from __future__ import annotations

import re
import sys
from pathlib import Path

sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent.parent.parent))
Comment on lines +4 to +7
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "test_report_duplication_baseline.py" -type f

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 140


🏁 Script executed:

fd -type f -name "conftest.py"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 251


🏁 Script executed:

rg "sys\.path\.insert" tests/

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 682


🏁 Script executed:

git ls-files tests/ | head -20

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 1014


🏁 Script executed:

cat -n ./tests/unit/scripts/qa/test_report_duplication_baseline.py | head -20

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 1119


🏁 Script executed:

find tests/unit -name "conftest.py"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 187


🏁 Script executed:

cat -n tests/conftest.py 2>/dev/null || cat -n tests/__init__.py 2>/dev/null || echo "No conftest or __init__ at tests/ level"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 15949


🏁 Script executed:

ls -la | grep -E "pytest|pyproject|setup"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 211


🏁 Script executed:

cat pyproject.toml 2>/dev/null | grep -A 10 "\[tool.pytest"

Repository: SatoryKono/BioactivityDataAcquisition

Length of output: 384


Remove the sys.path.insert() callβ€”pytest's pythonpath = ["src"] in pyproject.toml already provides centralized import resolution.

The sys.path mutation at line 7 is redundant because pytest is already configured with pythonpath = ["src"] to handle imports from the source directory. The hardcoded parent traversal (5 levels up) is brittle and tightly couples the test to the directory structure. This pattern is repeated across multiple test files (test_quality_integral_gate.py, test_config_ci_invariants.py, etc.), indicating a systematic issue. Remove this code and rely on pytest's centralized configuration instead.

πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/scripts/qa/test_report_duplication_baseline.py` around lines 4 -
7, Remove the hardcoded test import hack: delete the sys.path.insert(0,
str(Path(__file__).resolve().parent.parent.parent.parent.parent)) call (and the
now-unused import sys and/or Path if they become unused) from the test (the
symbols to remove are sys.path.insert and the Path(__file__).resolve().parent...
expression); rely on pytest's pythonpath = ["src"] in pyproject.toml for import
resolution instead.


from scripts.qa.report_duplication_baseline import _build_payload
from scripts.qa.report_duplication_baseline import _build_trend_summary
Expand Down
Loading
Loading