Describe the problem
There is no reusable, infrastructure-agnostic Python module for computing the difference between two GTFS datasets. Any diff logic built inside mobility-feed-api risks being tightly coupled to MobilityData-specific concepts (feed IDs, GCS paths, database schemas), preventing reuse across projects or publication as a standalone community tool.
Proposed solution
Create a generic gtfs_diff Python package in a standalone GitHub repository. The package takes two GTFS dataset sources (URLs or local paths) and produces a structured changelog. It must have zero knowledge of MobilityData-specific concepts (no feed IDs, no GCS paths, no database).
Acceptance criteria:
Alternatives considered
- Inline the diff logic inside
gtfs-change-tracker: creates tight coupling, prevents reuse as a standalone package.
- Keep the module inside
mobility-feed-api: still couples the logic to MobilityData infrastructure and makes PyPI publication awkward.
Describe the problem
There is no reusable, infrastructure-agnostic Python module for computing the difference between two GTFS datasets. Any diff logic built inside
mobility-feed-apirisks being tightly coupled to MobilityData-specific concepts (feed IDs, GCS paths, database schemas), preventing reuse across projects or publication as a standalone community tool.Proposed solution
Create a generic
gtfs_diffPython package in a standalone GitHub repository. The package takes two GTFS dataset sources (URLs or local paths) and produces a structured changelog. It must have zero knowledge of MobilityData-specific concepts (no feed IDs, no GCS paths, no database).Acceptance criteria:
GtfsDiff.compute()correctly identifies added, removed, and modified rows across all supported GTFS filespip install -e .README.mdincludes install instructions and a usage exampleAlternatives considered
gtfs-change-tracker: creates tight coupling, prevents reuse as a standalone package.mobility-feed-api: still couples the logic to MobilityData infrastructure and makes PyPI publication awkward.