diff --git a/.github/workflows/coverage.yml b/.github/workflows/coverage.yml index 7af5fa5..d01b071 100644 --- a/.github/workflows/coverage.yml +++ b/.github/workflows/coverage.yml @@ -31,6 +31,7 @@ jobs: - name: Upload coverage to Codecov uses: codecov/codecov-action@v4 + if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository with: files: lcov.info token: ${{ secrets.CODECOV_TOKEN }} diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 34d849a..4a88f73 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -292,9 +292,8 @@ When working on path calculation features (`tp-core/src/path/`): The path calculation module consists of several submodules: - **`candidate.rs`**: Find candidate netelements for each GNSS position -- **`probability.rs`**: Calculate probabilities using distance and heading -- **`construction.rs`**: Build paths forward and backward through network -- **`selection.rs`**: Select best path from candidates +- **`probability.rs`**: Calculate HMM-related probabilities (e.g., emission/transition) using distance, heading, and network context +- **`viterbi.rs`**: Run the HMM/Viterbi algorithm to compute the most likely train path from the candidate sequences - **`graph.rs`**: Network topology graph operations - **`spacing.rs`**: GNSS resampling for consistent spacing diff --git a/README.md b/README.md index d6b51f2..7b3a896 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ **Status**: Under construction -Train positioning library excels in post-processing the GNSS positions of your measurement train to achieve an unambiguous network location. This library is your a map matching assistant specifically for railway. +Train positioning library excels in post-processing the GNSS positions of your measurement train to achieve an unambiguous network location. This library is your map matching assistant specifically for railway. ## Features @@ -52,7 +52,7 @@ tp-cli --gnss positions.csv --network network.geojson --train-path path.csv --ou ### Debug Output Pass `--debug` to write intermediate HMM calculation results as GeoJSON files to a `debug/` subdirectory next to the output file. -See **[DEBUG.md](DEBUG.md)** for a full description of the four output files, their properties, and a typical debugging workflow. +See **[DEBUG.md](DEBUG.md)** for a full description of the debug output files, their properties, and a typical debugging workflow. ### Algorithm Parameters diff --git a/specs/002-train-path-calculation/algorithm.md b/specs/002-train-path-calculation/algorithm.md index cda84a2..31b86fc 100644 --- a/specs/002-train-path-calculation/algorithm.md +++ b/specs/002-train-path-calculation/algorithm.md @@ -242,9 +242,9 @@ When **all** transition scores at a time-step `t` are `-∞` (no feasible transi 3. For each current candidate `j` with non-zero emission: `log_V[t][j] = carry_score + ln(P_emission(t, j))` 4. Set `backptr[t][j] = i*` so the backtrace follows the best previous state -This produces a **single unbroken subsequence** for all GNSS input (the GNSS data represents one continuous drive). The heavy penalty ensures that carry-forward transitions are strongly disfavoured relative to genuine topological transitions, but the chain is never severed. +This produces a **single unbroken subsequence** within a Viterbi processing window. The heavy penalty ensures that carry-forward transitions are strongly disfavoured relative to genuine topological transitions, but within a window the chain is never severed. -**Important**: Because carry-forward preserves chain continuity, the backtrace always yields exactly one subsequence covering the entire GNSS timeline. +**Important**: Because carry-forward preserves chain continuity *within a window*, the backtrace for that window always yields exactly one subsequence covering the entire GNSS timeline of the window. Requirement **FR-027** (Viterbi break detection and subsequence reinitialization) is satisfied by higher-level control logic, which may terminate the current window and start a new one when configured break conditions are met; in that case, multiple subsequences exist across windows, while each individual window still uses the no-break penalty carry-forward scheme described here. ### Backtrace diff --git a/specs/002-train-path-calculation/contracts/cli.md b/specs/002-train-path-calculation/contracts/cli.md index 2c81840..e3fbdce 100644 --- a/specs/002-train-path-calculation/contracts/cli.md +++ b/specs/002-train-path-calculation/contracts/cli.md @@ -147,6 +147,12 @@ This is the **primary workflow** for path-based GNSS projection. It combines pat | `--format ` | auto | Output format: `csv`, `geojson`, or `auto` (detect from extension) | | `--save-path ` | None | Optionally save calculated path to file (in addition to projected coordinates) | +**Input Options:** + +| Option | Default | Description | +|--------|---------|-------------| +| `--crs ` | None | Coordinate Reference System for GNSS input (e.g., `EPSG:4326`). Overrides CRS column in CSV file. | + **General Options:** | Option | Description | diff --git a/specs/002-train-path-calculation/contracts/lib-api.md b/specs/002-train-path-calculation/contracts/lib-api.md index 623941a..b8196b1 100644 --- a/specs/002-train-path-calculation/contracts/lib-api.md +++ b/specs/002-train-path-calculation/contracts/lib-api.md @@ -46,7 +46,7 @@ pub struct PathConfig { /// Heading scale for exponential decay (default: 2.0 degrees) pub heading_scale: f64, - /// Maximum distance for candidate selection (default: 50.0 meters) + /// Maximum distance for candidate selection (default: 500.0 meters) pub cutoff_distance: f64, /// Maximum heading difference before rejection (default: 10.0 degrees) diff --git a/specs/002-train-path-calculation/spec.md b/specs/002-train-path-calculation/spec.md index 3779d95..cc4f843 100644 --- a/specs/002-train-path-calculation/spec.md +++ b/specs/002-train-path-calculation/spec.md @@ -152,7 +152,7 @@ A developer troubleshooting path calculation issues exports intermediate results ### Session 2026-01-08 -- Q: When GNSS coordinates fall outside the configured cutoff distance (default 50m) from all track segments during the projection phase (after path is calculated), how should the system handle these outliers? → A: Exclude outlier coordinates from output entirely (omit from results file). Future feature will address better handling. +- Q: When GNSS coordinates fall outside the configured cutoff distance (default 500m) from all track segments during the projection phase (after path is calculated), how should the system handle these outliers? → A: Exclude outlier coordinates from output entirely (omit from results file). Future feature will address better handling. - Q: How should the distance between a GNSS coordinate and a candidate netelement be factored into the probability calculation? → A: Inverse exponential decay based on both distance (e.g., e^(-distance/scale)) and heading difference (e.g., e^(-heading_diff/scale)) - Q: When multiple candidate paths have identical probability scores (after forward/backward averaging), which path should be selected? → A: Select the first path found during calculation (arbitrary but deterministic) - Q: When a pre-calculated train path is provided as input (FR-041), what format should the system expect? → A: Same format as path-only export: CSV or GeoJSON with ordered AssociatedNetElements @@ -173,7 +173,7 @@ A developer troubleshooting path calculation issues exports intermediate results ### Edge Cases -- GNSS coordinates more than the configured cutoff distance (default 50m) from any track segment are excluded from output (omitted from results) +- GNSS coordinates more than the configured cutoff distance (default 500m) from any track segment are excluded from output (omitted from results) - NetRelations where elementA equals elementB (self-referencing) are skipped with warnings logged - NetRelations referencing non-existent netelement IDs are skipped with warnings logged; segments with only invalid netrelations are treated as isolated - What happens when a track segment has no netrelations connecting it to other segments (isolated segment)? @@ -216,7 +216,7 @@ A developer troubleshooting path calculation issues exports intermediate results - **FR-015**: The calculated train path MUST be continuous (each segment connects to the next via a netrelation) - **FR-016**: All netrelations between consecutive segments in the path MUST have navigability in the direction of travel (not "none" or opposing direction) - **FR-017**: System MUST find at most N nearest netelements for each GNSS coordinate (where N is configurable, default 3) -- **FR-018**: System MUST only consider netelements within a configurable cutoff distance (default 50 meters) from each GNSS coordinate +- **FR-018**: System MUST only consider netelements within a configurable cutoff distance (default 500 meters) from each GNSS coordinate - **FR-018a**: System MUST exclude from output any GNSS coordinates that are more than the cutoff distance from all track segments in the calculated path - **FR-019**: System MUST calculate probability for each candidate netelement using inverse exponential decay for both distance (e.g., e^(-distance/distance_scale)) and heading alignment (e.g., e^(-heading_difference/heading_scale)), with the overall probability being the product of distance and heading probability factors - **FR-020**: System MUST set probability to 0 when heading difference between GNSS coordinate and netelement exceeds configurable cutoff (default 10 degrees), overriding exponential decay calculation @@ -324,7 +324,7 @@ A developer troubleshooting path calculation issues exports intermediate results The following configuration parameters are referenced in the requirements with default values: - **Max nearest netelements**: Default 3 — maximum number of candidate track segments considered for each GNSS coordinate -- **Distance cutoff**: Default 50 meters — maximum distance from GNSS coordinate to consider a track segment as candidate +- **Distance cutoff**: Default 500 meters — maximum distance from GNSS coordinate to consider a track segment as candidate - **Heading difference cutoff**: Default 10 degrees — maximum heading misalignment before emission probability is set to 0 - **Minimum probability threshold**: Default 2% — minimum emission probability for segment inclusion - **Resampling distance**: Default 10 meters — target spacing between GNSS coordinates used for path calculation diff --git a/specs/002-train-path-calculation/tasks.md b/specs/002-train-path-calculation/tasks.md index 63fe7ef..a8b541c 100644 --- a/specs/002-train-path-calculation/tasks.md +++ b/specs/002-train-path-calculation/tasks.md @@ -133,18 +133,17 @@ - [X] T063 [US1] Write unit test for consecutive position identification in tests/unit/path_probability_test.rs - [X] T064 [US1] Write unit test for coverage factor calculation in tests/unit/path_probability_test.rs -#### Phase 4: Path Construction (Bidirectional) +#### Phase 4: Path Decoding (HMM / Viterbi) - [X] T065 [P] [US1] Create tp-core/src/path/construction.rs module file -- [X] T066 [P] [US1] Implement construct_forward_path() starting from highest probability netelement at first position -- [X] T067 [P] [US1] Implement construct_backward_path() starting from highest probability netelement at last position -- [X] T068 [US1] Implement graph traversal with navigability constraints using petgraph neighbors() -- [X] T069 [US1] Implement probability threshold filtering (default 25%, except when only navigable option) -- [X] T070 [US1] Implement path reversal for backward path (reverse segment order + swap intrinsic coordinates) -- [X] T071 [US1] Implement bidirectional validation comparing forward and reversed backward paths -- [X] T072 [US1] Write unit test for forward path construction in tests/unit/path_construction_test.rs -- [X] T073 [US1] Write unit test for backward path construction and reversal in tests/unit/path_construction_test.rs -- [X] T074 [US1] Write unit test for bidirectional agreement detection in tests/unit/path_construction_test.rs +- [ ] T066 [P] [US1] Implement candidate_netelements_for_positions() to select candidate netelements per GNSS position (top-N by probability with navigability constraints) +- [ ] T067 [P] [US1] Implement emission_probability() functions using per-position/per-netelements probability components (position, heading, coverage) +- [ ] T068 [US1] Implement transition_probability() modeling between consecutive candidate netelements using topology connectivity and direction of travel +- [ ] T069 [US1] Implement viterbi_decode_path() HMM decoder over the candidate lattice to select the most probable netelement sequence +- [ ] T070 [US1] Implement insert_bridge_segments() to add required connecting segments between chosen netelements based on network topology +- [ ] T071 [US1] Implement calculate_train_path() orchestrator calling candidate selection, emission/transition probability, Viterbi decoding, and bridge insertion +- [ ] T072 [US1] Write unit tests for viterbi_decode_path() in tests/unit/path_construction_test.rs +- [ ] T073 [US1] Write unit tests for insert_bridge_segments() and calculate_train_path() in tests/unit/path_construction_test.rs #### Phase 5: Path Selection diff --git a/test-data/README.md b/test-data/README.md index cc93e4f..dcfc0e4 100644 --- a/test-data/README.md +++ b/test-data/README.md @@ -543,7 +543,7 @@ Good result: target/release/tp-cli.exe --gnss test-data/log_29584/log_29584_L36-A_to_L36C-A_to_L25N-B.csv --crs EPSG:4326 --network test-data/network_airport.geojson --output test-data/log_29584/log_29584_L36-A_to_L36C-A_to_L25N-B-path-projection.geojson ``` -Good result (and proves the need to have longitudal redistribution of the gnss positions): +Good result (and proves the need to have longitudinal redistribution of the gnss positions): ![L36-A to L36C-A to L25N-B (log_29584) - Path projection](log_29584/log_29584_L36-A_to_L36C-A_to_L25N-B-path-projection.png) @@ -640,7 +640,7 @@ Zoom on detail: target/release/tp-cli.exe --gnss test-data/log_28586/log_28586_L36-A_to_L36C-A_to_L25N-B-very-bad.csv --crs EPSG:4326 --network test-data/network_airport.geojson --output test-data/log_28586/log_28586_L36-A_to_L36C-A_to_L25N-B-very-bad-path-projection.geojson ``` -Expected result, again showing the need to also perform longitudal post processing: +Expected result, again showing the need to also perform longitudinal post processing: ![L36-A to L36C-A to L25N-B very bad GNSS - Path projection](log_28586/log_28586_L36-A_to_L36C-A_to_L25N-B-path-projection.png)