Skip to content

Conversation

@missinglink
Copy link
Member

Looking over the code today I noticed that the query 'Am Falkplatz' (and friends) are relying on a StopWordClassification for matching.

This seems brittle, this PR replaces it with the more robust DirectionalClassification.

I've also removed am from the /de/stopwords.txt file as Am Falkplatz is a valid streetname in Germany, the preposition shouldn't be ignored.

@missinglink missinglink requested a review from Joxit September 9, 2024 09:52
@missinglink
Copy link
Member Author

missinglink commented Sep 9, 2024

In fact, it seems as though Am Falkplatz wasn't even relying on the scheme/street.js classification which bore its name.

Removing that blocks only makes one test fail: N FISKE AVE Portlan

@missinglink
Copy link
Member Author

missinglink commented Sep 9, 2024

Worth noting this regression I spotted which wasn't covered by tests:

node bin/cli.js 10 A Main Street

(0.98) ➜ [ { housenumber: '10' }, { street: 'A Main Street' } ]

After this PR there are no results returned.

The previous parse wasn't correct anyway, so I'm not sure where I stand on it 🤔

The derived PR #180 seems to fix it (the reason is the TokenDistanceFilter penalises it because the housenumber and street are too far apart).

Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@missinglink missinglink force-pushed the stopword-classification branch from 14c5be7 to bc2f73a Compare June 30, 2025 10:13
@missinglink missinglink merged commit 85314bf into master Jun 30, 2025
4 checks passed
@missinglink missinglink deleted the stopword-classification branch June 30, 2025 10:14
@missinglink
Copy link
Member Author

missinglink commented Jun 30, 2025

I merged this today, afterwards I confirmed that the query 10 A Main Street returns zero results (which wasn't covered by tests).

As I intended to merge #180 immediately afterwards, at that PR restores this query, and actually improves the parse, I'm hoping that it didn't break anything.

We need to keep an eye out on this one for any other regressions not covered by unit tests and consider reverting it, or fixing it, if other issues surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants