fix: line-count score configured null-language extensions instead of skipping them#1275
Open
MkDev11 wants to merge 1 commit into
Open
fix: line-count score configured null-language extensions instead of skipping them#1275MkDev11 wants to merge 1 commit into
MkDev11 wants to merge 1 commit into
Conversation
…skipping them Extensions in programming_languages.json with no language field (no tree-sitter parser) were silently scored as skipped-unsupported because the routing gate in calculate_token_score_from_file_changes only checked the NON_CODE_EXTENSIONS constant, not the JSON config. Affected extensions: env, gitattributes, gitignore, gql, graphql, ipynb, move, nim, pde, puml, tex. Fix: - Add TokenConfig.is_line_count_extension() that returns True when an extension is in NON_CODE_EXTENSIONS OR is configured in programming_languages.json with language=None. - Replace the ext in NON_CODE_EXTENSIONS gate with weights.is_line_count_extension(ext). - Remove the now-unused NON_CODE_EXTENSIONS import from tree_sitter_scoring. Tests: - Add TestNullLanguageLineCountScoring with regressions for graphql and gitignore (positive), unknown extension (negative control), graphql weight-accuracy pin, and a config-coverage guard that iterates every null-language entry in programming_languages.json and asserts it reaches line-count — catching future config/code drift before it zeroes miner scores.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extensions configured in
programming_languages.jsonwith nolanguagefield receivescore=0.0withscoring_method='skipped-unsupported'. The scorer only routed a file to line-count scoring when its extension appeared in theNON_CODE_EXTENSIONSconstant — it never consulted the JSON config. The 11 affected extensions are:env,gitattributes,gitignore,gql,graphql,ipynb,move,nim,pde,puml,tex.Root cause:
calculate_token_score_from_file_changesgated line-count scoring onext in NON_CODE_EXTENSIONS. A configured null-language extension not in that list fell through toskipped-unsupported. Adding a weight to the JSON had no effect unlessNON_CODE_EXTENSIONSwas also updated — a silent maintenance trap.Fix:
TokenConfig.is_line_count_extension()— returnsTruewhen the extension is inNON_CODE_EXTENSIONSor is configured inprogramming_languages.jsonwithlanguage=None.ext in NON_CODE_EXTENSIONSgate withweights.is_line_count_extension(ext).NON_CODE_EXTENSIONSimport fromtree_sitter_scoring.py.This is the null-language complement of #986, which fixed extensionless configured files that do have a tree-sitter language.
Related Issues
Closes #1274
Type of Change
Testing
TestNullLanguageLineCountScoringintests/validator/test_token_scoring_integration.py:test_graphql_scores_line_count— primary reproduction case; assertsscoring_method='line-count'andscore > 0test_gitignore_scores_line_count— dotfile extension pathtest_unknown_extension_stays_skipped_unsupported— negative control; confirms fix is bounded to configured extensionstest_graphql_score_uses_configured_weight— pins exact value:graphqlweight 1.0 × 1 line = 1.0test_all_configured_null_language_extensions_are_line_count_reachable— config-coverage guard; iterates every null-language entry in the JSON and fails if any future addition silently producesskipped-unsupportedChecklist