refactor(extractors): parser abstraction layer (Phase 7.1)#700
refactor(extractors): parser abstraction layer (Phase 7.1)#700carlos-alm merged 3 commits intomainfrom
Conversation
Extract shared patterns from 9 language extractors into 4 reusable
helpers in helpers.ts, reducing per-language boilerplate by ~30 lines:
- findParentNode: replaces 6 findParent*/findCurrentImpl functions
- extractBodyMembers: replaces 5 body-iteration patterns for enums/structs
- stripQuotes: replaces inline .replace(/"/g,'') across 3 extractors
- lastPathSegment: replaces inline .split('.').pop() across 6 extractors
Net: +77 helper lines, -159 extractor lines = -82 lines total.
|
Claude finished @carlos-alm's task —— View job PR Review: refactor(extractors): parser abstraction layer (Phase 7.1) ⭐⭐⭐⭐☆Root Cause AnalysisProblem Being Solved: Code duplication across language extractors - 6 separate Why This Approach: Consolidating into 4 reusable helpers reduces boilerplate from ~200 to ~80 lines per extractor and directly implements ROADMAP Phase 7.1 to prepare for adding 23 new languages. Risk Assessment: ✅ LOW - Pure refactoring with no functional changes. Backlog Compliance
Configuration & Test Integrity ✅
Code Quality Assessment ✅Excellent abstraction design:
Comprehensive refactoring: Found 25 references to old patterns across 7 files - confirms complete scope. Technical AnalysisNet Impact: -82 lines (+77 helper, -159 boilerplate) Quality Indicators:
Final Recommendation
This is exactly the type of foundational work needed before adding 23 new languages. Well executed. |
Greptile SummaryThis PR implements ROADMAP Phase 7.1, extracting four reusable helpers (
Confidence Score: 5/5Safe to merge — clean refactor with no regressions, all 272 tests pass, and prior review concerns are resolved. No P0 or P1 issues found. The single P2 comment is a documentation suggestion for extractBodyMembers's double-dispatch body lookup; the behaviour itself is harmless and fully covered by the existing test suite. All previously raised concerns (anchored stripQuotes semantics, Rust self_parameter comment) have been addressed. No files require special attention; src/extractors/helpers.ts is the logical hub but is straightforward and well-documented. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph helpers["src/extractors/helpers.ts (new exports)"]
FPN["findParentNode(node, typeNames, nameField?)"]
EBM["extractBodyMembers(containerNode, bodyFields, memberType, kind, ...)"]
SQ["stripQuotes(text)"]
LPS["lastPathSegment(path, separator?)"]
end
subgraph extractors["Language Extractors (consumers)"]
JS["javascript.ts\nfindParentClass → findParentNode"]
PY["python.ts\nfindPythonParentClass → findParentNode"]
JAVA["java.ts\nfindJavaParentClass → findParentNode\nextractEnumConstants → extractBodyMembers\nlastPathSegment (import paths)"]
CS["csharp.ts\nfindCSharpParentType → findParentNode\nextractCSharpEnumMembers → extractBodyMembers\nlastPathSegment (using directives)"]
RUBY["ruby.ts\nfindRubyParentClass → findParentNode\nstripQuotes + lastPathSegment (require)"]
RUST["rust.ts\nfindCurrentImpl → findParentNode\nextractStructFields → extractBodyMembers\nextractEnumVariants → extractBodyMembers\nlastPathSegment (use paths)"]
PHP["php.ts\nextractPhpEnumCases → extractBodyMembers\nlastPathSegment (namespace use)"]
GO["go.ts\nstripQuotes + lastPathSegment (imports)"]
HCL["hcl.ts\nstripQuotes (block names, module source)"]
end
FPN --> JS
FPN --> PY
FPN --> JAVA
FPN --> CS
FPN --> RUBY
FPN --> RUST
EBM --> JAVA
EBM --> CS
EBM --> RUST
EBM --> PHP
SQ --> RUBY
SQ --> GO
SQ --> HCL
LPS --> JAVA
LPS --> CS
LPS --> RUBY
LPS --> RUST
LPS --> PHP
LPS --> GO
Reviews (2): Last reviewed commit: "fix: address Greptile review comments (#..." | Re-trigger Greptile |
| if (param.type === 'self_parameter') { | ||
| // Skip self parameters — matches native engine behaviour | ||
| continue; | ||
| } else if (param.type === 'parameter') { |
There was a problem hiding this comment.
Empty
if block obscures intent
The comment // Skip self parameters — matches native engine behaviour and continue; were removed together, leaving an empty if branch. While this is functionally correct (nothing else follows in the loop body, so the else if is simply skipped), the empty block is a code smell and a future reader has no signal that skipping self_parameter is deliberate. Consider either keeping a comment or inverting to a guard clause:
| if (param.type === 'self_parameter') { | |
| // Skip self parameters — matches native engine behaviour | |
| continue; | |
| } else if (param.type === 'parameter') { | |
| if (param.type === 'self_parameter') { | |
| // Skip self — matches native engine behaviour | |
| } else if (param.type === 'parameter') { |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed — restored the explanatory comment in the empty self_parameter if-block: // Skip self — matches native engine behaviour.
There was a problem hiding this comment.
Fixed — restored the explanatory comment in the empty self_parameter if-block: // Skip self — matches native engine behaviour.
| export function stripQuotes(text: string): string { | ||
| return text.replace(/^['"`]|['"`]$/g, ''); | ||
| } |
There was a problem hiding this comment.
stripQuotes semantics differ subtly from replaced HCL regex
The HCL extractor previously used .replace(/"/g, '') — a global replace that removes every double-quote character in the text. stripQuotes uses anchored replacements (^ / $) that strip only the leading and trailing character. For well-formed Tree-sitter string literal nodes (always delimited with a single pair of outer quotes) the result is identical, so this is not a bug in practice.
Worth noting as a potential surprise if someone later passes a node whose .text value contains interior quote characters. A short inline comment on the "anchored, not global" choice would clarify the intentional scope:
| export function stripQuotes(text: string): string { | |
| return text.replace(/^['"`]|['"`]$/g, ''); | |
| } | |
| export function stripQuotes(text: string): string { | |
| // Strips only the leading/trailing delimiter; interior quotes are untouched. | |
| return text.replace(/^['"`]|['"`]$/g, ''); | |
| } |
There was a problem hiding this comment.
Fixed — added clarifying JSDoc line: Strips only the leading/trailing delimiter; interior quotes are untouched.
Add clarifying comment to empty self_parameter if-block in rust.ts and document anchored-vs-global semantics in stripQuotes JSDoc.
Summary
src/extractors/helpers.ts:findParentNode,extractBodyMembers,stripQuotes,lastPathSegmentfindParent*functions, 5 body-iteration patterns, and inline quote-stripping/path-splitting across 7 filesImplements ROADMAP Phase 7.1 — reduces per-language extractor boilerplate to prepare for Batch 1 language additions (C, C++, Kotlin, etc.)
Test plan
tsc --noEmit)