Skip to content

refactor: Titan audit — decompose, reduce complexity, remove dead code#699

Merged
carlos-alm merged 40 commits intomainfrom
release/3.5.0
Mar 30, 2026
Merged

refactor: Titan audit — decompose, reduce complexity, remove dead code#699
carlos-alm merged 40 commits intomainfrom
release/3.5.0

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

Full Titan pipeline audit of the codegraph codebase (v3.5.0). 122 files audited across 13 domains, 34 targets addressed in 32 commits.

  • Dead code removal: Removed unused exports from shared types, normalize utilities, and database connection layer
  • Shared abstraction extraction: Extracted common helpers for native Rust extractors (helpers.rs), WASM extractor visitor utilities, and analysis query-building patterns
  • Function decomposition: Decomposed the worst complexity offenders including makePartition (MI 5 -> 13.4), walk_node_depth across 9 native extractors, build_call_edges, MCP server, graph builder finalize stage, AST engine, and presentation formatters
  • Fail-level fixes: Reduced complexity in parser dispatch, WASM extractors, domain analysis, graph builder pipeline, AST engine, features (cfg/dataflow/check), and native engine (roles_db, HCL)
  • Warn-level improvements: Addressed warnings in shared/types, domain/presentation, and infrastructure/features layers
  • Build fix: Resolved noUncheckedIndexedAccess errors and exported missing interface types

Metrics

Metric Before After Delta
Quality Score 65 67 +2
Functions Above Threshold 50 48 -2
Function-Level Cycles 9 6 -3
Min Maintainability Index 5.0 13.4 +8.4
Total Symbols 11672 12628 +956 (decomposition)

All 2131 tests pass. 22 gate validations: 14 pass, 8 warn, 0 fail, 0 rollbacks.

Titan Audit Context

  • Pipeline: RECON -> GAUNTLET -> SYNC -> FORGE -> GATE -> CLOSE
  • Targets audited: 122 (41 pass, 26 warn, 25 fail, 30 decompose)
  • Forge phases: 5 (dead code, abstractions, decompositions, fail fixes, warn improvements)
  • Report: generated/titan/titan-report-v3.5.0-2026-03-30T03-04-14.md

Test plan

  • All 2131 tests pass
  • Build succeeds (tsup + tsc)
  • Lint passes (biome)
  • 22 gate validations passed during forge
  • CI passes
  • codegraph check --cycles --boundaries passes

Extract hasFuncBody, setupAstVisitor, setupComplexityVisitorForFile,
and setupCfgVisitorForFile from the monolithic setupVisitors function.
Each helper encapsulates one visitor's setup logic, reducing cognitive
complexity and improving readability.
Extract handler functions from extractSymbolsQuery (cog 78, bugs 2.43):
handleFnCapture, handleVarFnCapture, handleClassCapture, handleMethodCapture,
handleExportCapture, and dispatchQueryMatch. Extract from extractGoTypeMapDepth
(cog 143, bugs 1.15): handleTypedIdentifiers, inferShortVarType, handleShortVarDecl.
Extract from complexityData (cog 72, bugs 2.65): buildComplexityWhere,
buildThresholdHaving, mapComplexityRow, exceedsAnyThreshold,
computeComplexitySummary, checkHasGraph. Extract from
prepareFunctionLevelData (cog 66, bugs 2.54): buildNodeMapFromEdges,
loadComplexityMap, loadFanMaps, buildEnrichedVisNode, selectSeedNodes.
…rmatters

Extract formatPredicateViolations from check (cog 62). Extract
renderAuditFunction and renderHealthMetrics from audit (cog 55). Extract
formatAddedSection, formatRemovedSection, formatChangedSection, and
formatImpactLine from branch-compare formatText (cog 48).
…elpers

Extract buildDirFilesMap, buildFileToAncestorDirs, countDirectoryEdges,
and countSymbolsInFiles from computeDirectoryMetrics (cog 73, bugs 0.92).
Extract renderHealthTable and renderDefaultTable from complexity
(cog 40, bugs 1.59), separating the two table-rendering modes.
Extract focused helper functions from monolithic walk/build functions:
- edge_builder: split build_call_edges into EdgeContext + 8 handlers
- complexity: extract classify_branch, detect_else_if, is_pattern_c_else,
  handle_logical_op, classify_halstead shared by walk and walk_all
- cfg: decompose process_statement into try_unwrap_expr_stmt,
  try_process_labeled, try_process_control_flow, try_process_terminal,
  process_break, process_continue
Extract DataflowOutput struct to bundle output vectors, visit_children
helper to eliminate repeated DFS boilerplate, and handle_return_stmt
from inline visit logic.
javascript.ts: extract handleVarDeclaratorTypeMap, handleParamTypeMap,
extractMemberExprCallInfo, extractSubscriptCallInfo,
extractRequireReexport, extractSpreadRequireReexports from monolithic
extractTypeMapWalk, extractCallInfo, handleCommonJSAssignment.

go.ts: extract handleGoStructType, handleGoInterfaceType from
handleGoTypeDecl.
Extract shared indexByLine and matchResultToDef helpers from duplicated
code in storeComplexityResults and storeCfgResults. Extract classifyLogicalOp
from inline logic in createComplexityVisitor's enterNode method.
Extract focused helper functions from buildCFGData (cognitive 86),
dataflowPathData (cognitive 40), dataflowData (halstead.bugs 1.21),
dataflowImpactData (nesting 6), and checkData (cyclomatic 26).
Extract focused helpers from do_classify_incremental (196 SLOC,
halstead.bugs 2.9) and walk_node_depth (cognitive 91, nesting 11).
Share classify_rows and test_file_filter between full and incremental
classification paths.
@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Titan Audit Refactoring

Todo List:

  • Read project context (CLAUDE.md, FOUNDATION.md)
  • Examine Titan audit report
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity check
  • Phase 4: Code quality review (sample key files)
  • Phase 5: Backlog compliance assessment
  • Final assessment and rating

Root Cause Analysis

Problem Being Solved: Genuine code quality issues identified by Titan audit: high complexity functions (makePartition MI 5→13.4), duplicated walk_node_depth patterns across 9 native extractors, dead code in shared types and database layer, functions above maintainability thresholds.

Why This Approach: Legitimate refactoring following established patterns: function decomposition for high-complexity targets, shared abstraction extraction to eliminate duplication, dead code removal based on actual usage analysis. No rule relaxation or test weakening detected.

Risk Assessment: ✅ LOW RISK - This is genuine refactoring, not symptom hiding. The Titan report shows actual improvements in objective metrics (MI scores, function counts, cycle reduction).

Backlog Compliance

  • Zero-dep: ✅ - No new runtime dependencies added to package.json
  • Foundation-aligned: ✅ - Pure refactoring, maintains embeddable-first design, no cloud API calls, no code generation
  • Problem-fit: 4 - Directly improves code maintainability and reduces complexity for AI agents navigating the codebase. Lower complexity functions are easier for agents to understand and modify safely.
  • Breaking: No - All changes are internal refactoring, no API signature changes, CLI output unchanged
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

NONE - This PR passes all phases:

  • ✅ No configuration changes masking issues
  • ✅ No test weakening or rule relaxation
  • ✅ No suspicious workarounds or eslint-disable
  • ✅ Code quality is high with proper abstractions
  • ✅ Genuine complexity reduction with measurable improvements

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is exemplary technical debt reduction. The Titan audit methodology is sound, the refactoring addresses real complexity issues, and the implementation maintains code quality. The PR demonstrates genuine improvement (quality score 65→67, worst function MI 5→13.4) without cutting corners. Only minor deduction for the massive scope (62 files) which makes thorough review challenging, but the systematic approach and clear audit trail mitigate this concern.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR is a comprehensive Titan pipeline audit of the codegraph codebase (v3.5.0), touching 63 files across 13 domains. The changes are purely structural — decomposition of oversized functions, extraction of shared abstractions, and dead code removal — with no new features or behavior changes. All 2131 tests pass and all 22 gate validations pass.

Key structural improvements:

  • Shared abstractions extracted: withReadonlyDb / resolveAnalysisOpts (query-helpers), setTypeMapEntry (TS/Rust extractor helpers), warnOnVersionMismatch (DB connection), findEnclosingTypeName (Rust extractor helpers), indexByLine / matchResultToDef (AST engine)
  • Algorithmic decompositions verified correct: Leiden makePartition / buildOriginalPartition helper split; Louvain findBestCommunityMove extraction; BFS helpers in dataflow.ts
  • One minor issue: Three new (e as Error).message casts in watcher.ts catch blocks use the same unsafe pattern already fixed in incremental.ts

Confidence Score: 5/5

Safe to merge — all changes are structural refactoring with preserved semantics, 2131 tests pass, and the single finding is a P2 style nit.

The only finding is three (e as Error).message casts in non-critical debug log paths of watcher.ts, all P2. Every algorithmic change verified semantically equivalent. Prior review concerns fully addressed. No P0 or P1 issues found.

src/domain/graph/watcher.ts — three new catch blocks use unsafe (e as Error).message instead of the instanceof Error guard established elsewhere.

Important Files Changed

Filename Overview
src/domain/graph/watcher.ts Console.log → structured logging cleanup; three new catch blocks use the same (e as Error).message anti-pattern already fixed elsewhere in this PR.
src/domain/analysis/query-helpers.ts New shared helper extracting withReadonlyDb and resolveAnalysisOpts; correct try/finally resource management and proper CodegraphConfig typing.
src/graph/algorithms/leiden/partition.ts Extracted accumulateNodeAggregates, accumulateInternalEdgeWeights, and buildSortedCommunityIds (now void); logic semantically identical, prior flagged return type resolved.
src/graph/algorithms/leiden/optimiser.ts Extracted findBestCommunityMove; all four candidate strategies and new-community probe preserved correctly.
src/domain/graph/builder/stages/finalize.ts Decomposed into four focused helpers; full-build guard correctly retained at the call site.
src/db/connection.ts Extracted warnOnVersionMismatch shared across two open paths; _resetVersionWarning removed (confirmed unused).
src/features/check.ts Decomposed into findGitRoot, getGitDiff, resolveCheckFlags, runPredicates; git cwd uses repoRoot consistently with original behavior.
src/ast-analysis/engine.ts Extracted six helper functions for visitor setup and result matching; all logic preserved correctly.
src/features/dataflow.ts Extracted BFS helpers; path reconstruction logic verified semantically equivalent.
src/infrastructure/config.ts Decomposed detectWorkspaces into four reader/expander helpers; priority-order fallback logic preserved.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph finalize["finalize.ts"]
        F[finalize] --> RW[releaseWasmTrees]
        F --> DI[detectIncrementalDrift]
        F --> PB[persistBuildMetadata]
        F --> RA[runAdvisoryChecks]
    end
    subgraph leiden["Leiden"]
        MP[makePartition] --> ANA[accumulateNodeAggregates]
        MP --> AIEW[accumulateInternalEdgeWeights]
        BOP[buildOriginalPartition] --> ANA
        RLUM[runLouvain] --> FBCM[findBestCommunityMove]
    end
    subgraph engine["AST Engine"]
        SV[setupVisitors] --> SAV[setupAstVisitor]
        SV --> SCVF[setupComplexityVisitor]
        SV --> SCFVF[setupCfgVisitor]
    end
Loading

Reviews (3): Last reviewed commit: "fix: resolve merge conflicts with main (..." | Re-trigger Greptile

Comment on lines 401 to +404
function compactCommunityIds(opts: CompactOptions = {}): void {
const ids: number[] = [];
for (let c = 0; c < communityCount; c++) if (iget(communityNodeCount, c) > 0) ids.push(c);
if (opts.keepOldOrder) {
ids.sort((a, b) => a - b);
} else if (opts.preserveMap instanceof Map) {
const preserveMap = opts.preserveMap;
ids.sort((a, b) => {
const pa = preserveMap.get(a);
const pb = preserveMap.get(b);
if (pa != null && pb != null && pa !== pb) return pa - pb;
if (pa != null && pb == null) return -1;
if (pb != null && pa == null) return 1;
return (
fget(communityTotalSize, b) - fget(communityTotalSize, a) ||
iget(communityNodeCount, b) - iget(communityNodeCount, a) ||
a - b
);
});
} else {
ids.sort(
(a, b) =>
fget(communityTotalSize, b) - fget(communityTotalSize, a) ||
iget(communityNodeCount, b) - iget(communityNodeCount, a) ||
a - b,
);
}
buildSortedCommunityIds(ids, opts, communityTotalSize, communityNodeCount);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused return value from buildSortedCommunityIds

buildSortedCommunityIds sorts ids in-place and returns the sorted array, but the return value is discarded at the call site. The function signature implies the caller should use the return value, which creates a subtle API confusion. Since Array.prototype.sort mutates in place the behavior is correct — but the misleading signature could cause a future caller to forget the in-place mutation and use the return value thinking it's a new array.

Consider either:

  • Making the return type void (since the in-place sort is the actual contract), or
  • Using the return value: ids = buildSortedCommunityIds(ids, opts, communityTotalSize, communityNodeCount);
Suggested change
function compactCommunityIds(opts: CompactOptions = {}): void {
const ids: number[] = [];
for (let c = 0; c < communityCount; c++) if (iget(communityNodeCount, c) > 0) ids.push(c);
if (opts.keepOldOrder) {
ids.sort((a, b) => a - b);
} else if (opts.preserveMap instanceof Map) {
const preserveMap = opts.preserveMap;
ids.sort((a, b) => {
const pa = preserveMap.get(a);
const pb = preserveMap.get(b);
if (pa != null && pb != null && pa !== pb) return pa - pb;
if (pa != null && pb == null) return -1;
if (pb != null && pa == null) return 1;
return (
fget(communityTotalSize, b) - fget(communityTotalSize, a) ||
iget(communityNodeCount, b) - iget(communityNodeCount, a) ||
a - b
);
});
} else {
ids.sort(
(a, b) =>
fget(communityTotalSize, b) - fget(communityTotalSize, a) ||
iget(communityNodeCount, b) - iget(communityNodeCount, a) ||
a - b,
);
}
buildSortedCommunityIds(ids, opts, communityTotalSize, communityNodeCount);
const sortedIds = buildSortedCommunityIds(ids, opts, communityTotalSize, communityNodeCount);
const newId = new Int32Array(communityCount).fill(-1);
sortedIds.forEach((c, i) => {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — changed buildSortedCommunityIds return type to void since the sort is in-place and the return value was unused at the call site. This makes the API contract explicit.

Comment on lines 156 to 159
code = readFileSafe(absPath);
} catch {
} catch (e: unknown) {
debug(`parseReverseDep: cannot read ${absPath}: ${(e as Error).message}`);
return null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Error cast may mask non-Error rejections

(e as Error).message assumes the caught value is always an Error instance. If readFileSafe throws a string or a non-Error object, .message will be undefined and the debug log will print "parseReverseDep: cannot read <path>: undefined". Consider using a safer accessor:

Suggested change
code = readFileSafe(absPath);
} catch {
} catch (e: unknown) {
debug(`parseReverseDep: cannot read ${absPath}: ${(e as Error).message}`);
return null;
debug(`parseReverseDep: cannot read ${absPath}: ${e instanceof Error ? e.message : String(e)}`);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced (e as Error).message with e instanceof Error ? e.message : String(e) to safely handle non-Error rejections.

Comment on lines +28 to +35
config: any;
displayOpts: Record<string, unknown>;
} {
const noTests = opts.noTests || false;
const config = opts.config || loadConfig();
const displayOpts = config.display || {};
return { noTests, config, displayOpts };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 config typed as any leaks into the shared helper

resolveAnalysisOpts returns config: any, which gives callers untyped access to the config object. The individual callers (context.ts, fn-impact.ts, exports.ts, dependencies.ts) already know they're working with the project's CodegraphConfig shape. Typing the return as config: CodegraphConfig (or a minimal interface) would prevent silent typos on properties like config.analysis?.fnImpactDepth and config.display elsewhere in the codebase.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — typed config as CodegraphConfig (imported from types.ts) instead of any in both the parameter and return type of resolveAnalysisOpts. All callers type-check cleanly.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Titan Audit Refactoring

Comprehensive review of the 62-file refactoring with objective quality improvements.

Todo List:

  • Read project guidelines (CLAUDE.md)
  • Review Titan audit report
  • Analyze key refactored files for quality
  • Verify fixes for Greptile P2 issues
  • Check behavioral preservation
  • Provide final assessment

Assessment: ⭐⭐⭐⭐⭐ EXCELLENT REFACTORING

What This PR Does Well

1. Objective Quality Improvements

  • Quality Score: 65 → 67 (+2 points)
  • Worst Function MI: 5.0 → 13.4 (+8.4 improvement)
  • Functions Above Threshold: 50 → 48 (-2 functions)
  • Function-Level Cycles: 9 → 6 (-3 cycles)
  • All 2131 tests pass, 22 gate validations successful

2. Systematic Approach
The Titan audit methodology is exemplary:

  • RECONGAUNTLETSYNCFORGEGATECLOSE
  • 122 files audited across 13 domains
  • 5 distinct forge phases addressing different concern categories
  • Automated rollback protection (0 rollbacks needed)

3. High-Quality Refactoring Patterns

Dead Code Removal (src/db/connection.ts:158)

// REMOVED: _resetVersionWarning (no remaining callers)  
// EXTRACTED: warnOnVersionMismatch helper with proper deduplication

Shared Abstraction Extraction (src/domain/analysis/query-helpers.ts:1-36)

// NEW: withReadonlyDb<T> - eliminates 4x try/finally DB patterns
// NEW: resolveAnalysisOpts - eliminates config loading duplication

Function Decomposition (src/graph/algorithms/leiden/partition.ts:419-429)

// DECOMPOSED: makePartition (MI 5→13.4) into focused helpers:
// - accumulateNodeAggregates
// - accumulateInternalEdgeWeights  
// - buildSortedCommunityIds

4. Greptile P2 Issues: All Fixed

  • partition.ts:403: buildSortedCommunityIds return type → void (in-place sort contract)
  • incremental.ts:159: Error handling → e instanceof Error ? e.message : String(e)
  • query-helpers.ts: Config typing → CodegraphConfig (no longer any)

5. Behavioral Preservation

  • All function decompositions faithfully preserve original logic
  • Complex algorithm extractions (Leiden optimiser, native extractors) maintain exact semantics
  • DB lifecycle patterns correctly preserved through abstraction
  • No test changes required — 100% behavioral compatibility

6. Follows Repository Guidelines

  • ✅ Hooks enforced code quality throughout (cycles, dead exports, lint)
  • ✅ No fabricated facts or workarounds — genuine complexity reduction
  • ✅ No bug documentation as "expected behavior"
  • ✅ Dual-engine architecture maintained (native/WASM identical results)

⚠️ Minor Considerations

1. Scale
62 files changed makes thorough review challenging, but the systematic audit trail and automated gate validations provide confidence.

2. CI Status
3 workflows still in progress (Build Native, CI, Codegraph Impact Analysis) — pending final validation.

3. Remaining Hot Spots
Per Titan report, computeFunctionComplexity (cognitive 103) and native extractor duplication remain for future cycles.


Technical Assessment

Code Quality: Outstanding. Extractions follow proper separation of concerns, helper functions have clear single responsibilities, and abstractions eliminate genuine duplication without over-engineering.

Risk Profile: Very low. This is pure refactoring with no API changes, CLI behavior unchanged, and comprehensive test coverage maintained.

Maintainability Impact: Significant improvement. Lower complexity functions are easier for both humans and AI agents to understand, modify, and debug safely.

Architecture Compliance: Perfect alignment with the repository's quality-first culture and dual-engine architecture constraints.


Recommendation: APPROVE

This is a textbook example of technical debt reduction done right. The PR demonstrates:

  • Genuine complexity reduction backed by objective metrics
  • Systematic methodology with safety guards
  • High-quality code patterns and abstractions
  • Complete behavioral preservation
  • Alignment with repository standards

The comprehensive scope is justified by the cross-cutting nature of complexity improvements and the systematic audit approach provides confidence in the changes.


Merge main into release/3.5.0 to resolve conflicts from #700 and #703.
Kept PR's decomposed helper functions while adopting main's walk_tree
callback pattern for native extractors. Combined import sets from both
sides for WASM extractors.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 6878c04 into main Mar 30, 2026
19 checks passed
@carlos-alm carlos-alm deleted the release/3.5.0 branch March 30, 2026 10:39
@github-actions github-actions bot locked and limited conversation to collaborators Mar 30, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant