Skip to content

fix(extraction): convert remaining fixed-size stacks in extract_channels.c to growable TSNodeStack#339

Open
jjoos wants to merge 2 commits into
DeusData:mainfrom
jjoos:fix/remaining-fixed-stacks
Open

fix(extraction): convert remaining fixed-size stacks in extract_channels.c to growable TSNodeStack#339
jjoos wants to merge 2 commits into
DeusData:mainfrom
jjoos:fix/remaining-fixed-stacks

Conversation

@jjoos
Copy link
Copy Markdown

@jjoos jjoos commented May 11, 2026

Problem

PR #217 converted most AST traversal functions to use the new growable TSNodeStack (arena-allocated, no hard cap), but extract_channels.c still had 8 functions using fixed-size TSNode stack[CHAN_STACK_CAP] arrays:

  • scan_string_consts_python
  • extract_channels_python
  • extract_channels_go
  • extract_channels_java
  • extract_channels_csharp
  • extract_channels_ruby
  • extract_channels_elixir
  • extract_channels_rust

When indexing large repositories with deep AST nesting (e.g. 4096+ nodes on a single DFS path), these arrays overflow silently — the capacity guard && top < CHAN_STACK_CAP drops children without warning, and when the stack itself overflows it corrupts adjacent heap memory. The corruption surfaces as a segmentation fault in a later pass (observed in lsp_cross and semantic_edges passes on large repos).

Fix

Apply the same mechanical conversion used in PR #217 to the 8 remaining functions:

Before After
TSNode stack[CHAN_STACK_CAP]; int top = 0; TSNodeStack stack; ts_nstack_init(&stack, ctx->arena, CHAN_STACK_CAP);
stack[top++] = ctx->root ts_nstack_push(&stack, ctx->arena, ctx->root)
while (top > 0) while (stack.count > 0)
TSNode node = stack[--top] TSNode node = ts_nstack_pop(&stack)
&& top < CHAN_STACK_CAP capacity guard removed (arena grows automatically)
stack[top++] = ts_node_child(...) ts_nstack_push(&stack, ctx->arena, ts_node_child(...))

Testing

Verified by indexing large repositories (1290 files, 13,735 nodes; 3482+ functions) — previously these repos caused segfaults, with this fix all repos index cleanly.

jjoos added 2 commits May 11, 2026 22:52
…els.c to growable TSNodeStack

PR DeusData#217 converted most extractors to growable arena-allocated TSNodeStack,
but extract_channels.c still had 8 functions using fixed-size
`TSNode stack[CHAN_STACK_CAP]` arrays. On large codebases with 4096+ AST
nodes on a single DFS path, these arrays overflow silently, corrupting
adjacent heap memory and causing segfaults in later passes.

Apply the same mechanical conversion used in PR DeusData#217:
  - `TSNode stack[CHAN_STACK_CAP]; int top = 0;` -> `TSNodeStack stack; ts_nstack_init(...);`
  - `stack[top++] = root` -> `ts_nstack_push(&stack, ctx->arena, root)`
  - `while (top > 0)` -> `while (stack.count > 0)`
  - `TSNode node = stack[--top]` -> `TSNode node = ts_nstack_pop(&stack)`
  - Remove `&& top < CHAN_STACK_CAP` capacity guard from push loops

Affected functions: scan_string_consts_python, extract_channels_python,
extract_channels_go, extract_channels_java, extract_channels_csharp,
extract_channels_ruby, extract_channels_elixir, extract_channels_rust.

Fixes segmentation faults when indexing large repositories.
…nsts_python

The while condition in scan_string_consts_python had an extra
`&& tbl->count < CHAN_CONST_CAP` clause that wasn't caught by the
previous conversion pass. Convert it properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant