Skip to content

Comments

⚡️ Speed up function _find_type_node by 18% in PR #1199 (omni-java)#1292

Closed
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1199-2026-02-03T07.19.45
Closed

⚡️ Speed up function _find_type_node by 18% in PR #1199 (omni-java)#1292
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1199-2026-02-03T07.19.45

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 3, 2026

⚡️ This pull request contains optimizations for PR #1199

If you approve this dependent PR, these changes will be merged into the original PR branch omni-java.

This PR will be automatically closed if the original PR is merged.


📄 18% (0.18x) speedup for _find_type_node in codeflash/languages/java/context.py

⏱️ Runtime : 25.3 microseconds 21.4 microseconds (best of 34 runs)

📝 Explanation and details

The optimized code achieves a 17% runtime improvement by eliminating repeated dictionary creation overhead in a recursive function.

Key Optimization:
The critical change is moving the type_declarations dictionary from inside the function to module-level as _TYPE_DECLARATIONS. In the original code, this dictionary was recreated on every function call, including all recursive calls. The line profiler shows this dictionary construction consumed ~27% of the function's time (lines allocating "class_declaration", "interface_declaration", and "enum_declaration").

Why This Improves Performance:

  • Eliminates allocation overhead: Dictionary creation, even for small dicts, involves memory allocation and hashing operations on each call
  • Critical in recursive contexts: Since _find_type_node recursively traverses a tree structure, the dictionary was being recreated multiple times per search operation (25 hits in the profiler)
  • Constant lookup cost: Module-level constants are created once at import time and accessed via faster LOAD_GLOBAL bytecode operations

Test Results Analysis:
The optimization shows consistent gains across all test cases:

  • Deep nesting scenarios (19% faster): Maximum benefit when recursion depth is high, as dictionary recreation is avoided on each level
  • Multiple type scenarios (18-22% faster): When traversing multiple sibling nodes, the savings compound
  • Early termination cases (20% faster): Even when a match is found quickly, avoiding the dictionary creation overhead provides measurable gains

The profiler confirms the improvement: total function time decreased from 140.23μs to 115.17μs, with the dictionary construction lines completely eliminated from the optimized version.

This optimization is particularly valuable when parsing large Java ASTs with deep nesting or when this function is called frequently in a hot path, as the per-call overhead reduction scales with usage frequency.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 10 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.context import _find_type_node
from tree_sitter import Node

# helper classes for tests
class _NameNode:
    """
    Minimal object that exposes start_byte and end_byte attributes
    so that _find_type_node can slice source_bytes[source:start].
    """
    def __init__(self, start_byte: int, end_byte: int):
        self.start_byte = start_byte
        self.end_byte = end_byte

class _SimpleNode:
    """
    Lightweight test double that mimics the parts of tree_sitter.Node used by
    _find_type_node:
      - .type (a string)
      - .children (an iterable of child nodes)
      - .child_by_field_name(field) -> returns a node for 'name' or None
    This is not replacing tree_sitter.Node in the code under test; it only
    provides the attributes accessed by the function. Using a real tree-sitter
    Node in tests would require compiled language grammars, which is brittle in
    CI environments. The function does not perform isinstance checks, only attribute access.
    """
    def __init__(self, type_: str, children: list | None = None, name_node: _NameNode | None = None):
        self.type = type_
        # children must be an iterable the function can loop over
        self.children = list(children) if children else []
        # store a mapping for child_by_field_name
        self._field_map = {}
        if name_node is not None:
            # emulate that child_by_field_name("name") returns this node
            self._field_map["name"] = name_node

    def child_by_field_name(self, field: str):
        # return the node mapped to the requested field name, or None
        return self._field_map.get(field, None)

def test_basic_find_class_at_root():
    # Basic scenario: the root node itself is a class declaration with the expected name.
    source = b'public class MyClass { }'
    # find byte indices for "MyClass" in the source bytes
    start = source.find(b'MyClass')
    end = start + len(b'MyClass')
    # create a name node pointing to "MyClass"
    name_node = _NameNode(start, end)
    # make the root a class_declaration with the name node attached
    root = _SimpleNode('class_declaration', children=[], name_node=name_node)

    # call the function under test; it should return the root node and "class"
    found_node, kind = _find_type_node(root, 'MyClass', source) # 2.46μs -> 2.14μs (14.9% faster)

def test_find_interface_nested_deep():
    # Basic scenario: an interface declaration nested multiple levels deep should be found.
    source = b'// wrapper\ninterface InnerIntf {}\n'
    start = source.find(b'InnerIntf')
    end = start + len(b'InnerIntf')
    name_node = _NameNode(start, end)
    # interface node deep inside
    interface_node = _SimpleNode('interface_declaration', children=[], name_node=name_node)
    # intermediate wrapper nodes (non-type nodes)
    lvl2 = _SimpleNode('some_block', children=[interface_node])
    lvl1 = _SimpleNode('compilation_unit', children=[lvl2])
    root = _SimpleNode('root', children=[lvl1])

    # should find the interface node and return kind 'interface'
    found_node, kind = _find_type_node(root, 'InnerIntf', source) # 3.28μs -> 2.75μs (19.0% faster)

def test_enum_and_multiple_types_pick_exact_name():
    # Multiple type declarations at same level; ensure the one with exact name is returned.
    source = b'class Alpha {}\nclass Beta {}\nenum Gamma {}\n'
    # create name nodes for each
    a_start = source.find(b'Alpha'); a_end = a_start + len(b'Alpha')
    b_start = source.find(b'Beta'); b_end = b_start + len(b'Beta')
    g_start = source.find(b'Gamma'); g_end = g_start + len(b'Gamma')

    alpha = _SimpleNode('class_declaration', children=[], name_node=_NameNode(a_start, a_end))
    beta = _SimpleNode('class_declaration', children=[], name_node=_NameNode(b_start, b_end))
    gamma = _SimpleNode('enum_declaration', children=[], name_node=_NameNode(g_start, g_end))
    root = _SimpleNode('root', children=[alpha, beta, gamma])

    # find Beta
    found_node, kind = _find_type_node(root, 'Beta', source) # 3.04μs -> 2.56μs (18.4% faster)

    # find Gamma which is an enum
    found_node2, kind2 = _find_type_node(root, 'Gamma', source) # 1.96μs -> 1.60μs (22.6% faster)

def test_name_not_found_returns_none_and_empty_string():
    # Edge case: no type has the requested name -> should return (None, "")
    source = b'class One {}\n'
    start = source.find(b'One'); end = start + len(b'One')
    one = _SimpleNode('class_declaration', children=[], name_node=_NameNode(start, end))
    root = _SimpleNode('prog', children=[one])

    # search for a non-existing name
    found_node, kind = _find_type_node(root, 'DoesNotExist', source) # 2.12μs -> 1.79μs (18.1% faster)

def test_type_node_without_name_is_skipped():
    # Edge case: a type declaration exists but lacks a 'name' child -> should be ignored
    source = b'class Anonymous {}\nclass Named {}\n'
    # anonymous class node without name_node (simulate malformed AST)
    anon = _SimpleNode('class_declaration', children=[], name_node=None)
    # named class node
    n_start = source.find(b'Named'); n_end = n_start + len(b'Named')
    named = _SimpleNode('class_declaration', children=[], name_node=_NameNode(n_start, n_end))
    root = _SimpleNode('root', children=[anon, named])

    # search for 'Named' should still find the named class even though 'anon' exists
    found_node, kind = _find_type_node(root, 'Named', source) # 2.75μs -> 2.34μs (17.6% faster)

def test_partial_name_does_not_match_exact_equality_required():
    # Edge case: ensure substring matches do not count (exact equality required)
    source = b'class MyClassPlus {}\n'
    start = source.find(b'MyClassPlus'); end = start + len(b'MyClassPlus')
    node = _SimpleNode('class_declaration', children=[], name_node=_NameNode(start, end))
    root = _SimpleNode('root', children=[node])

    # searching for 'MyClass' (a substring) must not match
    found_node, kind = _find_type_node(root, 'MyClass', source) # 1.96μs -> 1.62μs (20.9% faster)

def test_name_with_utf8_characters_decodes_properly():
    # Edge case: ensure names with non-ascii characters decode correctly from source_bytes
    name = 'Café'  # contains non-ascii e accent
    source = f'public class {name} {{}}'.encode('utf8')
    start = source.find(name.encode('utf8'))
    end = start + len(name.encode('utf8'))
    node = _SimpleNode('class_declaration', children=[], name_node=_NameNode(start, end))
    root = _SimpleNode('root', children=[node])

    # searching for the unicode name must succeed
    found_node, kind = _find_type_node(root, 'Café', source) # 2.70μs -> 2.49μs (8.35% faster)

def test_early_match_stops_searching_remaining_children():
    # Basic property test: when a match is found early, the function should return it and not continue.
    # We'll instrument by placing a node after the match that, if visited, would be recognizable.
    source = b'class First {}\nclass Second {}\n'
    f_start = source.find(b'First'); f_end = f_start + len(b'First')
    s_start = source.find(b'Second'); s_end = s_start + len(b'Second')

    first = _SimpleNode('class_declaration', children=[], name_node=_NameNode(f_start, f_end))
    # The following node is given a sentinel type that we'd detect if visited. We'll ensure it isn't necessary.
    sentinel = _SimpleNode('SENTINEL_TYPE', children=[], name_node=_NameNode(s_start, s_end))
    # place first before sentinel
    root = _SimpleNode('root', children=[first, sentinel])

    found_node, kind = _find_type_node(root, 'First', source) # 2.76μs -> 2.29μs (20.5% faster)

def test_empty_source_bytes_with_zero_length_name():
    # Edge case: name node with zero-length span should decode to an empty string.
    # If the desired type_name is also an empty string, that should match according to code logic.
    source = b''  # empty source bytes
    # create a name node whose start and end are both 0
    zero_name = _NameNode(0, 0)
    node = _SimpleNode('class_declaration', children=[], name_node=zero_name)
    root = _SimpleNode('root', children=[node])

    # searching for empty string should match the zero-length name
    found_node, kind = _find_type_node(root, '', source) # 2.25μs -> 1.85μs (21.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1199-2026-02-03T07.19.45 and push.

Codeflash Static Badge

The optimized code achieves a **17% runtime improvement** by eliminating repeated dictionary creation overhead in a recursive function.

**Key Optimization:**
The critical change is moving the `type_declarations` dictionary from inside the function to module-level as `_TYPE_DECLARATIONS`. In the original code, this dictionary was recreated on every function call, including all recursive calls. The line profiler shows this dictionary construction consumed ~27% of the function's time (lines allocating "class_declaration", "interface_declaration", and "enum_declaration").

**Why This Improves Performance:**
- **Eliminates allocation overhead**: Dictionary creation, even for small dicts, involves memory allocation and hashing operations on each call
- **Critical in recursive contexts**: Since `_find_type_node` recursively traverses a tree structure, the dictionary was being recreated multiple times per search operation (25 hits in the profiler)
- **Constant lookup cost**: Module-level constants are created once at import time and accessed via faster LOAD_GLOBAL bytecode operations

**Test Results Analysis:**
The optimization shows consistent gains across all test cases:
- **Deep nesting scenarios** (19% faster): Maximum benefit when recursion depth is high, as dictionary recreation is avoided on each level
- **Multiple type scenarios** (18-22% faster): When traversing multiple sibling nodes, the savings compound
- **Early termination cases** (20% faster): Even when a match is found quickly, avoiding the dictionary creation overhead provides measurable gains

The profiler confirms the improvement: total function time decreased from 140.23μs to 115.17μs, with the dictionary construction lines completely eliminated from the optimized version.

This optimization is particularly valuable when parsing large Java ASTs with deep nesting or when this function is called frequently in a hot path, as the per-call overhead reduction scales with usage frequency.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 3, 2026
@codeflash-ai codeflash-ai bot mentioned this pull request Feb 3, 2026
@KRRT7
Copy link
Collaborator

KRRT7 commented Feb 19, 2026

Closing stale bot PR.

@KRRT7 KRRT7 closed this Feb 19, 2026
@KRRT7 KRRT7 deleted the codeflash/optimize-pr1199-2026-02-03T07.19.45 branch February 19, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant