From 1a0ab57432592a53ffe20cfb2bb1f56c1fa35550 Mon Sep 17 00:00:00 2001
From: "codeflash-ai[bot]"
 <148906541+codeflash-ai[bot]@users.noreply.github.com>
Date: Tue, 3 Feb 2026 04:05:47 +0000
Subject: [PATCH] Optimize _get_parent_type_name
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The optimized code achieves a **12% runtime improvement** by replacing the inline tuple `("ClassDef", "InterfaceDef", "EnumDef")` with a module-level `frozenset` constant `_PARENT_TYPE_NAMES`.

**What changed:**
- A `frozenset` containing the three parent type names is created once at module load time
- The membership test `parent.type in _PARENT_TYPE_NAMES` now uses the frozenset instead of creating a tuple on each check

**Why this is faster:**
The key performance gain comes from two factors:
1. **Constant instantiation overhead eliminated**: The original code creates a new tuple object every time the membership check executes (513 hits in the profile). The optimized version creates the frozenset only once at module load.
2. **O(1) hash-based lookup**: While the difference is marginal for just 3 elements, `frozenset` uses hash-based membership testing (O(1) average case) versus tuple's linear scan (O(n)). This provides a small but measurable speedup per check.

**Performance characteristics:**
The line profiler shows the critical loop line (checking `parent.type in ...`) executes 513 times and accounts for ~51% of total runtime. Even small per-iteration improvements here compound significantly. The test results confirm this:
- **Large-scale benefit**: The `test_large_scale_parents_last_element_matches` test shows a dramatic **27.2% speedup** (27.6μs → 21.7μs) when iterating through 500 parents, demonstrating the optimization scales well with larger parent lists
- **Small overhead on fast paths**: Tests with early returns or no parent iteration show minor slowdowns (3-13%), likely due to cache effects or measurement noise on nanosecond-scale operations
- **Overall win**: The aggregate 12% speedup indicates the optimization benefits the typical usage pattern where multiple parents are checked

This optimization is particularly valuable if `_get_parent_type_name` is called frequently during Java code analysis, as the savings multiply across many invocations.
---
 codeflash/languages/java/context.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/codeflash/languages/java/context.py b/codeflash/languages/java/context.py
index 2ccfd34bf..63cc630b0 100644
--- a/codeflash/languages/java/context.py
+++ b/codeflash/languages/java/context.py
@@ -20,6 +20,8 @@
 if TYPE_CHECKING:
     from tree_sitter import Node
 
+_PARENT_TYPE_NAMES: frozenset[str] = frozenset(("ClassDef", "InterfaceDef", "EnumDef"))
+
 logger = logging.getLogger(__name__)
 
 
@@ -138,7 +140,7 @@ def _get_parent_type_name(function: FunctionToOptimize) -> str | None:
     # Check parents for interface/enum
     if function.parents:
         for parent in function.parents:
-            if parent.type in ("ClassDef", "InterfaceDef", "EnumDef"):
+            if parent.type in _PARENT_TYPE_NAMES:
                 return parent.name
 
     return None