⚡️ Speed up function _add_behavior_instrumentation by 197% in PR #1580 (fix/java-direct-jvm-and-bugs)#1596
Conversation
The optimized code achieves a **196% speedup** (from 13.3ms to 4.49ms) primarily through two focused optimizations that target the hottest paths identified by the line profiler:
## Key Optimizations
### 1. Early Exit in `wrap_target_calls_with_treesitter` (Primary Driver)
The profiler shows that in the original code, 55.5% of `wrap_target_calls_with_treesitter`'s time (9.7ms out of 17.5ms) was spent in `_collect_calls`, which parses Java code with tree-sitter. The optimization adds:
```python
body_text = "\n".join(body_lines)
if func_name not in body_text:
return list(body_lines), 0
```
This simple string membership check avoids expensive tree-sitter parsing when the target function isn't present in the test method body. Since many test methods don't call the function being instrumented, this provides massive savings. The annotated tests confirm this pattern - tests with empty or simple bodies (no function calls) show the largest speedups: 639% for large methods and 1018% for complex expressions.
### 2. Optimized `_is_test_annotation` (Secondary Improvement)
The profiler shows `_is_test_annotation` being called 1,950 times, spending 100% of its time (1.21ms) on regex matching. The optimization replaces the regex with direct string checks:
```python
if not stripped_line.startswith("@test"):
return False
if len(stripped_line) == 5: # exactly "@test"
return True
next_char = stripped_line[5]
return next_char == " " or next_char == "("
```
This avoids regex overhead for the 1,737 non-`@Test` annotations that can be rejected immediately with `startswith()`. The profiler shows this reduced time from 1.21ms to 0.91ms (25% faster in this function).
## Performance Impact by Test Type
The annotated tests reveal optimization effectiveness varies by workload:
- **Empty/simple methods**: 107-154% faster (early exit dominates)
- **Methods with complex expressions**: 396-1018% faster (avoids parsing large expression trees)
- **Large methods with many statements**: 510-639% faster (early exit + reduced AST traversal)
- **Methods with actual function calls**: 111-152% faster (smaller benefit since tree-sitter must run)
## Context and Production Impact
Based on `function_references`, this function is called from test discovery in `test_instrumentation.py`, specifically for behavior instrumentation that captures return values. The early exit optimization is particularly valuable here because:
1. Test discovery processes many test methods, but typically only a subset call the target function
2. The function operates on the hot path during test suite instrumentation
3. Large test suites with 100+ test methods (see test case showing 154% speedup for 150 methods) benefit significantly
The optimization maintains correctness - all test cases pass with identical output, confirming the early exit safely bypasses work that produces no changes when the function isn't present.
PR Review SummaryPrek ChecksFixed 1 issue:
mypy: No type errors found. Code ReviewNo critical issues found. The optimization makes two targeted changes:
Both changes are correctness-preserving — the early exit cannot produce false negatives (if a function is called, its name must appear in the text). Test Coverage
Optimized lines coverage:
Note: This is a new file (relative to Pre-existing test failure: Optimization PRs24 open codeflash optimization PRs checked — all have CI failures (mostly JS e2e tests and Snyk). None eligible for merge. Last updated: 2026-02-20 |
8c3a2b0
into
fix/java-direct-jvm-and-bugs
⚡️ This pull request contains optimizations for PR #1580
If you approve this dependent PR, these changes will be merged into the original PR branch
fix/java-direct-jvm-and-bugs.📄 197% (1.97x) speedup for
_add_behavior_instrumentationincodeflash/languages/java/instrumentation.py⏱️ Runtime :
13.3 milliseconds→4.49 milliseconds(best of118runs)📝 Explanation and details
The optimized code achieves a 196% speedup (from 13.3ms to 4.49ms) primarily through two focused optimizations that target the hottest paths identified by the line profiler:
Key Optimizations
1. Early Exit in
wrap_target_calls_with_treesitter(Primary Driver)The profiler shows that in the original code, 55.5% of
wrap_target_calls_with_treesitter's time (9.7ms out of 17.5ms) was spent in_collect_calls, which parses Java code with tree-sitter. The optimization adds:This simple string membership check avoids expensive tree-sitter parsing when the target function isn't present in the test method body. Since many test methods don't call the function being instrumented, this provides massive savings. The annotated tests confirm this pattern - tests with empty or simple bodies (no function calls) show the largest speedups: 639% for large methods and 1018% for complex expressions.
2. Optimized
_is_test_annotation(Secondary Improvement)The profiler shows
_is_test_annotationbeing called 1,950 times, spending 100% of its time (1.21ms) on regex matching. The optimization replaces the regex with direct string checks:This avoids regex overhead for the 1,737 non-
@Testannotations that can be rejected immediately withstartswith(). The profiler shows this reduced time from 1.21ms to 0.91ms (25% faster in this function).Performance Impact by Test Type
The annotated tests reveal optimization effectiveness varies by workload:
Context and Production Impact
Based on
function_references, this function is called from test discovery intest_instrumentation.py, specifically for behavior instrumentation that captures return values. The early exit optimization is particularly valuable here because:The optimization maintains correctness - all test cases pass with identical output, confirming the early exit safely bypasses work that produces no changes when the function isn't present.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1580-2026-02-20T10.00.27and push.