⚡️ Speed up function discover_functions_from_source by 12% in PR #1199 (omni-java)#1293
Closed
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
Closed
⚡️ Speed up function discover_functions_from_source by 12% in PR #1199 (omni-java)#1293codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
discover_functions_from_source by 12% in PR #1199 (omni-java)#1293codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
Conversation
This optimization achieves an **11% runtime improvement** (from 21.3ms to 19.0ms) through several targeted changes to the Java code discovery pipeline:
## Key Optimizations
### 1. **Module-Level Import Hoisting**
The `fnmatch` module is now imported once at the module level instead of being conditionally imported inside `_should_include_method` on every pattern check. This eliminates repeated import overhead when filtering methods with include/exclude patterns, as shown in the line profiler where pattern matching checks were consuming ~5-11% of total time in the original code.
### 2. **Default Path Pre-computation**
The fallback `Path("unknown.java")` is now computed once before the loop (`default_file_path = file_path or Path("unknown.java")`) rather than 1,224 times inside the loop. The line profiler shows this change reduced time spent on the file_path assignment from **12.2%** to **0.3%** of total function time - a critical improvement since this line was the second-most expensive operation in the original code.
### 3. **Early Exit Reordering in Filters**
The `include_methods` check is moved earlier in `_should_include_method`, before the more expensive pattern matching operations. This allows the function to exit early for methods that should be excluded due to being class methods, avoiding unnecessary fnmatch calls. The line count calculation is also made conditional - only computed when `min_lines` or `max_lines` criteria are actually set, reducing unnecessary arithmetic for 1,022 out of 1,341 invocations.
## Performance Impact by Test Case
The optimizations particularly benefit scenarios with:
- **Multiple methods with patterns**: 9-31% faster (e.g., `test_large_scale_many_methods_under_limit` shows 29.9% improvement)
- **File path handling**: Tests that provide explicit paths see consistent 3-18% improvements
- **Line count filtering**: 18.6% faster when min/max line criteria are active
Tests that regressed slightly (showing slower times) are edge cases with very few methods where the overhead of the additional conditional check (`if criteria.min_lines is not None or criteria.max_lines is not None`) marginally exceeds savings, but these represent atypical usage with only 1-2 methods.
## Why This Matters
While individual micro-optimizations are small, they compound significantly in the hot loop that processes all discovered methods. With 1,650+ method invocations in typical runs, eliminating repeated imports, reducing object allocations, and enabling early exits creates measurable aggregate savings. The 11% runtime improvement demonstrates how loop-level optimizations scale effectively for Java codebases with many methods.
Collaborator
|
Closing stale bot PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1199
If you approve this dependent PR, these changes will be merged into the original PR branch
omni-java.📄 12% (0.12x) speedup for
discover_functions_from_sourceincodeflash/languages/java/discovery.py⏱️ Runtime :
21.3 milliseconds→19.0 milliseconds(best of19runs)📝 Explanation and details
This optimization achieves an 11% runtime improvement (from 21.3ms to 19.0ms) through several targeted changes to the Java code discovery pipeline:
Key Optimizations
1. Module-Level Import Hoisting
The
fnmatchmodule is now imported once at the module level instead of being conditionally imported inside_should_include_methodon every pattern check. This eliminates repeated import overhead when filtering methods with include/exclude patterns, as shown in the line profiler where pattern matching checks were consuming ~5-11% of total time in the original code.2. Default Path Pre-computation
The fallback
Path("unknown.java")is now computed once before the loop (default_file_path = file_path or Path("unknown.java")) rather than 1,224 times inside the loop. The line profiler shows this change reduced time spent on the file_path assignment from 12.2% to 0.3% of total function time - a critical improvement since this line was the second-most expensive operation in the original code.3. Early Exit Reordering in Filters
The
include_methodscheck is moved earlier in_should_include_method, before the more expensive pattern matching operations. This allows the function to exit early for methods that should be excluded due to being class methods, avoiding unnecessary fnmatch calls. The line count calculation is also made conditional - only computed whenmin_linesormax_linescriteria are actually set, reducing unnecessary arithmetic for 1,022 out of 1,341 invocations.Performance Impact by Test Case
The optimizations particularly benefit scenarios with:
test_large_scale_many_methods_under_limitshows 29.9% improvement)Tests that regressed slightly (showing slower times) are edge cases with very few methods where the overhead of the additional conditional check (
if criteria.min_lines is not None or criteria.max_lines is not None) marginally exceeds savings, but these represent atypical usage with only 1-2 methods.Why This Matters
While individual micro-optimizations are small, they compound significantly in the hot loop that processes all discovered methods. With 1,650+ method invocations in typical runs, eliminating repeated imports, reducing object allocations, and enabling early exits creates measurable aggregate savings. The 11% runtime improvement demonstrates how loop-level optimizations scale effectively for Java codebases with many methods.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1199-2026-02-03T07.34.56and push.