⚡️ Speed up function extract_init_stub_from_class by 70% in PR #1524 (fixes-for-core-unstructured-experimental)#1529
Merged
KRRT7 merged 2 commits intofixes-for-core-unstructured-experimentalfrom Feb 18, 2026
Conversation
The optimized code achieves a **70% runtime speedup** (from 7.02ms to 4.13ms) through three key improvements: ## 1. **Faster Class Discovery via Deque-Based BFS (Primary Speedup)** The original code uses `ast.walk()` which recursively traverses the entire AST tree even after finding the target class. The line profiler shows this taking 20.5ms (71% of time). The optimized version replaces this with an explicit BFS using `collections.deque`, which stops immediately upon finding the target class. The profiler shows this reduces traversal time to 9.95ms - **cutting the search overhead by >50%**. This is especially impactful when: - The target class appears early in the module (eliminates unnecessary traversal) - The module contains many classes (test shows 7-10% faster on modules with 100-1000 classes) - The function is called frequently (shown by the 108% speedup on 1000 repeated calls) ## 2. **Explicit Loops Replace Generator Overhead** The original code uses `any()` with a generator expression and `min()` with a generator to check decorators and find minimum line numbers. These create function call and generator overhead. The optimized version uses explicit `for` loops with early breaks: - Decorator checking: Directly iterates and breaks on first match - Min line number: Uses explicit comparison instead of `min()` generator The profiler shows decorator processing time reduced from ~1.4ms to ~0.3ms, and min line calculation from 69μs to 28μs. ## 3. **Conditional Flag Pattern for Relevance Checking** Instead of evaluating both conditions in a compound expression, the optimized version uses an `is_relevant` flag with early exits, reducing redundant checks. ## Impact on Workloads Based on `function_references`, this function is called from: - `enrich_testgen_context`: Used in test generation workflows where it may process many classes - Benchmark tests: Indicates this is in a performance-critical path The optimization particularly benefits: - **Large codebases**: 89-90% faster on classes with 100+ methods or 50+ properties - **Repeated calls**: 108% faster when called 1000 times in sequence - **Early matches**: Up to 88% faster when target class is found quickly - **Deep nesting**: 57% faster for nested classes The annotated tests show consistent 50-108% speedups across most scenarios, with minimal gains (6-10%) only when processing very large files where string slicing dominates runtime.
2364096
into
fixes-for-core-unstructured-experimental
26 of 27 checks passed
Contributor
PR Review SummaryPrek Checks✅ Fixed — 2 issues auto-fixed and committed:
Additionally fixed 6 mypy type errors in
All prek and mypy checks pass after fixes. Code ReviewNo critical bugs or security vulnerabilities found. Notable observations:
Test Coverage
Coverage improved from 85% → 91% for the main source file. Tests were updated to match the new API (removed functions replaced with new test cases). 8 pre-existing test failures in Last updated: 2026-02-18T14:57Z |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1524
If you approve this dependent PR, these changes will be merged into the original PR branch
fixes-for-core-unstructured-experimental.📄 70% (0.70x) speedup for
extract_init_stub_from_classincodeflash/languages/python/context/code_context_extractor.py⏱️ Runtime :
7.02 milliseconds→4.13 milliseconds(best of41runs)📝 Explanation and details
The optimized code achieves a 70% runtime speedup (from 7.02ms to 4.13ms) through three key improvements:
1. Faster Class Discovery via Deque-Based BFS (Primary Speedup)
The original code uses
ast.walk()which recursively traverses the entire AST tree even after finding the target class. The line profiler shows this taking 20.5ms (71% of time).The optimized version replaces this with an explicit BFS using
collections.deque, which stops immediately upon finding the target class. The profiler shows this reduces traversal time to 9.95ms - cutting the search overhead by >50%.This is especially impactful when:
2. Explicit Loops Replace Generator Overhead
The original code uses
any()with a generator expression andmin()with a generator to check decorators and find minimum line numbers. These create function call and generator overhead.The optimized version uses explicit
forloops with early breaks:min()generatorThe profiler shows decorator processing time reduced from ~1.4ms to ~0.3ms, and min line calculation from 69μs to 28μs.
3. Conditional Flag Pattern for Relevance Checking
Instead of evaluating both conditions in a compound expression, the optimized version uses an
is_relevantflag with early exits, reducing redundant checks.Impact on Workloads
Based on
function_references, this function is called from:enrich_testgen_context: Used in test generation workflows where it may process many classesThe optimization particularly benefits:
The annotated tests show consistent 50-108% speedups across most scenarios, with minimal gains (6-10%) only when processing very large files where string slicing dominates runtime.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1524-2026-02-18T14.38.26and push.