⚡️ Speed up function find_last_node by 13,074%#271
Closed
codeflash-ai[bot] wants to merge 1 commit intooptimizefrom
Closed
⚡️ Speed up function find_last_node by 13,074%#271codeflash-ai[bot] wants to merge 1 commit intooptimizefrom
find_last_node by 13,074%#271codeflash-ai[bot] wants to merge 1 commit intooptimizefrom
Conversation
The optimized code achieves a **131x speedup** (13,073% faster) by fundamentally restructuring the algorithm to eliminate nested iteration. The original implementation uses a nested generator expression that results in O(n×m) complexity, where for each node it checks all edges. The optimized version reduces this to O(n+m) by pre-computing a set of source IDs.
**Key optimization techniques:**
1. **Set-based lookup instead of nested iteration**: The original code calls `all(e["source"] != n["id"] for e in edges)` for each node, which means for every node, it iterates through all edges. The optimized version builds `sources = {e["source"] for e in edges}` once upfront, creating a hash set that enables O(1) membership checks via `n["id"] not in sources`.
2. **Early return path for empty edges**: When there are no edges, the optimized code immediately returns the first node without any ID lookups, matching the original's lazy evaluation behavior while being more explicit and faster.
3. **Direct iteration over explicit loop**: Replacing the generator expression with a straightforward for-loop eliminates the overhead of generator machinery and makes the control flow more efficient.
**Performance across test cases:**
The optimization shows dramatic improvements across all scenarios:
- **Small graphs** (2-5 nodes): 71-129% faster
- **Medium graphs** (100-300 nodes): 100-6,038% faster
- **Large graphs** (500+ nodes): 16,450-23,607% faster
The speedup scales particularly well with graph size. For example:
- `test_large_scale_flow_find_last_node_performance` (500 nodes): 4.50ms → 27.2μs (16,450% faster)
- `test_large_scale_grid_graph` (900 nodes): 15.0ms → 63.5μs (23,607% faster)
- `test_large_scale_density_edge_case` (100 nodes, dense): 874μs → 14.2μs (6,038% faster)
**Why this matters:**
This optimization is particularly valuable for graph algorithms that need to identify terminal nodes in flows or DAGs. The O(n+m) complexity means performance remains predictable even as graph size grows, whereas the original O(n×m) approach degrades rapidly with larger edge counts. The optimization maintains correctness across all edge cases including empty inputs, cycles, and type mismatches.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 13,074% (130.74x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
25.8 milliseconds→196 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 131x speedup (13,073% faster) by fundamentally restructuring the algorithm to eliminate nested iteration. The original implementation uses a nested generator expression that results in O(n×m) complexity, where for each node it checks all edges. The optimized version reduces this to O(n+m) by pre-computing a set of source IDs.
Key optimization techniques:
Set-based lookup instead of nested iteration: The original code calls
all(e["source"] != n["id"] for e in edges)for each node, which means for every node, it iterates through all edges. The optimized version buildssources = {e["source"] for e in edges}once upfront, creating a hash set that enables O(1) membership checks vian["id"] not in sources.Early return path for empty edges: When there are no edges, the optimized code immediately returns the first node without any ID lookups, matching the original's lazy evaluation behavior while being more explicit and faster.
Direct iteration over explicit loop: Replacing the generator expression with a straightforward for-loop eliminates the overhead of generator machinery and makes the control flow more efficient.
Performance across test cases:
The optimization shows dramatic improvements across all scenarios:
The speedup scales particularly well with graph size. For example:
test_large_scale_flow_find_last_node_performance(500 nodes): 4.50ms → 27.2μs (16,450% faster)test_large_scale_grid_graph(900 nodes): 15.0ms → 63.5μs (23,607% faster)test_large_scale_density_edge_case(100 nodes, dense): 874μs → 14.2μs (6,038% faster)Why this matters:
This optimization is particularly valuable for graph algorithms that need to identify terminal nodes in flows or DAGs. The O(n+m) complexity means performance remains predictable even as graph size grows, whereas the original O(n×m) approach degrades rapidly with larger edge counts. The optimization maintains correctness across all edge cases including empty inputs, cycles, and type mismatches.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mldg7yhfand push.