-
Notifications
You must be signed in to change notification settings - Fork 35
β‘ Bolt: Optimize /stats endpoint with single aggregate query #600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -53,21 +53,26 @@ def get_stats(db: Session = Depends(get_db)): | |||||||||||||||||||||||||||||||||||
| if cached_stats: | ||||||||||||||||||||||||||||||||||||
| return JSONResponse(content=cached_stats) | ||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||
| # Optimized: Single aggregate query to calculate total and resolved issues | ||||||||||||||||||||||||||||||||||||
| stats = db.query( | ||||||||||||||||||||||||||||||||||||
| func.count(Issue.id).label("total"), | ||||||||||||||||||||||||||||||||||||
| func.sum(case((Issue.status.in_(['resolved', 'verified']), 1), else_=0)).label("resolved") | ||||||||||||||||||||||||||||||||||||
| ).first() | ||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||
| total = stats.total or 0 | ||||||||||||||||||||||||||||||||||||
| resolved = int(stats.resolved or 0) | ||||||||||||||||||||||||||||||||||||
| # β‘ Bolt Optimization: Consolidate multiple aggregate queries into a single database roundtrip | ||||||||||||||||||||||||||||||||||||
| # by grouping by category and accumulating system-wide totals in Python. | ||||||||||||||||||||||||||||||||||||
| results = db.query( | ||||||||||||||||||||||||||||||||||||
| Issue.category, | ||||||||||||||||||||||||||||||||||||
| func.count(Issue.id).label('count'), | ||||||||||||||||||||||||||||||||||||
| func.sum(case((Issue.status.in_(['resolved', 'verified']), 1), else_=0)).label('resolved_count') | ||||||||||||||||||||||||||||||||||||
| ).group_by(Issue.category).all() | ||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||
| total = 0 | ||||||||||||||||||||||||||||||||||||
| resolved = 0 | ||||||||||||||||||||||||||||||||||||
| issues_by_category = {} | ||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||
| for cat, count, res_count in results: | ||||||||||||||||||||||||||||||||||||
| total += count | ||||||||||||||||||||||||||||||||||||
| resolved += int(res_count or 0) | ||||||||||||||||||||||||||||||||||||
| issues_by_category[cat] = count | ||||||||||||||||||||||||||||||||||||
|
Comment on lines
+68
to
+71
|
||||||||||||||||||||||||||||||||||||
| for cat, count, res_count in results: | |
| total += count | |
| resolved += int(res_count or 0) | |
| issues_by_category[cat] = count | |
| for cat, issue_count, res_count in results: | |
| total += issue_count | |
| resolved += int(res_count or 0) | |
| issues_by_category[cat] = issue_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π§© Analysis chain
π Script executed:
#!/bin/bash
# Check if there are any NULL categories in the codebase that would trigger this issue
# Also verify how the original implementation handled NULL categories
# Search for how category is used/defaulted when creating issues
rg -n -A3 -B3 'category\s*=' --type=py -g '!*test*'Repository: RohanExploit/VishwaGuru
Length of output: 10428
π Script executed:
# Get the full context of the function around lines 68-71 in utility.py
head -n 85 backend/routers/utility.py | tail -n 30Repository: RohanExploit/VishwaGuru
Length of output: 1014
π Script executed:
# Search for the function that contains this code and see the full query
ast-grep --pattern $'func $_($_) {
$$$
for cat, count, res_count in results:
$$$
$$$
}'Repository: RohanExploit/VishwaGuru
Length of output: 49
π Script executed:
# Find the exact function and get broader context
cat -n backend/routers/utility.py | head -n 85Repository: RohanExploit/VishwaGuru
Length of output: 3264
π Script executed:
# Find the StatsResponse schema definition
rg -n "class StatsResponse" -A 10 --type=pyRepository: RohanExploit/VishwaGuru
Length of output: 928
π Script executed:
# Also check if there's any validation or filtering of None categories
rg -n "issues_by_category" -B 3 -A 3 --type=pyRepository: RohanExploit/VishwaGuru
Length of output: 1665
π Script executed:
# Check if there are any records being created with NULL category
rg -n "Issue(" -A 5 --type=py | grep -A 5 "Issue("Repository: RohanExploit/VishwaGuru
Length of output: 129
π Script executed:
# Verify Pydantic behavior with None keys - test what happens when None is used as dict key
python3 << 'EOF'
from typing import Dict
from pydantic import BaseModel
import json
class TestResponse(BaseModel):
items: Dict[str, int]
# Test 1: None key in dict
test_dict = {None: 5, "other": 10}
try:
response = TestResponse(items=test_dict)
print("Pydantic accepts None key")
print(f"model_dump(): {response.model_dump()}")
print(f"model_dump(mode='json'): {response.model_dump(mode='json')}")
print(f"JSON serialized: {json.dumps(response.model_dump(mode='json'))}")
except Exception as e:
print(f"Pydantic error with None key: {type(e).__name__}: {e}")
# Test 2: What if we try to pass it directly
print("\n--- Direct JSON handling ---")
raw_dict = {None: 5, "other": 10}
try:
json_str = json.dumps(raw_dict)
print(f"json.dumps with None key: {json_str}")
except Exception as e:
print(f"json.dumps error: {type(e).__name__}: {e}")
EOFRepository: RohanExploit/VishwaGuru
Length of output: 419
π Script executed:
# Check if there are any Issues being created without a category in the codebase
rg -n "Issue\(" -A 5 --type=py -g '!*test*' | head -100Repository: RohanExploit/VishwaGuru
Length of output: 1787
π Script executed:
# Check if there's a default value or validation that ensures category is never NULL
cat -n backend/models.py | grep -A 10 "class Issue"Repository: RohanExploit/VishwaGuru
Length of output: 561
π Script executed:
# Check if category is optional when creating Issues in routers
rg -n "category" backend/routers/issues.py -B 2 -A 2 | head -80Repository: RohanExploit/VishwaGuru
Length of output: 3492
π Script executed:
# Check the Issue creation endpoints to see if category is required
ast-grep --pattern 'def $_($_: $_Request, $_: Session):
$$$
Issue($$$)
$$$'Repository: RohanExploit/VishwaGuru
Length of output: 49
π Script executed:
# Check if there are any actual issues with NULL category in the database
# by looking at any migrations or data seeding
fd -e "py" -e "sql" | xargs grep -l "category" | grep -i migratRepository: RohanExploit/VishwaGuru
Length of output: 49
π Script executed:
# Verify if model_dump with mode='json' on Pydantic would raise an error with None keys
python3 << 'EOF'
from typing import Dict
from pydantic import BaseModel
class StatsResponse(BaseModel):
total_issues: int
resolved_issues: int
pending_issues: int
issues_by_category: Dict[str, int]
# Simulate what happens in the code
issues_by_category = {}
results = [(None, 5, 0), ("pothole", 10, 3)] # Simulating GROUP BY with NULL category
total = 0
resolved = 0
for cat, count, res_count in results:
total += count
resolved += int(res_count or 0)
issues_by_category[cat] = count
print(f"Dictionary before Pydantic: {issues_by_category}")
try:
response = StatsResponse(
total_issues=total,
resolved_issues=resolved,
pending_issues=total - resolved,
issues_by_category=issues_by_category
)
print(f"Pydantic validation passed")
data = response.model_dump(mode='json')
print(f"model_dump(mode='json') result: {data}")
except Exception as e:
print(f"ERROR: {type(e).__name__}: {e}")
EOFRepository: RohanExploit/VishwaGuru
Length of output: 382
π Script executed:
# Check if category is required in all Issue creation endpoints
rg -n "Form\(" backend/routers/issues.py | grep categoryRepository: RohanExploit/VishwaGuru
Length of output: 165
π Script executed:
# Check voice router category requirement
rg -n "issue_category" backend/routers/voice.py -B 3 -A 3Repository: RohanExploit/VishwaGuru
Length of output: 604
π Script executed:
# Check if there's any code path that could create Issues without category
rg -n "new_issue = Issue" -A 3 --type=pyRepository: RohanExploit/VishwaGuru
Length of output: 1054
π Script executed:
# Check if there's a database constraint or validator that prevents NULL category
rg -n "nullable\|NOT NULL\|constraint" backend/models.py | grep -i "issue\|category"Repository: RohanExploit/VishwaGuru
Length of output: 49
π Script executed:
# Check if there's any validation in the Issue model itself
cat -n backend/models.py | sed -n '131,160p'Repository: RohanExploit/VishwaGuru
Length of output: 1868
π Script executed:
# Check if bot.py validates category before creating Issue
rg -n "category" backend/bot.py -B 5 -A 2 | head -40Repository: RohanExploit/VishwaGuru
Length of output: 1503
π Script executed:
# Final verification: Check if StatsResponse is used in any validation or response handling
rg -n "StatsResponse" --type=py -B 2 -A 2Repository: RohanExploit/VishwaGuru
Length of output: 4294
NULL category values will cause a validation error in the /stats endpoint.
Issue.category is nullable (no nullable=False constraint in models.py), so when grouping issues by category, the SQL query will include rows where cat is None. When this None is used as a dictionary key, Pydantic's validation will fail because StatsResponse.issues_by_category expects Dict[str, int] (string keys only), not None keys. This causes a ValidationError at runtime.
Coalesce None to a meaningful label:
for cat, count, res_count in results:
total += count
resolved += int(res_count or 0)
- issues_by_category[cat] = count
+ category_key = cat if cat else "Uncategorized"
+ issues_by_category[category_key] = countπ Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for cat, count, res_count in results: | |
| total += count | |
| resolved += int(res_count or 0) | |
| issues_by_category[cat] = count | |
| for cat, count, res_count in results: | |
| total += count | |
| resolved += int(res_count or 0) | |
| category_key = cat if cat else "Uncategorized" | |
| issues_by_category[category_key] = count |
π€ Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routers/utility.py` around lines 68 - 71, The loop building
issues_by_category uses raw cat values which can be None and later break
Pydantic validation for StatsResponse.issues_by_category (Dict[str,int]); update
the loop in the stats-building code that iterates "for cat, count, res_count in
results" to coalesce None to a string label (e.g. label = cat if cat is not None
else "Uncategorized"), use that label as the dictionary key when assigning
issues_by_category[label] = count, and keep resolved/total logic the same (use
int(res_count or 0)). Ensure the key type is str so StatsResponse accepts it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue.categoryis nullable in the model, butStatsResponse.issues_by_categoryis typed asDict[str, int]. If any rows haveNULLcategories, this loop will produce aNonekey and theStatsResponse(...)construction can fail validation. Consider coalescingIssue.categoryto a non-null string in the query (e.g., an explicit "uncategorized" bucket) so the response shape is always valid and stable.