feat: Proactive contributor assistance with lightweight pattern detection#283
feat: Proactive contributor assistance with lightweight pattern detection#283mahek2016 wants to merge 3 commits intoAOSSIE-Org:mainfrom
Conversation
📝 WalkthroughWalkthroughReplaces LLM-driven triage with lightweight rule-based and service-backed flows across the devrel stack: adds private pattern/fallback triage helpers, swaps Discord queue/agent execution for direct GitHubToolkit calls, introduces IssueSuggestionService and new GitHub API endpoints/routes, and updates main app and requirements accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (Discord)
participant Bot as Discord Bot
participant Toolkit as GitHubToolkit
participant Service as IssueSuggestionService
participant GitHub as GitHub API
User->>Bot: message (question / request)
Bot->>Toolkit: classify_intent / execute(message)
Toolkit->>Toolkit: rule-based intent classification
alt find_good_first_issues
Toolkit->>Service: fetch_global_beginner_issues(query)
Service->>GitHub: GET /search/issues (Bearer token)
GitHub-->>Service: search results (JSON)
Service-->>Toolkit: list of simplified issues
Toolkit-->>Bot: structured result with issues
else other intents
Toolkit->>Toolkit: dispatch to appropriate handler (web_search, contributor_recommendation, etc.)
Toolkit-->>Bot: structured result or placeholder
end
Bot->>User: send formatted response (issues or handler output)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
backend/app/classification/classification_router.py (3)
104-104: Moveimport jsonto the top of the file.Importing inside a function body on every LLM-response parse is a minor code smell. Standard library imports belong at the module level.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` at line 104, Move the "import json" statement out of the function body and place it with the other module-level imports at the top of classification_router.py; remove the inline "import json" (currently inside the LLM-response parsing code) so the standard library is imported once at module import time rather than on every call.
28-69: Substring matching is prone to false positives.
any(k in msg for k in ...)performs naive substring containment. Examples:
"beginner"matches"I'm not a beginner"or"the beginner issue is too hard""thanks"matches"no thanks","thanksgiving""new here"matches"nothing new here to report"Meanwhile, greetings use exact match (
msg in greetings), so"hi!"or"hello there"won't trigger — inconsistent sensitivity.Consider adding word-boundary checks (e.g., regex
\banchors) or at minimum tightening the keyword lists to reduce false triggers. This is acceptable for an MVP but worth flagging.💡 Example: use regex word boundaries for more robust matching
+import re + def _simple_pattern_match(self, message: str): msg = message.lower().strip() - greetings = ["hi", "hello", "hey"] - thanks = ["thanks", "thank you"] - onboarding_keywords = ["new here", "how to start", "beginner", "first time"] - issue_keywords = ["good first issue", "beginner issue", "start contributing"] + greeting_pattern = re.compile(r"^(hi|hello|hey)[\s!?.]*$") + thanks_pattern = re.compile(r"\b(thanks|thank you)\b") + onboarding_pattern = re.compile(r"\b(new here|how to start|beginner|first time)\b") + issue_pattern = re.compile(r"\b(good first issue|beginner issue|start contributing)\b") - if msg in greetings: + if greeting_pattern.match(msg): ... - if any(k in msg for k in onboarding_keywords): + if onboarding_pattern.search(msg): ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` around lines 28 - 69, Current substring checks on msg (using any(k in msg for k in ...)) cause false positives and greetings use an inconsistent exact-match approach; replace these with word-boundary or tokenized matching. Update checks for greetings, onboarding_keywords, issue_keywords, and thanks to use regex searches with \b anchors (e.g., compile patterns for greetings like r'^(hi|hello|hey)\b' and for lists use r'\b(keyword)\b' or join list items into a single alternation), or alternatively tokenize msg into words and check set intersections against the keyword sets; ensure msg is normalized (lower/stripped) before matching and precompile patterns for performance.
76-76: Use explicitOptionalfor thecontextparameter.Per PEP 484,
Dict[str, Any] = Noneshould beOptional[Dict[str, Any]] = None(orDict[str, Any] | None = None). Ruff RUF013 flags this correctly.Proposed fix
-from typing import Dict, Any +from typing import Dict, Any, Optional ... async def should_process_message( self, message: str, - context: Dict[str, Any] = None + context: Optional[Dict[str, Any]] = None ) -> Dict[str, Any]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` at line 76, The parameter annotation "context: Dict[str, Any] = None" is not explicit about optionality; update it to "context: Optional[Dict[str, Any]] = None" (or "context: Dict[str, Any] | None = None" if using 3.10+), and add the corresponding import for Optional from typing (or ensure the union syntax is supported by the project's Python version); apply this change where the "context" parameter is declared so static checkers (Ruff RUF013) no longer flag it.backend/integrations/discord/bot.py (2)
80-112: Extract proactive response templates into constants or a config map.Four inline multiline strings couple presentation with control flow. As this feature grows (more proactive types, i18n, A/B testing), maintaining these inline will be painful. Consider a dictionary mapping
proactive_type → template stringat the module or class level, reducing the handler to a simple lookup + send.💡 Sketch
PROACTIVE_RESPONSES = { "greeting": ( "Hi {mention}! 👋\n" "Welcome to the community!\n" "If you're new, I can guide you on how to start contributing 🚀" ), "onboarding": ( "Awesome {mention}! 🎉\n" "Here's how you can start:\n" "1️⃣ Look for `good first issue`\n" "2️⃣ Set up the project locally\n" "3️⃣ Read CONTRIBUTING.md\n\n" "Would you like me to suggest beginner-friendly issues?" ), "issue_suggestion": ( "{mention} 🔍\n" "You can check open issues labeled `good first issue`.\n" "Would you like me to fetch some right now?" ), }Then in the handler:
template = PROACTIVE_RESPONSES.get(proactive_type) if template: await message.channel.send(template.format(mention=message.author.mention)) return if proactive_type == "acknowledgment": return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/integrations/discord/bot.py` around lines 80 - 112, Extract the inline proactive reply strings into a module- or class-level mapping (e.g., PROACTIVE_RESPONSES) and replace the multiple inline message.channel.send calls in the triage handling block (where triage_result and proactive_type are used) with a single lookup + send flow: lookup template = PROACTIVE_RESPONSES.get(proactive_type), if template then await message.channel.send(template.format(mention=message.author.mention)) and return; keep the explicit early return for the "acknowledgment" proactive_type. This centralizes templates for easier maintenance, i18n and testing while preserving the existing control flow in the handler.
84-109: Proactive responses bypass thread creation — no rate-limiting or dedup.Every matching message (e.g., a user saying "hi" repeatedly) will trigger a new channel-level response with no cooldown. Consider adding a lightweight per-user cooldown (e.g., a TTL dict keyed by
(user_id, proactive_type)) to avoid spamming the channel.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/integrations/discord/bot.py` around lines 84 - 109, The proactive handlers in backend/integrations/discord/bot.py (the blocks checking proactive_type and calling message.channel.send) currently send channel-level replies every match with no dedup or rate-limit; add a lightweight per-user cooldown before each send by introducing a process-wide TTL cache (e.g., a dict mapping (message.author.id, proactive_type) to expiry timestamp), check the cache at the top of the proactive path and skip sending if the entry exists and hasn't expired, and after sending set/update the cache with a short cooldown (e.g., 60–300s) so subsequent identical triggers are suppressed; ensure the keys reference message.author.id and the proactive_type used in those if-blocks so the check applies to greeting/onboarding/issue_suggestion separately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/classification/classification_router.py`:
- Around line 62-69: The acknowledgment branch in classification_router.py
currently returns needs_devrel: False which makes bot._handle_devrel_message
(and the acknowledgment handler in bot.py) unreachable; either change the
acknowledgment return to needs_devrel: True so the triage gate if
triage_result.get("needs_devrel", False) will invoke _handle_devrel_message (and
let the existing acknowledgment handler run), or remove the dead acknowledgment
handler from bot.py (lines ~111-112) and keep needs_devrel: False to silently
drop "thanks" messages—update the JSON return in the acknowledgment branch (the
dict with "proactive_type": "acknowledgment") or remove the handler in bot.py
accordingly.
In `@backend/integrations/discord/bot.py`:
- Around line 111-112: The if-branch checking proactive_type == "acknowledgment"
in bot.py is dead because acknowledgment results have needs_devrel: False and
thus on_message never calls _handle_devrel_message for them; remove this
unreachable branch or adjust the caller logic to pass acknowledgment cases if
you intended to handle them here. Locate the conditional around proactive_type
in _handle_devrel_message (or the surrounding devrel handling block referenced
by on_message) and either delete the acknowledgment branch or update
on_message/_handle_devrel_message to ensure acknowledgment results are routed
here before keeping the branch.
- Around line 92-101: The string passed to message.channel.send inside the
proactive_type == "onboarding" branch contains a Unicode RIGHT SINGLE QUOTATION
MARK in "Here’s"; replace it with the ASCII apostrophe so the literal becomes
"Here's" to satisfy Ruff RUF001 and maintain consistent ASCII punctuation in the
onboarding message; update the message in the message.channel.send call
accordingly and run the linter to verify the warning is cleared.
---
Nitpick comments:
In `@backend/app/classification/classification_router.py`:
- Line 104: Move the "import json" statement out of the function body and place
it with the other module-level imports at the top of classification_router.py;
remove the inline "import json" (currently inside the LLM-response parsing code)
so the standard library is imported once at module import time rather than on
every call.
- Around line 28-69: Current substring checks on msg (using any(k in msg for k
in ...)) cause false positives and greetings use an inconsistent exact-match
approach; replace these with word-boundary or tokenized matching. Update checks
for greetings, onboarding_keywords, issue_keywords, and thanks to use regex
searches with \b anchors (e.g., compile patterns for greetings like
r'^(hi|hello|hey)\b' and for lists use r'\b(keyword)\b' or join list items into
a single alternation), or alternatively tokenize msg into words and check set
intersections against the keyword sets; ensure msg is normalized
(lower/stripped) before matching and precompile patterns for performance.
- Line 76: The parameter annotation "context: Dict[str, Any] = None" is not
explicit about optionality; update it to "context: Optional[Dict[str, Any]] =
None" (or "context: Dict[str, Any] | None = None" if using 3.10+), and add the
corresponding import for Optional from typing (or ensure the union syntax is
supported by the project's Python version); apply this change where the
"context" parameter is declared so static checkers (Ruff RUF013) no longer flag
it.
In `@backend/integrations/discord/bot.py`:
- Around line 80-112: Extract the inline proactive reply strings into a module-
or class-level mapping (e.g., PROACTIVE_RESPONSES) and replace the multiple
inline message.channel.send calls in the triage handling block (where
triage_result and proactive_type are used) with a single lookup + send flow:
lookup template = PROACTIVE_RESPONSES.get(proactive_type), if template then
await message.channel.send(template.format(mention=message.author.mention)) and
return; keep the explicit early return for the "acknowledgment" proactive_type.
This centralizes templates for easier maintenance, i18n and testing while
preserving the existing control flow in the handler.
- Around line 84-109: The proactive handlers in
backend/integrations/discord/bot.py (the blocks checking proactive_type and
calling message.channel.send) currently send channel-level replies every match
with no dedup or rate-limit; add a lightweight per-user cooldown before each
send by introducing a process-wide TTL cache (e.g., a dict mapping
(message.author.id, proactive_type) to expiry timestamp), check the cache at the
top of the proactive path and skip sending if the entry exists and hasn't
expired, and after sending set/update the cache with a short cooldown (e.g.,
60–300s) so subsequent identical triggers are suppressed; ensure the keys
reference message.author.id and the proactive_type used in those if-blocks so
the check applies to greeting/onboarding/issue_suggestion separately.
|
@coderabbitai Thanks for the review!
All review suggestions have been addressed. Please let me know if anything else needs improvement. |
…itHub API integration
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (12)
backend/app/agents/devrel/github/github_toolkit.py (2)
120-125:IssueSuggestionServiceis instantiated on everyexecute()call.A new service instance (and implicitly a new HTTP client lifecycle) is created for each query. Move the instantiation to
__init__to reuse the service across calls.Proposed fix
class GitHubToolkit: def __init__(self): + self.issue_service = IssueSuggestionService(settings.github_token) self.tools = [ ... ]Then in
execute:- service = IssueSuggestionService(settings.github_token) - issues = await service.fetch_global_beginner_issues(query) + issues = await self.issue_service.fetch_global_beginner_issues(query)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/agents/devrel/github/github_toolkit.py` around lines 120 - 125, The code currently instantiates IssueSuggestionService(settings.github_token) inside execute(), creating a new HTTP client per call; move the instantiation into the class __init__ (store it as self.issue_suggestion_service) and update execute() to call self.issue_suggestion_service.fetch_global_beginner_issues(query) so the same IssueSuggestionService instance (and its underlying HTTP client) is reused across calls.
47-76: Rule-based classifier uses overly broad single-keyword matches.Keywords like
"repo"and"search"are very common words. For example:
"I want to report a bug"contains"repo"→ classified asrepo_support(false positive from substring match of "repo" in "report")"I'm searching for help"→ classified asweb_searchConsider using multi-word phrases, word boundaries (
\bregex), or at minimum checking for whole-word matches.Example: tighten keyword matching
- elif "repo" in query_lower: + elif "repository" in query_lower or " repo " in f" {query_lower} ": classification = "repo_support" - elif "search" in query_lower: + elif "web search" in query_lower or query_lower.startswith("search "): classification = "web_search"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/agents/devrel/github/github_toolkit.py` around lines 47 - 76, The rule-based classifier in classify_intent uses broad substring checks (e.g., checking "repo" or "search" in query_lower) causing false positives; change those checks to use whole-word or phrase matching (e.g., regex with word boundaries or split/token-based matching) and prefer multi-word phrases for intents like "good first issue" and "github support" to avoid matching substrings (e.g., "report" -> "repo"); update the conditions in classify_intent to perform re.search(r"\brepo\b", ...) or equivalent token checks and add tests for cases like "report" and "searching" to verify correct classification.backend/routes.py (1)
122-127: Chain the original exception withraise ... from e.Per Ruff B904, re-raising without
fromloses the original traceback context.Fix
except Exception as e: logging.error(f"Error fetching beginner issues: {e}") raise HTTPException( status_code=500, detail="Failed to fetch beginner issues" - ) + ) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes.py` around lines 122 - 127, The except Exception as e block that logs and re-raises an HTTPException when failing to fetch beginner issues should chain the original exception to preserve traceback; update the re-raise of HTTPException(status_code=500, detail="Failed to fetch beginner issues") to use "from e" (i.e., raise HTTPException(...) from e) while keeping the logging call intact so the original exception context is preserved for debugging.backend/app/classification/classification_router.py (3)
30-35: Exact-match greetings won't catch common variations.
msg in greetingsrequires an exact match afterlower().strip(), so inputs like"hi there","hello!", or"hey everyone"will miss. If the goal is proactive onboarding on greetings, considermsg.startswith(...)orany(msg.startswith(g) for g in greetings).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` around lines 30 - 35, The current exact-match check (if msg in greetings) in classification_router.py will miss variants like "hi there" or "hello!"; update the condition that uses the greetings list to use a starts-with or substring test (e.g., any(msg.startswith(g) or msg.startswith(g + " ") or g in msg for g in greetings)) after msg = msg.lower().strip(), and consider stripping trailing punctuation before matching; locate the greetings variable and the conditional that checks "if msg in greetings" and replace it with the more permissive any(...) test so greetings like "hi there", "hello!", and "hey everyone" are correctly detected.
73-77: PEP 484: Use explicitOptionalinstead of implicitNonedefault.
context: Dict[str, Any] = Noneshould becontext: Dict[str, Any] | None = None(orOptional[Dict[str, Any]] = None). This is flagged by Ruff RUF013.Fix
async def should_process_message( self, message: str, - context: Dict[str, Any] = None + context: Dict[str, Any] | None = None ) -> Dict[str, Any]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` around lines 73 - 77, The type annotation for the method should_process_message uses an implicit None default (context: Dict[str, Any] = None) which flags Ruff RUF013; change the parameter to an explicit optional type such as context: Dict[str, Any] | None = None (or use Optional[Dict[str, Any]] = None if you prefer typing.Optional) so the signature clearly expresses that context may be None while keeping the default None value; update the import if you choose Optional.
81-85:if False and pattern_result:— pattern matching is permanently disabled (dead code).The
if Falsecondition means_simple_pattern_matchis called but its result is always discarded. The entire "Step 1" branch is unreachable. If this is a feature flag for future use, consider using a configuration setting or removing the call until it's ready—a bareif Falseis confusing and easy to miss in review.Suggestion: use a config flag or remove
- # Step 1: Lightweight proactive pattern check - pattern_result = self._simple_pattern_match(message) - if False and pattern_result: - logger.info("Pattern-based proactive classification triggered") - return pattern_result + # Step 1: Lightweight proactive pattern check (disabled; enable via config) + if getattr(settings, 'enable_pattern_matching', False): + pattern_result = self._simple_pattern_match(message) + if pattern_result: + logger.info("Pattern-based proactive classification triggered") + return pattern_result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/classification/classification_router.py` around lines 81 - 85, The branch using "if False and pattern_result" permanently disables the proactive pattern logic while still calling _simple_pattern_match; replace this dead flag with a real configuration toggle (e.g., self.config.enable_proactive_pattern or a module-level setting) and guard the call so _simple_pattern_match(message) is only invoked when the flag is true; if you prefer to remove the feature, delete the call to _simple_pattern_match and the unreachable if-block entirely (and keep the logger/info and return behavior only when the config flag is enabled).backend/integrations/discord/bot.py (2)
58-60:GitHubToolkit()is instantiated on every incoming message.Each message creates a new
GitHubToolkitinstance (and downstream, a newIssueSuggestionService). Move toolkit initialization to__init__so it's reused across messages.Proposed fix
def __init__(self, **kwargs): ... self.active_threads: Dict[str, str] = {} + self.toolkit = GitHubToolkit()- # 🔥 Direct Toolkit Execution - toolkit = GitHubToolkit() - result = await toolkit.execute(message.content) + # 🔥 Direct Toolkit Execution + result = await self.toolkit.execute(message.content)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/integrations/discord/bot.py` around lines 58 - 60, Currently GitHubToolkit() (and thus IssueSuggestionService) is instantiated per message; move the creation to the bot class constructor by adding self.toolkit = GitHubToolkit() in __init__ and update the message handler to call await self.toolkit.execute(message.content) instead of creating a new GitHubToolkit; ensure any other places that constructed IssueSuggestionService are updated to reuse self.toolkit so toolkit and downstream services are reused across messages.
32-32:active_threadsdict grows without bound — no eviction or cleanup.Every new user gets an entry that's only removed if the thread is found to be archived. There's no TTL, max-size, or periodic cleanup. Over time this will consume memory.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/integrations/discord/bot.py` at line 32, active_threads is an unbounded dict and will leak memory; implement bounded/TTL eviction and cleanup: replace plain Dict[str,str] active_threads with a small helper (e.g., an LRU or TTL cache) or wrap accesses to enforce a max size and timestamps, remove entries when threads are archived/closed, and run a periodic cleanup task. Update any code that writes to active_threads (places that create threads or handle thread events such as thread creation/updates/archival handlers, e.g., the functions that currently add/remove entries) to use the new helper APIs (put/get/remove) so entries are removed on archive and expired entries are purged by the periodic task or on insert when max size is reached. Ensure thread-safety if accessed from async handlers.backend/app/agents/devrel/github/services/issue_suggestion_service.py (3)
7-63: DuplicateIssueSuggestionServiceclass with a divergent interfaceThis class is a second, incompatible definition of
IssueSuggestionServicethat already exists atbackend/services/github/issue_suggestion_service.py. The two copies have different method signatures (fetch_global_beginner_issues(user_query, limit)here vs.fetch_global_beginner_issues(language, limit)there) and different filtering logic. This split will cause confusion over which one to import and will inevitably diverge further.Consider consolidating into a single service (e.g. merging the
user_queryparsing approach from this file with the parameterised approach in the other, or moving all GitHub service logic under one canonical path).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py` around lines 7 - 63, There are two incompatible definitions of IssueSuggestionService; consolidate them by removing the duplicate and merging logic into one canonical class (IssueSuggestionService) so imports are unambiguous: pick the canonical file to keep (prefer backend/services/github/issue_suggestion_service.py) and update its fetch_global_beginner_issues signature to support both a typed parameterized interface (e.g., language: Optional[str], org: Optional[str], limit: int) and the freeform user_query parsing used here; migrate the filtering logic from this file (language detection from user_query and org checks like "django") into the single retained fetch_global_beginner_issues implementation, adjust callers to the unified signature, and delete this duplicate class to avoid divergence.
29-34: Hardcoded language/org detection is too narrow and not extensibleOnly
"python"and"django"are recognised, meaning any other language or organisation query passes through unfiltered. This makes the service's value-add effectively a no-op for the vast majority of inputs and will be confusing for contributors asking about JavaScript, Rust, etc.At minimum, a configurable mapping (or a parameter-driven design) should replace the hard-coded strings, or the filtering logic should be clearly documented as a stub.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py` around lines 29 - 34, The current hardcoded detection in issue_suggestion_service.py (the search_query assembly using query_lower) only checks for "python" and "django"; replace this with a configurable mapping approach: add language and org maps (e.g., LANG_KEYWORDS and ORG_KEYWORDS) that map keyword sets to search qualifiers and use a loop to detect any matching keyword in query_lower and append the corresponding "language:..." or "org:..." to search_query; expose these maps as constructor parameters or module-level config so they can be extended without code changes and update the logic in the function that builds search_query to iterate the maps instead of checking only "python" and "django".
38-39: Replaceprint()debug statements with structured loggingLines 38–39, 45–46, and 61 use raw
print()calls with emoji, which are not appropriate for production code — they bypass log level controls, are invisible in log aggregators, and the status/error lines on 45–46 silently swallow failures without any re-raise or propagation.♻️ Proposed refactor
+import logging + +logger = logging.getLogger(__name__) ... - print("🔍 GitHub Search Query:", search_query) - print("🔗 GitHub URL:", url) + logger.debug("GitHub search query: %s", search_query) async with httpx.AsyncClient() as client: response = await client.get(url, headers=headers) if response.status_code != 200: - print("❌ GitHub API Error:", response.status_code) - print("❌ Response Body:", response.text) + logger.error("GitHub API error %s: %s", response.status_code, response.text) return [] ... - print(f"✅ Found {len(results)} issues") + logger.debug("Found %d beginner issues", len(results))Also applies to: 45-46, 61-61
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py` around lines 38 - 39, Replace the raw print() debug statements for "search_query" and "url" (and the other print calls at the noted locations) with structured logging: create or use a module logger (logging.getLogger(__name__>) and use appropriate log levels (logger.debug or logger.info for query/URL diagnostics, logger.error for failures). For the prints that currently swallow errors (the status/error prints around lines 45–46 and 61), log the error with logger.error including the exception/info and either re-raise the exception or return/propagate a clear error value instead of silently continuing; update the surrounding code paths that reference search_query and url to use these logger calls (identify the usages of variables search_query and url and the exception-handling block in issue_suggestion_service.py) so logs are captured by aggregators and obey log-level configuration.backend/app/api/v1/github.py (1)
30-34: Three Ruff-flagged issues in the exception handler
- BLE001 – Bare
except Exceptionmasks unexpected errors; prefer a more specific exception type or at minimum document why the catch-all is intentional.- B904 –
raise HTTPException(...)inside anexceptblock should useraise ... from eto preserve the exception chain.- RUF010 – Use the
!sconversion flag instead ofstr(e)in the f-string.🔧 Proposed fix
- except Exception as e: - raise HTTPException( + except Exception as e: + raise HTTPException( status_code=500, - detail=f"Failed to fetch issues: {str(e)}" - ) + detail=f"Failed to fetch issues: {e!s}" + ) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/v1/github.py` around lines 30 - 34, The current bare "except Exception as e" should be narrowed to the specific exceptions thrown when fetching GitHub issues (e.g., requests.exceptions.RequestException, PyGithub's GithubException, or whatever client-specific exceptions your fetch function raises) or, if a catch-all is intentional, add a comment explaining why; preserve the exception chain by re-raising the HTTPException with "raise HTTPException(...) from e" and use the f-string conversion flag for the error message (e.g., f"Failed to fetch issues: {e!s}") while keeping the HTTPException symbol and the surrounding except block unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/agents/devrel/github/github_toolkit.py`:
- Around line 152-153: The fallback call passing None to
handle_general_github_help causes an AttributeError because
handle_general_github_help calls _extract_search_query which uses llm. Fix by
supplying a valid LLM instance instead of None: update the fallback to pass the
toolkit's LLM (e.g., self.llm or the LLM field on GitHubToolkit) or ensure
GitHubToolkit initializes a default LLM during construction and then call
handle_general_github_help(query, self.llm); alternatively implement a safe
non-LLM fallback in handle_general_github_help that does not call
_extract_search_query when llm is None.
In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py`:
- Around line 53-59: The loop that builds results from data.get("items", [])
uses direct key access on item["repository_url"], item["number"], item["title"],
and item["html_url"], which can raise KeyError on partial GitHub API responses;
update the loop in issue_suggestion_service.py (where results is appended) to
safely read fields via item.get(...) and validate that required fields
(repository_url, number, title, html_url) are present before appending, skip
(and optionally log) any incomplete entries, and extract the repo name from
repository_url defensively (e.g., handle None or empty string before splitting).
- Around line 36-42: The code builds the GitHub search URL by interpolating
search_query into an f-string (see the url variable and the client.get call in
issue_suggestion_service.py) which sends unencoded spaces/quotes; change the
request to call client.get(GITHUB_API_BASE + "/search/issues", params={"q":
search_query, "per_page": limit}, headers=headers) so httpx handles URL
encoding, remove the manual url f-string, and delete the debug print()
statements present around the search_query/url and response logging.
In `@backend/app/api/router.py`:
- Around line 1-13: The repo exposes the same beginner-issues endpoint twice
(api_router includes github_router at /v1/github exposing
/v1/github/beginner-issues while backend/routes.py separately registers
/github/beginner-issues), so remove the duplicate by consolidating to a single
router: choose the canonical implementation (either the endpoint in .v1.github's
router or the one in backend/routes.py), delete the other duplicate route
registration, and update imports/usages accordingly; ensure the remaining
endpoint still calls fetch_beginner_issues(owner=GITHUB_ORG, repo=repo) on
IssueSuggestionService and returns the same response shape, and remove or adjust
any tests/config that expect the removed path.
In `@backend/app/api/v1/github.py`:
- Around line 7-16: IssueSuggestionService is instantiated at module import with
GITHUB_TOKEN which can be None, so outbound calls may silently use an invalid
token; change to either validate GITHUB_TOKEN at import and raise (fail-fast) or
lazily create the service inside the request handler (e.g., move creation of
IssueSuggestionService into get_beginner_issues and construct it after checking
GITHUB_TOKEN), referencing IssueSuggestionService and get_beginner_issues to
locate the code and ensure the token check precedes any service construction or
API calls.
- Line 2: The handler is calling a non-existent method
issue_service.fetch_beginner_issues which will raise AttributeError; update the
call to use the existing IssueSuggestionService.fetch_global_beginner_issues (or
add a new method fetch_beginner_issues to IssueSuggestionService) so signatures
match: either replace issue_service.fetch_beginner_issues(owner=GITHUB_ORG,
repo=repo) with issue_service.fetch_global_beginner_issues(language, limit)
passing appropriate language/limit values, or implement
fetch_beginner_issues(owner, repo, ...) in IssueSuggestionService that delegates
to fetch_global_beginner_issues and returns the expected shape.
In `@backend/app/classification/classification_router.py`:
- Around line 62-69: The acknowledgment branch currently returns needs_devrel:
True while saying "no processing needed"; change needs_devrel to False and
update the reasoning string to match (e.g., "Acknowledgment message - no devrel
processing needed") in the returned dict that contains keys needs_devrel,
priority, reasoning, original_message, proactive_type so the boolean and message
are consistent.
In `@backend/integrations/discord/bot.py`:
- Around line 67-69: The except block currently only logs errors (except
Exception as e / logger.error(...)) but doesn't notify the user; update the
exception handler to, after logging the error, send a user-facing reply in the
same context (e.g., await message.reply(...) or await message.channel.send(...)
/ post into message.thread if thread-aware) with a short apology like "Sorry,
something went wrong while processing your message." Ensure the send is awaited
and wrapped to avoid raising on failure (catch/send fallback) so the bot won't
crash when attempting to notify the user.
In `@backend/main.py`:
- Around line 61-67: The CORS setup using api.add_middleware(CORSMiddleware)
currently sets allow_origins=["*"] together with allow_credentials=True which is
invalid; update the CORSMiddleware configuration in main.py (the
api.add_middleware call) to either (A) replace allow_origins=["*"] with an
explicit list of allowed origins (e.g., read from an ALLOWED_ORIGINS env var or
config and pass that list) while keeping allow_credentials=True, or (B) if you
truly need wildcard origins, set allow_credentials=False; make the change where
CORSMiddleware is configured so the Access-Control-Allow-Origin header will not
be "*" when credentials are allowed.
- Line 75: Import the router object from routes.py (e.g., from routes import
router as routes_router) and register it on the FastAPI app alongside api_router
by calling api.include_router(routes_router) so the /github/webhook and
/github/beginner-issues endpoints become reachable; place the import near the
other router imports and add the include_router call after the existing
api.include_router(api_router) in main.py.
- Around line 33-35: The fire-and-forget asyncio.create_task call for starting
the bot can be garbage-collected; store the Task on the instance so it has a
strong reference. Replace the bare
asyncio.create_task(self.discord_bot.start(settings.discord_bot_token)) with
assigning the returned Task to an instance attribute (e.g., self.discord_task)
and use that attribute when you need to await, cancel, or check the bot
lifecycle; ensure the attribute is created on the same object where create_task
is invoked so references to self.discord_task prevent silent cancellation.
In `@backend/requirements.txt`:
- Around line 1-19: requirements.txt is missing runtime packages used by the
code; add the following packages to backend/requirements.txt so imports succeed:
discord.py (used in backend/integrations/discord/bot.py), langchain-google-genai
and langchain-core (used in
backend/app/classification/classification_router.py), duckduckgo-search or ddgs
(used in backend/app/agents/devrel/tools/search_tool/ddg.py), and langsmith
(used in backend/app/agents/devrel/tools/search_tool/ddg.py); pin versions if
required by your environment and run pip install -r requirements.txt to verify
imports succeed.
In `@backend/routes.py`:
- Around line 2-3: The file defines imports (IssueSuggestionService,
GITHUB_TOKEN, GITHUB_ORG) and a FastAPI router but the router is never mounted
so its endpoints are unreachable; either mount this router into the app (e.g.,
include the router on the global FastAPI instance or add it to api_router in
main.py) or remove/merge these routes if they are superseded by the /v1/github
router—update main.py (or api_router registration) to include the router symbol
from this module or delete the duplicate endpoints and their imports
(IssueSuggestionService, GITHUB_TOKEN, GITHUB_ORG) to avoid stale/invalid
imports.
- Around line 110-114: The route calls a non-existent
IssueSuggestionService.fetch_beginner_issues with owner/repo kwargs causing
AttributeError; fix by either (A) changing the call in routes.py to the existing
method IssueSuggestionService.fetch_global_beginner_issues(language, limit) and
pass the correct parameters, or (B) add/rename a method on
IssueSuggestionService (e.g., fetch_beginner_issues(self, owner, repo, ...))
that implements repo-scoped behavior and accept owner/repo kwargs; update the
route to call the matching method name and signature so the symbols
IssueSuggestionService, fetch_beginner_issues and fetch_global_beginner_issues
align.
In `@backend/services/github/issue_suggestion_service.py`:
- Around line 1-46: This file defines an
IssueSuggestionService.fetch_global_beginner_issues(language, limit) that
conflicts with the other service variant
fetch_global_beginner_issues(user_query, limit); consolidate into a single
service API (pick one canonical signature — e.g., accept user_query and optional
language) and merge filtering logic so both callers (including github_toolkit
import) use the same class/method, updating import sites as needed; also replace
the print("GitHub search failed:", response.text) in IssueSuggestionService with
the project logger (use the existing logger instance or create one) and log the
response status and body for diagnostics.
- Around line 38-44: The loop building results from GitHub response items uses
direct indexing (issue["number"], issue["title"], issue["html_url"],
issue["repository_url"]) which can raise KeyError; update the block that appends
to results to use issue.get(...) with sensible defaults (e.g.,
issue.get("number"), issue.get("title", ""), issue.get("html_url", "") ) and
derive repo safely by reading repo_url = issue.get("repository_url") and then
splitting only if repo_url is truthy (fallback to an empty string or None).
Ensure you update the same symbols shown (items, issue, results,
repository_url/html_url keys) so missing fields don't crash the service.
- Around line 25-26: The code opens httpx.AsyncClient() and calls
client.get(url, headers=headers) with no timeout; add a request timeout to avoid
hanging the event loop by passing a timeout value (or an httpx.Timeout object)
either when constructing httpx.AsyncClient(timeout=...) or on the client.get
call (timeout=...) in the function in issue_suggestion_service.py that performs
the GitHub request; choose a reasonable timeout (e.g., 5–30s) and ensure the
call is wrapped so timeouts raise and can be handled/logged appropriately.
- Around line 11-15: The method is named fetch_global_beginner_issues but
callers invoke issue_service.fetch_beginner_issues(), causing an AttributeError;
fix by either renaming the service method fetch_global_beginner_issues to
fetch_beginner_issues or updating all call sites that call
issue_service.fetch_beginner_issues() to call fetch_global_beginner_issues()
instead, and ensure the function signature (language: str = "python", limit: int
= 5) and return type List[Dict] remain unchanged so no other call expectations
break.
- Around line 22-23: The constructed query string (variable query) is
interpolating language into the URL (url) without URL-encoding, which breaks the
GitHub API for languages with spaces/special characters; update the code in
issue_suggestion_service.py to either URL-encode the language (via
urllib.parse.quote) before building query or, better, stop manual string
interpolation and pass the search as an httpx params dict (e.g. params={'q':
f'label:"good first issue" language:{language} state:open', 'per_page': limit})
when calling the GitHub API base (GITHUB_API_BASE) so httpx handles encoding for
you and avoid embedding raw unencoded values into url.
---
Nitpick comments:
In `@backend/app/agents/devrel/github/github_toolkit.py`:
- Around line 120-125: The code currently instantiates
IssueSuggestionService(settings.github_token) inside execute(), creating a new
HTTP client per call; move the instantiation into the class __init__ (store it
as self.issue_suggestion_service) and update execute() to call
self.issue_suggestion_service.fetch_global_beginner_issues(query) so the same
IssueSuggestionService instance (and its underlying HTTP client) is reused
across calls.
- Around line 47-76: The rule-based classifier in classify_intent uses broad
substring checks (e.g., checking "repo" or "search" in query_lower) causing
false positives; change those checks to use whole-word or phrase matching (e.g.,
regex with word boundaries or split/token-based matching) and prefer multi-word
phrases for intents like "good first issue" and "github support" to avoid
matching substrings (e.g., "report" -> "repo"); update the conditions in
classify_intent to perform re.search(r"\brepo\b", ...) or equivalent token
checks and add tests for cases like "report" and "searching" to verify correct
classification.
In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py`:
- Around line 7-63: There are two incompatible definitions of
IssueSuggestionService; consolidate them by removing the duplicate and merging
logic into one canonical class (IssueSuggestionService) so imports are
unambiguous: pick the canonical file to keep (prefer
backend/services/github/issue_suggestion_service.py) and update its
fetch_global_beginner_issues signature to support both a typed parameterized
interface (e.g., language: Optional[str], org: Optional[str], limit: int) and
the freeform user_query parsing used here; migrate the filtering logic from this
file (language detection from user_query and org checks like "django") into the
single retained fetch_global_beginner_issues implementation, adjust callers to
the unified signature, and delete this duplicate class to avoid divergence.
- Around line 29-34: The current hardcoded detection in
issue_suggestion_service.py (the search_query assembly using query_lower) only
checks for "python" and "django"; replace this with a configurable mapping
approach: add language and org maps (e.g., LANG_KEYWORDS and ORG_KEYWORDS) that
map keyword sets to search qualifiers and use a loop to detect any matching
keyword in query_lower and append the corresponding "language:..." or "org:..."
to search_query; expose these maps as constructor parameters or module-level
config so they can be extended without code changes and update the logic in the
function that builds search_query to iterate the maps instead of checking only
"python" and "django".
- Around line 38-39: Replace the raw print() debug statements for "search_query"
and "url" (and the other print calls at the noted locations) with structured
logging: create or use a module logger (logging.getLogger(__name__>) and use
appropriate log levels (logger.debug or logger.info for query/URL diagnostics,
logger.error for failures). For the prints that currently swallow errors (the
status/error prints around lines 45–46 and 61), log the error with logger.error
including the exception/info and either re-raise the exception or
return/propagate a clear error value instead of silently continuing; update the
surrounding code paths that reference search_query and url to use these logger
calls (identify the usages of variables search_query and url and the
exception-handling block in issue_suggestion_service.py) so logs are captured by
aggregators and obey log-level configuration.
In `@backend/app/api/v1/github.py`:
- Around line 30-34: The current bare "except Exception as e" should be narrowed
to the specific exceptions thrown when fetching GitHub issues (e.g.,
requests.exceptions.RequestException, PyGithub's GithubException, or whatever
client-specific exceptions your fetch function raises) or, if a catch-all is
intentional, add a comment explaining why; preserve the exception chain by
re-raising the HTTPException with "raise HTTPException(...) from e" and use the
f-string conversion flag for the error message (e.g., f"Failed to fetch issues:
{e!s}") while keeping the HTTPException symbol and the surrounding except block
unchanged.
In `@backend/app/classification/classification_router.py`:
- Around line 30-35: The current exact-match check (if msg in greetings) in
classification_router.py will miss variants like "hi there" or "hello!"; update
the condition that uses the greetings list to use a starts-with or substring
test (e.g., any(msg.startswith(g) or msg.startswith(g + " ") or g in msg for g
in greetings)) after msg = msg.lower().strip(), and consider stripping trailing
punctuation before matching; locate the greetings variable and the conditional
that checks "if msg in greetings" and replace it with the more permissive
any(...) test so greetings like "hi there", "hello!", and "hey everyone" are
correctly detected.
- Around line 73-77: The type annotation for the method should_process_message
uses an implicit None default (context: Dict[str, Any] = None) which flags Ruff
RUF013; change the parameter to an explicit optional type such as context:
Dict[str, Any] | None = None (or use Optional[Dict[str, Any]] = None if you
prefer typing.Optional) so the signature clearly expresses that context may be
None while keeping the default None value; update the import if you choose
Optional.
- Around line 81-85: The branch using "if False and pattern_result" permanently
disables the proactive pattern logic while still calling _simple_pattern_match;
replace this dead flag with a real configuration toggle (e.g.,
self.config.enable_proactive_pattern or a module-level setting) and guard the
call so _simple_pattern_match(message) is only invoked when the flag is true; if
you prefer to remove the feature, delete the call to _simple_pattern_match and
the unreachable if-block entirely (and keep the logger/info and return behavior
only when the config flag is enabled).
In `@backend/integrations/discord/bot.py`:
- Around line 58-60: Currently GitHubToolkit() (and thus IssueSuggestionService)
is instantiated per message; move the creation to the bot class constructor by
adding self.toolkit = GitHubToolkit() in __init__ and update the message handler
to call await self.toolkit.execute(message.content) instead of creating a new
GitHubToolkit; ensure any other places that constructed IssueSuggestionService
are updated to reuse self.toolkit so toolkit and downstream services are reused
across messages.
- Line 32: active_threads is an unbounded dict and will leak memory; implement
bounded/TTL eviction and cleanup: replace plain Dict[str,str] active_threads
with a small helper (e.g., an LRU or TTL cache) or wrap accesses to enforce a
max size and timestamps, remove entries when threads are archived/closed, and
run a periodic cleanup task. Update any code that writes to active_threads
(places that create threads or handle thread events such as thread
creation/updates/archival handlers, e.g., the functions that currently
add/remove entries) to use the new helper APIs (put/get/remove) so entries are
removed on archive and expired entries are purged by the periodic task or on
insert when max size is reached. Ensure thread-safety if accessed from async
handlers.
In `@backend/routes.py`:
- Around line 122-127: The except Exception as e block that logs and re-raises
an HTTPException when failing to fetch beginner issues should chain the original
exception to preserve traceback; update the re-raise of
HTTPException(status_code=500, detail="Failed to fetch beginner issues") to use
"from e" (i.e., raise HTTPException(...) from e) while keeping the logging call
intact so the original exception context is preserved for debugging.
| else: | ||
| result = await handle_general_github_help(query, self.llm) | ||
| result = await handle_general_github_help(query, None) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and inspect the general_github_help.py file
find . -name "general_github_help.py" -type fRepository: AOSSIE-Org/Devr.AI
Length of output: 191
🏁 Script executed:
# Also look at the github_toolkit.py file around lines 152-153 to see the full context
cat -n backend/app/agents/devrel/github/github_toolkit.py | sed -n '140,160p'Repository: AOSSIE-Org/Devr.AI
Length of output: 919
🏁 Script executed:
# Search for handle_general_github_help function definition
rg "def handle_general_github_help|async def handle_general_github_help" -A 20Repository: AOSSIE-Org/Devr.AI
Length of output: 2239
🏁 Script executed:
rg "_extract_search_query" -A 10 backend/app/agents/devrel/github/tools/general_github_help.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 1027
🏁 Script executed:
# Also check where llm is injected/initialized in github_toolkit.py
rg "self.llm|self\.llm" backend/app/agents/devrel/github/github_toolkit.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 44
🏁 Script executed:
rg "_extract_search_query" -B 5 -A 15 backend/app/agents/devrel/nodes/handlers/web_search.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 1628
🏁 Script executed:
# Also check all calls to handle_general_github_help to see if None is always passed
rg "handle_general_github_help" backend/app/agents/devrel/github/github_toolkit.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 198
handle_general_github_help(query, None) will crash — llm cannot be None.
The fallback path at line 153 passes None as the llm argument. The function immediately calls _extract_search_query(query, llm), which invokes await llm.ainvoke(...). This will raise AttributeError: 'NoneType' object has no attribute 'ainvoke'. Every unclassified query will hit this path and crash.
Either initialize an LLM instance in GitHubToolkit to pass here, or provide a non-LLM fallback for the general_github_help path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/agents/devrel/github/github_toolkit.py` around lines 152 - 153,
The fallback call passing None to handle_general_github_help causes an
AttributeError because handle_general_github_help calls _extract_search_query
which uses llm. Fix by supplying a valid LLM instance instead of None: update
the fallback to pass the toolkit's LLM (e.g., self.llm or the LLM field on
GitHubToolkit) or ensure GitHubToolkit initializes a default LLM during
construction and then call handle_general_github_help(query, self.llm);
alternatively implement a safe non-LLM fallback in handle_general_github_help
that does not call _extract_search_query when llm is None.
| url = f"{GITHUB_API_BASE}/search/issues?q={search_query}&per_page={limit}" | ||
|
|
||
| print("🔍 GitHub Search Query:", search_query) | ||
| print("🔗 GitHub URL:", url) | ||
|
|
||
| async with httpx.AsyncClient() as client: | ||
| response = await client.get(url, headers=headers) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n backend/app/agents/devrel/github/services/issue_suggestion_service.py | head -70Repository: AOSSIE-Org/Devr.AI
Length of output: 2196
🏁 Script executed:
# Check if there are any other similar files that might show patterns
find . -name "*issue_suggestion_service*" -type fRepository: AOSSIE-Org/Devr.AI
Length of output: 187
🌐 Web query:
httpx AsyncClient URL encoding f-string vs params kwarg query string
💡 Result:
In httpx (including AsyncClient), prefer params= (or httpx.URL(..., params=...)) over building a query string with an f-string:
params=is the supported API for query strings and acceptsdict,str, or[(k, v), ...].httpxwill serialize/encode the parameters for you. [1][2]httpx.QueryParams/params=encoding is form-style (e.g., spaces become+,@becomes%40,!becomes%21). [1]- Some characters may be treated as “safe” and not percent-encoded (notably
/), which can surprise people coming fromrequests; this is discussed as an intentional behavior change in httpx. [3]
Recommended
import httpx
async with httpx.AsyncClient() as client:
r = await client.get(
"https://api.example.com/search",
params={"q": "How HTTP works!", "email": "user@example.com"},
)If you use an f-string/manual URL
You must encode yourself (and avoid double-encoding):
from urllib.parse import urlencode
q = urlencode({"q": "How HTTP works!", "email": "user@example.com"})
url = f"https://api.example.com/search?{q}"If an API requires / to be encoded inside a parameter value
You may need to pre-quote that specific value before passing it (because httpx may leave / unescaped), e.g. for q="http://example.com"; this exact issue is reported/discussed. [3]
Sources: httpx URL/QueryParams docs [1], AsyncClient params API [2], httpx issue on query param encoding and / handling [3].
Search query is not URL-encoded — spaces and quotes will be sent raw to GitHub API
The search_query string (e.g. label:"good first issue" is:issue state:open language:python) contains spaces and double-quotes. Embedding it directly in an f-string URL bypasses httpx's query parameter encoding, sending unencoded characters to the GitHub API. This will result in malformed requests and unpredictable behavior.
Use the params= keyword argument instead, which httpx will properly URL-encode:
🔧 Proposed fix
- url = f"{GITHUB_API_BASE}/search/issues?q={search_query}&per_page={limit}"
-
- print("🔍 GitHub Search Query:", search_query)
- print("🔗 GitHub URL:", url)
-
async with httpx.AsyncClient() as client:
- response = await client.get(url, headers=headers)
+ response = await client.get(
+ f"{GITHUB_API_BASE}/search/issues",
+ headers=headers,
+ params={"q": search_query, "per_page": limit},
+ )Additionally, remove debug print() statements (lines 38–39, 45–46, 61) from production code.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| url = f"{GITHUB_API_BASE}/search/issues?q={search_query}&per_page={limit}" | |
| print("🔍 GitHub Search Query:", search_query) | |
| print("🔗 GitHub URL:", url) | |
| async with httpx.AsyncClient() as client: | |
| response = await client.get(url, headers=headers) | |
| async with httpx.AsyncClient() as client: | |
| response = await client.get( | |
| f"{GITHUB_API_BASE}/search/issues", | |
| headers=headers, | |
| params={"q": search_query, "per_page": limit}, | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py` around
lines 36 - 42, The code builds the GitHub search URL by interpolating
search_query into an f-string (see the url variable and the client.get call in
issue_suggestion_service.py) which sends unencoded spaces/quotes; change the
request to call client.get(GITHUB_API_BASE + "/search/issues", params={"q":
search_query, "per_page": limit}, headers=headers) so httpx handles URL
encoding, remove the manual url f-string, and delete the debug print()
statements present around the search_query/url and response logging.
| for item in data.get("items", []): | ||
| results.append({ | ||
| "repo": item["repository_url"].split("/")[-1], | ||
| "number": item["number"], | ||
| "title": item["title"], | ||
| "url": item["html_url"] | ||
| }) |
There was a problem hiding this comment.
Direct key access will raise KeyError on partial or unexpected GitHub API responses
All four field accesses on the item dict ("repository_url", "number", "title", "html_url") will raise a KeyError if a response item omits any field, silently aborting the entire result set instead of skipping the malformed entry.
🔧 Proposed fix — use `.get()` and skip incomplete entries
for item in data.get("items", []):
- results.append({
- "repo": item["repository_url"].split("/")[-1],
- "number": item["number"],
- "title": item["title"],
- "url": item["html_url"]
- })
+ repo_url = item.get("repository_url", "")
+ number = item.get("number")
+ title = item.get("title")
+ url = item.get("html_url")
+ if not all([repo_url, number, title, url]):
+ continue
+ results.append({
+ "repo": repo_url.split("/")[-1],
+ "number": number,
+ "title": title,
+ "url": url,
+ })📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for item in data.get("items", []): | |
| results.append({ | |
| "repo": item["repository_url"].split("/")[-1], | |
| "number": item["number"], | |
| "title": item["title"], | |
| "url": item["html_url"] | |
| }) | |
| for item in data.get("items", []): | |
| repo_url = item.get("repository_url", "") | |
| number = item.get("number") | |
| title = item.get("title") | |
| url = item.get("html_url") | |
| if not all([repo_url, number, title, url]): | |
| continue | |
| results.append({ | |
| "repo": repo_url.split("/")[-1], | |
| "number": number, | |
| "title": title, | |
| "url": url, | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/agents/devrel/github/services/issue_suggestion_service.py` around
lines 53 - 59, The loop that builds results from data.get("items", []) uses
direct key access on item["repository_url"], item["number"], item["title"], and
item["html_url"], which can raise KeyError on partial GitHub API responses;
update the loop in issue_suggestion_service.py (where results is appended) to
safely read fields via item.get(...) and validate that required fields
(repository_url, number, title, html_url) are present before appending, skip
(and optionally log) any incomplete entries, and extract the repo name from
repository_url defensively (e.g., handle None or empty string before splitting).
| from .v1.github import router as github_router | ||
| from fastapi import APIRouter | ||
| from .v1.auth import router as auth_router | ||
| from .v1.health import router as health_router | ||
| from .v1.integrations import router as integrations_router | ||
|
|
||
| api_router = APIRouter() | ||
|
|
||
| api_router.include_router( | ||
| github_router, | ||
| prefix="/v1/github", | ||
| tags=["GitHub"] | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if files exist
echo "=== Checking file existence ==="
ls -la backend/app/api/v1/github.py 2>&1 || echo "File not found"
ls -la backend/routes.py 2>&1 || echo "File not found"
echo ""
echo "=== Routes in backend/app/api/v1/github.py ==="
if [ -f backend/app/api/v1/github.py ]; then
rg -n 'beginner|@router|def ' backend/app/api/v1/github.py | head -50
fi
echo ""
echo "=== Routes in backend/routes.py ==="
if [ -f backend/routes.py ]; then
rg -n 'beginner|@router|def ' backend/routes.py | head -50
fiRepository: AOSSIE-Org/Devr.AI
Length of output: 904
🏁 Script executed:
#!/bin/bash
echo "=== Full content of backend/app/api/v1/github.py ==="
cat -n backend/app/api/v1/github.py
echo ""
echo "=== Full content of backend/routes.py ==="
cat -n backend/routes.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 6730
Duplicate beginner-issues endpoints across two routers.
This wires github_router at /v1/github (exposing /v1/github/beginner-issues), while backend/routes.py also defines /github/beginner-issues on a separate router. Both endpoints are functionally identical—they call fetch_beginner_issues(owner=GITHUB_ORG, repo=repo) on separate IssueSuggestionService instances and return the same response structure. Consolidate to a single endpoint to avoid confusion and redundancy.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/router.py` around lines 1 - 13, The repo exposes the same
beginner-issues endpoint twice (api_router includes github_router at /v1/github
exposing /v1/github/beginner-issues while backend/routes.py separately registers
/github/beginner-issues), so remove the duplicate by consolidating to a single
router: choose the canonical implementation (either the endpoint in .v1.github's
router or the one in backend/routes.py), delete the other duplicate route
registration, and update imports/usages accordingly; ensure the remaining
endpoint still calls fetch_beginner_issues(owner=GITHUB_ORG, repo=repo) on
IssueSuggestionService and returns the same response shape, and remove or adjust
any tests/config that expect the removed path.
| @@ -0,0 +1,34 @@ | |||
| from fastapi import APIRouter, HTTPException | |||
| from services.github.issue_suggestion_service import IssueSuggestionService | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for fetch_beginner_issues definition
echo "=== Searching for fetch_beginner_issues definition ==="
rg -n "def fetch_beginner_issues" --type py
echo ""
echo "=== Methods on service classes (issue_suggestion_service) ==="
rg -n "def fetch" --type py -g "*issue_suggestion_service*" -A 2
echo ""
echo "=== Locating issue_suggestion_service.py ==="
fd "issue_suggestion_service.py"
echo ""
echo "=== Checking github.py lines 15-25 for actual method call ==="
fd "github.py" -g "*/api/v1/*" -x cat -n {} | sed -n '15,25p'Repository: AOSSIE-Org/Devr.AI
Length of output: 1087
🏁 Script executed:
#!/bin/bash
# Find the github.py file in api/v1
find . -path "*/api/v1/github.py" -type f
echo ""
echo "=== Reading backend/app/api/v1/github.py ==="
cat -n backend/app/api/v1/github.py | head -30Repository: AOSSIE-Org/Devr.AI
Length of output: 1084
🏁 Script executed:
#!/bin/bash
cat -n backend/services/github/issue_suggestion_service.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 1619
fetch_beginner_issues does not exist on the imported service — AttributeError at runtime
Line 19 calls issue_service.fetch_beginner_issues(owner=GITHUB_ORG, repo=repo), but the imported IssueSuggestionService only defines fetch_global_beginner_issues(language: str, limit: int). This method does not exist and will raise AttributeError on every request to the /beginner-issues endpoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/v1/github.py` at line 2, The handler is calling a
non-existent method issue_service.fetch_beginner_issues which will raise
AttributeError; update the call to use the existing
IssueSuggestionService.fetch_global_beginner_issues (or add a new method
fetch_beginner_issues to IssueSuggestionService) so signatures match: either
replace issue_service.fetch_beginner_issues(owner=GITHUB_ORG, repo=repo) with
issue_service.fetch_global_beginner_issues(language, limit) passing appropriate
language/limit values, or implement fetch_beginner_issues(owner, repo, ...) in
IssueSuggestionService that delegates to fetch_global_beginner_issues and
returns the expected shape.
| import httpx | ||
| from typing import List, Dict | ||
|
|
||
| GITHUB_API_BASE = "https://api.github.com" | ||
|
|
||
|
|
||
| class IssueSuggestionService: | ||
| def __init__(self, token: str): | ||
| self.token = token | ||
|
|
||
| async def fetch_global_beginner_issues( | ||
| self, | ||
| language: str = "python", | ||
| limit: int = 5 | ||
| ) -> List[Dict]: | ||
|
|
||
| headers = { | ||
| "Authorization": f"Bearer {self.token}", | ||
| "Accept": "application/vnd.github+json" | ||
| } | ||
|
|
||
| query = f'label:"good first issue" language:{language} state:open' | ||
| url = f"{GITHUB_API_BASE}/search/issues?q={query}&per_page={limit}" | ||
|
|
||
| async with httpx.AsyncClient() as client: | ||
| response = await client.get(url, headers=headers) | ||
|
|
||
| if response.status_code != 200: | ||
| print("GitHub search failed:", response.text) | ||
| return [] | ||
|
|
||
| data = response.json() | ||
|
|
||
| items = data.get("items", []) | ||
|
|
||
| results = [] | ||
|
|
||
| for issue in items: | ||
| results.append({ | ||
| "number": issue["number"], | ||
| "title": issue["title"], | ||
| "url": issue["html_url"], | ||
| "repo": issue["repository_url"].split("/")[-1] | ||
| }) | ||
|
|
||
| return results No newline at end of file |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Duplicate IssueSuggestionService with divergent APIs — consolidate into one.
There are now two IssueSuggestionService classes:
backend/services/github/issue_suggestion_service.py(this file) —fetch_global_beginner_issues(language, limit)backend/app/agents/devrel/github/services/issue_suggestion_service.py—fetch_global_beginner_issues(user_query, limit)
They have different signatures and different filtering logic. github_toolkit.py imports from the second path. Consider consolidating into a single service to avoid confusion and drift.
Also, print() on Line 29 should be replaced with logger for consistency with the rest of the codebase.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/github/issue_suggestion_service.py` around lines 1 - 46,
This file defines an
IssueSuggestionService.fetch_global_beginner_issues(language, limit) that
conflicts with the other service variant
fetch_global_beginner_issues(user_query, limit); consolidate into a single
service API (pick one canonical signature — e.g., accept user_query and optional
language) and merge filtering logic so both callers (including github_toolkit
import) use the same class/method, updating import sites as needed; also replace
the print("GitHub search failed:", response.text) in IssueSuggestionService with
the project logger (use the existing logger instance or create one) and log the
response status and body for diagnostics.
| async def fetch_global_beginner_issues( | ||
| self, | ||
| language: str = "python", | ||
| limit: int = 5 | ||
| ) -> List[Dict]: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the method name mismatch between the service and its call site
echo "=== Methods defined in this service ==="
rg -n 'def fetch' backend/services/github/issue_suggestion_service.py
echo ""
echo "=== Call sites referencing issue_service.fetch ==="
rg -n 'issue_service\.fetch' backend/routes.py
echo ""
echo "=== Call sites in v1/github.py ==="
rg -n 'issue_service\.fetch' backend/app/api/v1/github.pyRepository: AOSSIE-Org/Devr.AI
Length of output: 363
Method name mismatch causes AttributeError at runtime.
The service defines fetch_global_beginner_issues() but both backend/routes.py (line 111) and backend/app/api/v1/github.py (line 19) call issue_service.fetch_beginner_issues(), which does not exist. This will raise an AttributeError when either call site executes. Rename the method to fetch_beginner_issues() or update all call sites to use fetch_global_beginner_issues().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/github/issue_suggestion_service.py` around lines 11 - 15,
The method is named fetch_global_beginner_issues but callers invoke
issue_service.fetch_beginner_issues(), causing an AttributeError; fix by either
renaming the service method fetch_global_beginner_issues to
fetch_beginner_issues or updating all call sites that call
issue_service.fetch_beginner_issues() to call fetch_global_beginner_issues()
instead, and ensure the function signature (language: str = "python", limit: int
= 5) and return type List[Dict] remain unchanged so no other call expectations
break.
| query = f'label:"good first issue" language:{language} state:open' | ||
| url = f"{GITHUB_API_BASE}/search/issues?q={query}&per_page={limit}" |
There was a problem hiding this comment.
Query string is not URL-encoded — special characters will break the GitHub API call.
The language parameter is interpolated directly into the URL without encoding. If it contains spaces or special characters, the request will either fail or return unexpected results. Use urllib.parse.quote or pass the query as a params dict to httpx.
Proposed fix using httpx params
- query = f'label:"good first issue" language:{language} state:open'
- url = f"{GITHUB_API_BASE}/search/issues?q={query}&per_page={limit}"
-
- async with httpx.AsyncClient() as client:
- response = await client.get(url, headers=headers)
+ query = f'label:"good first issue" language:{language} state:open'
+ url = f"{GITHUB_API_BASE}/search/issues"
+
+ async with httpx.AsyncClient(timeout=15.0) as client:
+ response = await client.get(url, headers=headers, params={"q": query, "per_page": limit})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| query = f'label:"good first issue" language:{language} state:open' | |
| url = f"{GITHUB_API_BASE}/search/issues?q={query}&per_page={limit}" | |
| query = f'label:"good first issue" language:{language} state:open' | |
| url = f"{GITHUB_API_BASE}/search/issues" | |
| async with httpx.AsyncClient(timeout=15.0) as client: | |
| response = await client.get(url, headers=headers, params={"q": query, "per_page": limit}) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/github/issue_suggestion_service.py` around lines 22 - 23,
The constructed query string (variable query) is interpolating language into the
URL (url) without URL-encoding, which breaks the GitHub API for languages with
spaces/special characters; update the code in issue_suggestion_service.py to
either URL-encode the language (via urllib.parse.quote) before building query
or, better, stop manual string interpolation and pass the search as an httpx
params dict (e.g. params={'q': f'label:"good first issue" language:{language}
state:open', 'per_page': limit}) when calling the GitHub API base
(GITHUB_API_BASE) so httpx handles encoding for you and avoid embedding raw
unencoded values into url.
| async with httpx.AsyncClient() as client: | ||
| response = await client.get(url, headers=headers) |
There was a problem hiding this comment.
No timeout on httpx.AsyncClient — calls to GitHub API can hang indefinitely.
If the GitHub API is slow or unresponsive, this will block the async event loop without bound. Add a timeout.
Proposed fix
- async with httpx.AsyncClient() as client:
+ async with httpx.AsyncClient(timeout=15.0) as client:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async with httpx.AsyncClient() as client: | |
| response = await client.get(url, headers=headers) | |
| async with httpx.AsyncClient(timeout=15.0) as client: | |
| response = await client.get(url, headers=headers) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/github/issue_suggestion_service.py` around lines 25 - 26,
The code opens httpx.AsyncClient() and calls client.get(url, headers=headers)
with no timeout; add a request timeout to avoid hanging the event loop by
passing a timeout value (or an httpx.Timeout object) either when constructing
httpx.AsyncClient(timeout=...) or on the client.get call (timeout=...) in the
function in issue_suggestion_service.py that performs the GitHub request; choose
a reasonable timeout (e.g., 5–30s) and ensure the call is wrapped so timeouts
raise and can be handled/logged appropriately.
| for issue in items: | ||
| results.append({ | ||
| "number": issue["number"], | ||
| "title": issue["title"], | ||
| "url": issue["html_url"], | ||
| "repo": issue["repository_url"].split("/")[-1] | ||
| }) |
There was a problem hiding this comment.
KeyError risk when accessing issue fields without .get().
Lines 40-43 use direct dictionary indexing (issue["number"], issue["title"], etc.) on GitHub API response items. If an item is missing an expected field, this will crash. Use .get() with defaults for resilience, especially for html_url and repository_url.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/github/issue_suggestion_service.py` around lines 38 - 44,
The loop building results from GitHub response items uses direct indexing
(issue["number"], issue["title"], issue["html_url"], issue["repository_url"])
which can raise KeyError; update the block that appends to results to use
issue.get(...) with sensible defaults (e.g., issue.get("number"),
issue.get("title", ""), issue.get("html_url", "") ) and derive repo safely by
reading repo_url = issue.get("repository_url") and then splitting only if
repo_url is truthy (fallback to an empty string or None). Ensure you update the
same symbols shown (items, issue, results, repository_url/html_url keys) so
missing fields don't crash the service.
|
Update: Phase 2 enhancements added. This PR now includes:
Ready for review. 🚀 |
Closes #258
📝 Description
This PR introduces a proactive contributor assistance layer to Devr.AI.
Before calling the LLM for every message, the system now performs lightweight pattern-based detection to identify common contributor intents such as greetings, onboarding questions, and issue discovery queries.
This improves responsiveness, reduces unnecessary LLM API calls, and provides a smoother onboarding experience for new contributors.
The implementation is modular and can be extended later with caching, fuzzy matching, or advanced intent detection.
🔧 Changes Made
_simple_pattern_match()inclassification_router.pydiscord/bot.py📷 Screenshots or Visual Changes (if applicable)
N/A – Backend and bot behavior enhancement (no UI changes)
🤝 Collaboration
Collaborated with: N/A
✅ Checklist
Summary by CodeRabbit
New Features
Improvements