Skip to content

feat: Gitea support + multi-forge URL parsing groundwork#9

Open
grandfso wants to merge 1 commit intoAsyncFuncAI:mainfrom
grandfso:upstream-pr-multi-forge
Open

feat: Gitea support + multi-forge URL parsing groundwork#9
grandfso wants to merge 1 commit intoAsyncFuncAI:mainfrom
grandfso:upstream-pr-multi-forge

Conversation

@grandfso
Copy link

@grandfso grandfso commented Feb 16, 2026

Summary

Adds working Gitea support and lays groundwork for other forges (GitLab, Bitbucket) at the URL parsing level.

Relates to #8

What works end-to-end

Gitea — full support: PR fetch, diff, commits, comments, review posting. Tested on Gitea 1.25.3 (self-hosted). Gitea mirrors GitHub's API, so the existing API layer works as-is with these fixes:

  • _get_login() helper: falls back user.loginuser.username (Gitea uses username)
  • Defensive .get() for fields that differ: avatar_url, html_url, head.sha, base.ref, etc.
  • Search API: handles both items (GitHub) and data (Gitea) response keys

URL parsing groundwork

PR/MR URL patterns now accept all major forges (for future API adapter work):

  • GitHub: /owner/repo/pull/123
  • Gitea: /owner/repo/pulls/123
  • GitLab: /owner/repo/-/merge_requests/123 (parsing only — API adapter not yet implemented)
  • Bitbucket: /owner/repo/pull-requests/123 (parsing only — API adapter not yet implemented)

Regex is anchored with optional trailing /?/# for tab/comment URLs.

Other improvements

Model provider passthrough

  • Provider-prefixed model names (e.g. openai/gpt-4) pass through unchanged instead of forcing gemini/ prefix
  • Enables use with LiteLLM, OpenAI-compatible proxies, and other providers

Console output hardening

  • User-generated content rendered via Rich now uses rich.markup.escape() to prevent markup injection crashes

Tests

  • 24 regression tests covering URL parsing for all forge formats

Tested with

  • Gitea 1.25.3 (self-hosted) — full end-to-end: PR fetch → review → comment posted
  • LiteLLM proxy → Azure OpenAI
  • Gitea Actions workflow

Add support for multiple Git forge URL formats in PR/issue parsing:
- GitHub: /owner/repo/pull/123
- Gitea: /owner/repo/pulls/123
- GitLab: /owner/repo/-/merge_requests/123
- Bitbucket: /owner/repo/pull-requests/123

Also includes:
- Defensive .get() chains for API response fields that differ across
  forges (user.login vs user.username, optional avatar_url, html_url)
- _get_login() helper with fallback for Gitea's 'username' field
- Rich markup escaping for all user-generated console output
- Gitea search API compatibility (data vs items response keys)
- Provider-prefixed model passthrough (e.g. openai/gpt-4) instead of
  forcing gemini/ prefix — enables use with LiteLLM and other proxies
- 24 regression tests covering all forge URL formats

Closes AsyncFuncAI#8
@gemini-code-assist
Copy link

Summary of Changes

Hello @grandfso, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the compatibility of AsyncReview by introducing support for multiple Git forge platforms, including Gitea, GitLab, and Bitbucket, alongside the existing GitHub integration. This enhancement allows the tool to seamlessly process pull/merge request URLs and API responses from diverse environments, making it more versatile for users operating across different version control systems. Additionally, it improves the flexibility of model configuration and strengthens the security of console output.

Highlights

  • Multi-Forge URL Parsing: Implemented robust regex patterns to parse PR/MR URLs from GitHub, Gitea, GitLab, and Bitbucket, including optional trailing characters.
  • API Response Compatibility: Enhanced data fetching logic with a _get_login helper and defensive .get() calls to gracefully handle variations in API responses across different Git forges.
  • Flexible Model Provider Passthrough: Modified model configuration to allow provider-prefixed model names (e.g., openai/gpt-4) to be used directly, supporting LiteLLM and other OpenAI-compatible services.
  • Console Output Hardening: Integrated rich.markup.escape() to prevent markup injection vulnerabilities when rendering user-generated content in the console.
  • Comprehensive Testing: Added 24 new regression tests to validate URL parsing for all supported forge formats.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cli/github_fetcher.py
    • Added _get_login helper function for user login compatibility across forges.
    • Updated parse_github_url to support multi-forge URL patterns (GitHub, Gitea, GitLab, Bitbucket) and expanded examples.
    • Modified fetch_pr and fetch_issue to use _get_login and defensive .get() for user and branch data.
  • cli/main.py
    • Imported rich.markup.escape and applied it to print_info, print_error, and Panel output to prevent markup injection.
  • cli/virtual_runner.py
    • Adjusted model configuration logic to allow provider-prefixed model names for both main and sub-models, removing the forced gemini/ prefix if a provider is already specified.
  • cr/github.py
    • Added _get_login helper function for user login compatibility.
    • Updated parse_pr_url to support multi-forge PR/MR URL patterns and improved error messages.
    • Modified load_pr to use _get_login and defensive .get() for user, commit, and comment data fields.
  • cr/render.py
    • Imported rich.markup.escape and applied it to print_error, print_info, and history outputs to prevent markup injection.
  • npx/python/cli/github_fetcher.py
    • Added _get_login helper function for user login compatibility across forges.
    • Updated parse_github_url to support multi-forge URL patterns (GitHub, Gitea, GitLab, Bitbucket) and expanded examples.
    • Modified fetch_pr and fetch_issue to use _get_login and defensive .get() for user and branch data.
  • npx/python/cli/main.py
    • Imported rich.markup.escape and applied it to print_step (reasoning, output), print_info, print_error, and Panel output to prevent markup injection.
  • npx/python/cli/repo_tools.py
    • Modified search_code to handle both "items" (GitHub) and "data" (Gitea) keys in search API responses.
  • npx/python/cli/virtual_runner.py
    • Adjusted model configuration logic to allow provider-prefixed model names for both main and sub-models, removing the forced gemini/ prefix if a provider is already specified.
  • npx/python/cr/github.py
    • Added _get_login helper function for user login compatibility.
    • Updated parse_pr_url to support multi-forge PR/MR URL patterns and improved error messages.
    • Modified load_pr to use _get_login and defensive .get() for user, commit, and comment data fields.
  • npx/python/cr/render.py
    • Imported rich.markup.escape and applied it to print_error, print_info, and history outputs to prevent markup injection.
  • tests/test_url_parsing.py
    • Added a new test file containing parameterized regression tests for parse_github_url and parse_pr_url functions, covering various forge URL formats and invalid cases.
Activity
  • The author, grandfso, has tested the changes with Gitea 1.25.3 (self-hosted) for end-to-end PR fetching, review, and comment posting.
  • The author also tested with LiteLLM proxy connected to Azure OpenAI.
  • Gitea Actions workflow was used for testing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of extending support to multiple Git forges like Gitea, GitLab, and Bitbucket. The changes to URL parsing are robust and well-tested, and the defensive API handling will improve compatibility. However, there is a critical issue with code duplication. Entire files and helper functions are copied across different directories (cli/, cr/, and their npx/python/ counterparts). This will make the codebase very difficult to maintain. I've added specific comments highlighting these duplications, and I strongly recommend refactoring to create shared utility modules for this common logic before merging.

Comment on lines +16 to 47
def _get_login(user: dict) -> str:
"""Get user login, falling back to username (Gitea compat)."""
return user.get("login") or user.get("username") or "unknown"


# In-memory store for loaded PRs (MVP - no persistence)
_pr_cache: dict[str, PRInfo] = {}


def parse_pr_url(url: str) -> tuple[str, str, int]:
"""Parse a GitHub PR URL into (owner, repo, number).
"""Parse a PR/MR URL into (owner, repo, number).

Supports GitHub, Gitea, GitLab, and Bitbucket URL formats.

Args:
url: GitHub PR URL like https://github.com/owner/repo/pull/123
url: PR/MR URL (e.g. /pull/1, /pulls/1, /-/merge_requests/1, /pull-requests/1)

Returns:
Tuple of (owner, repo, pr_number)

Raises:
ValueError: If URL format is invalid
"""
pattern = r"github\.com/([^/]+)/([^/]+)/pull/(\d+)"
# GitHub /pull/, Gitea /pulls/, GitLab /-/merge_requests/, Bitbucket /pull-requests/
pattern = r"^https?://[^/]+/([^/]+)/([^/]+)/(?:-/)?(?:pulls?|merge_requests|pull-requests)/(\d+)(?:[/?#].*)?$"
match = re.search(pattern, url)
if not match:
raise ValueError(f"Invalid GitHub PR URL: {url}")
raise ValueError(
f"Invalid PR URL: {url}\n"
"Expected: https://host/owner/repo/{pull,pulls,merge_requests,pull-requests}/123"
)
return match.group(1), match.group(2), int(match.group(3))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This file is a complete duplicate of cr/github.py. This is a critical maintenance issue. The same problem exists for other files in the npx/python/ directory, such as npx/python/cli/github_fetcher.py being a copy of cli/github_fetcher.py. The project structure should be refactored to have a single source for this code that can be shared, rather than maintaining separate, identical copies.

Comment on lines +16 to +18
def _get_login(user: dict) -> str:
"""Get user login, falling back to username (Gitea compat)."""
return user.get("login") or user.get("username") or "unknown"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This _get_login function is also defined in cli/github_fetcher.py. Similarly, the parse_pr_url function on line 25 duplicates logic and regular expressions from cli/github_fetcher.py. To improve maintainability and avoid inconsistencies, these shared helper functions should be extracted into a common utility module and imported where needed.

Comment on lines +160 to +168
base_sha=pr_data.get("base", {}).get("sha", "HEAD"),
head_sha=pr_data.get("head", {}).get("sha", "HEAD"),
files=files,
created_at=datetime.now(),
user={"login": pr_data["user"]["login"], "avatar_url": pr_data["user"]["avatar_url"]},
user={"login": _get_login(pr_data["user"]), "avatar_url": pr_data["user"].get("avatar_url", "")},
state=pr_data.get("state", "open"),
draft=pr_data.get("draft", False),
head_ref=pr_data["head"]["ref"],
base_ref=pr_data["base"]["ref"],
head_ref=pr_data.get("head", {}).get("ref", "unknown"),
base_ref=pr_data.get("base", {}).get("ref", "main"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding default values like "HEAD" for SHAs and "main" for the base branch can be brittle. For example, many repositories still use master as their default branch. If the API response is missing these fields for an unexpected reason, it could lead to incorrect behavior. It would be more robust to either fail explicitly if this critical information is missing, or use None as a fallback and ensure downstream code handles it gracefully.

Comment on lines +327 to +328
total = data.get("total_count", data.get("total", "?"))
print(f"[DEBUG-SEARCH] No results (total_count={total}). Search may be disabled on this instance.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This print statement is useful for debugging, but it's better to use the logging module. This allows consumers of the library to control log verbosity and redirect output, which is more flexible than hardcoded print statements. This applies to the other [DEBUG-SEARCH] prints in this file as well.

@grandfso grandfso changed the title feat: multi-forge support (GitHub, Gitea, GitLab, Bitbucket) feat: Gitea support + multi-forge URL parsing groundwork Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant