feat(provider): add Qwen3-ASR-Flash STT provider & fix STT not triggering bug by muchstarlight · Pull Request #6789 · AstrBotDevs/AstrBot

muchstarlight · 2026-03-22T07:34:18Z

Summary / 概要

This PR adds a new speech-to-text provider and fixes a bug that prevented STT from being triggered.

Modifications / 改动点

feat(provider): Add Qwen3-ASR-Flash STT Provider
- New STT provider using DashScope's Qwen3-ASR-Flash model
- Supports base64 encoded audio input
- Handles multiple audio formats (silk, amr, opus → wav conversion)
- Files: astrbot/core/provider/sources/qwen_asr_flash_source.py
fix(stt): Fix STT not triggering bug
- The PreProcessStage was checking Record.url for audio path
- But Record.fromURL() stores the URL in Record.file attribute, not url
- This caused component.url to always be empty, making STT never trigger
- Fixed by checking Record.file instead
- File: astrbot/core/pipeline/preprocess_stage/stage.py

Verification Steps / 验证步骤

Enable STT in settings: provider_stt_settings.enable = true
Select Qwen3-ASR-Flash as the STT provider
Configure DashScope API key
Send a voice message to the bot
Verify the voice is transcribed to text

Screenshots or Test Results / 运行截图或测试结果

(Please add screenshots showing the voice message being transcribed successfully)

This is NOT a breaking change. / 这不是一个破坏性变更。
My changes have been tested locally. / 我的更改已在本地测试。
My changes do not introduce malicious code. / 我的更改没有引入恶意代码。

Summary by Sourcery

Add a new DashScope-based Qwen3-ASR-Flash speech-to-text provider and fix speech-to-text not triggering due to incorrect record path handling.

New Features:

Introduce Qwen3-ASR-Flash as a configurable DashScope speech-to-text provider supporting base64 audio and common telephony/voice formats.

Bug Fixes:

Ensure speech-to-text preprocessing reads the audio file path from the correct record attribute so STT can be triggered for incoming voice messages.

Enhancements:

Wire the new Qwen3-ASR-Flash provider into the dynamic provider manager and default configuration for easier selection in settings.

Add a new speech-to-text provider using DashScope's Qwen3-ASR-Flash model with base64 encoded audio support.

The Record component stores the audio URL in the 'file' attribute, not 'url' attribute. This caused STT to never be triggered since component.url was always empty when checking for voice messages.

gemini-code-assist · 2026-03-22T07:34:34Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new Speech-to-Text (STT) provider, Qwen3-ASR-Flash, enhancing the system's capability to transcribe audio. Concurrently, it addresses a critical bug that prevented the STT functionality from activating, ensuring that audio messages are now correctly processed for transcription.

Highlights

New STT Provider: Integrated the Qwen3-ASR-Flash Speech-to-Text provider, leveraging DashScope's MultiModalConversation API for base64 encoded audio input and supporting various audio formats.
STT Trigger Fix: Resolved a bug in the PreProcessStage where STT was not triggering due to an incorrect audio path check, switching from Record.url to Record.file.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey - I've found 3 issues, and left some high level feedback:

The timeout value from the provider config is stored but never used when calling MultiModalConversation.call; consider passing it through (if supported) or removing the setting to avoid confusion.
In _prepare_audio, temporary files created when downloading from a URL (the initial path before any conversion) are never cleaned up, which can leak files in the temp directory; consider tracking and removing the downloaded temp file in finally.
_get_audio_format is defined as async but only performs synchronous file I/O and is awaited in one place; consider making it a regular synchronous method to simplify usage and avoid unnecessary await.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `timeout` value from the provider config is stored but never used when calling `MultiModalConversation.call`; consider passing it through (if supported) or removing the setting to avoid confusion.
- In `_prepare_audio`, temporary files created when downloading from a URL (the initial `path` before any conversion) are never cleaned up, which can leak files in the temp directory; consider tracking and removing the downloaded temp file in `finally`.
- `_get_audio_format` is defined as `async` but only performs synchronous file I/O and is awaited in one place; consider making it a regular synchronous method to simplify usage and avoid unnecessary `await`.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/qwen_asr_flash_source.py" line_range="59-68" />
<code_context>
+        self.timeout = provider_config.get("timeout", 30)
</code_context>
<issue_to_address>
**issue (bug_risk):** The configured timeout value is stored but not applied to the DashScope API call.

`timeout` is read from `provider_config`, but `MultiModalConversation.call` is not given any timeout-related argument, so this value is never used. If the DashScope client exposes a timeout option (e.g., `request_timeout`), pass `self.timeout` through so the config actually controls request duration and avoids indefinitely hanging calls.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/qwen_asr_flash_source.py" line_range="110-119" />
<code_context>
+        output_path = None
+
+        # Download from URL if needed
+        if audio_url.startswith("http"):
+            if "multimedia.nt.qq.com.cn" in audio_url:
+                is_tencent = True
+
+            temp_dir = get_astrbot_temp_path()
+            path = os.path.join(
+                temp_dir,
+                f"qwen_asr_{os.urandom(4).hex()}.input",
+            )
+            await download_file(audio_url, path)
+            audio_url = path
+
</code_context>
<issue_to_address>
**suggestion:** Preserving the original file extension when downloading could improve MIME detection for base64 encoding.

Because downloaded files are saved with a `.input` suffix, `_get_mime_type` always falls back to the default `audio/mpeg` for these URLs. If you derive and reuse the original extension from the remote URL (e.g., `.wav`, `.mp3`, `.ogg`) in the temp filename, `_get_mime_type` can infer a more accurate MIME type and better match what the API expects.

Suggested implementation:

```python
import base64
import os
import pathlib
from urllib.parse import urlparse

import dashscope
from dashscope import MultiModalConversation

```

```python
        is_tencent = False
        output_path = None

        # Download from URL if needed
        if audio_url.startswith("http"):
            if "multimedia.nt.qq.com.cn" in audio_url:
                is_tencent = True

            temp_dir = get_astrbot_temp_path()

            # Preserve original file extension (if present) to improve MIME detection
            parsed_url = urlparse(audio_url)
            _, ext = os.path.splitext(parsed_url.path)
            if not ext:
                ext = ".input"

            path = os.path.join(
                temp_dir,
                f"qwen_asr_{os.urandom(4).hex()}{ext}",
            )
            await download_file(audio_url, path)
            audio_url = path

```
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/sources/qwen_asr_flash_source.py" line_range="166" />
<code_context>
+        base64_str = base64.b64encode(file_path_obj.read_bytes()).decode()
+        return f"data:{mime_type};base64,{base64_str}"
+
+    async def get_text(self, audio_url: str) -> str:
+        """Transcribe audio file to text using Qwen3-ASR-Flash API.
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting response parsing into a helper, simplifying audio format detection, and optionally splitting out URL downloading to make `get_text` and `_prepare_audio` easier to follow without changing behaviour.

You can reduce complexity in a few focused spots without changing behaviour.

### 1. Flatten response parsing in `get_text`

The current parsing is quite nested and defensive. You can make it easier to follow by:

- Using `getattr(...)` with sane defaults.
- Normalising `content` into a list.
- Using a single dict branch and `.get()` calls.

Example:

```python
def _extract_text_from_response(self, response) -> str:
    if response.status_code != 200:
        error_msg = getattr(response, "message", None) or f"API error: {response.status_code}"
        logger.error(f"Qwen3-ASR-Flash API error: {error_msg}")
        raise Exception(f"Qwen3-ASR-Flash API error: {error_msg}")

    output = getattr(response, "output", None)
    choices = getattr(output, "choices", []) or []
    if not choices:
        return ""

    message = getattr(choices[0], "message", None)
    content = getattr(message, "content", "")

    # normalise to list
    if isinstance(content, str):
        return content.strip()
    if not isinstance(content, list):
        return ""

    parts: list[str] = []
    for item in content:
        if isinstance(item, dict):
            if "text" in item:
                parts.append(item.get("text", ""))
            elif "audio" in item:
                parts.append(item.get("audio", ""))
    return "".join(parts).strip()
```

Then `get_text` becomes:

```python
async def get_text(self, audio_url: str) -> str:
    output_path = None
    try:
        audio_path, output_path = await self._prepare_audio(audio_url)
        data_uri = self._encode_audio_base64(audio_path)

        messages = [{"role": "user", "content": [{"audio": data_uri}]}]
        asr_options = {"enable_itn": self.enable_itn}
        if self.language != "auto":
            asr_options["language"] = self.language

        response = MultiModalConversation.call(
            api_key=self.api_key,
            model=self.model,
            messages=messages,
            result_format="message",
            asr_options=asr_options,
        )

        text = self._extract_text_from_response(response)
        logger.debug(f"Qwen3-ASR-Flash transcription: {text}")
        return text

    except Exception as e:
        logger.error(f"Qwen3-ASR-Flash transcription error: {e}")
        raise
    finally:
        if output_path and os.path.exists(output_path):
            try:
                os.remove(output_path)
            except Exception as e:
                logger.error(f"Failed to remove temp file {output_path}: {e}")
```

This keeps all behaviour but separates orchestration from parsing and removes nested `hasattr` checks.

### 2. Make `_get_audio_format` synchronous

The method does only synchronous file I/O and header checks; making it async forces callers to `await` with no real benefit.

```python
def _get_audio_format(self, file_path: str) -> str | None:
    silk_header = b"SILK"
    amr_header = b"#!AMR"

    try:
        with open(file_path, "rb") as f:
            file_header = f.read(8)
    except FileNotFoundError:
        return None

    if silk_header in file_header:
        return "silk"
    if amr_header in file_header:
        return "amr"
    return None
```

And update `_prepare_audio` accordingly:

```python
elif (
    lower_audio_url.endswith(".amr")
    or lower_audio_url.endswith(".silk")
    or is_tencent
):
    file_format = self._get_audio_format(audio_url)

    if file_format in ["silk", "amr"]:
        temp_dir = get_astrbot_temp_path()
        output_path = os.path.join(temp_dir, f"qwen_asr_{os.urandom(4).hex()}.wav")

        if file_format == "silk":
            logger.info("Converting silk file to wav...")
            await tencent_silk_to_wav(audio_url, output_path)
        elif file_format == "amr":
            logger.info("Converting amr file to wav...")
            await convert_to_pcm_wav(audio_url, output_path)

        audio_url = output_path
```

This removes unnecessary async surface area while preserving functionality.

### 3. Optional: small helpers to clarify `_prepare_audio`

Without changing the logic, you can pull out URL handling to reduce branching inside `_prepare_audio`:

```python
async def _download_if_url(self, audio_url: str) -> tuple[str, bool]:
    if not audio_url.startswith("http"):
        return audio_url, False

    is_tencent = "multimedia.nt.qq.com.cn" in audio_url
    temp_dir = get_astrbot_temp_path()
    path = os.path.join(temp_dir, f"qwen_asr_{os.urandom(4).hex()}.input")
    await download_file(audio_url, path)
    return path, is_tencent
```

Then in `_prepare_audio`:

```python
async def _prepare_audio(self, audio_url: str) -> tuple[str, str | None]:
    audio_url, is_tencent = await self._download_if_url(audio_url)
    output_path = None

    if not os.path.exists(audio_url):
        raise FileNotFoundError(f"File not found: {audio_url}")

    lower_audio_url = audio_url.lower()

    if lower_audio_url.endswith(".opus"):
        ...
    elif lower_audio_url.endswith((".amr", ".silk")) or is_tencent:
        ...
    return audio_url, output_path
```

This keeps behaviour intact but makes the flow (download → conversion decisions) easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-22T07:35:19Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+        self.timeout = provider_config.get("timeout", 30)
+
+        # Set the DashScope API base URL
+        dashscope.base_http_api_url = self.api_base
+
+        self.set_model(self.model)
+
+    def _get_mime_type(self, file_path: str) -> str:
+        """Get MIME type based on file extension."""
+        ext_to_mime = {


issue (bug_risk): The configured timeout value is stored but not applied to the DashScope API call.

timeout is read from provider_config, but MultiModalConversation.call is not given any timeout-related argument, so this value is never used. If the DashScope client exposes a timeout option (e.g., request_timeout), pass self.timeout through so the config actually controls request duration and avoids indefinitely hanging calls.

sourcery-ai · 2026-03-22T07:35:19Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+        if audio_url.startswith("http"):
+            if "multimedia.nt.qq.com.cn" in audio_url:
+                is_tencent = True
+
+            temp_dir = get_astrbot_temp_path()
+            path = os.path.join(
+                temp_dir,
+                f"qwen_asr_{os.urandom(4).hex()}.input",
+            )
+            await download_file(audio_url, path)


suggestion: Preserving the original file extension when downloading could improve MIME detection for base64 encoding.

Because downloaded files are saved with a .input suffix, _get_mime_type always falls back to the default audio/mpeg for these URLs. If you derive and reuse the original extension from the remote URL (e.g., .wav, .mp3, .ogg) in the temp filename, _get_mime_type can infer a more accurate MIME type and better match what the API expects.

Suggested implementation:

import base64 import os import pathlib from urllib.parse import urlparse import dashscope from dashscope import MultiModalConversation

is_tencent = False output_path = None # Download from URL if needed if audio_url.startswith("http"): if "multimedia.nt.qq.com.cn" in audio_url: is_tencent = True temp_dir = get_astrbot_temp_path() # Preserve original file extension (if present) to improve MIME detection parsed_url = urlparse(audio_url) _, ext = os.path.splitext(parsed_url.path) if not ext: ext = ".input" path = os.path.join( temp_dir, f"qwen_asr_{os.urandom(4).hex()}{ext}", ) await download_file(audio_url, path) audio_url = path

sourcery-ai · 2026-03-22T07:35:19Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+        base64_str = base64.b64encode(file_path_obj.read_bytes()).decode()
+        return f"data:{mime_type};base64,{base64_str}"
+
+    async def get_text(self, audio_url: str) -> str:


issue (complexity): Consider extracting response parsing into a helper, simplifying audio format detection, and optionally splitting out URL downloading to make get_text and _prepare_audio easier to follow without changing behaviour.

You can reduce complexity in a few focused spots without changing behaviour.

1. Flatten response parsing in get_text

The current parsing is quite nested and defensive. You can make it easier to follow by:

Using getattr(...) with sane defaults.

Normalising content into a list.

Using a single dict branch and .get() calls.

Example:

def _extract_text_from_response(self, response) -> str: if response.status_code != 200: error_msg = getattr(response, "message", None) or f"API error: {response.status_code}" logger.error(f"Qwen3-ASR-Flash API error: {error_msg}") raise Exception(f"Qwen3-ASR-Flash API error: {error_msg}") output = getattr(response, "output", None) choices = getattr(output, "choices", []) or [] if not choices: return "" message = getattr(choices[0], "message", None) content = getattr(message, "content", "") # normalise to list if isinstance(content, str): return content.strip() if not isinstance(content, list): return "" parts: list[str] = [] for item in content: if isinstance(item, dict): if "text" in item: parts.append(item.get("text", "")) elif "audio" in item: parts.append(item.get("audio", "")) return "".join(parts).strip()

Then get_text becomes:

async def get_text(self, audio_url: str) -> str: output_path = None try: audio_path, output_path = await self._prepare_audio(audio_url) data_uri = self._encode_audio_base64(audio_path) messages = [{"role": "user", "content": [{"audio": data_uri}]}] asr_options = {"enable_itn": self.enable_itn} if self.language != "auto": asr_options["language"] = self.language response = MultiModalConversation.call( api_key=self.api_key, model=self.model, messages=messages, result_format="message", asr_options=asr_options, ) text = self._extract_text_from_response(response) logger.debug(f"Qwen3-ASR-Flash transcription: {text}") return text except Exception as e: logger.error(f"Qwen3-ASR-Flash transcription error: {e}") raise finally: if output_path and os.path.exists(output_path): try: os.remove(output_path) except Exception as e: logger.error(f"Failed to remove temp file {output_path}: {e}")

This keeps all behaviour but separates orchestration from parsing and removes nested hasattr checks.

2. Make _get_audio_format synchronous

The method does only synchronous file I/O and header checks; making it async forces callers to await with no real benefit.

def _get_audio_format(self, file_path: str) -> str | None: silk_header = b"SILK" amr_header = b"#!AMR" try: with open(file_path, "rb") as f: file_header = f.read(8) except FileNotFoundError: return None if silk_header in file_header: return "silk" if amr_header in file_header: return "amr" return None

And update _prepare_audio accordingly:

elif ( lower_audio_url.endswith(".amr") or lower_audio_url.endswith(".silk") or is_tencent ): file_format = self._get_audio_format(audio_url) if file_format in ["silk", "amr"]: temp_dir = get_astrbot_temp_path() output_path = os.path.join(temp_dir, f"qwen_asr_{os.urandom(4).hex()}.wav") if file_format == "silk": logger.info("Converting silk file to wav...") await tencent_silk_to_wav(audio_url, output_path) elif file_format == "amr": logger.info("Converting amr file to wav...") await convert_to_pcm_wav(audio_url, output_path) audio_url = output_path

This removes unnecessary async surface area while preserving functionality.

3. Optional: small helpers to clarify _prepare_audio

Without changing the logic, you can pull out URL handling to reduce branching inside _prepare_audio:

async def _download_if_url(self, audio_url: str) -> tuple[str, bool]: if not audio_url.startswith("http"): return audio_url, False is_tencent = "multimedia.nt.qq.com.cn" in audio_url temp_dir = get_astrbot_temp_path() path = os.path.join(temp_dir, f"qwen_asr_{os.urandom(4).hex()}.input") await download_file(audio_url, path) return path, is_tencent

Then in _prepare_audio:

async def _prepare_audio(self, audio_url: str) -> tuple[str, str | None]: audio_url, is_tencent = await self._download_if_url(audio_url) output_path = None if not os.path.exists(audio_url): raise FileNotFoundError(f"File not found: {audio_url}") lower_audio_url = audio_url.lower() if lower_audio_url.endswith(".opus"): ... elif lower_audio_url.endswith((".amr", ".silk")) or is_tencent: ... return audio_url, output_path

This keeps behaviour intact but makes the flow (download → conversion decisions) easier to follow.

gemini-code-assist

Code Review

This pull request introduces the Qwen3-ASR-Flash speech-to-text provider and resolves a bug that was preventing STT from triggering. The bug fix appears correct and addresses the issue described. However, the implementation of the new provider has several areas for improvement, primarily concerning synchronous (blocking) operations within an asynchronous context. These blocking calls for file I/O and network requests can significantly degrade application performance by holding up the event loop. Additionally, there's an issue with modifying global state for the API endpoint, which could lead to incorrect behavior if multiple provider instances are used. I've provided specific comments and code suggestions to address these points by using asyncio.to_thread for blocking calls and ensuring thread-safe configuration.

gemini-code-assist · 2026-03-22T07:37:56Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+        self.timeout = provider_config.get("timeout", 30)
+
+        # Set the DashScope API base URL
+        dashscope.base_http_api_url = self.api_base


Setting the global dashscope.base_http_api_url in the initializer can lead to incorrect behavior when multiple provider instances with different api_base URLs are used. The last provider to initialize would set the URL for all of them. This line should be removed, and the api_base should be set just-in-time within the get_text method before making the API call to ensure each request uses its correct endpoint.

gemini-code-assist · 2026-03-22T07:37:56Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+            # Call API
+            response = MultiModalConversation.call(
+                api_key=self.api_key,
+                model=self.model,
+                messages=messages,
+                result_format="message",
+                asr_options=asr_options,
+            )


MultiModalConversation.call is a synchronous (blocking) network call. Invoking it directly within an async method will block the entire event loop, severely impacting application performance and responsiveness. You should use asyncio.to_thread to run this blocking operation in a separate thread. Additionally, to fix the issue of using a global api_base set in __init__, the api_base should be configured here, just before making the API call.

# Set API base for this call and execute the blocking call in a thread def _blocking_call(): dashscope.base_http_api_url = self.api_base return MultiModalConversation.call( api_key=self.api_key, model=self.model, messages=messages, result_format="message", asr_options=asr_options, ) response = await asyncio.to_thread(_blocking_call)

gemini-code-assist · 2026-03-22T07:37:57Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+"""
+
+import base64
+import os


To support running blocking I/O operations in a separate thread and avoid blocking the event loop, the asyncio module needs to be imported. This is required for several of the suggested fixes in this file.

Suggested change

import os

import os

import asyncio

gemini-code-assist · 2026-03-22T07:37:57Z

astrbot/core/provider/sources/qwen_asr_flash_source.py

+            with open(file_path, "rb") as f:
+                file_header = f.read(8)


The open() and read() calls are synchronous file operations that will block the asyncio event loop. This can degrade performance, especially with slow storage. These operations should be performed asynchronously by running them in a separate thread using asyncio.to_thread.

Suggested change

with open(file_path, "rb") as f:

file_header = f.read(8)

def _read_header(p):

with open(p, "rb") as f:

return f.read(8)

file_header = await asyncio.to_thread(_read_header, file_path)

astrbot/core/provider/sources/qwen_asr_flash_source.py

muchstarlight added 2 commits March 22, 2026 15:29

feat(provider): add Qwen3-ASR-Flash STT provider

9e38166

Add a new speech-to-text provider using DashScope's Qwen3-ASR-Flash model with base64 encoded audio support.

fix(stt): use Record.file instead of Record.url for audio path

8829e73

The Record component stores the audio URL in the 'file' attribute, not 'url' attribute. This caused STT to never be triggered since component.url was always empty when checking for voice messages.

auto-assign bot requested review from LIghtJUNction and advent259141 March 22, 2026 07:34

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 22, 2026

sourcery-ai bot reviewed Mar 22, 2026

View reviewed changes

dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 22, 2026

gemini-code-assist bot reviewed Mar 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(provider): add Qwen3-ASR-Flash STT provider & fix STT not triggering bug#6789

feat(provider): add Qwen3-ASR-Flash STT provider & fix STT not triggering bug#6789
muchstarlight wants to merge 2 commits intoAstrBotDevs:masterfrom
muchstarlight:master

muchstarlight commented Mar 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 22, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Mar 22, 2026

Uh oh!

sourcery-ai bot Mar 22, 2026

Uh oh!

sourcery-ai bot Mar 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 22, 2026

Uh oh!

gemini-code-assist bot Mar 22, 2026

Uh oh!

gemini-code-assist bot Mar 22, 2026

Uh oh!

gemini-code-assist bot Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-            with open(file_path, "rb") as f:
-                file_header = f.read(8)
+            def _read_header(p):
+                with open(p, "rb") as f:
+                    return f.read(8)
+            file_header = await asyncio.to_thread(_read_header, file_path)

Uh oh!

Conversation

muchstarlight commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary / 概要

Modifications / 改动点

Verification Steps / 验证步骤

Screenshots or Test Results / 运行截图或测试结果

Summary by Sourcery

Uh oh!

gemini-code-assist bot commented Mar 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 22, 2026

Choose a reason for hiding this comment

1. Flatten response parsing in get_text

2. Make _get_audio_format synchronous

3. Optional: small helpers to clarify _prepare_audio

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

muchstarlight commented Mar 22, 2026 •

edited

Loading

1. Flatten response parsing in `get_text`

2. Make `_get_audio_format` synchronous

3. Optional: small helpers to clarify `_prepare_audio`