Fix/ensure correct transcript finished #3871

ChrisQlasty · 2025-12-08T23:36:56Z

Problem:
The input transcript text from Live Gemini models (tested on gemini-live-2.5-flash-preview-native-audio-09-2025) is not properly stitched. Words are often glued as new chunks do not guarantee a whitespace at end/beginning.

Solution:
Added logic that properly handles words stitching for the final transcript which comes with the finished=True flag.
I do not see it as very elegant but that solves most common issues with the final input transcript.

Testing Plan

Added unit test with several cases that were observed to be not working with ADK 1.20.

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Manual End-to-End (E2E) Tests:
Configure live Gemini model to return input_transcription. Talk with the model and print event.input_transcription.text when event.input_transcription.finished=True. Talk a lot and observe badly stitched fragments.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

gemini-code-assist · 2025-12-08T23:37:17Z

Summary of Changes

Hello @ChrisQlasty, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where transcription text from Live Gemini models was not being correctly assembled, leading to improperly stitched words. The changes introduce a refined algorithm for concatenating transcription fragments, ensuring that spaces are inserted appropriately while respecting punctuation and apostrophes. This significantly improves the readability and accuracy of the final transcribed text.

Highlights

Improved Transcription Stitching: Implemented logic to correctly join transcription fragments, preventing words from being glued together due to missing or incorrect spacing, especially around punctuation and apostrophes, for both input and output transcripts.
Punctuation Handling: Introduced a PUNCTUATION_CHARS set to intelligently decide whether to insert a space when concatenating transcription chunks based on leading punctuation or trailing apostrophes.
Comprehensive Unit Tests: Added a new parameterized unit test to validate the transcription stitching logic for both input and output, covering various scenarios with punctuation and apostrophes to ensure robustness.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

adk-bot · 2025-12-08T23:37:59Z

Response from ADK Triaging Agent

Hello @ChrisQlasty, thank you for creating this PR!

This PR is a bug fix. Could you please associate a GitHub issue with this PR? If there is no existing issue, could you please create one?

This information will help reviewers to review your PR more efficiently. Thanks!

gemini-code-assist

Code Review

This pull request addresses an issue with stitching transcription fragments from Live Gemini models by introducing logic to correctly handle whitespace and punctuation. The solution is effective and is supported by a comprehensive set of new unit tests that cover various edge cases. My review focuses on improving code maintainability by reducing duplication. I've suggested refactoring the duplicated transcription stitching logic into a helper method in gemini_llm_connection.py and simplifying the test setup in test_gemini_llm_connection.py with a helper function. These changes will make the code cleaner and easier to manage in the future.

gemini-code-assist · 2025-12-08T23:38:16Z

src/google/adk/models/gemini_llm_connection.py

+              existing = self._input_transcription_text
+              # Insert a space when joining fragments except when the new
+              # chunk starts with a punctuation character that should attach
+              # to the previous token, or the existing text ends with an
+              # apostrophe.
+              conditional_space = (
+                ' '
+                if existing
+                and not (
+                  new_input_transcription_chunk[0] in PUNCTUATION_CHARS
+                  or existing.endswith("'")
+                )
+                else ''
              )
+              self._input_transcription_text = f'{existing}{conditional_space}{new_input_transcription_chunk.strip()}'.strip()


The logic for stitching transcription chunks is duplicated for both input_transcription (here) and output_transcription (lines 226-240). To improve maintainability and adhere to the Don't Repeat Yourself (DRY) principle, this logic should be extracted into a private helper method.

For example, you could create a method like this:

def _stitch_transcription_chunk(self, existing_text: str, new_chunk: str) -> str: if not new_chunk: return existing_text # Insert a space when joining fragments except when the new # chunk starts with a punctuation character that should attach # to the previous token, or the existing text ends with an # apostrophe. conditional_space = ( ' ' if existing_text and not ( new_chunk[0] in PUNCTUATION_CHARS or existing_text.endswith("'") ) else '' ) return f'{existing_text}{conditional_space}{new_chunk.strip()}'.strip()

Then you could call it like so:
self._input_transcription_text = self._stitch_transcription_chunk(self._input_transcription_text, new_input_transcription_chunk)

gemini-code-assist · 2025-12-08T23:38:16Z

tests/unittests/models/test_gemini_llm_connection.py

+  message1 = mock.Mock()
+  message1.usage_metadata = None
+  message1.server_content = mock.Mock()
+  message1.server_content.model_turn = None
+  message1.server_content.interrupted = False
+  message1.server_content.turn_complete = False
+  message1.server_content.generation_complete = False
+  message1.tool_call = None
+  message1.session_resumption_update = None
+  message1.server_content.input_transcription = (
+      types.Transcription(text=fragment1, finished=False)
+      if tx_direction == 'input'
+      else None
+  )
+  message1.server_content.output_transcription = (
+      types.Transcription(text=fragment1, finished=False)
+      if tx_direction == 'output'
+      else None
+  )
+
+  message2 = mock.Mock()
+  message2.usage_metadata = None
+  message2.server_content = mock.Mock()
+  message2.server_content.model_turn = None
+  message2.server_content.interrupted = False
+  message2.server_content.turn_complete = False
+  message2.server_content.generation_complete = False
+  message2.tool_call = None
+  message2.session_resumption_update = None
+  message2.server_content.input_transcription = (
+      types.Transcription(text=fragment2, finished=False)
+      if tx_direction == 'input'
+      else None
+  )
+  message2.server_content.output_transcription = (
+      types.Transcription(text=fragment2, finished=False)
+      if tx_direction == 'output'
+      else None
+  )
+
+  message3 = mock.Mock()
+  message3.usage_metadata = None
+  message3.server_content = mock.Mock()
+  message3.server_content.model_turn = None
+  message3.server_content.interrupted = False
+  message3.server_content.turn_complete = False
+  message3.server_content.generation_complete = False
+  message3.tool_call = None
+  message3.session_resumption_update = None
+  message3.server_content.input_transcription = (
+      types.Transcription(text=None, finished=True)
+      if tx_direction == 'input'
+      else None
+  )
+  message3.server_content.output_transcription = (
+      types.Transcription(text=None, finished=True)
+      if tx_direction == 'output'
+      else None
+  )


The setup for message1, message2, and message3 is quite verbose and contains a lot of repeated code. To improve readability and maintainability, consider extracting the message creation into a helper function.

For example, you could define a helper within the test file:

def _create_mock_transcription_message(text: str | None, finished: bool, direction: str) -> mock.Mock: msg = mock.Mock() msg.usage_metadata = None msg.server_content = mock.Mock() msg.server_content.model_turn = None msg.server_content.interrupted = False msg.server_content.turn_complete = False msg.server_content.generation_complete = False msg.tool_call = None msg.session_resumption_update = None transcription = types.Transcription(text=text, finished=finished) if direction == 'input': msg.server_content.input_transcription = transcription msg.server_content.output_transcription = None else: msg.server_content.input_transcription = None msg.server_content.output_transcription = transcription return msg

Then, you could simplify the test setup significantly:

message1 = _create_mock_transcription_message(fragment1, False, tx_direction) message2 = _create_mock_transcription_message(fragment2, False, tx_direction) message3 = _create_mock_transcription_message(None, True, tx_direction)

ChrisQlasty added 6 commits December 7, 2025 22:58

add fix to ensure finished transcript has whitespaces

e8864b1

add unittest for final transcription whitespaces appearance

5901190

update test cases

ce4e269

fix fragments stich respecting punctuation marks

38cc1c2

add more test cases

c0f9d62

update logic of conditional_space

ed863a3

adk-bot added the live [Component] This issue is related to live, voice and video chat label Dec 8, 2025

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

ryanaiagent self-assigned this Dec 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/ensure correct transcript finished #3871

Fix/ensure correct transcript finished #3871

ChrisQlasty commented Dec 8, 2025

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Uh oh!

adk-bot commented Dec 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix/ensure correct transcript finished #3871

Are you sure you want to change the base?

Fix/ensure correct transcript finished #3871

Conversation

ChrisQlasty commented Dec 8, 2025

Testing Plan

Checklist

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

adk-bot commented Dec 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants