Fix Whisper seek behavior #1400

DePasqualeOrg · 2025-12-15T09:19:27Z

When transcribing short audio clips, the model may output a single_timestamp_ending pattern (text followed by a single timestamp and EOT). Previously, this would advance seek by the full 30-second segment size, potentially skipping remaining audio content.

This fix advances seek to the timestamp position instead, allowing transcription to continue from that point.

You can test this by checking out commit 9820718 and running whisper/demo_seek_fix.py to see the behavior before the fix. This behavior was specific to only one model variant in my testing.

The test script can be deleted before merging.

Before:

Text: The examination and testimony of the experts enabled the commission to conclude that five
Segments: 1
Accuracy: 71%

After:

Text: The examination and testimony of the experts enabled the commission to conclude that five shots may have been fired.
Segments: 2
Accuracy: 100%

DePasqualeOrg added 2 commits December 15, 2025 10:01

Demo seek fix for Whisper

9820718

Fix Whisper seek behavior

f116b49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Whisper seek behavior #1400

Fix Whisper seek behavior #1400

Uh oh!

DePasqualeOrg commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix Whisper seek behavior #1400

Are you sure you want to change the base?

Fix Whisper seek behavior #1400

Uh oh!

Conversation

DePasqualeOrg commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant