Skip to content

fix(amd): defer SIP listening until answer#1639

Open
rosetta-livekit-bot[bot] wants to merge 1 commit into
mainfrom
fix-amd-sip-answer-gate
Open

fix(amd): defer SIP listening until answer#1639
rosetta-livekit-bot[bot] wants to merge 1 commit into
mainfrom
fix-amd-sip-answer-gate

Conversation

@rosetta-livekit-bot
Copy link
Copy Markdown
Contributor

@rosetta-livekit-bot rosetta-livekit-bot Bot commented May 29, 2026

  1. Timeouts

Previously AMD started all its timers as soon as the SIP audio track was subscribed. Because audio tracks can be published during ringing or carrier early media (before SIP ANSWER), this poisoned the classifier with pre-answer audio and burned the no-speech budget.

This PR correctly waits for SIP active state to start the no-speech timer. Detection timeout is still armed when track is subscribed.

Also adds a reusable wait_for_participant_attribute helper in utils.participant with a dedicated ParticipantAttributeWaitAborted exception, and tracks/cleans up the deferred setup task properly.

  1. No speech timeout

Now it maps to uncertain;

  1. EoT

Previously, verdict is emitted when the prediction arrives and if the silence threshold is satisfied, but it doesn't mean the user turn is ended. A new option wait_until_finished (default False) now make sure we wait for both EOT and silence threshold for machines.

This helps when previously, new normal generation can be triggered in parallel with a generate_reply call after AMD has interrupted any preemptive generations, emitted the verdict and released the hold for playout (the new generation is too late for AMD to interrupt, due to late STT for example).

If EOT is properly being waited, in addition to the silence threshold, then the parallel generation will be interrupted when interrupt_on_machine=True.

This flag still respects the overall timeout value if there is no speech or transcript arrives.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 29, 2026

⚠️ No Changeset found

Latest commit: 775662d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread agents/src/voice/amd.ts
Comment on lines +584 to +586
if (!participant) {
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 AMD never starts listening when participant resolution fails after track publication

In gateListening(), when participant is undefined at line 584 (e.g., the participant disconnected between track publication and the lookup, or publicationSid is falsy), the .then() handler returns without calling startListening(). This means the no-speech timer never starts, VAD events and transcripts are silently dropped (handleUserStateChanged and consumeTranscript both return early when this.listening is false), and AMD only settles when the detection timer fires after 20 seconds, producing a meaningless UNCERTAIN result.

This is a regression from the previous code which always called startNoSpeechTimer() after track publication. The .catch() handler at agents/src/voice/amd.ts:610-622 correctly falls back to startListening(), but the !participant branch in .then() does not.

Suggested change
if (!participant) {
return;
}
if (!participant) {
this.startListening();
return;
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread agents/src/utils.ts
Comment on lines +1120 to +1123
const current = room.remoteParticipants.get(identity);
if (current && current.attributes[attribute] === value) {
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 waitForParticipantAttribute hangs if participant disconnects between existence check and listener setup

In waitForParticipantAttribute, the participant existence is checked at line 1078 before event listeners are registered at lines 1114-1117. If the participant disconnects in that window, the ParticipantDisconnected event fires before our listener is registered (missed), and at line 1120 current is undefined (participant gone from map). The function falls through to await fut.await which can only resolve via room disconnect or signal abort — there's no path that detects the participant already left.

Comparison with existing waitForParticipant pattern

The sibling function waitForParticipant (agents/src/utils.ts:1028-1048) correctly avoids this by setting up listeners first and then checking existing participants, ensuring no events are missed. waitForParticipantAttribute should similarly check !current after listener registration and reject/throw.

In the AMD caller context this is mitigated by the trackGateAbort signal (eventually aborted by cleanup), but the utility function itself is incorrect for any caller without such a safety net.

Suggested change
const current = room.remoteParticipants.get(identity);
if (current && current.attributes[attribute] === value) {
return;
}
const current = room.remoteParticipants.get(identity);
if (!current) {
throw new Error(`Participant ${identity} is no longer in the room`);
}
if (current.attributes[attribute] === value) {
return;
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants