Why doesn’t MistralChatGenerator(streaming_callback=…) yield streaming chunks when used inside a Haystack Pipeline.run()? #9805

Hansehart · 2025-09-20T14:20:58Z

Hansehart
Sep 20, 2025

Hi all,

I’m trying to use the MistralChatGenerator with a streaming_callback inside a Haystack Pipeline. According to the docs:

MistralChatGenerator supports streaming tokens via the streaming_callback init parameter.
Haystack Documentation

When I plug the MistralChatGenerator (with streaming_callback) into a pipeline and call pipeline.run(...), nothing is yielded in chunks — the entire response is only available once the generator finishes. It behaves synchronously: I have to wait for the full output rather than receiving intermediate tokens / chunks.

What I’ve tried so far:

Best workaround I found was using a queue.Queue + threading to push tokens from the callback and then yield them manually. It does work, but it feels hacky and not very “Haystack-native.” I was hoping there’s a built-in way to use streaming_callback in pipelines, or at least an officially recommended pattern.

Is this expected behavior even though MistralChatGenerator supports streaming when used standalone?

Answered by julian-risch

Sep 24, 2025

Hi @Hansehart , I tried to reproduce the issue you describe without success. I ran the pipeline we have in this example and got streaming outputs printed to the console: https://docs.haystack.deepset.ai/docs/mistralchatgenerator#in-a-pipeline
Could the issue be in the particular streaming_callback that you are trying to use? What if you use
print_streaming_chunk with
from haystack.components.generators.utils import print_streaming_chunk?

View full answer

julian-risch · 2025-09-24T10:34:04Z

julian-risch
Sep 24, 2025
Maintainer

Hi @Hansehart , I tried to reproduce the issue you describe without success. I ran the pipeline we have in this example and got streaming outputs printed to the console: https://docs.haystack.deepset.ai/docs/mistralchatgenerator#in-a-pipeline
Could the issue be in the particular streaming_callback that you are trying to use? What if you use
print_streaming_chunk with
from haystack.components.generators.utils import print_streaming_chunk?

0 replies

Hansehart · 2025-09-24T21:18:32Z

Hansehart
Sep 24, 2025
Author

Thanks for your reply! You are right, it's a layer 8 problem. After my haystack operations I did some queries on my db, that blocked the process. However the real mistake was to set my breakpoint on the line pipeline.run(), that resulted in getting the whole response at once. But when I set my breakpoint inside the streaming callback I see correctly every chunk. I highly appreciate your answer!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why doesn’t MistralChatGenerator(streaming_callback=…) yield streaming chunks when used inside a Haystack Pipeline.run()? #9805

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why doesn’t MistralChatGenerator(streaming_callback=…) yield streaming chunks when used inside a Haystack Pipeline.run()? #9805

Uh oh!

Hansehart Sep 20, 2025

Replies: 2 comments

Uh oh!

julian-risch Sep 24, 2025 Maintainer

Uh oh!

Hansehart Sep 24, 2025 Author

Hansehart
Sep 20, 2025

julian-risch
Sep 24, 2025
Maintainer

Hansehart
Sep 24, 2025
Author