Why doesn’t MistralChatGenerator(streaming_callback=…) yield streaming chunks when used inside a Haystack Pipeline.run()? #9805
-
|
Hi all, I’m trying to use the MistralChatGenerator with a streaming_callback inside a Haystack Pipeline. According to the docs: MistralChatGenerator supports streaming tokens via the streaming_callback init parameter. When I plug the MistralChatGenerator (with streaming_callback) into a pipeline and call pipeline.run(...), nothing is yielded in chunks — the entire response is only available once the generator finishes. It behaves synchronously: I have to wait for the full output rather than receiving intermediate tokens / chunks. What I’ve tried so far: Best workaround I found was using a queue.Queue + threading to push tokens from the callback and then yield them manually. It does work, but it feels hacky and not very “Haystack-native.” I was hoping there’s a built-in way to use streaming_callback in pipelines, or at least an officially recommended pattern. Is this expected behavior even though MistralChatGenerator supports streaming when used standalone? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Hi @Hansehart , I tried to reproduce the issue you describe without success. I ran the pipeline we have in this example and got streaming outputs printed to the console: https://docs.haystack.deepset.ai/docs/mistralchatgenerator#in-a-pipeline |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for your reply! You are right, it's a layer 8 problem. After my haystack operations I did some queries on my db, that blocked the process. However the real mistake was to set my breakpoint on the line pipeline.run(), that resulted in getting the whole response at once. But when I set my breakpoint inside the streaming callback I see correctly every chunk. I highly appreciate your answer! |
Beta Was this translation helpful? Give feedback.
Hi @Hansehart , I tried to reproduce the issue you describe without success. I ran the pipeline we have in this example and got streaming outputs printed to the console: https://docs.haystack.deepset.ai/docs/mistralchatgenerator#in-a-pipeline
Could the issue be in the particular
streaming_callbackthat you are trying to use? What if you useprint_streaming_chunkwithfrom haystack.components.generators.utils import print_streaming_chunk?