Skip to content

SnarfNL/HA_MistralAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mistral AI Conversation

Mistral AI Conversation

Home Assistant custom integration — Mistral AI as conversation agent, Voxtral for speech-to-text, and Mistral TTS for text-to-speech.

⚠️ Please note this is not an officially supported integration and is not affiliated with Mistral AI in any way.

hacs_badge HA Version Mistral AI License: MIT


Table of Contents

  1. About
  2. Features
  3. Requirements
  4. Installation
  5. Configuration
  6. Options
  7. Controlling devices
  8. Using as a service action
  9. Speech recognition (STT)
  10. Text-to-speech (TTS)
  11. FAQ
  12. Release Notes
  13. License

About

This integration makes Mistral AI available as a fully-featured conversation agent inside Home Assistant's built-in Assist voice pipeline. It also registers Voxtral (Mistral's own speech-to-text model) as a native HA STT provider — creating two separate devices, one for conversation and one for transcription, just like the official Google Gemini integration.

(back to top)


Features

Feature Status Description
Conversation agent in HA Assist Selectable as agent in Voice Assistants
Smart home control Control lights, switches, covers, locks, etc.
Speech recognition (STT) Voxtral Mini via /v1/audio/transcriptions
Text-to-speech (TTS) Mistral TTS via /v1/audio/speech with multiple voices
Streaming TTS Low-latency SSE WAV with sentence-level pipelining (≥ v0.4.0)
Conversation memory Context kept per session until 5 min idle (HA timeout).
Jinja2 system prompt Templates with {{ now() }}, {{ ha_name }} etc.
Multilingual Responds in the user's language
Continue conversation Keeps microphone open after questions (Experimental)
Separate devices Conversation and STT appear as separate HA devices

(back to top)


Requirements

Requirement Minimum version
Home Assistant Core 2025.10
Python 3.13
Mistral AI account + API key

(back to top)


Installation

Via HACS (recommended)

  1. HACS → Integrations → ⋮ → Custom repositories
  2. URL: https://github.com/SnarfNL/HA_MistralAI — category: Integration
  3. Search "Mistral AI Conversation" → Download
  4. Fully restart Home Assistant

Manual

  1. Copy custom_components/mistral_conversation/ to /config/custom_components/
  2. Remove old __pycache__ directories if updating from a previous version:
    rm -rf /config/custom_components/mistral_conversation/__pycache__
  3. Fully restart Home Assistant (not just reload)

(back to top)


Configuration

Creating an API key

  1. Sign up at mistral.ai
  2. Go to console.mistral.ai/api-keys
  3. Click Create new key and copy it immediately

Setting up the integration

  1. Settings → Devices & Services → + Add Integration
  2. Search for Mistral AI Conversation
  3. Enter your API key → Submit

Selecting as voice assistant

  1. Settings → Voice Assistants → click your assistant
  2. Set Conversation agent to Mistral AI Conversation
  3. Optionally set Speech-to-text to Mistral AI STT (Voxtral)
  4. Save

(back to top)


Options

Click the integration → Configure to change settings.

Option Default Description
AI model ministral-8b-latest Which Mistral model to use
System prompt See below Jinja2 template with AI instructions
Temperature 0.7 Creativity: 0.0 = deterministic, 1.0 = creative
Max tokens 1024 Maximum response length
Control HA On Allow the AI to control exposed devices
Continue conversation Off Keep listening after questions (Experimental)
STT language Auto-detect Language for Voxtral transcription
TTS mode Streaming Streaming (SSE WAV with sentence-level pipelining) or Batch (single MP3 request)

Available models

Model Speed Cost Best for
ministral-8b-latest ★★★★★ $ Home automation commands — fast, accurate, cheap
ministral-3b-latest ★★★★★ $ Ultra-simple commands, lowest latency
mistral-small-latest ★★★★ $$ Balanced: quality and speed
mistral-large-latest ★★★ $$$$ Complex reasoning, long conversations
open-mistral-nemo ★★★★ $ Open-source alternative

Recommendation: Start with ministral-8b-latest. It has excellent instruction-following, handles structured JSON output reliably (needed for device control), and costs a fraction of larger models.

System prompt

The system prompt supports Jinja2 templates:

You are a helpful voice assistant for {{ ha_name }}.
Answer in the same language the user speaks.
Today is {{ now().strftime('%A, %B %d, %Y') }} and the time is {{ now().strftime('%H:%M') }}.
Be concise and friendly.
For straightforward home control commands, briefly confirm the action taken without asking follow-up questions.
Your responses are read aloud by text-to-speech, so reply in plain text.
Do not use markdown formatting that cannot be read aloud, such as asterisks for bold, underscores for italics, backticks, bullet lists, emojis, or headers.

Available template variables:

Variable Description
{{ ha_name }} Your Home Assistant location name
{{ now() }} Current datetime object
{{ now().strftime(…) }} Formatted date/time string

Continue conversation (Experimental)

When enabled, the assistant automatically keeps the microphone open after any response that contains a question (?). This is implemented using the native continue_conversation flag in HA's ConversationResult — no separate automation is needed.

Note: This feature requires a satellite device that supports assist_satellite.start_conversation. Behaviour may vary between satellite types.

(back to top)


Controlling devices

Enable Allow AI to control Home Assistant devices in the options, then expose the entities you want via Settings → Voice Assistants → Exposed devices.

Example commands

What you say What happens
"Turn off the kitchen light" light.turn_off
"Open the blinds" cover.open_cover
"Lock the front door" lock.lock
"Play something in the living room" media_player.media_play
"Activate the movie scene" scene.turn_on

Supported domains

light · switch · cover · media_player · fan · climate · lock · alarm_control_panel · scene · script · automation · homeassistant

(back to top)


Using as a service action

Use conversation.process in automations or scripts:

action: conversation.process
data:
  agent_id: conversation.mistral_ai_conversation
  text: "What is the temperature in the living room?"
response_variable: result

The response text is in result.response.speech.plain.speech.

Example: Smart doorbell notification

alias: Smart doorbell notification
sequence:
  - action: conversation.process
    data:
      agent_id: conversation.mistral_ai_conversation
      text: >
        The doorbell rang at {{ now().strftime('%H:%M') }}.
        Write a short, friendly notification message.
    response_variable: ai_result
  - action: notify.mobile_app
    data:
      title: "Doorbell 🔔"
      message: "{{ ai_result.response.speech.plain.speech }}"

(back to top)


Speech recognition (STT)

A stt.mistral_ai_stt_voxtral entity is registered automatically.

Voxtral specifications

Property Value
Model voxtral-mini-latest
Supported format WAV (16-bit, 16 kHz, mono PCM)
Languages 60+ with auto-detect
Pricing ~$0.003 per minute

Setting STT language

In the options, select a language from the dropdown for best accuracy, or leave it on Auto-detect.

(back to top)


Text-to-speech (TTS)

When the integration is installed, a Mistral AI TTS entity is registered automatically as a separate HA TTS provider. It uses Mistral's /v1/audio/speech endpoint and supports two operating modes (see TTS modes below): low-latency streaming WAV (default, since v0.4.0) and single-shot MP3.

TTS modes

Selectable via Settings → Devices & Services → Mistral AI Conversation → Configure → Text-to-speech mode:

Mode Behaviour First audio Best for
Streaming (default) Server-Sent Events with chunked WAV. Sentences are extracted from the LLM token stream and dispatched to Mistral in parallel (up to 5 concurrent), with audio reassembled in strict order. ~0.5–1 s HA Voice satellites (Voice PE, ESPHome)
Batch Single non-streaming POST returns the full MP3 in a JSON body before any audio is yielded. After full synthesis (~3–5 s for long replies) Direct tts.speak service calls; environments where chunked HTTP is problematic

Note: Direct tts.speak service calls always use the batch path regardless of this setting (the service expects a single audio file, not a stream).

Voxtral TTS specifications

Property Value
Model voxtral-mini-tts-2603
Stream container WAV (24 kHz, 16-bit, mono PCM)
Batch container MP3 (base64-wrapped JSON response)
Voices EN-US (Paul), GB (Jane, Oliver), FR (Marie) — emotion variants

Selecting a voice

In Settings → Devices & Services → Mistral AI Conversation → Configure, choose from the available voices. The available voices are retrieved dynamically. Currently there are only voices for EN, GB and FR available.

(back to top)

---

FAQ

Q: The integration does not appear in the Voice Assistants dropdown. A: Make sure you performed a full restart (not just reload) and cleared any __pycache__ directories.

Q: I get a 400 Bad Request error. A: Check the HA logs for the full error body. A common cause is an invalid model name or a temperature value outside 0.0–1.0.

Q: Can I use TTS with this integration? A: Now that Mistral has a TTS model, yes. Please refer to Text-to-speech (TTS) section for details.

Q: How much does it cost? A: With ministral-8b-latest and typical home use, expect less than €1–2 per month. Voxtral STT adds ~€0.003/minute. See mistral.ai/pricing.

Q: Does continue conversation work on all satellites? A: It requires a satellite that supports the assist_satellite integration and start_conversation. It has been tested with ESPHome voice satellites. Behaviour on other devices may vary.

Q: Are my conversations stored? A: Mistral AI processes requests via their servers. See their privacy policy for details.

(back to top)


Release Notes

2026.05 — 2026-05-04

  • Changed: Version numbering from digits to year.month.version format.
  • Added: New AI task entity that lets Mistral generate structured data, handle image attachments, and respect output schemas defined in HA automations.
  • Added: Streaming TTS via Mistral's SSE WAV endpoint (response_format=wav, stream=true). End-of-speech-to-first-audio drops from ~5 s to ~1 s for long responses — audio chunks arrive while synthesis is in progress instead of waiting for the full MP3.
  • Added: Sentence-level pipelining — incoming LLM tokens are segmented into complete sentences and dispatched to Mistral in parallel (up to 5 concurrent requests bounded by an asyncio.Semaphore). Audio is reassembled in strict order with the WAV header from sentence 0 followed by raw PCM samples from sentences 1..N.
  • Added: tts_mode integration option — choose between stream (default) and batch. Direct tts.speak service calls always use the batch path regardless of this setting.
  • Added: tests/ directory with 64 unit tests covering _streaming helpers (sentence segmenter, SSE parser), const sanity (defaults reference valid values, voice naming convention, WAV header pin), _pcm_to_wav (round-trips RIFF/WAVE, fmt subchunk, data subchunk), and _async_stream_delta (chat-completions SSE: text deltas, [DONE] termination, tool-call accumulation, frame-split tolerance, malformed JSON resilience). Run with python -m unittest discover -t . -s tests.
  • Added: MIT LICENSE file at the repo root.
  • Fixed: async_unload_entry now unloads platforms before clearing runtime data — previously the order was reversed, which could race with in-flight streaming TTS that resolves self._runtime lazily on each chunk.
  • Changed: Minimum Home Assistant version bumped to 2025.10.0 — required for async_stream_tts_audio (HA streaming TTS API, shipped 2025.7).

v0.3.6 — 2026-04-10

  • Changed :Setting default voices and language has been clarified

v0.3.5 — 2026-04-10

  • Fixed: : Update available TTS voices and set new default by @kalon33 in #12
  • Changed: :Translate in French by @kalon33 in #13

v0.3.1.2 — 2026-04-09

  • Fixed: TTS voice list corrected — previous lists contained non-existent voice IDs. Replaced with the complete official list of 20 voices from Mistral TTS documentation, covering English (casual, cheerful, neutral), French, Spanish, German, Italian, Portuguese, Dutch, Arabic, and Hindi. Default changed to neutral_female.

v0.3.1.1 — 2026-04-09

  • Fixed: Mistral TTS returned HTTP 400 Invalid model — corrected model name from mistral-tts-latest to voxtral-mini-tts-2603.
  • Fixed: TTS API returns base64-encoded audio in a JSON audio_data field, not raw bytes — the response is now decoded correctly via base64.b64decode().
  • Fixed: Voice parameter renamed from voice to voice_id to match the Mistral API spec.
  • Updated: Voice list replaced with actual Mistral TTS voices in language_name_style format (e.g. gb_oliver_excited). New default: s3_rachel. Added voices for EN-GB, EN-US, FR, DE, ES, NL, IT, PT.

v0.3.1 — 2026-04-09

  • Fixed (HA 2026.4): TypeError: can only concatenate str (not "list") to str — HA 2026.4 changed chat_log.async_add_delta_content_stream to expect the generator to yield plain types directly: str for text deltas, llm.ToolInput for completed tool calls. Our generator was still yielding wrapper dicts ({"content": ..., "tool_calls": [...]}), causing HA to attempt concatenating a list onto a string. Fixed _async_stream_delta to yield str and llm.ToolInput objects directly. Tool calls are still buffered until all arguments are streamed before being yielded.
  • Added: Text-to-speech (TTS) platform using Mistral TTS (mistral-tts-latest) via /v1/audio/speech. Returns MP3 audio. Registers as a third separate HA device alongside Conversation and STT.
  • Added: Six selectable TTS voices: nova (default), alloy, echo, fable, onyx, shimmer — all multilingual.
  • Added: TTS voice selector in the integration options (Settings → Configure).

v0.2.2.3 — 2026-03-05

  • Fixed: 422 Unprocessable Entity from Mistral API — HA tool parameters were being sent in HA's own intermediate list format [{"type": "string", "name": "area", ...}] instead of the OpenAI-compatible JSON Schema format Mistral requires ({"type": "object", "properties": {...}, "required": [...]}). Added _ha_params_to_json_schema() which performs the full conversion, including: string/integer/float/boolean primitives, selectenum, multi_select → array of enum, list → string array, dict → object. The required list is only populated for parameters that have required: true and no optional: true.

v0.2.2.2 — 2026-03-05

  • Fixed: TypeError: Type is not JSON serializable: function — voluptuous validators (str, int, bool, etc.) are Python callables and were ending up as values inside tool parameter schemas. Two-part fix:
    1. _format_tool now uses voluptuous_serialize.convert() with HA's cv.custom_serializer — the same approach used by HA's own OpenAI and Gemini integrations — to produce a proper JSON Schema dict from tool.parameters.
    2. _sanitize extended to handle non-serializable values (functions, types, voluptuous validators): anything that is not a JSON scalar, dict, or list is now converted to repr(obj) instead of being passed through, so a single unexpected value can never crash serialization.

v0.2.2.1 — 2026-03-05

Community contributions merged with priority fix applied.

  • Fixed (priority): TypeError: Dict key must be a type serializable with OPT_NON_STR_KEYS — root cause identified as voluptuous schema objects (vol.Required, vol.Optional) being used as dict keys in tool parameter schemas from HA's LLM API. A recursive _sanitize() helper now converts all dict keys to plain strings before any payload is passed to aiohttp. Applied to messages, tools, and all nested structures.
  • Fixed: _convert_chat_log_to_messages now explicitly casts all id, tool_name, content values to str, and tool_result/tool_args dicts are also sanitized before json.dumps.
  • Added (community): MistralRuntimeData dataclass in __init__.py — shared aiohttp.ClientSession and auth headers stored in hass.data, avoiding repeated header construction per request.
  • Added (community): Re-authentication flow (async_step_reauth) — when the API key becomes invalid, HA now shows a re-auth notification instead of leaving the integration broken.
  • Added (community): Native HA LLM API integration via CONF_LLM_HASS_API — replaces the custom CONF_CONTROL_HA approach. Device control now uses HA's standard Assist API, identical to how Google Gemini and OpenAI integrations work.
  • Added (community): Streaming responses via chat_log.async_add_delta_content_stream — words appear progressively in the HA UI.
  • Added (community): Tool-call loop (max 10 iterations) for multi-step HA device control commands.
  • Added (community): Web search option (Beta) — uses Mistral's Agents/Conversations API. Requires mistral-medium-latest or mistral-large-latest.
  • Added (community): STT now uses the shared runtime session from hass.data instead of creating a new client per request.
  • Kept: continue_conversation (Experimental) — re-integrated into the new streaming architecture. Reads the final speech text from ConversationResult and sets continue_conversation=True when a ? is detected.

v0.2.2 — 2026-03-05

  • Fixed: TypeError: Dict key must be a type serializable with OPT_NON_STR_KEYS — caused by a community contribution that passed HA ChatLog objects into the aiohttp JSON payload. The _async_handle_message method now intentionally ignores the chat_log argument and manages its own rolling history using _make_message(), which explicitly casts all keys and values to plain Python strings before serialization.
  • Fixed: service_data keys returned by the model are also explicitly cast to str as an additional safeguard against non-string keys in nested payload structures.

v0.2.1 — 2026-02-23

  • Fixed: Service confirmation responses are now fully dynamic and language-aware. The AI generates the confirmation text itself (in whatever language the user is speaking) via a "confirmation" field in the JSON action payload. The hardcoded English _SERVICE_PAST_TENSE dictionary has been removed entirely.
  • Fixed: volume_set service call was incorrectly blocked — added volume_set, volume_mute, select_source, select_sound_mode, media_next_track, media_previous_track to the media_player allowlist.
  • Fixed: Service calls with extra parameters (e.g. volume_level, temperature) now work correctly via a "service_data" field in the JSON payload.
  • Improved: Extended allowlist with cover.set_cover_position, fan.set_percentage, fan.set_preset_mode, climate.set_temperature, climate.set_hvac_mode, input_boolean, input_number, and number domains.

v0.2.0 — 2026-02-23

Breaking: Removed Agent mode — integration now uses Model mode only.

  • Removed: Agent mode and all Mistral Console agent configuration. All configuration is now done directly in Home Assistant.
  • Added: continue_conversation option — when enabled, the assistant automatically keeps the microphone open after responses containing a question. Implemented natively via HA's ConversationResult.continue_conversation flag (no external automation required). Labelled Experimental.
  • Updated: Model list — removed deprecated ministral-7b-latest and open-codestral-mamba. Added ministral-8b-latest (new default) and ministral-3b-latest. ministral-8b-latest is the recommended model for home automation: fast, cost-effective, and excellent at structured instruction-following.
  • Fixed: All hardcoded Dutch strings in Python code replaced with English fallbacks. UI labels remain available in both English and Dutch via translation files.
  • Fixed: Service confirmation messages no longer start with "Done!" / "Klaar!". Format is now e.g. "Kitchen light has been turned off."
  • Fixed: Wrong GitHub URL in documentation corrected from SnarfNL/mistral_conversation to SnarfNL/HA_MistralAI.
  • Fixed: Removed "(only in Model mode)" labels from all UI options since Agent mode no longer exists.
  • Optimised: _post_chat error handling consolidated; HomeAssistantError and aiohttp.ClientError caught in a single handler. Error messages are now in English.
  • Optimised: History trimming now preserves exactly the last 40 messages (20 turns) using a single slice operation.

v0.1.8 — 2026-02-21

  • Added: icon.png (128 px) and icon@2x.png (256 px) — Mistral M-logo on orange rounded-square background.
  • Added: images/ folder with 256 px and 512 px versions for submission to the home-assistant/brands repository.
  • Added: Comprehensive README.md modelled after the BlaXun integration.
  • Fixed: STT and conversation entities now have separate DeviceInfo with distinct identifiers, matching the pattern used by the Google Gemini integration.
  • Fixed: MistralSTTEntity was missing DeviceInfo entirely — caused WebSocket handler errors (Received binary message for non-existing handler).
  • Fixed: PCM-to-WAV wrapping now always applied regardless of metadata.format, fixing 400 errors on the Voxtral endpoint.
  • Fixed: Full HTTP response body now logged on any 4xx/5xx from the chat API, making debugging possible.

v0.1.7 — 2026-02-21

  • Fixed: STT 400 error: HA always delivers raw PCM bytes; the WAV wrapper was incorrectly skipped when metadata.format == WAV.
  • Fixed: Conversation 400 error: error response body was silently discarded; now logged at ERROR level.
  • Fixed: HomeAssistantError raised inside _post_chat was not caught by the aiohttp.ClientError handler — added combined except clause.
  • Fixed: DeviceInfo added to MistralSTTEntity to allow correct HA device registration.

v0.1.6 — 2026-02-21

  • Added: Speech-to-text (STT) platform using Mistral's Voxtral Mini (voxtral-mini-latest).
  • Added: Agent mode — use a pre-configured agent from Mistral Console via agent_id.
  • Added: STT language selector (dropdown with 60+ languages + Auto-detect).
  • Changed: Conversation and STT entities registered as separate HA devices.

v0.1.5 — 2026-02-21

  • Added: icon.png and icon@2x.png in the component directory.
  • Added: Full README.md with installation guide, option descriptions, automation examples and FAQ.

v0.1.4 — 2026-02-21

  • Fixed: Mistral API rejects temperature values above 1.0 — clamped to 0.0–1.0.
  • Fixed: Removed top_p from API payload (cannot be sent together with temperature).
  • Added: ConversationEntityFeature.CONTROL to enable device control.
  • Improved: JSON extraction from AI response now handles markdown code fences.

v0.1.3 — 2026-02-21

  • Fixed: MistralOptionsFlow.__init__ tried to set self.config_entry which is a read-only property in HA 2024.x — removed __init__.
  • Added: _async_handle_message (HA 2024.6+ API) with async_process fallback for older versions.

v0.1.2 — 2026-02-21

  • Fixed: Conversation agent did not appear in the Voice Assistants dropdown because entities were registered directly instead of via the conversation platform.
  • Changed: Switched to async_forward_entry_setups with PLATFORMS = ["conversation"].

v0.1.1 — 2026-02-21

  • Fixed: 500 error in config flow caused by incorrect OptionsFlow structure.
  • Changed: Deprecated conversation.async_set_agent() replaced by proper platform setup.

v0.1.0 — 2026-02-21

  • Initial release.
  • Mistral AI selectable as conversation agent in HA Assist.
  • Configurable model, system prompt, temperature and max tokens via the UI.
  • Home Assistant device control via spoken commands.
  • Conversation history per session.

(back to top)


License

Distributed under the MIT License. See LICENSE for more information.

(back to top)


Made with ❤️ for the Home Assistant community
Inspired by the work of BlaXun

About

Home Assistant custom integration — Mistral AI as conversation agent and Voxtral as speech-to-text engine. ⚠️ Please note this is not an officially supported integration and is not affiliated with Mistral AI in any way.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages