Agents¶
Schakel uses four LLM-powered agents, each with a specific role. All system prompts are defined as constants in app/services/llm/router.py (source of truth).
All agents operate in Spanish. All agent output that reaches TTS must be natural spoken language -- no markdown, bullet lists, or code.
Intent Router¶
Prompt constant: ROUTER_SYSTEM_PROMPT
The intent router is the first step in the pipeline. It classifies user text into exactly one of three intents using structured JSON output with an enum constraint.
| Intent | Trigger | Example |
|---|---|---|
DOMOTICA |
Device control (lights, switches, climate, covers, timers, scenes) | "enciende la luz del salon" |
MUSICA |
Music playback (play, pause, skip, volume) | "pon Despacito" |
GENERAL |
Everything else (questions, conversation, information) | "que tiempo hace manana" |
Output format:
Fallback: If classification fails (JSON parse error or unknown value), the router defaults to GENERAL. This ensures the user always gets a response even if the classifier produces unexpected output.
Domotica Agent¶
Prompt constant: DOMOTICA_SYSTEM_PROMPT
Translates natural language into a Home Assistant service call. The agent receives the current entity list as context so it only targets devices that actually exist in your HA instance.
Output schema (DOMOTICA_JSON_SCHEMA):
{
"domain": "light",
"service": "turn_on",
"target": {"entity_id": "light.salon"},
"service_data": {},
"confirmation": "Luz del salon encendida."
}
Supported domains: light, switch, climate, cover, fan, lock, timer, scene, script, input_boolean.
Behavioral Rules¶
- Output is exclusively JSON -- never conversational text.
- The
confirmationfield is a single short spoken phrase (e.g. "Hecho.", "Temporizador puesto a 10 minutos."). - If the user mentions a device not in the entity list, returns
domain: "unknown"with confirmation "No he encontrado ese dispositivo." - After parsing, the router executes
ha_client.call_service()and returns the confirmation via TTS. - If the HA service call fails, responds with "Ha habido un error al ejecutar la accion."
Execution Flow¶
- The router sends the user's text + entity list to the domotica LLM.
- The LLM returns a JSON action with domain, service, target, and confirmation.
- The router parses the JSON and calls
ha_client.call_service(domain, service, target, service_data). - On success, the
confirmationtext is sent to TTS. - On failure, a generic error message is sent to TTS instead.
Musica Agent¶
Prompt constant: MUSICA_SYSTEM_PROMPT
Translates natural language into a Spotify playback command. The router dispatches the parsed action to SpotifyClient and uses the real Spotify result (track/artist names from search) as the TTS response -- not the LLM's guess.
Output schema (MUSICA_JSON_SCHEMA):
{
"action": "play_track",
"query": "Despacito Luis Fonsi",
"value": null,
"confirmation": "Reproduciendo Despacito de Luis Fonsi."
}
Available Actions¶
| Action | query |
value |
Description |
|---|---|---|---|
play_track |
search string | null |
Search and play a track |
play_artist |
artist name | null |
Play an artist's top tracks |
play_album |
album name | null |
Play an album |
play_playlist |
playlist name | null |
Play a playlist |
pause |
null |
null |
Pause playback |
resume |
null |
null |
Resume playback |
next |
null |
null |
Skip to next track |
previous |
null |
null |
Go to previous track |
volume |
null |
0-100 |
Set volume percentage |
Behavioral Rules¶
- "Pon musica" (play music) without specifics maps to
resume. - If Spotify is not configured or has no cached token, returns "La musica no esta disponible todavia." without crashing.
- Confirmation text comes from the actual Spotify response, not the LLM output. This ensures the assistant announces the correct track/artist name.
Execution Flow¶
- The router sends the user's text to the musica LLM.
- The LLM returns a JSON action with action type, query, and optional value.
- The router dispatches to the appropriate
SpotifyClientmethod (e.g.,play_track(query)). - The Spotify client returns the real result (actual track name, artist).
- The real result is used as the TTS response, overriding the LLM's
confirmationfield.
General Agent¶
Prompt constant: GENERAL_SYSTEM_PROMPT
Conversational assistant for questions and general information. This is the catch-all agent for anything that isn't device control or music playback.
The general agent can use either the local (Ollama) or cloud (OpenAI-compatible) LLM backend, controlled by the use_cloud config flag.
Behavioral Rules¶
- Max 2-3 short sentences.
- Natural spoken language only -- no markdown, lists, code, asterisks, numbering, or special formatting.
- Concise and direct.
- If the LLM call fails, responds with "Lo siento, no puedo responder ahora mismo."
Backend Selection¶
When use_cloud: false (default), the general agent uses the local Ollama instance with the model specified in general_model.
When use_cloud: true, it uses the configured cloud provider (OpenAI, Anthropic, or Mistral) with the specified cloud_model.
The router, domotica, and musica agents always use the local Ollama backend regardless of the use_cloud setting, as they need fast, deterministic structured JSON responses.