Skip to content

Agents

Schakel uses four LLM-powered agents, each with a specific role. All system prompts are defined as constants in app/services/llm/router.py (source of truth).

All agents operate in Spanish. All agent output that reaches TTS must be natural spoken language -- no markdown, bullet lists, or code.

Intent Router

Prompt constant: ROUTER_SYSTEM_PROMPT

The intent router is the first step in the pipeline. It classifies user text into exactly one of three intents using structured JSON output with an enum constraint.

Intent Trigger Example
DOMOTICA Device control (lights, switches, climate, covers, timers, scenes) "enciende la luz del salon"
MUSICA Music playback (play, pause, skip, volume) "pon Despacito"
GENERAL Everything else (questions, conversation, information) "que tiempo hace manana"

Output format:

{"intent": "DOMOTICA"}

Fallback: If classification fails (JSON parse error or unknown value), the router defaults to GENERAL. This ensures the user always gets a response even if the classifier produces unexpected output.


Domotica Agent

Prompt constant: DOMOTICA_SYSTEM_PROMPT

Translates natural language into a Home Assistant service call. The agent receives the current entity list as context so it only targets devices that actually exist in your HA instance.

Output schema (DOMOTICA_JSON_SCHEMA):

{
  "domain": "light",
  "service": "turn_on",
  "target": {"entity_id": "light.salon"},
  "service_data": {},
  "confirmation": "Luz del salon encendida."
}

Supported domains: light, switch, climate, cover, fan, lock, timer, scene, script, input_boolean.

Behavioral Rules

  • Output is exclusively JSON -- never conversational text.
  • The confirmation field is a single short spoken phrase (e.g. "Hecho.", "Temporizador puesto a 10 minutos.").
  • If the user mentions a device not in the entity list, returns domain: "unknown" with confirmation "No he encontrado ese dispositivo."
  • After parsing, the router executes ha_client.call_service() and returns the confirmation via TTS.
  • If the HA service call fails, responds with "Ha habido un error al ejecutar la accion."

Execution Flow

  1. The router sends the user's text + entity list to the domotica LLM.
  2. The LLM returns a JSON action with domain, service, target, and confirmation.
  3. The router parses the JSON and calls ha_client.call_service(domain, service, target, service_data).
  4. On success, the confirmation text is sent to TTS.
  5. On failure, a generic error message is sent to TTS instead.

Musica Agent

Prompt constant: MUSICA_SYSTEM_PROMPT

Translates natural language into a Spotify playback command. The router dispatches the parsed action to SpotifyClient and uses the real Spotify result (track/artist names from search) as the TTS response -- not the LLM's guess.

Output schema (MUSICA_JSON_SCHEMA):

{
  "action": "play_track",
  "query": "Despacito Luis Fonsi",
  "value": null,
  "confirmation": "Reproduciendo Despacito de Luis Fonsi."
}

Available Actions

Action query value Description
play_track search string null Search and play a track
play_artist artist name null Play an artist's top tracks
play_album album name null Play an album
play_playlist playlist name null Play a playlist
pause null null Pause playback
resume null null Resume playback
next null null Skip to next track
previous null null Go to previous track
volume null 0-100 Set volume percentage

Behavioral Rules

  • "Pon musica" (play music) without specifics maps to resume.
  • If Spotify is not configured or has no cached token, returns "La musica no esta disponible todavia." without crashing.
  • Confirmation text comes from the actual Spotify response, not the LLM output. This ensures the assistant announces the correct track/artist name.

Execution Flow

  1. The router sends the user's text to the musica LLM.
  2. The LLM returns a JSON action with action type, query, and optional value.
  3. The router dispatches to the appropriate SpotifyClient method (e.g., play_track(query)).
  4. The Spotify client returns the real result (actual track name, artist).
  5. The real result is used as the TTS response, overriding the LLM's confirmation field.

General Agent

Prompt constant: GENERAL_SYSTEM_PROMPT

Conversational assistant for questions and general information. This is the catch-all agent for anything that isn't device control or music playback.

The general agent can use either the local (Ollama) or cloud (OpenAI-compatible) LLM backend, controlled by the use_cloud config flag.

Behavioral Rules

  • Max 2-3 short sentences.
  • Natural spoken language only -- no markdown, lists, code, asterisks, numbering, or special formatting.
  • Concise and direct.
  • If the LLM call fails, responds with "Lo siento, no puedo responder ahora mismo."

Backend Selection

When use_cloud: false (default), the general agent uses the local Ollama instance with the model specified in general_model.

When use_cloud: true, it uses the configured cloud provider (OpenAI, Anthropic, or Mistral) with the specified cloud_model.

The router, domotica, and musica agents always use the local Ollama backend regardless of the use_cloud setting, as they need fast, deterministic structured JSON responses.