.agents/memory/MODEL-ROUTING.md

36 lines
1.2 KiB
Markdown

# model routing policy
## primary model switching
nicholai can request any model via "switch to [alias]" (ordered by intelligence):
- `opus` → anthropic/claude-opus-4-5 (metered)
- `glm` → zai/glm-5 (free)
- `sonnet` → anthropic/claude-sonnet-4-5 (metered)
- `kimi` → opencode/openrouter/moonshotai/kimi-k2.5 (free-ish)
- `gemini-flash` → opencode/google/antigravity-gemini-3-flash (free)
- `gemini-pro` → opencode/google/antigravity-gemini-3-pro (free)
## sub-agent routing
### when anthropic weekly usage > 80%:
sub-agents MUST default to free models:
1. gemini-flash (preferred for lightweight tasks)
2. gemini-pro (for heavier reasoning)
3. glm-local (local fallback)
if ALL free models are unavailable:
- notify nicholai immediately
- ask: anthropic oauth or glm-local?
- do NOT auto-fall back to metered models
### when anthropic weekly usage < 80%:
sub-agents can use any model as appropriate for the task
## kimi fallback chain
1. kimi via ollama (preferred, local)
2. kimi via openrouter (fallback, notify nicholai)
3. if both fail: notify nicholai for alternative
## checking usage
before spawning sub-agents, check usage via session_status
look at weekly usage percentage to determine routing tier