.agents/MODEL-ROUTING.md

# model routing policy

## primary model switching
nicholai can request any model via "switch to [alias]":
- `opus` → anthropic/claude-opus-4-5 (metered)
- `sonnet` → anthropic/claude-sonnet-4-5 (metered)
- `kimi` → opencode/openrouter/moonshotai/kimi-k2.5 (free-ish)
- `gemini-flash` → opencode/google/antigravity-gemini-3-flash (free)
- `gemini-pro` → opencode/google/antigravity-gemini-3-pro (free)
- `glm-local` → opencode/ollama/glm-4.7-flash:latest (free, local)

## sub-agent routing

### when anthropic weekly usage > 80%:
sub-agents MUST default to free models:
1. gemini-flash (preferred for lightweight tasks)
2. gemini-pro (for heavier reasoning)
3. glm-local (local fallback)

if ALL free models are unavailable:
- notify nicholai immediately
- ask: anthropic oauth or glm-local?
- do NOT auto-fall back to metered models

### when anthropic weekly usage < 80%:
sub-agents can use any model as appropriate for the task

## kimi fallback chain
1. kimi via ollama (preferred, local)
2. kimi via openrouter (fallback, notify nicholai)
3. if both fail: notify nicholai for alternative

## checking usage
before spawning sub-agents, check usage via session_status
look at weekly usage percentage to determine routing tier