36 lines
1.2 KiB
Markdown
36 lines
1.2 KiB
Markdown
# model routing policy
|
|
|
|
## primary model switching
|
|
nicholai can request any model via "switch to [alias]":
|
|
- `opus` → anthropic/claude-opus-4-5 (metered)
|
|
- `sonnet` → anthropic/claude-sonnet-4-5 (metered)
|
|
- `kimi` → opencode/openrouter/moonshotai/kimi-k2.5 (free-ish)
|
|
- `gemini-flash` → opencode/google/antigravity-gemini-3-flash (free)
|
|
- `gemini-pro` → opencode/google/antigravity-gemini-3-pro (free)
|
|
- `glm-local` → opencode/ollama/glm-4.7-flash:latest (free, local)
|
|
|
|
## sub-agent routing
|
|
|
|
### when anthropic weekly usage > 80%:
|
|
sub-agents MUST default to free models:
|
|
1. gemini-flash (preferred for lightweight tasks)
|
|
2. gemini-pro (for heavier reasoning)
|
|
3. glm-local (local fallback)
|
|
|
|
if ALL free models are unavailable:
|
|
- notify nicholai immediately
|
|
- ask: anthropic oauth or glm-local?
|
|
- do NOT auto-fall back to metered models
|
|
|
|
### when anthropic weekly usage < 80%:
|
|
sub-agents can use any model as appropriate for the task
|
|
|
|
## kimi fallback chain
|
|
1. kimi via ollama (preferred, local)
|
|
2. kimi via openrouter (fallback, notify nicholai)
|
|
3. if both fail: notify nicholai for alternative
|
|
|
|
## checking usage
|
|
before spawning sub-agents, check usage via session_status
|
|
look at weekly usage percentage to determine routing tier
|