435 lines
24 KiB
Markdown
435 lines
24 KiB
Markdown
# MASTER PLAN: Interactive Agent Factory SaaS
|
||
## Codename: "GooseFactory" — Your AI Factory, Your Rules
|
||
|
||
> **Author:** Buba (synthesized from 4 specialized research agents)
|
||
> **Date:** 2026-02-06
|
||
> **Status:** PLAN — Awaiting Jake's Review
|
||
> **Supporting Research:** 4 docs, ~15,000 words, 60+ sources
|
||
|
||
---
|
||
|
||
## TL;DR — The 30-Second Pitch
|
||
|
||
Fork Goose (Block's open-source AI agent). Gut its chat UI. Wire in a **Factory Command Center** — a decision queue, pipeline kanban, and approval system that makes it painfully obvious when YOU are the bottleneck. The backend is an API + MCP server that exposes every factory operation as a conversational tool. You literally type "what needs my attention?" and get a prioritized list with one-click approve/reject. Everything you don't touch auto-advances. Everything that needs you screams at you until you act.
|
||
|
||
---
|
||
|
||
## 1. WHY THIS MATTERS
|
||
|
||
Right now the pipeline has ~64 MCP servers across 8 stages. The bottleneck isn't the AI — it's **you not knowing what's stuck on you**. The current system (Discord channels + cron heartbeats + manual checks) is passive. You have to go looking for what needs attention. That's backwards.
|
||
|
||
**The fix:** Build a system where decisions come to YOU, not the other way around. Make human-in-the-loop a first-class experience, not an afterthought.
|
||
|
||
---
|
||
|
||
## 2. ARCHITECTURE OVERVIEW
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ YOUR INTERFACE LAYER │
|
||
│ │
|
||
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||
│ │ GooseFactory │ │ Discord Bot │ │ Mobile │ │
|
||
│ │ Desktop App │ │ (Buttons + │ │ Push Notifs │ │
|
||
│ │ (Forked │ │ Embeds) │ │ (Quick │ │
|
||
│ │ Goose) │ │ │ │ Approve) │ │
|
||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||
│ │ │ │ │
|
||
│ └──────────────────┼──────────────────┘ │
|
||
│ │ │
|
||
│ ┌─────────────────────────▼─────────────────────────────┐ │
|
||
│ │ MCP Server (Factory Operations) │ │
|
||
│ │ 11 Tools · 6 Resources · 4 Prompts │ │
|
||
│ │ "what needs attention?" → prioritized decision queue │ │
|
||
│ └─────────────────────────┬─────────────────────────────┘ │
|
||
└────────────────────────────┼────────────────────────────────────┘
|
||
│
|
||
┌────────────────────────────┼────────────────────────────────────┐
|
||
│ ▼ │
|
||
│ ┌─────────────────────────────────────────────────────┐ │
|
||
│ │ Factory API (REST + WebSocket) │ │
|
||
│ │ 30+ endpoints · Real-time events · GraphQL queries │ │
|
||
│ └─────────────────────────┬───────────────────────────┘ │
|
||
│ │ │
|
||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │Pipeline │ │Task │ │Notif + │ │Audit │ │
|
||
│ │Engine │ │Queue │ │Escalation│ │Logger │ │
|
||
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
|
||
│ │ │
|
||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │PostgreSQL│ │Redis │ │S3/R2 │ │
|
||
│ │(State) │ │(Events) │ │(Assets) │ │
|
||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||
└─────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 3. THE GOOSE FORK — "GooseFactory"
|
||
|
||
### Why Goose?
|
||
- **Rust backend + Electron/React frontend** — production-grade, fast
|
||
- **Apache 2.0 license** — full commercial freedom, no copyleft
|
||
- **MCP-native** — already a first-class MCP host with dynamic extension discovery
|
||
- **Built-in permission system** — 4 modes including Smart Approval (risk-based)
|
||
- **Extension ecosystem** — thousands of MCP servers plug in immediately
|
||
- **Active community** — but now under Linux Foundation (AAIF), so stable governance
|
||
|
||
### What We Change
|
||
|
||
| Component | Current Goose | GooseFactory |
|
||
|-----------|--------------|--------------|
|
||
| **Branding** | Goose logos, `goose://` protocol | Your brand, `factory://` protocol |
|
||
| **Default Extensions** | Developer, Memory, etc. | Factory MCP Server (built-in), Pipeline Manager |
|
||
| **Chat UI** | General-purpose assistant | Factory Command Center with decision queue sidebar |
|
||
| **Approval Flow** | Simple allow/deny on tool calls | Rich approval cards with context, diffs, metrics |
|
||
| **System Prompts** | Generic agent instructions | Factory operator mode — knows about pipeline stages, MCPs |
|
||
| **MCP UI Rendering** | Basic inline/sidecar (WIP) | Custom approval UIs, pipeline dashboards, code review panels |
|
||
| **Protocol Handler** | `goose://extension?...` | `factory://approve?task_id=...` deep links |
|
||
|
||
### Fork Strategy
|
||
|
||
1. **Clone the repo** — `git clone https://github.com/block/goose GooseFactory`
|
||
2. **Rebrand** — `package.json`, `main.ts`, assets, protocol handler (~1-2 days)
|
||
3. **Add Factory MCP Server** as a built-in Rust extension in `crates/goose-mcp/`
|
||
4. **Customize the chat UI** — Add decision queue sidebar in React (the interesting part)
|
||
5. **Add MCP UI components** — Custom approval cards using `@mcp-ui/client`
|
||
6. **Configure Smart Approval** — Factory operations auto-classified by risk level
|
||
|
||
### ⚠️ Timing Risk
|
||
Goose is actively migrating to ACP (Agent Communication Protocol) — Issue #6642. This replaces the backend REST+SSE with JSON-RPC 2.0. **Recommendation:** Fork AFTER the ACP migration lands (or fork now and track upstream). The migration affects `goosed` ↔ desktop communication.
|
||
|
||
---
|
||
|
||
## 4. EVERY MOMENT YOU'RE NEEDED (Taxonomy)
|
||
|
||
Based on research across 10+ agent products and frameworks, here's every type of human-in-the-loop moment mapped to your factory:
|
||
|
||
### 🔴 CRITICAL — Always Need You
|
||
|
||
| Moment | Factory Example | UI Pattern |
|
||
|--------|----------------|------------|
|
||
| **Deploy to Production** | Promoting an MCP to live | Modal overlay with deploy checklist |
|
||
| **API Key Entry** | Configuring Stripe/GHL credentials | Secure input form in chat |
|
||
| **Client Communication** | Sending deliverables to the $20k client | Preview + approve before send |
|
||
| **Pricing/Positioning** | Setting MCP marketplace pricing | Multi-choice card with tradeoffs |
|
||
| **Legal/License Review** | Checking dependency licenses | Sidebar review panel |
|
||
|
||
### 🟡 HIGH VALUE — Usually Need You
|
||
|
||
| Moment | Factory Example | UI Pattern |
|
||
|--------|----------------|------------|
|
||
| **Design Review** | Approving UI/UX for MCP apps | Side-by-side mockup comparison |
|
||
| **Code Quality Gate** | Reviewing generated MCP server code | Diff view with inline annotations |
|
||
| **Naming/Branding** | Naming a new MCP server | A/B choice between options |
|
||
| **Test Failure Triage** | GHL's 42 failing tests — fix or skip? | Error cards with suggested actions |
|
||
| **Priority Decisions** | Which MCP to advance next? | Drag-and-drop priority list |
|
||
|
||
### 🟢 CONTEXTUAL — Sometimes Need You
|
||
|
||
| Moment | Factory Example | UI Pattern |
|
||
|--------|----------------|------------|
|
||
| **Routine Approvals** | Stage advances for passing servers | Batch approve with exceptions |
|
||
| **Parameter Tuning** | Adjusting test coverage thresholds | Slider controls |
|
||
| **Edge Cases** | AI hit a wall building a tool | Escalation card with context |
|
||
| **Delegation** | Route task to specialized agent | Dropdown assignment |
|
||
|
||
### Smart Routing (Confidence-Based)
|
||
Not everything needs to block on you:
|
||
- **>90% confidence** → Auto-execute, log for async review
|
||
- **60-90% confidence** → Queue for review, pipeline continues other work
|
||
- **<60% confidence** → Block and escalate immediately
|
||
|
||
---
|
||
|
||
## 5. THE DECISION QUEUE — Your Mission Control
|
||
|
||
This is the centerpiece. A prioritized inbox of every decision the factory needs from you.
|
||
|
||
### Layout (In GooseFactory Desktop App)
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ GooseFactory [≡] [−] [×]│
|
||
├──────────────────┬──────────────────────────────────────────┤
|
||
│ │ │
|
||
│ 📥 DECISIONS (6)│ 🔴 GHL MCP — Deploy to Production │
|
||
│ │ │
|
||
│ 🔴 GHL Deploy │ Pipeline: ghl-mcp-server │
|
||
│ 🟡 Stripe Review │ Stage: staging → production │
|
||
│ 🟡 3 Batch Items│ Tests: 47/47 ✅ Coverage: 94% ✅ │
|
||
│ 🟢 2 FYI Items │ Waiting: 2h 15m SLA: ⚠️ 45m left │
|
||
│ │ │
|
||
│ ── Pipeline ── │ Changes since last review: │
|
||
│ [Kanban View] │ + 12 files modified │
|
||
│ │ + 3 new API endpoints │
|
||
│ ── Agents ── │ + Edge case handling improved │
|
||
│ 🟢 Builder: OK │ │
|
||
│ 🟢 Tester: OK │ [View Full Diff] [Run Tests Again] │
|
||
│ 🟡 GHL: Waiting │ │
|
||
│ │ ┌─────────┐ ┌─────────┐ ┌──────────┐ │
|
||
│ ── Stats ── │ │✅ Deploy│ │❌ Reject│ │⏰ Defer │ │
|
||
│ Today: 12 done │ └─────────┘ └─────────┘ └──────────┘ │
|
||
│ Avg wait: 1.2h │ │
|
||
│ │ 💬 Chat: "approve the GHL deploy" │
|
||
│ │ [________________________________] [⏎] │
|
||
├──────────────────┴──────────────────────────────────────────┤
|
||
│ Chat: You can also just type naturally here... │
|
||
│ > "what else needs my attention?" │
|
||
│ > "approve all low-risk items" │
|
||
│ > "show me the GHL test failures" │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### Key Features
|
||
|
||
1. **Left Sidebar: Decision Queue** — Priority-sorted, color-coded, with age timers
|
||
2. **Center: Context Panel** — Full details for the selected decision (diffs, metrics, history)
|
||
3. **Bottom: Chat** — Natural language interface to the factory ("approve all passing servers")
|
||
4. **One-Click Actions** — Approve, reject, defer, reassign, batch approve
|
||
5. **Keyboard Shortcuts** — `j/k` navigate, `a` approve, `r` reject, `d` defer
|
||
6. **SLA Indicators** — Glowing countdown timers, escalation warnings
|
||
|
||
---
|
||
|
||
## 6. MCP SERVER — The Brain
|
||
|
||
The Factory MCP Server is what makes the chat interface powerful. It exposes 11 tools, 6 resources, and 4 prompts.
|
||
|
||
### Tools (What You Can Do)
|
||
|
||
| Tool | What It Does | Example |
|
||
|------|-------------|---------|
|
||
| `factory_get_pending_tasks` | Your decision inbox | "what needs my attention?" |
|
||
| `factory_approve_task` | Approve and advance | "approve the GHL deploy" |
|
||
| `factory_reject_task` | Reject with feedback | "reject stripe review — needs more tests" |
|
||
| `factory_get_pipeline_status` | Pipeline overview | "show me all active pipelines" |
|
||
| `factory_advance_stage` | Manual stage advance | "move notion-mcp to testing" |
|
||
| `factory_assign_priority` | Set priority | "make GHL critical priority" |
|
||
| `factory_get_blockers` | What's stuck | "what's blocked and why?" |
|
||
| `factory_run_tests` | Trigger tests | "run tests on the stripe server" |
|
||
| `factory_deploy` | Deploy to env | "deploy freshdesk to staging" |
|
||
| `factory_search` | Search everything | "find all servers with auth issues" |
|
||
| `factory_create_pipeline` | New server pipeline | "start a new Zendesk MCP server" |
|
||
|
||
### Resources (What You Can Read)
|
||
|
||
| Resource | What It Provides |
|
||
|----------|-----------------|
|
||
| `factory://dashboard/summary` | High-level factory status |
|
||
| `factory://pipelines/{id}/state` | Specific pipeline details |
|
||
| `factory://servers/{name}/status` | Individual server health |
|
||
| `factory://pipelines/{id}/test-results` | Test results + coverage |
|
||
| `factory://pipelines/{id}/build-logs` | Build output |
|
||
| `factory://config/templates` | Available pipeline templates |
|
||
|
||
### Prompts (Structured Conversations)
|
||
|
||
| Prompt | What It Sets Up |
|
||
|--------|----------------|
|
||
| `review_server` | Pull all context for a full MCP server review |
|
||
| `whats_needs_attention` | Prioritized summary of everything pending |
|
||
| `deploy_checklist` | Pre-deployment verification checklist |
|
||
| `pipeline_retrospective` | Post-completion analysis and lessons learned |
|
||
|
||
---
|
||
|
||
## 7. NOTIFICATION ESCALATION — No Decision Falls Through
|
||
|
||
This is critical. The whole point is that you CANNOT miss something.
|
||
|
||
```
|
||
T+0min Task created → Decision appears in GooseFactory queue
|
||
→ Discord embed in #factory-tasks with buttons
|
||
|
||
T+30min Reminder #1 → Discord DM + badge pulse in app
|
||
→ "⏰ GHL deploy approval waiting 30m"
|
||
|
||
T+2h Reminder #2 → Discord @mention + push notification
|
||
→ "🟡 GHL deploy waiting 2h — SLA in 2h"
|
||
|
||
T+4h SLA Warning → Discord @here + sound alert in app
|
||
→ "🔴 GHL deploy SLA breach imminent"
|
||
|
||
T+SLA SLA Breach → Auto-escalate: SMS + all channels
|
||
→ "🚨 GHL deploy SLA BREACHED — action required"
|
||
|
||
T+SLA+2h Critical → Phone notification + auto-default to safest action
|
||
→ Incident report logged
|
||
```
|
||
|
||
### Smart Batching
|
||
Instead of 10 separate pings:
|
||
```
|
||
📋 5 servers ready for review:
|
||
✅ freshdesk (low risk, tests pass) [Approve]
|
||
✅ helpscout (low risk, tests pass) [Approve]
|
||
✅ close (low risk, tests pass) [Approve]
|
||
⚠️ stripe (med risk, 1 warning) [Review]
|
||
❌ ghl (high risk, 42 failures) [Review Required]
|
||
|
||
[Approve All Low-Risk (3)] [Review All]
|
||
```
|
||
|
||
---
|
||
|
||
## 8. EVERY UI PATTERN MAPPED
|
||
|
||
Based on research across Devin, Cursor, GitHub Copilot Workspace, n8n, Retool, and 20+ other products:
|
||
|
||
### Pattern → When to Use
|
||
|
||
| Pattern | Best For | Our Implementation |
|
||
|---------|----------|-------------------|
|
||
| **Inline Chat Buttons** | Quick approve/reject | Approve/reject buttons in chat messages |
|
||
| **Modal Overlay** | Critical/irreversible actions | Production deploy confirmation (type "DEPLOY" to confirm) |
|
||
| **Sidebar Panel** | Code/asset review | Diff viewer alongside approval context |
|
||
| **Decision Queue** | Managing multiple pending items | Left sidebar in GooseFactory |
|
||
| **Kanban Board** | Pipeline stage visualization | Pipeline view tab |
|
||
| **Batch Processor** | Many similar decisions | "Approve all matching criteria" |
|
||
| **Progress Dashboard** | Long-running agent monitoring | Agent status panel |
|
||
| **Run Contract** | Pre-approving expensive operations | "This will use ~$50 in API calls, take ~4h" |
|
||
| **Mobile Quick Actions** | Approvals on the go | Push notification with swipe actions |
|
||
| **Discord Embeds** | Team visibility + async approval | Rich embeds with buttons in factory channels |
|
||
| **MCP Apps** | Complex interactive reviews | Custom HTML UIs rendered in chat (code review, forms) |
|
||
|
||
---
|
||
|
||
## 9. TECH STACK
|
||
|
||
| Layer | Technology | Why |
|
||
|-------|-----------|-----|
|
||
| **Desktop App** | Forked Goose (Electron + React 19 + Rust) | Best-in-class MCP host, extensible UI |
|
||
| **Backend API** | Node.js + Hono | Fast, lightweight, TypeScript-native |
|
||
| **Database** | PostgreSQL (Neon/Supabase) | Proven, JSONB support, great for state machines |
|
||
| **Cache/Events** | Redis (Upstash) | Pub/sub, streams, fast queue |
|
||
| **Object Storage** | Cloudflare R2 | S3-compatible, no egress fees |
|
||
| **MCP Server** | TypeScript + @modelcontextprotocol/sdk | Native MCP, stdio + SSE transport |
|
||
| **State Machine** | XState-inspired patterns | Explicit states, SLA timers, auto-escalation |
|
||
| **Orchestration** | Inngest (step.waitForEvent) | Durable execution, event correlation, timeouts |
|
||
| **Discord Bot** | discord.js | Buttons, embeds, modals, slash commands |
|
||
| **Auth** | JWT + API keys | Simple, stateless, scoped |
|
||
| **CI/CD** | GitHub Actions | Existing infra, dispatch triggers |
|
||
|
||
### Why Inngest over Temporal?
|
||
- **Simpler** — No separate server cluster to manage
|
||
- **TypeScript-native** — Matches our stack
|
||
- **Event matching** — `waitForEvent` with correlation is exactly our approval pattern
|
||
- **Serverless** — Functions dehydrate while waiting, no resource consumption
|
||
- Temporal is more powerful but overkill for our scale right now. Can migrate later if needed.
|
||
|
||
---
|
||
|
||
## 10. DATABASE SCHEMA (Key Tables)
|
||
|
||
```
|
||
pipelines — One per MCP server build
|
||
├── pipeline_stages — Stage definitions + state machine
|
||
├── tasks — Human decisions needed (the queue)
|
||
├── approvals — Formal gate approvals
|
||
├── assets — Generated code, configs, builds
|
||
└── audit_log — Immutable event log
|
||
|
||
agents — AI workers + build agents
|
||
notifications — Multi-channel notification queue
|
||
```
|
||
|
||
8 tables total. Full SQL DDL in `research-factory-api-architecture.md`.
|
||
|
||
---
|
||
|
||
## 11. IMPLEMENTATION ROADMAP
|
||
|
||
### Phase 1: Foundation (Week 1-2) — "The Skeleton"
|
||
- [ ] Fork Goose, rebrand basics (name, logo, protocol)
|
||
- [ ] Set up PostgreSQL schema + migrations
|
||
- [ ] Core REST API (pipelines, tasks, approvals CRUD)
|
||
- [ ] JWT auth
|
||
- [ ] Basic audit logging
|
||
- [ ] **Deliverable:** API accepts requests, data persists
|
||
|
||
### Phase 2: MCP Server + Real-Time (Week 3-4) — "The Brain"
|
||
- [ ] Factory MCP server with core tools (get_pending, approve, reject, status)
|
||
- [ ] MCP resources (pipeline state, dashboard summary)
|
||
- [ ] WebSocket server for real-time dashboard updates
|
||
- [ ] Redis event bus with consumer groups
|
||
- [ ] Wire MCP server into GooseFactory as built-in extension
|
||
- [ ] **Deliverable:** "What needs my attention?" works in chat
|
||
|
||
### Phase 3: Decision Queue UI (Week 5-6) — "The Centerpiece"
|
||
- [ ] Decision queue sidebar in GooseFactory React UI
|
||
- [ ] Context panel with diffs, metrics, history
|
||
- [ ] One-click approve/reject/defer actions
|
||
- [ ] Keyboard shortcuts (j/k/a/r/d)
|
||
- [ ] Pipeline kanban view
|
||
- [ ] SLA countdown indicators
|
||
- [ ] **Deliverable:** Full Command Center in desktop app
|
||
|
||
### Phase 4: Notifications + Discord (Week 7-8) — "The Nagger"
|
||
- [ ] Discord bot bridge with rich embeds + buttons
|
||
- [ ] Escalation ladder (queue → DM → mention → SMS)
|
||
- [ ] Smart batching for similar decisions
|
||
- [ ] Mobile push notifications
|
||
- [ ] SLA monitoring and auto-escalation
|
||
- [ ] GitHub webhook integration
|
||
- [ ] **Deliverable:** Decisions come to you, not the other way around
|
||
|
||
### Phase 5: Advanced Features (Week 9-10) — "The Polish"
|
||
- [ ] MCP Apps for complex reviews (code diffs, forms in chat)
|
||
- [ ] Batch approval processor
|
||
- [ ] MCP prompts (review, deploy checklist, retrospective)
|
||
- [ ] Analytics dashboard (decision velocity, bottleneck analysis)
|
||
- [ ] Confidence-based auto-routing
|
||
- [ ] Undo/rollback for 24h post-approval
|
||
- [ ] **Deliverable:** Full SaaS-grade product
|
||
|
||
### Phase 6: SaaS-ify (Week 11-12) — "The Product"
|
||
- [ ] Multi-tenant support (separate factory instances)
|
||
- [ ] User management + team roles
|
||
- [ ] Billing integration
|
||
- [ ] Landing page + docs
|
||
- [ ] Onboarding flow
|
||
- [ ] **Deliverable:** Sellable product
|
||
|
||
---
|
||
|
||
## 12. WHAT MAKES THIS DIFFERENT FROM EXISTING TOOLS
|
||
|
||
| Tool | What It Does | What We Do Better |
|
||
|------|-------------|------------------|
|
||
| **Devin** | Autonomous coding agent | We're a factory MANAGER, not a single agent |
|
||
| **Cursor/Windsurf** | IDE with AI | We manage pipelines of 64+ servers, not single files |
|
||
| **n8n/Zapier** | Workflow automation | We're AI-agent-native with MCP, not just webhooks |
|
||
| **Linear/Jira** | Project management | We have AI agents doing the work, humans just decide |
|
||
| **Retool** | Internal tools | We're purpose-built for AI agent factories |
|
||
| **Goose (vanilla)** | General AI assistant | We're a specialized factory operator |
|
||
|
||
**The unique value:** No one has built a purpose-built human-in-the-loop command center specifically for managing fleets of AI agents building MCP servers. You'd be first.
|
||
|
||
---
|
||
|
||
## 13. IMMEDIATE NEXT STEPS
|
||
|
||
1. **Jake reviews this plan** — What's missing? What's wrong? What's the priority?
|
||
2. **Fork Goose** — Clone, rebrand, get building running locally
|
||
3. **Spike the MCP Server** — Build the 3 most critical tools (get_pending, approve, reject) and test in Goose
|
||
4. **Spike the Decision Queue UI** — Mockup the sidebar in GooseFactory's React app
|
||
5. **Wire to existing pipeline** — Connect to `mcp-command-center/state.json` as initial data source
|
||
|
||
**The MVP is:** Type "what needs my attention?" in GooseFactory → get a prioritized list → approve/reject from chat. Everything else builds on that.
|
||
|
||
---
|
||
|
||
## SUPPORTING RESEARCH DOCS
|
||
|
||
| Doc | Words | Focus |
|
||
|-----|-------|-------|
|
||
| `research-goose-architecture.md` | ~3,000 | Goose codebase, fork strategy, MCP integration |
|
||
| `research-hitl-ux-patterns.md` | ~5,500 | Every HITL interaction type, UI patterns, 10 products analyzed |
|
||
| `research-factory-api-architecture.md` | ~4,000 | API design, MCP server spec, database schema, real-time events |
|
||
| `research-agent-orchestration-patterns.md` | ~3,500 | LangGraph, Temporal, Inngest, state machines, notification patterns |
|
||
|
||
---
|
||
|
||
*"The best interface for managing AI agents isn't more AI — it's making it painfully obvious when a human needs to do something, and making that something take one click."*
|