clawdbot-workspace/clawdbot-architecture-deep-dive.md

# 🦞 Clawdbot Architecture Deep Dive

> A comprehensive technical breakdown of Clawdbot's codebase, prompting system, and internal architecture.

---

## High-Level Overview

Clawdbot is a **TypeScript/Node.js application** (v22.12+) that acts as a universal gateway between messaging platforms and AI agents. Think of it as a sophisticated message router with an embedded AI brain.

```
┌─────────────────────────────────────────────────────────────────┐
│                        MESSAGING CHANNELS                        │
│  Discord │ Telegram │ WhatsApp │ Signal │ iMessage │ Slack     │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         GATEWAY SERVER                          │
│  - WebSocket control plane (ws://127.0.0.1:18789)               │
│  - HTTP server (control UI, Canvas, OpenAI-compat endpoints)   │
│  - Session management                                           │
│  - Cron scheduler                                               │
│  - Node pairing (iOS/Android/macOS)                             │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         AGENT LAYER                             │
│  - Pi coding agent (embedded via @mariozechner packages)        │
│  - System prompt generation                                     │
│  - Tool definitions & policy enforcement                        │
│  - Sub-agent spawning                                           │
│  - Model routing (Anthropic, OpenAI, Gemini, Bedrock, etc.)    │
└─────────────────────────────────────────────────────────────────┘
```

---

## Directory Structure

```
/opt/homebrew/lib/node_modules/clawdbot/
├── dist/                    # Compiled JavaScript (~800 files)
│   ├── agents/              # Agent runtime, tools, system prompt
│   ├── gateway/             # Gateway server implementation
│   ├── channels/            # Channel plugin system
│   ├── cli/                 # CLI commands
│   ├── config/              # Configuration loading/validation
│   ├── browser/             # Playwright browser automation
│   ├── cron/                # Scheduled jobs
│   ├── memory/              # Semantic memory search
│   ├── sessions/            # Session management
│   ├── plugins/             # Plugin SDK
│   ├── infra/               # Infrastructure utilities
│   └── ...
├── docs/                    # Documentation (~50 files)
├── skills/                  # Built-in skills (~50 SKILL.md packages)
├── extensions/              # Channel extensions
├── assets/                  # Static assets
└── package.json             # Dependencies & scripts
```

---

## Core Components

### 1. Entry Point (`dist/entry.js`)

The CLI entry point that:
- Sets `process.title = "clawdbot"`
- Suppresses Node.js experimental warnings
- Handles Windows path normalization
- Loads CLI profiles
- Dispatches to `cli/run-main.js`

```javascript
#!/usr/bin/env node
process.title = "clawdbot";
installProcessWarningFilter();

// Handle profile args, then bootstrap CLI
import("./cli/run-main.js")
  .then(({ runCli }) => runCli(process.argv))
```

### 2. Gateway Server (`dist/gateway/server.impl.js`)

The heart of Clawdbot — a single long-running process that owns:

| Subsystem | Purpose |
|-----------|---------|
| Config loader | Reads/validates `~/.clawdbot/clawdbot.yaml` |
| Plugin registry | Loads channel & tool plugins |
| Channel manager | Manages Discord/Telegram/WhatsApp connections |
| Session manager | Isolates conversations, tracks history |
| Cron service | Scheduled jobs & reminders |
| Node registry | Mobile/desktop node pairing |
| TLS runtime | Secure connections |
| Control UI | Browser dashboard at `:18789` |
| Health monitor | Gateway health & presence |

**Key Gateway Files:**
- `server-channels.js` — Channel connection lifecycle
- `server-chat.js` — Message → Agent routing
- `server-cron.js` — Scheduled jobs & reminders
- `server-bridge-*.js` — WebSocket control plane methods
- `server-http.js` — HTTP endpoints
- `server-providers.js` — Model provider management

### 3. Agent System (`dist/agents/`)

This is where the AI "brain" lives.

#### System Prompt Generation (`system-prompt.js`)

The `buildAgentSystemPrompt()` function dynamically constructs the system prompt based on runtime context:

```typescript
export function buildAgentSystemPrompt(params) {
  // Sections built dynamically:
  const lines = [
    "You are a personal assistant running inside Clawdbot.",
    "",
    "## Tooling",
    // ... tool availability list
    "",
    "## Tool Call Style",
    // ... narration guidelines
    "",
    "## Clawdbot CLI Quick Reference",
    // ... CLI commands
    "",
    ...buildSkillsSection(params),      // Available skills
    ...buildMemorySection(params),       // Memory recall instructions
    ...buildDocsSection(params),         // Documentation paths
    ...buildMessagingSection(params),    // Messaging guidelines
    ...buildReplyTagsSection(params),    // [[reply_to_current]] etc.
    // ...
    "## Runtime",
    buildRuntimeLine(runtimeInfo),       // Model, channel, capabilities
  ];

  // Inject project context files
  for (const file of contextFiles) {
    lines.push(`## ${file.path}`, "", file.content, "");
  }

  return lines.filter(Boolean).join("\n");
}
```

**System Prompt Sections:**

| Section | Purpose |
|---------|---------|
| Tooling | Lists available tools with descriptions |
| Tool Call Style | When to narrate vs. just call tools |
| CLI Quick Reference | Gateway management commands |
| Skills | Available SKILL.md files to read |
| Memory Recall | How to use memory_search/memory_get |
| Self-Update | Config/update restrictions |
| Model Aliases | opus, sonnet shortcuts |
| Workspace | Working directory info |
| Documentation | Docs paths |
| Reply Tags | Native reply/quote syntax |
| Messaging | Channel routing rules |
| Silent Replies | NO_REPLY handling |
| Heartbeats | HEARTBEAT_OK protocol |
| Runtime | Model, channel, capabilities |
| Project Context | AGENTS.md, SOUL.md, USER.md, etc. |

The final prompt is **~2000-3000 tokens** depending on configuration.

#### Tools System (`dist/agents/tools/`)

Each tool is a separate module with schema definition and handler:

| Tool | File | Purpose |
|------|------|---------|
| `exec` | `bash-tools.exec.js` | Shell command execution (54KB!) |
| `process` | `bash-tools.process.js` | Background process management |
| `browser` | `browser-tool.js` | Playwright browser control |
| `canvas` | `canvas-tool.js` | Present/eval/snapshot Canvas |
| `cron` | `cron-tool.js` | Scheduled jobs & reminders |
| `gateway` | `gateway-tool.js` | Self-management (restart, update) |
| `message` | `message-tool.js` | Cross-channel messaging |
| `nodes` | `nodes-tool.js` | Mobile node camera/screen/location |
| `sessions_list` | `sessions-list-tool.js` | List sessions |
| `sessions_history` | `sessions-history-tool.js` | Fetch session history |
| `sessions_send` | `sessions-send-tool.js` | Send to another session |
| `sessions_spawn` | `sessions-spawn-tool.js` | Spawn sub-agent |
| `session_status` | `session-status-tool.js` | Usage/cost/model info |
| `agents_list` | `agents-list-tool.js` | List spawnable agents |
| `web_search` | `web-search.js` | Brave API search |
| `web_fetch` | `web-fetch.js` | URL content extraction |
| `image` | `image-tool.js` | Vision model analysis |
| `memory_search` | `memory-tool.js` | Semantic memory search |
| `memory_get` | `memory-tool.js` | Read memory snippets |
| `tts` | `tts-tool.js` | Text-to-speech |

**Tool Policy Enforcement (`pi-tools.policy.js`):**

Tools are filtered through multiple policy layers:
1. Global policy (`tools.policy` in config)
2. Provider-specific policy
3. Agent-specific policy
4. Group chat policy
5. Sandbox policy
6. Sub-agent policy

```typescript
const isAllowed = isToolAllowedByPolicies("exec", [
  profilePolicy,
  providerProfilePolicy,
  globalPolicy,
  globalProviderPolicy,
  agentPolicy,
  agentProviderPolicy,
  groupPolicy,
  sandbox?.tools,
  subagentPolicy,
]);
```

#### Pi Integration (`dist/agents/pi-*.js`)

Clawdbot embeds [Pi coding agent](https://github.com/badlogic/pi-mono) as its core AI runtime:

```json
// Dependencies from package.json:
{
  "@mariozechner/pi-agent-core": "0.49.3",
  "@mariozechner/pi-ai": "0.49.3",
  "@mariozechner/pi-coding-agent": "0.49.3",
  "@mariozechner/pi-tui": "0.49.3"
}
```

**Key Pi Integration Files:**

| File | Purpose |
|------|---------|
| `pi-embedded-runner.js` | Spawns Pi agent sessions |
| `pi-embedded-subscribe.js` | Handles streaming responses |
| `pi-embedded-subscribe.handlers.*.js` | Message/tool event handlers |
| `pi-embedded-utils.js` | Utilities for Pi integration |
| `pi-tools.js` | Tool definition adapter |
| `pi-tools.policy.js` | Tool allowlist/denylist |
| `pi-tools.read.js` | Read tool customization |
| `pi-tools.schema.js` | Schema normalization |
| `pi-settings.js` | Pi agent settings |

### 4. Channel Plugins (`dist/channels/plugins/`)

Each messaging platform is a plugin:

```
channels/plugins/
├── discord/          # Discord.js integration
├── telegram/         # grammY framework
├── whatsapp/         # Baileys (WhatsApp Web protocol)
├── signal/           # Signal CLI bridge
├── imessage/         # macOS imsg CLI
├── bluebubbles/      # BlueBubbles API
├── slack/            # Slack Bolt
├── line/             # LINE Bot SDK
├── mattermost/       # WebSocket events
├── googlechat/       # Google Chat API
└── ...
```

**Channel Plugin Interface:**
Each plugin implements:
- `connect()` — Establish connection
- `disconnect()` — Clean shutdown
- `send()` — Deliver messages
- `onMessage()` — Handle incoming messages
- Channel-specific actions (reactions, polls, threads, etc.)

**Channel Registry (`dist/channels/registry.js`):**
```typescript
// Plugins register themselves
registerChannelPlugin({
  id: "discord",
  displayName: "Discord",
  connect: async (config) => { ... },
  send: async (message) => { ... },
  // ...
});
```

### 5. Skills System (`skills/`)

Skills are self-contained instruction packages that teach the agent how to use external tools:

```
skills/
├── github/SKILL.md           # gh CLI usage
├── gog/SKILL.md              # Google Workspace CLI
├── spotify-player/SKILL.md   # Spotify control
├── weather/SKILL.md          # wttr.in integration
├── bear-notes/SKILL.md       # Bear notes via grizzly
├── apple-notes/SKILL.md      # memo CLI
├── apple-reminders/SKILL.md  # remindctl CLI
├── obsidian/SKILL.md         # Obsidian vault management
├── notion/SKILL.md           # Notion API
├── himalaya/SKILL.md         # Email via IMAP/SMTP
├── openhue/SKILL.md          # Philips Hue control
├── camsnap/SKILL.md          # RTSP camera capture
└── ...
```

**Skill Loading Flow:**
1. System prompt includes skill descriptions in `<available_skills>`
2. Agent scans descriptions to find matching skill
3. Agent calls `read` tool to load SKILL.md
4. Agent follows instructions in SKILL.md

**Skill File Structure:**
```markdown
# SKILL.md - [Tool Name]

## When to use
Description of when this skill applies.

## Commands
```bash
tool-name command --flags
```

## Examples
...
```

### 6. Memory System (`dist/memory/`)

Semantic search over workspace memory files:

**Memory Files:**
- `MEMORY.md` — Root memory file
- `memory/*.md` — Dated logs, research intel, project notes

**Memory Tools:**
- `memory_search` — Semantic vector search using `sqlite-vec`
- `memory_get` — Read specific lines from memory files

```typescript
// memory-search.js
import SqliteVec from "sqlite-vec";

async function searchMemory(query: string, options: SearchOptions) {
  // Embed query
  const embedding = await embedText(query);

  // Vector similarity search
  const results = await db.query(`
    SELECT path, line_start, line_end, content,
           vec_distance_cosine(embedding, ?) as distance
    FROM memory_chunks
    ORDER BY distance
    LIMIT ?
  `, [embedding, options.maxResults]);

  return results;
}
```

### 7. Session Management (`dist/gateway/session-utils.js`)

Sessions isolate conversations and track state:

**Session Key Format:**
```
{channel}:{accountId}:{chatId}
discord:main:938238002528911400
telegram:main:123456789
whatsapp:main:1234567890@s.whatsapp.net
```

**Session State:**
- Conversation history
- Model override (if any)
- Reasoning level
- Active tool calls
- Sub-agent references

**Session Files:**
```
~/.clawdbot/sessions/
├── discord-main-938238002528911400.json
├── telegram-main-123456789.json
└── ...
```

---

## Message Flow

```
1. User sends message on Discord/Telegram/WhatsApp/etc.
           │
           ▼
2. Channel plugin receives message
   - Parses sender, chat ID, content
   - Handles media attachments
   - Checks mention gating (groups)
           │
           ▼
3. Gateway routes to session
   - Resolves session key
   - Loads/creates session
   - Checks activation rules
           │
           ▼
4. Session loads context:
   - System prompt (generated dynamically)
   - Project context files (AGENTS.md, SOUL.md, USER.md)
   - Conversation history
   - Tool availability
           │
           ▼
5. Pi agent processes with configured model
   - Anthropic (Claude)
   - OpenAI (GPT-4, o1, etc.)
   - Google (Gemini)
   - AWS Bedrock
   - Local (Ollama, llama.cpp)
           │
           ▼
6. Agent may call tools
   - Tool policy checked
   - Tool executed
   - Results fed back to agent
   - Loop until done
           │
           ▼
7. Response streamed/chunked back
   - Long responses chunked for Telegram
   - Markdown formatted per channel
   - Media attachments handled
           │
           ▼
8. Channel plugin delivers message
   - Native formatting applied
   - Reply threading if requested
   - Reactions/buttons if configured
```

---

## Configuration

Config lives at `~/.clawdbot/clawdbot.yaml`:

```yaml
# Model Providers
providers:
  anthropic:
    key: "sk-ant-..."
  openai:
    key: "sk-..."
  google:
    key: "..."

# Default Model
defaultModel: "anthropic/claude-sonnet-4-5"

# Channel Configs
discord:
  token: "..."
  defaultModel: "anthropic/claude-opus-4-5"

telegram:
  token: "..."

whatsapp:
  enabled: true

# Agent Config
agent:
  workspaceDir: "~/.clawdbot/workspace"

agents:
  main:
    workspaceDir: "~/.clawdbot/workspace"

# Tool Policies
tools:
  exec:
    security: "full"  # full | allowlist | deny
    host: "sandbox"   # sandbox | gateway | node
  browser:
    profile: "clawd"

# Cron Jobs
cron:
  jobs:
    - id: "daily-standup"
      schedule: "0 9 * * *"
      text: "Good morning! What's on the agenda?"

# Gateway Settings
gateway:
  port: 18789
  token: "..."
```

**Config Schema:**
Full schema at `dist/protocol.schema.json` (~80KB of JSON Schema)

---

## Key Dependencies

```json
{
  // AI Agent Core
  "@mariozechner/pi-agent-core": "0.49.3",
  "@mariozechner/pi-ai": "0.49.3",
  "@mariozechner/pi-coding-agent": "0.49.3",

  // Messaging Channels
  "discord-api-types": "^0.38.37",
  "grammy": "^1.39.3",
  "@whiskeysockets/baileys": "7.0.0-rc.9",
  "@slack/bolt": "^4.6.0",
  "@line/bot-sdk": "^10.6.0",

  // Browser Automation
  "playwright-core": "1.58.0",
  "chromium-bidi": "13.0.1",

  // Vector Search
  "sqlite-vec": "0.1.7-alpha.2",

  // Image Processing
  "sharp": "^0.34.5",
  "@napi-rs/canvas": "^0.1.88",

  // TTS
  "node-edge-tts": "^1.2.9",

  // Scheduling
  "croner": "^9.1.0",

  // HTTP/WebSocket
  "hono": "4.11.4",
  "ws": "^8.19.0",
  "undici": "^7.19.0",

  // Schema Validation
  "zod": "^4.3.6",
  "@sinclair/typebox": "0.34.47",
  "ajv": "^8.17.1"
}
```

---

## Key Architectural Decisions

### 1. Single Gateway Process
One process owns all channel connections to avoid session conflicts (especially WhatsApp Web which only allows one active session).

### 2. Pi as Core Runtime
Leverages Pi coding agent's battle-tested:
- Tool streaming
- Context management
- Multi-provider support
- Structured output handling

### 3. Dynamic System Prompt
Built at runtime based on:
- Available tools (policy-filtered)
- Available skills
- Current channel
- User configuration
- Project context files

### 4. Plugin Architecture
Everything is pluggable:
- Channels (Discord, Telegram, etc.)
- Tools (exec, browser, etc.)
- Hooks (voice transcription, media processing)
- Skills (external instruction packages)

### 5. Session Isolation
Each conversation gets isolated:
- History
- Model settings
- Tool state
- Sub-agent references

### 6. Sub-agent Spawning
Complex tasks spawn isolated sub-agents that:
- Run in separate sessions
- Have restricted tool access
- Report back when complete
- Can be monitored/killed

### 7. Multi-Layer Tool Policy
Security through depth:
- Global policy
- Provider policy
- Agent policy
- Group policy
- Sandbox policy
- Sub-agent policy

---

## File Counts

| Category | Count |
|----------|-------|
| Compiled JS files | ~800 |
| Documentation files | ~50 |
| Skill packages | ~50 |
| Channel plugins | ~12 |
| Tool implementations | ~25 |

**Total package size:** ~1.5MB (minified JS + assets)

---

## Development

```bash
# Clone and install
git clone https://github.com/clawdbot/clawdbot
cd clawdbot
pnpm install

# Build
pnpm build

# Run dev gateway
pnpm gateway:dev

# Run tests
pnpm test

# Lint
pnpm lint
```

---

## Resources

- **GitHub:** https://github.com/clawdbot/clawdbot
- **Docs:** https://docs.clawd.bot
- **Discord:** https://discord.com/invite/clawd
- **Skills Marketplace:** https://clawdhub.com

---

*Generated by Buba, 2026-02-06*