# Architecture Technical details of how the Clawdbot Memory System works. --- ## System Overview ``` ┌─────────────────────────────────────────────────────────────────┐ │ CLAWDBOT AGENT │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ Chat Session │ │ Tool: write │ │ Tool: memory_ │ │ │ │ │ │ (file ops) │ │ search │ │ │ └───────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │ │ │ │ │ │ └──────────┼───────────────────┼──────────────────────┼─────────────┘ │ │ │ │ ┌──────▼───────┐ ┌──────▼───────┐ │ │ Markdown │ │ SQLite + │ │ │ Files │◄──────│ sqlite-vec │ │ │ (source of │ index │ Vector Store │ │ │ truth) │───────► │ │ └──────────────┘ └───────────────┘ │ ▲ │ │ └───────────────────┘ Agent writes memories during session ``` --- ## Write Flow When the agent decides to store a memory: ``` Agent decides to remember something │ ▼ ┌─────────────────┐ │ Write to file │ │ memory/YYYY-MM- │ │ DD.md │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ File watcher │ ← Clawdbot watches memory/ for changes │ detects change │ (debounced — waits for writes to settle) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Chunking │ ← File split into meaningful chunks │ (by section/ │ (headers, paragraphs, list items) │ paragraph) │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Embedding │ ← Each chunk → embedding vector │ Provider │ (OpenAI / Gemini / Local GGUF) │ │ │ text-embedding- │ │ 3-small (1536d) │ │ or │ │ gemini-embed- │ │ ding-001 │ │ or │ │ local GGUF model │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ SQLite + │ ← Vectors stored in sqlite-vec │ sqlite-vec │ Alongside original text chunks │ │ and metadata (file, date, section) │ memory.db │ └─────────────────┘ ``` --- ## Search Flow When the agent needs to recall something: ``` Agent: "What did we decide about the API rate limits?" │ ▼ ┌─────────────────┐ │ memory_search │ ← Tool invoked automatically │ tool called │ (or agent calls it explicitly) └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Query embedding │ ← Same provider as index │ generated │ "API rate limits decision" └────────┬────────┘ → [0.23, -0.11, 0.87, ...] │ ├─────────────────────────┐ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Vector search │ │ Keyword search │ │ (cosine sim) │ │ (BM25 / FTS) │ │ │ │ │ │ Finds semanti- │ │ Finds exact │ │ cally similar │ │ keyword matches │ │ chunks │ │ │ └────────┬────────┘ └────────┬────────┘ │ │ └────────────┬────────────┘ │ ▼ ┌─────────────────┐ │ Hybrid merge │ ← Combines both result sets │ & ranking │ Deduplicates, re-ranks └────────┬────────┘ │ ▼ ┌─────────────────┐ │ Top N chunks │ ← Relevant memory fragments │ returned │ injected into agent context └────────┬────────┘ │ ▼ Agent has full context to answer the question 🎉 ``` --- ## Pre-Compaction Flush Flow The safety net that prevents amnesia: ``` Context Window ┌──────────────────────────────────────────┐ │ System prompt │ │ AGENTS.md │ │ Memory search results │ │ ───────────────────────────────── │ │ Old messages ← these get compacted │ │ ... │ │ ... │ │ Recent messages │ │ ───────────────────────────────── │ │ Reserve tokens (floor: 20,000) │ └──────────────────────────────────────────┘ │ Token count approaches limit (contextWindow - reserveTokensFloor - softThresholdTokens) │ ▼ ┌───────────────────────┐ │ Clawdbot triggers │ │ memory flush │ │ │ │ Silent system prompt: │ │ "Session nearing │ │ compaction. Store │ │ durable memories." │ │ │ │ Silent user prompt: │ │ "Write lasting notes │ │ to memory/; reply │ │ NO_REPLY if nothing │ │ to store." │ └───────────┬───────────┘ │ ▼ ┌───────────────────────┐ │ Agent writes to disk │ │ │ │ • Current work status │ │ • Pending decisions │ │ • Important context │ │ • Where we left off │ └───────────┬───────────┘ │ ▼ ┌───────────────────────┐ │ File watcher triggers │ │ re-index │ └───────────┬───────────┘ │ ▼ ┌───────────────────────┐ │ Compaction happens │ │ (old messages removed/ │ │ summarized) │ └───────────┬───────────┘ │ ▼ Memories safe on disk ✅ Indexed and searchable ✅ Agent can recall later ✅ ``` --- ## Storage Layout ``` ~/.clawdbot/ ├── clawdbot.json ← Config with memorySearch settings │ ├── workspace/ ← Agent workspace (configurable) │ ├── AGENTS.md ← Agent instructions (with memory habits) │ ├── MEMORY.md ← Curated long-term memory (optional) │ │ │ ├── memory/ ← Daily logs & research intel │ │ ├── 2026-01-15.md ← Daily log │ │ ├── 2026-01-16.md │ │ ├── 2026-02-10.md ← Today │ │ ├── project-x-research-intel.md │ │ ├── TEMPLATE-daily.md ← Reference template │ │ ├── TEMPLATE-research-intel.md │ │ └── TEMPLATE-project-tracking.md │ │ │ └── ... (other workspace files) │ └── agents/ └── main/ └── agent/ └── memory/ ← Vector index (managed by Clawdbot) └── memory.db ← SQLite + sqlite-vec database ``` --- ## Config Structure The memory system config lives in `clawdbot.json` under `agents.defaults.memorySearch`: ```jsonc { "agents": { "defaults": { // ... other config (model, workspace, etc.) ... "memorySearch": { // Embedding provider: "openai" | "gemini" | "local" "provider": "openai", // Model name (provider-specific) "model": "text-embedding-3-small", // Remote provider settings (OpenAI / Gemini) "remote": { "apiKey": "sk-...", // Optional if using env var "baseUrl": "...", // Optional custom endpoint "headers": {} // Optional extra headers }, // Additional paths to index (beyond memory/ and MEMORY.md) "extraPaths": ["../team-docs"], // Fallback provider if primary fails "fallback": "local" // "openai" | "gemini" | "local" | "none" }, // Pre-compaction memory flush (enabled by default) "compaction": { "reserveTokensFloor": 20000, "memoryFlush": { "enabled": true, "softThresholdTokens": 4000 } } } } } ``` --- ## Data Flow Summary ``` WRITE PATH READ PATH ────────── ───────── Agent writes note Agent needs context │ │ ▼ ▼ memory/YYYY-MM-DD.md memory_search("query") │ │ ▼ ▼ File watcher Embed query │ │ ▼ ▼ Chunk + embed Vector + keyword search │ │ ▼ ▼ Store in SQLite Return top chunks │ │ ▼ ▼ Index updated ✅ Context restored ✅ ```