One-command persistent memory for Clawdbot. Prevents context amnesia during compaction with: - Two-layer memory: Markdown source of truth + SQLite vector search - Pre-compaction flush to save context before it's lost - Semantic search across all memory files - Daily logs, research intel, and project tracking templates - Interactive installer with dry-run and uninstall support
309 lines
12 KiB
Markdown
309 lines
12 KiB
Markdown
# Architecture
|
|
|
|
Technical details of how the Clawdbot Memory System works.
|
|
|
|
---
|
|
|
|
## System Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CLAWDBOT AGENT │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
|
|
│ │ Chat Session │ │ Tool: write │ │ Tool: memory_ │ │
|
|
│ │ │ │ (file ops) │ │ search │ │
|
|
│ └───────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
|
|
│ │ │ │ │
|
|
└──────────┼───────────────────┼──────────────────────┼─────────────┘
|
|
│ │ │
|
|
│ ┌──────▼───────┐ ┌──────▼───────┐
|
|
│ │ Markdown │ │ SQLite + │
|
|
│ │ Files │◄──────│ sqlite-vec │
|
|
│ │ (source of │ index │ Vector Store │
|
|
│ │ truth) │───────► │
|
|
│ └──────────────┘ └───────────────┘
|
|
│ ▲
|
|
│ │
|
|
└───────────────────┘
|
|
Agent writes memories
|
|
during session
|
|
```
|
|
|
|
---
|
|
|
|
## Write Flow
|
|
|
|
When the agent decides to store a memory:
|
|
|
|
```
|
|
Agent decides to remember something
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Write to file │
|
|
│ memory/YYYY-MM- │
|
|
│ DD.md │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ File watcher │ ← Clawdbot watches memory/ for changes
|
|
│ detects change │ (debounced — waits for writes to settle)
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Chunking │ ← File split into meaningful chunks
|
|
│ (by section/ │ (headers, paragraphs, list items)
|
|
│ paragraph) │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Embedding │ ← Each chunk → embedding vector
|
|
│ Provider │ (OpenAI / Gemini / Local GGUF)
|
|
│ │
|
|
│ text-embedding- │
|
|
│ 3-small (1536d) │
|
|
│ or │
|
|
│ gemini-embed- │
|
|
│ ding-001 │
|
|
│ or │
|
|
│ local GGUF model │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ SQLite + │ ← Vectors stored in sqlite-vec
|
|
│ sqlite-vec │ Alongside original text chunks
|
|
│ │ and metadata (file, date, section)
|
|
│ memory.db │
|
|
└─────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Search Flow
|
|
|
|
When the agent needs to recall something:
|
|
|
|
```
|
|
Agent: "What did we decide about the API rate limits?"
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ memory_search │ ← Tool invoked automatically
|
|
│ tool called │ (or agent calls it explicitly)
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Query embedding │ ← Same provider as index
|
|
│ generated │ "API rate limits decision"
|
|
└────────┬────────┘ → [0.23, -0.11, 0.87, ...]
|
|
│
|
|
├─────────────────────────┐
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ Vector search │ │ Keyword search │
|
|
│ (cosine sim) │ │ (BM25 / FTS) │
|
|
│ │ │ │
|
|
│ Finds semanti- │ │ Finds exact │
|
|
│ cally similar │ │ keyword matches │
|
|
│ chunks │ │ │
|
|
└────────┬────────┘ └────────┬────────┘
|
|
│ │
|
|
└────────────┬────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Hybrid merge │ ← Combines both result sets
|
|
│ & ranking │ Deduplicates, re-ranks
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Top N chunks │ ← Relevant memory fragments
|
|
│ returned │ injected into agent context
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
Agent has full context
|
|
to answer the question 🎉
|
|
```
|
|
|
|
---
|
|
|
|
## Pre-Compaction Flush Flow
|
|
|
|
The safety net that prevents amnesia:
|
|
|
|
```
|
|
Context Window
|
|
┌──────────────────────────────────────────┐
|
|
│ System prompt │
|
|
│ AGENTS.md │
|
|
│ Memory search results │
|
|
│ ───────────────────────────────── │
|
|
│ Old messages ← these get compacted │
|
|
│ ... │
|
|
│ ... │
|
|
│ Recent messages │
|
|
│ ───────────────────────────────── │
|
|
│ Reserve tokens (floor: 20,000) │
|
|
└──────────────────────────────────────────┘
|
|
│
|
|
Token count approaches limit
|
|
(contextWindow - reserveTokensFloor
|
|
- softThresholdTokens)
|
|
│
|
|
▼
|
|
┌───────────────────────┐
|
|
│ Clawdbot triggers │
|
|
│ memory flush │
|
|
│ │
|
|
│ Silent system prompt: │
|
|
│ "Session nearing │
|
|
│ compaction. Store │
|
|
│ durable memories." │
|
|
│ │
|
|
│ Silent user prompt: │
|
|
│ "Write lasting notes │
|
|
│ to memory/; reply │
|
|
│ NO_REPLY if nothing │
|
|
│ to store." │
|
|
└───────────┬───────────┘
|
|
│
|
|
▼
|
|
┌───────────────────────┐
|
|
│ Agent writes to disk │
|
|
│ │
|
|
│ • Current work status │
|
|
│ • Pending decisions │
|
|
│ • Important context │
|
|
│ • Where we left off │
|
|
└───────────┬───────────┘
|
|
│
|
|
▼
|
|
┌───────────────────────┐
|
|
│ File watcher triggers │
|
|
│ re-index │
|
|
└───────────┬───────────┘
|
|
│
|
|
▼
|
|
┌───────────────────────┐
|
|
│ Compaction happens │
|
|
│ (old messages removed/ │
|
|
│ summarized) │
|
|
└───────────┬───────────┘
|
|
│
|
|
▼
|
|
Memories safe on disk ✅
|
|
Indexed and searchable ✅
|
|
Agent can recall later ✅
|
|
```
|
|
|
|
---
|
|
|
|
## Storage Layout
|
|
|
|
```
|
|
~/.clawdbot/
|
|
├── clawdbot.json ← Config with memorySearch settings
|
|
│
|
|
├── workspace/ ← Agent workspace (configurable)
|
|
│ ├── AGENTS.md ← Agent instructions (with memory habits)
|
|
│ ├── MEMORY.md ← Curated long-term memory (optional)
|
|
│ │
|
|
│ ├── memory/ ← Daily logs & research intel
|
|
│ │ ├── 2026-01-15.md ← Daily log
|
|
│ │ ├── 2026-01-16.md
|
|
│ │ ├── 2026-02-10.md ← Today
|
|
│ │ ├── project-x-research-intel.md
|
|
│ │ ├── TEMPLATE-daily.md ← Reference template
|
|
│ │ ├── TEMPLATE-research-intel.md
|
|
│ │ └── TEMPLATE-project-tracking.md
|
|
│ │
|
|
│ └── ... (other workspace files)
|
|
│
|
|
└── agents/
|
|
└── main/
|
|
└── agent/
|
|
└── memory/ ← Vector index (managed by Clawdbot)
|
|
└── memory.db ← SQLite + sqlite-vec database
|
|
```
|
|
|
|
---
|
|
|
|
## Config Structure
|
|
|
|
The memory system config lives in `clawdbot.json` under `agents.defaults.memorySearch`:
|
|
|
|
```jsonc
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
// ... other config (model, workspace, etc.) ...
|
|
|
|
"memorySearch": {
|
|
// Embedding provider: "openai" | "gemini" | "local"
|
|
"provider": "openai",
|
|
|
|
// Model name (provider-specific)
|
|
"model": "text-embedding-3-small",
|
|
|
|
// Remote provider settings (OpenAI / Gemini)
|
|
"remote": {
|
|
"apiKey": "sk-...", // Optional if using env var
|
|
"baseUrl": "...", // Optional custom endpoint
|
|
"headers": {} // Optional extra headers
|
|
},
|
|
|
|
// Additional paths to index (beyond memory/ and MEMORY.md)
|
|
"extraPaths": ["../team-docs"],
|
|
|
|
// Fallback provider if primary fails
|
|
"fallback": "local" // "openai" | "gemini" | "local" | "none"
|
|
},
|
|
|
|
// Pre-compaction memory flush (enabled by default)
|
|
"compaction": {
|
|
"reserveTokensFloor": 20000,
|
|
"memoryFlush": {
|
|
"enabled": true,
|
|
"softThresholdTokens": 4000
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Data Flow Summary
|
|
|
|
```
|
|
WRITE PATH READ PATH
|
|
────────── ─────────
|
|
|
|
Agent writes note Agent needs context
|
|
│ │
|
|
▼ ▼
|
|
memory/YYYY-MM-DD.md memory_search("query")
|
|
│ │
|
|
▼ ▼
|
|
File watcher Embed query
|
|
│ │
|
|
▼ ▼
|
|
Chunk + embed Vector + keyword search
|
|
│ │
|
|
▼ ▼
|
|
Store in SQLite Return top chunks
|
|
│ │
|
|
▼ ▼
|
|
Index updated ✅ Context restored ✅
|
|
```
|