10 KiB
Memory System Comparison Matrix
Detailed comparison of Clawdbot's memory system vs. alternatives.
Quick Comparison Table
| Feature | Clawdbot Memory | Long Context | RAG on Docs | Vector DB SaaS | Notion/Obsidian |
|---|---|---|---|---|---|
| Persistent across sessions | ✅ | ❌ | ✅ | ✅ | ✅ |
| Survives crashes | ✅ | ❌ | ✅ | ✅ | ✅ |
| Semantic search | ✅ | ❌ | ✅ | ✅ | ⚠️ (limited) |
| Human-editable | ✅ | ❌ | ⚠️ | ❌ | ✅ |
| Git-backed | ✅ | ❌ | ⚠️ | ❌ | ⚠️ |
| Free/Low Cost | ✅ (~$0.50/mo) | ❌ (token-heavy) | ✅ | ❌ ($50+/mo) | ⚠️ ($10/mo) |
| No cloud dependency | ✅ (local SQLite) | ✅ | ✅ | ❌ | ❌ |
| Agent can write | ✅ | ✅ | ❌ | ⚠️ | ✅ |
| Fast search (<100ms) | ✅ | ❌ | ✅ | ⚠️ (network) | ⚠️ |
| Data sovereignty | ✅ (your disk) | ✅ | ✅ | ❌ | ❌ |
| Hybrid search (semantic + keyword) | ✅ | ❌ | ⚠️ | ✅ | ⚠️ |
| Auto-indexing | ✅ | N/A | ⚠️ | ✅ | ⚠️ |
| Multi-agent support | ✅ | ⚠️ | ⚠️ | ✅ | ❌ |
Legend:
- ✅ = Full support, works well
- ⚠️ = Partial support or caveats
- ❌ = Not supported or poor fit
Detailed Comparison
1. Clawdbot Memory System (This System)
Architecture: Markdown files + SQLite + vector embeddings
Pros:
- ✅ Agent actively curates its own memory
- ✅ Human-readable and editable (plain Markdown)
- ✅ Git-backed (full version history)
- ✅ Fast semantic search (<100ms)
- ✅ Hybrid search (semantic + keyword)
- ✅ Local storage (no cloud lock-in)
- ✅ Free (after embedding setup)
- ✅ Survives crashes and restarts
- ✅ Pre-compaction auto-flush
- ✅ Multi-session persistence
Cons:
- ⚠️ Requires API key for embeddings (or local setup)
- ⚠️ Initial indexing takes a few seconds
- ⚠️ Embedding costs scale with memory size (~$0.50/mo at 35 files)
Best for:
- Personal AI assistants
- Long-running projects
- Multi-session workflows
- Agents that need to "remember" decisions
Cost: ~$0.50/month (OpenAI Batch API)
2. Long Context Windows (Claude 200K, GPT-4 128K)
Architecture: Everything in prompt context
Pros:
- ✅ Simple (no separate storage)
- ✅ Agent has "all" context available
- ✅ No indexing delay
Cons:
- ❌ Ephemeral (lost on crash/restart)
- ❌ Expensive at scale ($5-20 per long session)
- ❌ Degrades with very long contexts (needle-in-haystack)
- ❌ No semantic search (model must scan)
- ❌ Compaction loses old context
Best for:
- Single-session tasks
- One-off questions
- Contexts that fit in <50K tokens
Cost: $5-20 per session (for 100K+ token contexts)
3. RAG on External Docs
Architecture: Vector DB over static documentation
Pros:
- ✅ Good for large doc corpora
- ✅ Semantic search
- ✅ Persistent
Cons:
- ❌ Agent can't write/update docs (passive)
- ❌ Requires separate ingestion pipeline
- ⚠️ Human editing is indirect
- ⚠️ Git backing depends on doc format
- ❌ Agent doesn't "learn" (docs are static)
Best for:
- Technical documentation search
- Knowledge base Q&A
- Support chatbots
Cost: Varies (Pinecone: $70/mo, OpenAI embeddings: $0.50+/mo)
4. Vector DB SaaS (Pinecone, Weaviate, Qdrant Cloud)
Architecture: Cloud-hosted vector database
Pros:
- ✅ Fast semantic search
- ✅ Scalable (millions of vectors)
- ✅ Managed infrastructure
Cons:
- ❌ Expensive ($70+/mo for production tier)
- ❌ Cloud lock-in
- ❌ Network latency on every search
- ❌ Data lives on their servers
- ⚠️ Human editing requires API calls
- ❌ Not git-backed (proprietary storage)
Best for:
- Enterprise-scale deployments
- Multi-tenant apps
- High-throughput search
Cost: $70-500/month
5. Notion / Obsidian / Roam
Architecture: Note-taking app with API
Pros:
- ✅ Human-friendly UI
- ✅ Rich formatting
- ✅ Collaboration features (Notion)
- ✅ Agent can write via API
Cons:
- ❌ Not designed for AI memory (UI overhead)
- ⚠️ Search is UI-focused, not API-optimized
- ❌ Notion: cloud lock-in, $10/mo
- ⚠️ Obsidian: local but not structured for agents
- ❌ No vector search (keyword only)
- ⚠️ Git backing: manual or plugin-dependent
Best for:
- Human-first note-taking
- Team collaboration
- Visual knowledge graphs
Cost: $0-10/month
6. Pure Filesystem (No Search)
Architecture: Markdown files, no indexing
Pros:
- ✅ Simple
- ✅ Free
- ✅ Git-backed
- ✅ Human-editable
Cons:
- ❌ No semantic search (grep only)
- ❌ Slow to find info (must scan all files)
- ❌ Agent can't recall context efficiently
- ❌ No hybrid search
Best for:
- Very small memory footprints (<10 files)
- Temporary projects
- Humans who manually search
Cost: Free
When to Choose Which
Choose Clawdbot Memory if:
- ✅ You want persistent, searchable memory
- ✅ Agent needs to write its own memory
- ✅ You value data sovereignty (local storage)
- ✅ Budget is <$5/month
- ✅ You want git-backed history
- ✅ Multi-session workflows
Choose Long Context if:
- ✅ Single-session tasks only
- ✅ Budget is flexible ($5-20/session OK)
- ✅ Context fits in <50K tokens
- ❌ Don't need persistence
Choose RAG on Docs if:
- ✅ Large existing doc corpus
- ✅ Docs rarely change
- ❌ Agent doesn't need to write
- ✅ Multiple agents share same knowledge
Choose Vector DB SaaS if:
- ✅ Enterprise scale (millions of vectors)
- ✅ Multi-tenant app
- ✅ Budget is $100+/month
- ❌ Data sovereignty isn't critical
Choose Notion/Obsidian if:
- ✅ Humans are primary users
- ✅ Visual knowledge graphs matter
- ✅ Collaboration is key
- ⚠️ Agent memory is secondary
Choose Pure Filesystem if:
- ✅ Tiny memory footprint (<10 files)
- ✅ Temporary project
- ❌ Search speed doesn't matter
Hybrid Approaches
Clawdbot Memory + Long Context
Best of both worlds:
- Use memory for durable facts/decisions
- Use context for current session detail
- Pre-compaction flush keeps memory updated
- This is what Jake's setup does
Clawdbot Memory + RAG
For large doc sets:
- Memory: agent's personal notes
- RAG: external documentation
- Agent searches both as needed
Clawdbot Memory + Notion
For team collaboration:
- Memory: agent's internal state
- Notion: shared team wiki
- Agent syncs key info to Notion
Migration Paths
From Long Context → Clawdbot Memory
- Extract key facts from long sessions
- Write to
memory/files - Index via
clawdbot memory index - Continue with hybrid approach
From Notion → Clawdbot Memory
- Export Notion pages as Markdown
- Move to
memory/directory - Index via
clawdbot memory index - Keep Notion for team wiki, memory for agent state
From Vector DB → Clawdbot Memory
- Export vectors (if possible) or re-embed
- Convert to Markdown + SQLite
- Index locally
- Optionally keep Vector DB for shared/production data
Real-World Performance
Jake's Production Stats (26 days, 35 files)
| Metric | Value |
|---|---|
| Files | 35 markdown files |
| Chunks | 121 |
| Memories | 116 |
| SQLite size | 15 MB |
| Search speed | <100ms |
| Embedding cost | ~$0.50/month |
| Crashes survived | 5+ |
| Data loss | Zero |
| Daily usage | 10-50 searches/day |
| Git commits | Daily (automated) |
Scaling Projection
| Scale | Files | Chunks | SQLite Size | Search Speed | Monthly Cost |
|---|---|---|---|---|---|
| Small | 10-50 | 50-200 | 5-20 MB | <100ms | $0.50 |
| Medium | 50-200 | 200-1000 | 20-80 MB | <200ms | $2-5 |
| Large | 200-500 | 1000-2500 | 80-200 MB | <500ms | $10-20 |
| XL | 500-1000 | 2500-5000 | 200-500 MB | <1s | $30-50 |
| XXL | 1000+ | 5000+ | 500+ MB | Consider partitioning | $50+ |
Note: At 1000+ files, consider archiving old logs or partitioning by date/project.
Cost Breakdown (OpenAI Batch API)
Initial Indexing (35 files, 121 chunks)
- Tokens: ~50,000 (121 chunks × ~400 tokens avg)
- Embedding cost: $0.001 per 1K tokens (Batch API)
- Total: ~$0.05
Daily Updates (3 files, ~10 chunks)
- Tokens: ~4,000
- Embedding cost: $0.004
- Monthly: ~$0.12
Ongoing Search (100 searches/day)
- Search: Local SQLite (free)
- No per-query cost
Total Monthly: ~$0.50
Compare to:
- Long context (100K tokens/session): $5-20/session
- Pinecone: $70/month (starter tier)
- Notion API: $10/month (plus rate limits)
Feature Matrix Deep Dive
Persistence
| System | Survives Crash | Survives Restart | Survives Power Loss |
|---|---|---|---|
| Clawdbot Memory | ✅ | ✅ | ✅ (if git pushed) |
| Long Context | ❌ | ❌ | ❌ |
| RAG | ✅ | ✅ | ✅ |
| Vector DB SaaS | ✅ | ✅ | ⚠️ (cloud dependent) |
| Notion | ✅ | ✅ | ✅ (cloud) |
Search Quality
| System | Semantic | Keyword | Hybrid | Speed |
|---|---|---|---|---|
| Clawdbot Memory | ✅ | ✅ | ✅ | <100ms |
| Long Context | ⚠️ (model scan) | ⚠️ (model scan) | ❌ | Slow |
| RAG | ✅ | ⚠️ | ⚠️ | <200ms |
| Vector DB SaaS | ✅ | ❌ | ⚠️ | <300ms (network) |
| Notion | ❌ | ✅ | ❌ | Varies |
Agent Control
| System | Agent Can Write | Agent Can Edit | Agent Can Delete | Auto-Index |
|---|---|---|---|---|
| Clawdbot Memory | ✅ | ✅ | ✅ | ✅ |
| Long Context | ✅ | ✅ | ✅ | N/A |
| RAG | ❌ | ❌ | ❌ | ⚠️ |
| Vector DB SaaS | ⚠️ (via API) | ⚠️ (via API) | ⚠️ (via API) | ⚠️ |
| Notion | ✅ (via API) | ✅ (via API) | ✅ (via API) | ❌ |
Bottom Line
For personal AI assistants like Buba:
🥇 #1: Clawdbot Memory System
- Best balance of cost, control, persistence, and search
- Agent-friendly (write/edit/delete)
- Git-backed safety
- Local storage (data sovereignty)
🥈 #2: Clawdbot Memory + Long Context (Hybrid)
- Memory for durable facts
- Context for current session
- This is Jake's setup — it works great
🥉 #3: RAG on Docs
- If you have massive existing docs
- Agent doesn't need to write
❌ Avoid for personal assistants:
- Vector DB SaaS (overkill + expensive)
- Pure long context (not persistent)
- Notion/Obsidian (not optimized for AI)
END OF COMPARISON
ᕕ( ᐛ )ᕗ