# Clawdbot Memory System Migration Plan **Created:** 2026-01-27 **Status:** Ready to Execute **Risk Level:** Low (old system preserved, incremental migration) --- ## Current State Inventory | Asset | Location | Size | Records | |-------|----------|------|---------| | Main SQLite | `~/.clawdbot/memory/main.sqlite` | 9.0 MB | 56 chunks | | iMessage SQLite | `~/.clawdbot/memory/imessage.sqlite` | 8.1 MB | ~42 chunks | | Markdown files | `~/.clawdbot/workspace/memory/*.md` | 17 files | ~60KB total | | INDEX.json | `~/.clawdbot/workspace/memory/INDEX.json` | 7.1 KB | 6 categories, 20 nodes | | Session transcripts | `~/.clawdbot/agents/*/sessions/*.jsonl` | 23 files | 5,593 lines | | New memories table | `~/.clawdbot/memory/main.sqlite` | - | 36 records (migrated) | --- ## Migration Phases ### Phase 0: Backup Everything (REQUIRED FIRST) **Time:** 5 minutes **Risk:** None ```bash # Create timestamped backup directory BACKUP_DIR=~/.clawdbot/backups/pre-migration-$(date +%Y%m%d-%H%M%S) mkdir -p "$BACKUP_DIR" # Backup SQLite databases cp ~/.clawdbot/memory/main.sqlite "$BACKUP_DIR/" cp ~/.clawdbot/memory/imessage.sqlite "$BACKUP_DIR/" # Backup markdown memory files cp -r ~/.clawdbot/workspace/memory "$BACKUP_DIR/memory-markdown" # Backup session transcripts cp -r ~/.clawdbot/agents "$BACKUP_DIR/agents" # Backup config cp ~/.clawdbot/clawdbot.json "$BACKUP_DIR/" # Verify backup echo "Backup created at: $BACKUP_DIR" ls -la "$BACKUP_DIR" ``` **Checkpoint:** Verify backup directory has all files before proceeding. --- ### Phase 1: Complete Markdown Migration **Time:** 15 minutes **Risk:** Low (additive only) We already migrated CRITICAL-REFERENCE.md, Genre Universe, and Remix Sniper. Now migrate the remaining files. #### Files to Migrate: | File | Content Type | Priority | |------|-------------|----------| | `2026-01-14.md` | Daily log - GOG setup | Medium | | `2026-01-15.md` | Daily log - agent-browser, Reonomy | High | | `2026-01-25.md` | Security incident - Reed breach | High | | `2026-01-26.md` | Daily log - Reonomy v13 | Medium | | `2026-01-19-backup-system.md` | Backup system setup | Medium | | `2026-01-19-cloud-backup.md` | Cloud backup config | Medium | | `burton-method-research-intel.md` | Competitor research | High | | `contacts-leaf-gc.md` | Contact info | Medium | | `contacts-skivals-gc.md` | Contact info | Medium | | `imessage-rules.md` | Security rules | High | | `imessage-security-rules.md` | Security rules | High | | `remi-self-healing.md` | Remix Sniper healing | Medium | | `voice-ai-comparison-2026.md` | Research | Low | | `accounts.md` | Accounts | Low | #### Migration Script Extension: ```python # Add to migrate-memories.py def migrate_daily_logs(db): """Migrate daily log files.""" memories = [] # 2026-01-14 - GOG setup memories.append(( "GOG (Google Workspace CLI) configured with 3 accounts: jake@burtonmethod.com, jake@localbosses.org, jakeshore98@gmail.com", "fact", None, "2026-01-14.md" )) # 2026-01-15 - agent-browser memories.append(( "agent-browser is Vercel Labs headless browser CLI with ref-based navigation, semantic locators, state persistence. Commands: open, snapshot -i, click @ref, type @ref 'text'", "fact", None, "2026-01-15.md" )) memories.append(( "Reonomy scraper attempted with agent-browser. URL pattern discovered: ownership tab in search filters allows searching by Owner Contact Information.", "fact", None, "2026-01-15.md" )) # 2026-01-25 - Security incident memories.append(( "SECURITY INCIDENT 2026-01-25: Reed breach. Contact memory poisoning. Password leaked. Rules updated. Rotate all passwords after breach.", "security", None, "2026-01-25.md" )) # ... continue for all files for content, mtype, guild_id, source in memories: insert_memory(db, content, mtype, source, guild_id) return len(memories) def migrate_security_rules(db): """Migrate iMessage security rules.""" memories = [ ("iMessage password gating: Password JAJAJA2026 required. Mention gating (Buba). Never reveal password in any context.", "security", None), ("iMessage trust chain: Only trust Jake (914-500-9208). Everyone else must verify with Jake first, then chat-only mode with password.", "security", None), ] for content, mtype, guild_id in memories: insert_memory(db, content, mtype, "imessage-security-rules.md", guild_id) return len(memories) def migrate_contacts(db): """Migrate contact information (non-sensitive parts only).""" memories = [ ("Contact: Leaf GC - group chat contact for Leaf-related communications", "relationship", None), ("Contact: Skivals GC - group chat contact for Skivals-related communications", "relationship", None), ] for content, mtype, guild_id in memories: insert_memory(db, content, mtype, "contacts.md", guild_id) return len(memories) ``` **Checkpoint:** Run `python memory-retrieval.py stats` and verify count increased. --- ### Phase 2: Migrate Existing Chunks (Vector Embeddings) **Time:** 10 minutes **Risk:** Low (copies data, doesn't delete) The existing `chunks` table has 56 pre-embedded chunks. We should copy these to memories table to preserve the embeddings. ```sql -- Copy chunks to memories (preserving embeddings) INSERT INTO memories ( content, embedding, memory_type, source, source_file, created_at, confidence ) SELECT text as content, embedding, 'fact' as memory_type, 'chunks_migration' as source, path as source_file, COALESCE(updated_at, unixepoch()) as created_at, 1.0 as confidence FROM chunks WHERE NOT EXISTS ( SELECT 1 FROM memories m WHERE m.source_file = chunks.path AND m.source = 'chunks_migration' ); ``` **Checkpoint:** Verify with `SELECT COUNT(*) FROM memories WHERE source = 'chunks_migration'` --- ### Phase 3: Session Transcript Indexing (Optional - Later) **Time:** 30-60 minutes **Risk:** Medium (large data volume) Session transcripts contain conversation history. This is valuable but voluminous. #### Strategy: Selective Indexing Don't index every message. Index: 1. Messages where Clawdbot learned something (contains "I'll remember", "noted", "got it") 2. User corrections ("actually it's...", "no, the correct...") 3. Explicit requests ("remember that...", "don't forget...") ```python def extract_memorable_from_sessions(): """Extract memorable moments from session transcripts.""" import json import glob session_files = glob.glob(os.path.expanduser( "~/.clawdbot/agents/*/sessions/*.jsonl" )) memorable_patterns = [ r"I'll remember", r"I've noted", r"remember that", r"don't forget", r"actually it's", r"the correct", r"important:", r"key point:", ] memories = [] for fpath in session_files: with open(fpath) as f: for line in f: try: entry = json.loads(line) # Check if it matches memorable patterns # Extract and store except: pass return memories ``` **Recommendation:** Skip this for now. The markdown files contain the curated important stuff. Sessions are backup/audit trail. --- ### Phase 4: Wire Into Clawdbot Runtime **Time:** 30-60 minutes **Risk:** Medium (changes bot behavior) This requires modifying Clawdbot's code to use the new memory system. #### 4.1 Create Memory Interface Module Location: `~/.clawdbot/workspace/memory_interface.py` ```python """ Memory interface for Clawdbot runtime. Import this in your bot's message handler. """ from memory_retrieval import ( search_memories, add_memory, get_recent_memories, supersede_memory ) def get_context_for_message(message, guild_id, channel_id, user_id): """ Get relevant memory context for responding to a message. Call this before generating a response. """ # Search for relevant memories results = search_memories( query=message, guild_id=guild_id, limit=5 ) if not results: # Fall back to recent memories for this guild results = get_recent_memories(guild_id=guild_id, limit=3) # Format for context injection context_lines = [] for r in results: context_lines.append(f"[Memory] {r['content']}") return "\n".join(context_lines) def should_remember(response_text): """ Check if the bot's response indicates something should be remembered. """ triggers = [ "i'll remember", "i've noted", "got it", "noted", "understood", ] lower = response_text.lower() return any(t in lower for t in triggers) def extract_and_store(message, response, guild_id, channel_id, user_id): """ If the response indicates learning, extract and store the memory. """ if not should_remember(response): return None # The message itself is what should be remembered memory_id = add_memory( content=message, memory_type="fact", guild_id=guild_id, channel_id=channel_id, user_id=user_id, source="conversation" ) return memory_id ``` #### 4.2 Integration Points In Clawdbot's message handler: ```python # Before generating response: memory_context = get_context_for_message( message=user_message, guild_id=str(message.guild.id) if message.guild else None, channel_id=str(message.channel.id), user_id=str(message.author.id) ) # Inject into prompt: system_prompt = f""" {base_system_prompt} Relevant memories: {memory_context} """ # After generating response: extract_and_store( message=user_message, response=bot_response, guild_id=..., channel_id=..., user_id=... ) ``` --- ### Phase 5: Deprecate Old System **Time:** 5 minutes **Risk:** Low (keep files, just stop using) Once the new system is validated: 1. **Keep old files** - Don't delete markdown files, they're human-readable backup 2. **Stop writing to old locations** - New memories go to SQLite only 3. **Archive old chunks table** - Rename to `chunks_archive` ```sql -- Archive old chunks table (don't delete) ALTER TABLE chunks RENAME TO chunks_archive; ALTER TABLE chunks_fts RENAME TO chunks_fts_archive; ``` **DO NOT** delete the old files until you've run the new system for at least 2 weeks without issues. --- ## Validation Checkpoints ### After Each Phase: | Check | Command | Expected | |-------|---------|----------| | Memory count | `python memory-retrieval.py stats` | Count increases | | Search works | `python memory-retrieval.py search "Das"` | Returns results | | Guild scoping | `python memory-retrieval.py search "remix" --guild 1449158500344270961` | Only The Hive results | | FTS works | `sqlite3 ~/.clawdbot/memory/main.sqlite "SELECT COUNT(*) FROM memories_fts"` | Matches memories count | ### Integration Test (After Phase 4): 1. Send message to Clawdbot: "What do you know about Das?" 2. Verify response includes Genre Universe info 3. Send message: "Remember that Das prefers releasing on Fridays" 4. Search: `python memory-retrieval.py search "Das Friday"` 5. Verify new memory exists --- ## Rollback Plan If anything goes wrong: ```bash # 1. Restore from backup BACKUP_DIR=~/.clawdbot/backups/pre-migration-YYYYMMDD-HHMMSS # Restore databases cp "$BACKUP_DIR/main.sqlite" ~/.clawdbot/memory/ cp "$BACKUP_DIR/imessage.sqlite" ~/.clawdbot/memory/ # Restore markdown (if needed) cp -r "$BACKUP_DIR/memory-markdown/"* ~/.clawdbot/workspace/memory/ # 2. Drop new tables (if needed) sqlite3 ~/.clawdbot/memory/main.sqlite " DROP TABLE IF EXISTS memories; DROP TABLE IF EXISTS memories_fts; " # 3. Restart Clawdbot # (your restart command here) ``` --- ## Timeline | Phase | Duration | Dependency | |-------|----------|------------| | Phase 0: Backup | 5 min | None | | Phase 1: Markdown migration | 15 min | Phase 0 | | Phase 2: Chunks migration | 10 min | Phase 1 | | Phase 3: Sessions (optional) | 30-60 min | Phase 2 | | Phase 4: Runtime integration | 30-60 min | Phase 2 | | Phase 5: Deprecate old | 5 min | Phase 4 validated | **Total: 1-2 hours** (excluding Phase 3) --- ## Post-Migration Maintenance ### Weekly (Cron): ```bash # Add to crontab 0 3 * * 0 cd ~/.clawdbot/workspace && python3 memory-maintenance.py run >> ~/.clawdbot/logs/memory-maintenance.log 2>&1 ``` ### Monthly: - Review `python memory-maintenance.py stats` - Check for memories stuck at low confidence - Verify per-guild counts are balanced ### Quarterly: - Full backup - Review if session indexing is needed - Consider re-embedding if switching embedding models --- ## Files Reference | File | Purpose | |------|---------| | `migrate-memories.py` | One-time migration script | | `memory-retrieval.py` | Search/add/supersede API | | `memory-maintenance.py` | Decay/prune/limits | | `memory_interface.py` | Runtime integration (create in Phase 4) | | `MEMORY-MIGRATION-PLAN.md` | This document | --- ## Success Criteria The migration is complete when: 1. ✅ All markdown files have been processed (key facts extracted) 2. ✅ Old chunks are copied to memories table with embeddings 3. ✅ Search returns relevant results for test queries 4. ✅ Guild scoping works correctly 5. ✅ Clawdbot uses new memory in responses 6. ✅ "Remember this" creates new memories 7. ✅ Weekly maintenance cron is running 8. ✅ Old system files are preserved but not actively used --- ## Questions Before Starting 1. **Do you want to migrate session transcripts?** (Recommended: No, for now) 2. **Which guild should we test first?** (Recommended: Das server - most memories) 3. **When do you want to do the runtime integration?** (Requires Clawdbot restart)