Jake Shore cbc2f5e973 Daily backup: 2026-01-28

2026-01-28 23:00:58 -05:00

14 KiB

Raw Permalink Blame History

Clawdbot Memory System Migration Plan

Created: 2026-01-27 Status: Ready to Execute Risk Level: Low (old system preserved, incremental migration)

Current State Inventory

Asset	Location	Size	Records
Main SQLite	`~/.clawdbot/memory/main.sqlite`	9.0 MB	56 chunks
iMessage SQLite	`~/.clawdbot/memory/imessage.sqlite`	8.1 MB	~42 chunks
Markdown files	`~/.clawdbot/workspace/memory/*.md`	17 files	~60KB total
INDEX.json	`~/.clawdbot/workspace/memory/INDEX.json`	7.1 KB	6 categories, 20 nodes
Session transcripts	`~/.clawdbot/agents//sessions/.jsonl`	23 files	5,593 lines
New memories table	`~/.clawdbot/memory/main.sqlite`	-	36 records (migrated)

Migration Phases

Phase 0: Backup Everything (REQUIRED FIRST)

Time: 5 minutes Risk: None

# Create timestamped backup directory
BACKUP_DIR=~/.clawdbot/backups/pre-migration-$(date +%Y%m%d-%H%M%S)
mkdir -p "$BACKUP_DIR"

# Backup SQLite databases
cp ~/.clawdbot/memory/main.sqlite "$BACKUP_DIR/"
cp ~/.clawdbot/memory/imessage.sqlite "$BACKUP_DIR/"

# Backup markdown memory files
cp -r ~/.clawdbot/workspace/memory "$BACKUP_DIR/memory-markdown"

# Backup session transcripts
cp -r ~/.clawdbot/agents "$BACKUP_DIR/agents"

# Backup config
cp ~/.clawdbot/clawdbot.json "$BACKUP_DIR/"

# Verify backup
echo "Backup created at: $BACKUP_DIR"
ls -la "$BACKUP_DIR"

Checkpoint: Verify backup directory has all files before proceeding.

Phase 1: Complete Markdown Migration

Time: 15 minutes Risk: Low (additive only)

We already migrated CRITICAL-REFERENCE.md, Genre Universe, and Remix Sniper. Now migrate the remaining files.

Files to Migrate:

File	Content Type	Priority
`2026-01-14.md`	Daily log - GOG setup	Medium
`2026-01-15.md`	Daily log - agent-browser, Reonomy	High
`2026-01-25.md`	Security incident - Reed breach	High
`2026-01-26.md`	Daily log - Reonomy v13	Medium
`2026-01-19-backup-system.md`	Backup system setup	Medium
`2026-01-19-cloud-backup.md`	Cloud backup config	Medium
`burton-method-research-intel.md`	Competitor research	High
`contacts-leaf-gc.md`	Contact info	Medium
`contacts-skivals-gc.md`	Contact info	Medium
`imessage-rules.md`	Security rules	High
`imessage-security-rules.md`	Security rules	High
`remi-self-healing.md`	Remix Sniper healing	Medium
`voice-ai-comparison-2026.md`	Research	Low
`accounts.md`	Accounts	Low

Migration Script Extension:

# Add to migrate-memories.py

def migrate_daily_logs(db):
    """Migrate daily log files."""
    memories = []

    # 2026-01-14 - GOG setup
    memories.append((
        "GOG (Google Workspace CLI) configured with 3 accounts: jake@burtonmethod.com, jake@localbosses.org, jakeshore98@gmail.com",
        "fact", None, "2026-01-14.md"
    ))

    # 2026-01-15 - agent-browser
    memories.append((
        "agent-browser is Vercel Labs headless browser CLI with ref-based navigation, semantic locators, state persistence. Commands: open, snapshot -i, click @ref, type @ref 'text'",
        "fact", None, "2026-01-15.md"
    ))
    memories.append((
        "Reonomy scraper attempted with agent-browser. URL pattern discovered: ownership tab in search filters allows searching by Owner Contact Information.",
        "fact", None, "2026-01-15.md"
    ))

    # 2026-01-25 - Security incident
    memories.append((
        "SECURITY INCIDENT 2026-01-25: Reed breach. Contact memory poisoning. Password leaked. Rules updated. Rotate all passwords after breach.",
        "security", None, "2026-01-25.md"
    ))

    # ... continue for all files

    for content, mtype, guild_id, source in memories:
        insert_memory(db, content, mtype, source, guild_id)

    return len(memories)

def migrate_security_rules(db):
    """Migrate iMessage security rules."""
    memories = [
        ("iMessage password gating: Password JAJAJA2026 required. Mention gating (Buba). Never reveal password in any context.", "security", None),
        ("iMessage trust chain: Only trust Jake (914-500-9208). Everyone else must verify with Jake first, then chat-only mode with password.", "security", None),
    ]
    for content, mtype, guild_id in memories:
        insert_memory(db, content, mtype, "imessage-security-rules.md", guild_id)
    return len(memories)

def migrate_contacts(db):
    """Migrate contact information (non-sensitive parts only)."""
    memories = [
        ("Contact: Leaf GC - group chat contact for Leaf-related communications", "relationship", None),
        ("Contact: Skivals GC - group chat contact for Skivals-related communications", "relationship", None),
    ]
    for content, mtype, guild_id in memories:
        insert_memory(db, content, mtype, "contacts.md", guild_id)
    return len(memories)

Checkpoint: Run python memory-retrieval.py stats and verify count increased.

Phase 2: Migrate Existing Chunks (Vector Embeddings)

Time: 10 minutes Risk: Low (copies data, doesn't delete)

The existing chunks table has 56 pre-embedded chunks. We should copy these to memories table to preserve the embeddings.

-- Copy chunks to memories (preserving embeddings)
INSERT INTO memories (
    content,
    embedding,
    memory_type,
    source,
    source_file,
    created_at,
    confidence
)
SELECT
    text as content,
    embedding,
    'fact' as memory_type,
    'chunks_migration' as source,
    path as source_file,
    COALESCE(updated_at, unixepoch()) as created_at,
    1.0 as confidence
FROM chunks
WHERE NOT EXISTS (
    SELECT 1 FROM memories m
    WHERE m.source_file = chunks.path
    AND m.source = 'chunks_migration'
);

Checkpoint: Verify with SELECT COUNT(*) FROM memories WHERE source = 'chunks_migration'

Phase 3: Session Transcript Indexing (Optional - Later)

Time: 30-60 minutes Risk: Medium (large data volume)

Session transcripts contain conversation history. This is valuable but voluminous.

Strategy: Selective Indexing

Don't index every message. Index:

Messages where Clawdbot learned something (contains "I'll remember", "noted", "got it")
User corrections ("actually it's...", "no, the correct...")
Explicit requests ("remember that...", "don't forget...")

def extract_memorable_from_sessions():
    """Extract memorable moments from session transcripts."""
    import json
    import glob

    session_files = glob.glob(os.path.expanduser(
        "~/.clawdbot/agents/*/sessions/*.jsonl"
    ))

    memorable_patterns = [
        r"I'll remember",
        r"I've noted",
        r"remember that",
        r"don't forget",
        r"actually it's",
        r"the correct",
        r"important:",
        r"key point:",
    ]

    memories = []
    for fpath in session_files:
        with open(fpath) as f:
            for line in f:
                try:
                    entry = json.loads(line)
                    # Check if it matches memorable patterns
                    # Extract and store
                except:
                    pass

    return memories

Recommendation: Skip this for now. The markdown files contain the curated important stuff. Sessions are backup/audit trail.

Phase 4: Wire Into Clawdbot Runtime

Time: 30-60 minutes Risk: Medium (changes bot behavior)

This requires modifying Clawdbot's code to use the new memory system.

4.1 Create Memory Interface Module

Location: ~/.clawdbot/workspace/memory_interface.py

"""
Memory interface for Clawdbot runtime.
Import this in your bot's message handler.
"""

from memory_retrieval import (
    search_memories,
    add_memory,
    get_recent_memories,
    supersede_memory
)

def get_context_for_message(message, guild_id, channel_id, user_id):
    """
    Get relevant memory context for responding to a message.
    Call this before generating a response.
    """
    # Search for relevant memories
    results = search_memories(
        query=message,
        guild_id=guild_id,
        limit=5
    )

    if not results:
        # Fall back to recent memories for this guild
        results = get_recent_memories(guild_id=guild_id, limit=3)

    # Format for context injection
    context_lines = []
    for r in results:
        context_lines.append(f"[Memory] {r['content']}")

    return "\n".join(context_lines)

def should_remember(response_text):
    """
    Check if the bot's response indicates something should be remembered.
    """
    triggers = [
        "i'll remember",
        "i've noted",
        "got it",
        "noted",
        "understood",
    ]
    lower = response_text.lower()
    return any(t in lower for t in triggers)

def extract_and_store(message, response, guild_id, channel_id, user_id):
    """
    If the response indicates learning, extract and store the memory.
    """
    if not should_remember(response):
        return None

    # The message itself is what should be remembered
    memory_id = add_memory(
        content=message,
        memory_type="fact",
        guild_id=guild_id,
        channel_id=channel_id,
        user_id=user_id,
        source="conversation"
    )

    return memory_id

4.2 Integration Points

In Clawdbot's message handler:

# Before generating response:
memory_context = get_context_for_message(
    message=user_message,
    guild_id=str(message.guild.id) if message.guild else None,
    channel_id=str(message.channel.id),
    user_id=str(message.author.id)
)

# Inject into prompt:
system_prompt = f"""
{base_system_prompt}

Relevant memories:
{memory_context}
"""

# After generating response:
extract_and_store(
    message=user_message,
    response=bot_response,
    guild_id=...,
    channel_id=...,
    user_id=...
)

Phase 5: Deprecate Old System

Time: 5 minutes Risk: Low (keep files, just stop using)

Once the new system is validated:

Keep old files - Don't delete markdown files, they're human-readable backup
Stop writing to old locations - New memories go to SQLite only
Archive old chunks table - Rename to chunks_archive

-- Archive old chunks table (don't delete)
ALTER TABLE chunks RENAME TO chunks_archive;
ALTER TABLE chunks_fts RENAME TO chunks_fts_archive;

DO NOT delete the old files until you've run the new system for at least 2 weeks without issues.

Validation Checkpoints

After Each Phase:

Check	Command	Expected
Memory count	`python memory-retrieval.py stats`	Count increases
Search works	`python memory-retrieval.py search "Das"`	Returns results
Guild scoping	`python memory-retrieval.py search "remix" --guild 1449158500344270961`	Only The Hive results
FTS works	`sqlite3 ~/.clawdbot/memory/main.sqlite "SELECT COUNT(*) FROM memories_fts"`	Matches memories count

Integration Test (After Phase 4):

Send message to Clawdbot: "What do you know about Das?"
Verify response includes Genre Universe info
Send message: "Remember that Das prefers releasing on Fridays"
Search: python memory-retrieval.py search "Das Friday"
Verify new memory exists

Rollback Plan

If anything goes wrong:

# 1. Restore from backup
BACKUP_DIR=~/.clawdbot/backups/pre-migration-YYYYMMDD-HHMMSS

# Restore databases
cp "$BACKUP_DIR/main.sqlite" ~/.clawdbot/memory/
cp "$BACKUP_DIR/imessage.sqlite" ~/.clawdbot/memory/

# Restore markdown (if needed)
cp -r "$BACKUP_DIR/memory-markdown/"* ~/.clawdbot/workspace/memory/

# 2. Drop new tables (if needed)
sqlite3 ~/.clawdbot/memory/main.sqlite "
DROP TABLE IF EXISTS memories;
DROP TABLE IF EXISTS memories_fts;
"

# 3. Restart Clawdbot
# (your restart command here)

Timeline

Phase	Duration	Dependency
Phase 0: Backup	5 min	None
Phase 1: Markdown migration	15 min	Phase 0
Phase 2: Chunks migration	10 min	Phase 1
Phase 3: Sessions (optional)	30-60 min	Phase 2
Phase 4: Runtime integration	30-60 min	Phase 2
Phase 5: Deprecate old	5 min	Phase 4 validated

Total: 1-2 hours (excluding Phase 3)

Post-Migration Maintenance

Weekly (Cron):

# Add to crontab
0 3 * * 0 cd ~/.clawdbot/workspace && python3 memory-maintenance.py run >> ~/.clawdbot/logs/memory-maintenance.log 2>&1

Monthly:

Review python memory-maintenance.py stats
Check for memories stuck at low confidence
Verify per-guild counts are balanced

Quarterly:

Full backup
Review if session indexing is needed
Consider re-embedding if switching embedding models

Files Reference

File	Purpose
`migrate-memories.py`	One-time migration script
`memory-retrieval.py`	Search/add/supersede API
`memory-maintenance.py`	Decay/prune/limits
`memory_interface.py`	Runtime integration (create in Phase 4)
`MEMORY-MIGRATION-PLAN.md`	This document

Success Criteria

The migration is complete when:

✅ All markdown files have been processed (key facts extracted)
✅ Old chunks are copied to memories table with embeddings
✅ Search returns relevant results for test queries
✅ Guild scoping works correctly
✅ Clawdbot uses new memory in responses
✅ "Remember this" creates new memories
✅ Weekly maintenance cron is running
✅ Old system files are preserved but not actively used

Questions Before Starting

Do you want to migrate session transcripts? (Recommended: No, for now)
Which guild should we test first? (Recommended: Das server - most memories)
When do you want to do the runtime integration? (Requires Clawdbot restart)

14 KiB Raw Permalink Blame History