v2.0: Interactive onboarding, organization system, auto-scaffold, Jake's choices, contacts template

2026-02-10 14:27:03 -05:00 · 2026-02-10 14:27:03 -05:00 · 9d48022c50
commit 9d48022c50
parent cb28c2649f
3 changed files with 624 additions and 586 deletions
--- a/README.md
+++ b/README.md
@ -1,349 +1,236 @@
-<![CDATA[# 🧠 Clawdbot Memory System
+# 🧠 Clawdbot Memory System
-**One-command persistent memory for Clawdbot — never lose context to compaction again.**
+**Never lose context to compaction again.**
-> "Why does my agent forget everything after a long session?"
+Your Clawdbot agent forgets everything after a long session because compaction summarizes and discards old messages. This system fixes that with persistent, searchable memory that survives crashes, restarts, and compaction.
-Because Clawdbot compacts old context to stay within its context window. Without a memory system, everything that was compacted is gone. This repo fixes that permanently.
+> *"My buddies hate when their agent just gets amnesia."* — Jake
---
+## One-Command Install
 ## What This Is
 A **two-layer memory system** for Clawdbot:
 1. **Markdown files** (source of truth) — Daily logs, research intel, project tracking, and durable notes your agent writes to disk
 2. **SQLite vector search** (retrieval layer) — Semantic search index that lets your agent find relevant memories even when wording differs
 Your agent writes memories to plain Markdown. Those files get indexed into a vector store. When the agent needs context, it searches semantically and finds what it needs — even across sessions, even after compaction.
 ## Quick Install
 ```bash
 bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
 ```
-That's it. The installer will:
+The installer walks you through an interactive setup (~2 minutes) with these choices:
 - ✅ Detect your Clawdbot installation
 - ✅ Create the `memory/` directory with templates
 - ✅ Patch your `clawdbot.json` with memory search config (without touching anything else)
 - ✅ Add memory habits to your `AGENTS.md`
 - ✅ Build the initial vector index
 - ✅ Verify everything works
-### Preview First (Dry Run)
+| Step | What It Asks | Jake's Choice |
 |------|-------------|---------------|
 | Core Memory System | Install persistent memory + search? | ⭐ Yes |
 | Embedding Provider | OpenAI / Gemini / Local? | ⭐ OpenAI |
 | Organization System | Add project scaffolding templates? | ⭐ Yes |
 | Auto-Scaffold | Auto-create files for new projects? | ⭐ Yes |
 | Git Backup | Add backup instructions for agent? | ⭐ Yes |
-```bash
+**Flags:**
-bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --dry-run
+- `--dry-run` — Preview all changes without applying
-```
+- `--uninstall` — Cleanly remove config (preserves your memory files)
-### Uninstall
+## What Gets Installed
-```bash
+### Core Memory System (always installed)
 bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --uninstall
 ```
---
+| Component | What It Does |
 |-----------|-------------|
 | `memory/` directory | Where all memory files live |
 | Daily log system | Agent writes `memory/YYYY-MM-DD.md` each session |
 | SQLite vector index | Semantic search over all memory files (<100ms) |
 | Hybrid search | 70% vector similarity + 30% keyword matching |
 | Pre-compaction flush | Auto-saves context BEFORE session compacts |
 | AGENTS.md patch | Tells your agent when and how to use memory |
 | Config patch | Adds memorySearch to clawdbot.json (non-destructive) |
 ### Organization System (optional)
 | Component | What It Does |
 |-----------|-------------|
 | Project Quickstart | Auto-create structured files for new projects |
 | Research Intel System | Weekly intel rotation with auto-compression |
 | Project Tracking | Standardized stages, blockers, decisions |
 | Contacts Tracker | People, roles, access levels per project |
 | Tag Convention | Consistent naming so search groups by project |
 ## How It Works
 ```
 ┌─────────────────────────────────────────────────────────┐
-│                    YOUR AGENT SESSION                     │
+│                YOUR AGENT CHAT SESSION                  │
-│                                                           │
+│  Decisions, preferences, facts, project progress...     │
-│  Agent writes notes ──→ memory/2026-02-10.md             │
+└───────────────────────┬─────────────────────────────────┘
-│  Agent stores facts ──→ MEMORY.md                        │
+                        │
-│                          │                                │
+                        ▼ (agent writes throughout session)
-│                          ▼                                │
+┌─────────────────────────────────────────────────────────┐
-│                   ┌──────────────┐                        │
+│              MARKDOWN FILES (Source of Truth)            │
-│                   │  File Watcher │ (debounced)           │
+│  memory/2026-02-10.md          (daily log)              │
-│                   └──────┬───────┘                        │
+│  memory/acme-progress.md       (project tracking)       │
-│                          │                                │
+│  memory/acme-research-intel.md (weekly intel)            │
-│                          ▼                                │
+└───────────────────────┬─────────────────────────────────┘
-│              ┌───────────────────────┐                    │
+                        │
-│              │   Embedding Provider   │                   │
+                        ▼ (file watcher, 1.5s debounce)
-│              │  (OpenAI / Gemini /    │                   │
+┌─────────────────────────────────────────────────────────┐
-│              │   Local GGUF)          │                   │
+│              INDEXING PIPELINE (automatic)               │
-│              └───────────┬───────────┘                    │
+│  1. Detect file changes                                  │
-│                          │                                │
+│  2. Chunk text (~400 tokens, 80 overlap)                 │
-│                          ▼                                │
+│  3. Generate embeddings (OpenAI/Gemini/local)            │
-│              ┌───────────────────────┐                    │
+│  4. Store in SQLite with vector + full-text indexes      │
-│              │   SQLite + sqlite-vec  │                   │
+└───────────────────────┬─────────────────────────────────┘
-│              │   Vector Index          │                  │
+                        │
-│              └───────────┬───────────┘                    │
+                        ▼
-│                          │                                │
+┌─────────────────────────────────────────────────────────┐
-│     Agent asks ──────────┤                                │
+│          SQLite DATABASE (Search Engine)                 │
-│     "what did we decide  │                                │
+│  ~/.clawdbot/memory/main.sqlite                         │
-│      about the API?"     ▼                                │
+│  • Vector similarity (semantic meaning)                  │
-│              ┌───────────────────────┐                    │
+│  • BM25 full-text (exact keywords, IDs, code)           │
-│              │   Hybrid Search        │                   │
+│  • Hybrid: 70% vector + 30% keyword                     │
-│              │   (semantic + keyword)  │                  │
+│  • Search speed: <100ms                                  │
-│              └───────────┬───────────┘                    │
+└───────────────────────┬─────────────────────────────────┘
-│                          │                                │
+                        │
-│                          ▼                                │
+                        ▼ (next session starts)
-│              Relevant memory chunks                       │
+┌─────────────────────────────────────────────────────────┐
-│              injected into context                        │
+│              AGENT RECALLS CONTEXT                       │
 │  memory_search("what did we decide about X?")           │
 │  → Returns relevant snippets + file paths + line nums    │
 │  → Agent answers accurately from persistent memory       │
 └─────────────────────────────────────────────────────────┘
 ```
-### Pre-Compaction Flush
+## Why Your Agent Gets Amnesia (and how this fixes it)
-This is the secret sauce. When your session nears its context limit:
+**The problem:** When a chat session gets too long, Clawdbot *compacts* it — summarizing old messages and discarding the originals. Important decisions, preferences, and context get lost in the summary. Your agent "forgets."
-```
+**The fix:** This system adds two mechanisms:
 Session approaching limit
         │
         ▼
 ┌─────────────────────┐
 │  Pre-compaction ping  │  ← Clawdbot silently triggers this
 │  "Store durable       │
 │   memories now"       │
 └──────────┬────────────┘
           │
           ▼
   Agent writes lasting notes
   to memory/YYYY-MM-DD.md
           │
           ▼
   Context gets compacted
   (old messages removed)
           │
           ▼
   BUT memories are on disk
   AND indexed for search
           │
           ▼
   Agent can find them anytime 🎉
 ```
---
+1. **Continuous writing** — Your agent writes important context to `memory/YYYY-MM-DD.md` files *throughout the session*, not just at the end. Decisions, preferences, project state — all captured on disk in real-time.
-## Embedding Provider Options
+2. **Pre-compaction flush** — When a session is ~4,000 tokens from the compaction threshold, Clawdbot triggers a *silent* reminder. The agent reviews what's in context, writes anything important that hasn't been saved yet, and then compaction proceeds safely. The agent responds `NO_REPLY` so you never see this happen.
-The installer will ask which provider you want:
+**Result:** Your agent's memory is on disk, indexed, and searchable. Compaction can't touch it. Crashes can't touch it. Restarts can't touch it.
 ## Embedding Providers
 | Provider | Speed | Cost | Setup |
 |----------|-------|------|-------|
-| **OpenAI** (recommended) | ⚡ Fast | ~$0.02/million tokens | API key required |
+| **OpenAI** ⭐ | ⚡ Fast | ~$0.02/million tokens (~$0.50/mo) | API key required |
 | **Gemini** | ⚡ Fast | Free tier available | API key required |
-| **Local** | 🐢 Slower first run | Free | Downloads GGUF model (~100MB) |
+| **Local** | 🐢 Slower first build | Free forever | Auto-downloads ~600MB model |
-**OpenAI** (`text-embedding-3-small`) is recommended for the best experience. It's extremely cheap and fast.
+**Jake uses OpenAI** — it's the fastest and most reliable. At ~$0.50/month it's basically free.
-**Gemini** (`gemini-embedding-001`) works great and has a generous free tier.
+**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key, but first index build is slower.
-**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key needed, but the first index build is slower.
+## Production Stats (Jake's Setup)
---
+| Metric | Value |
 |--------|-------|
 | Files indexed | 35 |
 | Chunks | 121 |
 | Search speed | <100ms |
 | SQLite size | 15 MB |
 | Monthly cost | ~$0.50 (OpenAI) |
 | Data loss incidents | **Zero** |
 | Crashes survived | 5+ |
 | Days in production | 26+ |
-## Manual Setup (Alternative)
+## Manual Setup
 If you prefer to set things up yourself instead of using the installer:
-### 1. Create the memory directory
+### 1. Create memory directory
 ```bash
-mkdir -p ~/.clawdbot/workspace/memory
+cd ~/.clawdbot/workspace
 mkdir -p memory
 ```
-### 2. Add memory search config to clawdbot.json
+### 2. Add to clawdbot.json
 Open `~/.clawdbot/clawdbot.json` and add `memorySearch` inside `agents.defaults`:
 **For OpenAI:**
 ```json
 {
  "agents": {
    "defaults": {
      "memorySearch": {
        "enabled": true,
        "provider": "openai",
-        "model": "text-embedding-3-small"
+        "model": "text-embedding-3-small",
        "query": {
          "hybrid": {
            "enabled": true,
            "vectorWeight": 0.7,
            "textWeight": 0.3,
            "candidateMultiplier": 4
          }
        }
      }
    }
  }
 }
 ```
-**For Gemini:**
+### 3. Add to AGENTS.md
-```json
+See `config/agents-memory-patch.md` for the exact text to append.
 {
  "agents": {
    "defaults": {
      "memorySearch": {
        "provider": "gemini",
        "model": "gemini-embedding-001"
      }
    }
  }
 }
 ```
 **For Local:**
 ```json
 {
  "agents": {
    "defaults": {
      "memorySearch": {
        "provider": "local"
      }
    }
  }
 }
 ```
 ### 3. Set your API key (if using OpenAI or Gemini)
 For OpenAI, set `OPENAI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.openai.apiKey`.
 For Gemini, set `GEMINI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.google.apiKey`.
 ### 4. Build the index
 ### 4. Restart and verify
 ```bash
 clawdbot gateway restart
 clawdbot memory index --verbose
 clawdbot memory status --deep
 ```
 ## Troubleshooting
 ### "Memory search disabled"
 **Cause:** No embedding provider configured or API key missing.
 **Fix:** Run the installer again, or add your API key to clawdbot.json manually.
 ### Agent still forgets after compaction
 **Cause:** AGENTS.md may not have the memory instructions.
 **Fix:** Check that AGENTS.md contains the "Memory System" section. Re-run installer if needed.
 ### Search returns no results
 **Possible causes:**
 1. Index not built — run `clawdbot memory index --verbose`
 2. No memory files yet — create your first daily log
 3. Query too specific — try broader terms
 ### Rebuild index from scratch
 ```bash
 rm ~/.clawdbot/memory/main.sqlite
 clawdbot memory index --verbose
 ```
 Markdown files are the source of truth — SQLite is always regenerable.
-### 5. Verify
+### Check system health
 ```bash
 clawdbot memory status --deep
 ```
 ### 6. Restart the gateway
 ```bash
 clawdbot gateway restart
 ```
 ---
 ## What Gets Indexed
 By default, Clawdbot indexes:
 - `MEMORY.md` — Long-term curated memory
 - `memory/*.md` — Daily logs and all memory files
 All files must be Markdown (`.md`). The index watches for changes and re-indexes automatically.
 ### Adding Extra Paths
 Want to index files outside the default layout? Add `extraPaths`:
 ```json
 {
  "agents": {
    "defaults": {
      "memorySearch": {
        "extraPaths": ["../team-docs", "/path/to/other/notes"]
      }
    }
  }
 }
 ```
 ---
 ## Troubleshooting
 ### "No API key found for provider openai/google"
 You need to set your embedding API key. Either:
 - Set the environment variable (`OPENAI_API_KEY` or `GEMINI_API_KEY`)
 - Or add it to `clawdbot.json` under `models.providers`
 ### "Memory search stays disabled"
 Run `clawdbot memory status --deep` to see what's wrong. Common causes:
 - No embedding provider configured
 - API key missing or invalid
 - No `.md` files in `memory/` directory
 ### Index not updating
 Run a manual reindex:
 ```bash
 clawdbot memory index --force --verbose
 ```
 ### Agent still seems to forget things
 Make sure your `AGENTS.md` includes memory instructions. The agent needs to be told to:
 1. Search memory before answering questions about prior work
 2. Write important things to daily logs
 3. Flush memories before compaction
 The installer handles this automatically.
 ### Installer fails with "jq not found"
 The installer needs `jq` for safe JSON patching. Install it:
 ```bash
 # macOS
 brew install jq
 # Ubuntu/Debian
 sudo apt-get install jq
 # Or download from https://jqlang.github.io/jq/
 ```
 ---
 ## FAQ
-### Why does my agent forget everything?
+**Q: Will this slow down my agent?**
 A: No. Search takes <100ms. Indexing happens in the background. Writing to files takes milliseconds.
-Clawdbot uses a context window with a token limit. When a session gets long, old messages are **compacted** (summarized and removed) to make room. Without a memory system, the details in those old messages are lost forever.
+**Q: How much disk space does it use?**
 A: ~15MB for the SQLite index with 35 files. Markdown files themselves are tiny.
-This memory system solves it by:
+**Q: Can I edit memory files manually?**
-1. Writing important context to files on disk (survives any compaction)
+A: Yes! They're plain Markdown. Edit in any text editor. Changes are auto-indexed.
 2. Indexing those files for semantic search (agent can find them later)
 3. Flushing memories right before compaction happens (nothing falls through the cracks)
-### How is this different from just having MEMORY.md?
+**Q: What if I delete the SQLite database?**
 A: Just run `clawdbot memory index --verbose` to rebuild it. Markdown files are the source of truth.
-`MEMORY.md` alone is a single file that the agent reads at session start. It works for small amounts of info, but:
+**Q: Does this work with OpenClaw / Moltbot?**
- It doesn't scale (gets too big to fit in context)
+A: Yes. The installer auto-detects all three (Clawdbot, OpenClaw, Moltbot).
 - It's not searchable (agent has to read the whole thing)
 - Daily details get lost (you can't put everything in one file)
-This system adds **daily logs** (unlimited history) + **vector search** (find anything semantically) + **pre-compaction flush** (automatic safety net).
+**Q: Can multiple agents share the same memory?**
 A: Each agent gets its own SQLite index, but they can read the same markdown files if they share a workspace.
-### Does this cost money?
+**Q: Is my data sent to the cloud?**
-
+A: Only the text chunks are sent to generate embeddings (OpenAI/Gemini). The actual memory files stay on your disk. Use `local` provider for fully offline operation.
 - **Local embeddings**: Free (but slower)
 - **OpenAI embeddings**: ~$0.02 per million tokens (essentially free for personal use)
 - **Gemini embeddings**: Free tier available
 For reference, indexing 100 daily logs costs about $0.001 with OpenAI.
 ### Can I use this with multiple agents?
 Yes. Each agent uses the same workspace `memory/` directory by default. You can scope with `--agent <id>` for commands.
 ### Is my data sent to the cloud?
 Only if you use remote embeddings (OpenAI/Gemini). The embedding vectors are generated from your text, but they can't be reversed back to the original text. If you want full privacy, use `local` embeddings — everything stays on your machine.
 ### Can I run the installer multiple times?
 Yes! It's idempotent. It checks for existing files and config before making changes, and backs up your config before patching.
 ---
 ## Architecture
 See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams.
 ## Migrating from Another Setup
 See [MIGRATION.md](MIGRATION.md) for step-by-step migration guides.
 ## License
-MIT — see [LICENSE](LICENSE)
+MIT — use it however you want.
---
+## Credits
-**Built for the Clawdbot community** by people who got tired of explaining things to their agent twice.
+Built by [Buba](https://github.com/BusyBee3333) (Jake's Clawdbot agent) based on 26+ days of production use.
-]]>
+
 *Your agent will never forget again.* ᕕ( ᐛ )ᕗ
--- a/install.sh
+++ b/install.sh
--- a/templates/TEMPLATE-contacts.md
+++ b/templates/TEMPLATE-contacts.md
@ -0,0 +1,22 @@
 # {Project Name} — Contacts
 ## Team / Key People
 ### {Person Name}
 - **Role:** 
 - **Phone:** 
 - **Email:** 
 - **Platform:** (Discord/Slack/iMessage/etc.)
 - **Access Level:** (full / chat-only / view-only)
 - **Notes:** 
 ### {Person Name}
 - **Role:** 
 - **Phone:** 
 - **Email:** 
 - **Platform:** 
 - **Access Level:** 
 - **Notes:** 
 ## Communication Log
 - **{Date}:** {what was communicated, decisions made}