v2.0: Interactive onboarding, organization system, auto-scaffold, Jake's choices, contacts template

2026-02-10 14:27:03 -05:00 · 2026-02-10 14:27:03 -05:00 · 9d48022c50
commit 9d48022c50
parent cb28c2649f
3 changed files with 624 additions and 586 deletions
--- a/README.md
+++ b/README.md
@ -1,349 +1,236 @@
-<![CDATA[# 🧠 Clawdbot Memory System
+# 🧠 Clawdbot Memory System

-**One-command persistent memory for Clawdbot — never lose context to compaction again.**
+**Never lose context to compaction again.**

-> "Why does my agent forget everything after a long session?"
+Your Clawdbot agent forgets everything after a long session because compaction summarizes and discards old messages. This system fixes that with persistent, searchable memory that survives crashes, restarts, and compaction.

-Because Clawdbot compacts old context to stay within its context window. Without a memory system, everything that was compacted is gone. This repo fixes that permanently.
+> *"My buddies hate when their agent just gets amnesia."* — Jake

---
-
-## What This Is
-
-A **two-layer memory system** for Clawdbot:
-
-1. **Markdown files** (source of truth) — Daily logs, research intel, project tracking, and durable notes your agent writes to disk
-2. **SQLite vector search** (retrieval layer) — Semantic search index that lets your agent find relevant memories even when wording differs
-
-Your agent writes memories to plain Markdown. Those files get indexed into a vector store. When the agent needs context, it searches semantically and finds what it needs — even across sessions, even after compaction.
-
-## Quick Install
+## One-Command Install

 ```bash
 bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
 ```

-That's it. The installer will:
- ✅ Detect your Clawdbot installation
- ✅ Create the `memory/` directory with templates
- ✅ Patch your `clawdbot.json` with memory search config (without touching anything else)
- ✅ Add memory habits to your `AGENTS.md`
- ✅ Build the initial vector index
- ✅ Verify everything works
+The installer walks you through an interactive setup (~2 minutes) with these choices:

-### Preview First (Dry Run)
+| Step | What It Asks | Jake's Choice |
+|------|-------------|---------------|
+| Core Memory System | Install persistent memory + search? | ⭐ Yes |
+| Embedding Provider | OpenAI / Gemini / Local? | ⭐ OpenAI |
+| Organization System | Add project scaffolding templates? | ⭐ Yes |
+| Auto-Scaffold | Auto-create files for new projects? | ⭐ Yes |
+| Git Backup | Add backup instructions for agent? | ⭐ Yes |

-```bash
-bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --dry-run
-```
+**Flags:**
+- `--dry-run` — Preview all changes without applying
+- `--uninstall` — Cleanly remove config (preserves your memory files)

-### Uninstall
+## What Gets Installed

-```bash
-bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --uninstall
-```
+### Core Memory System (always installed)

---
+| Component | What It Does |
+|-----------|-------------|
+| `memory/` directory | Where all memory files live |
+| Daily log system | Agent writes `memory/YYYY-MM-DD.md` each session |
+| SQLite vector index | Semantic search over all memory files (<100ms) |
+| Hybrid search | 70% vector similarity + 30% keyword matching |
+| Pre-compaction flush | Auto-saves context BEFORE session compacts |
+| AGENTS.md patch | Tells your agent when and how to use memory |
+| Config patch | Adds memorySearch to clawdbot.json (non-destructive) |
+
+### Organization System (optional)
+
+| Component | What It Does |
+|-----------|-------------|
+| Project Quickstart | Auto-create structured files for new projects |
+| Research Intel System | Weekly intel rotation with auto-compression |
+| Project Tracking | Standardized stages, blockers, decisions |
+| Contacts Tracker | People, roles, access levels per project |
+| Tag Convention | Consistent naming so search groups by project |

 ## How It Works

 ```
 ┌─────────────────────────────────────────────────────────┐
-│                    YOUR AGENT SESSION                     │
-│                                                           │
-│  Agent writes notes ──→ memory/2026-02-10.md             │
-│  Agent stores facts ──→ MEMORY.md                        │
-│                          │                                │
-│                          ▼                                │
-│                   ┌──────────────┐                        │
-│                   │  File Watcher │ (debounced)           │
-│                   └──────┬───────┘                        │
-│                          │                                │
-│                          ▼                                │
-│              ┌───────────────────────┐                    │
-│              │   Embedding Provider   │                   │
-│              │  (OpenAI / Gemini /    │                   │
-│              │   Local GGUF)          │                   │
-│              └───────────┬───────────┘                    │
-│                          │                                │
-│                          ▼                                │
-│              ┌───────────────────────┐                    │
-│              │   SQLite + sqlite-vec  │                   │
-│              │   Vector Index          │                  │
-│              └───────────┬───────────┘                    │
-│                          │                                │
-│     Agent asks ──────────┤                                │
-│     "what did we decide  │                                │
-│      about the API?"     ▼                                │
-│              ┌───────────────────────┐                    │
-│              │   Hybrid Search        │                   │
-│              │   (semantic + keyword)  │                  │
-│              └───────────┬───────────┘                    │
-│                          │                                │
-│                          ▼                                │
-│              Relevant memory chunks                       │
-│              injected into context                        │
+│                YOUR AGENT CHAT SESSION                  │
+│  Decisions, preferences, facts, project progress...     │
+└───────────────────────┬─────────────────────────────────┘
+                        │
+                        ▼ (agent writes throughout session)
+┌─────────────────────────────────────────────────────────┐
+│              MARKDOWN FILES (Source of Truth)            │
+│  memory/2026-02-10.md          (daily log)              │
+│  memory/acme-progress.md       (project tracking)       │
+│  memory/acme-research-intel.md (weekly intel)            │
+└───────────────────────┬─────────────────────────────────┘
+                        │
+                        ▼ (file watcher, 1.5s debounce)
+┌─────────────────────────────────────────────────────────┐
+│              INDEXING PIPELINE (automatic)               │
+│  1. Detect file changes                                  │
+│  2. Chunk text (~400 tokens, 80 overlap)                 │
+│  3. Generate embeddings (OpenAI/Gemini/local)            │
+│  4. Store in SQLite with vector + full-text indexes      │
+└───────────────────────┬─────────────────────────────────┘
+                        │
+                        ▼
+┌─────────────────────────────────────────────────────────┐
+│          SQLite DATABASE (Search Engine)                 │
+│  ~/.clawdbot/memory/main.sqlite                         │
+│  • Vector similarity (semantic meaning)                  │
+│  • BM25 full-text (exact keywords, IDs, code)           │
+│  • Hybrid: 70% vector + 30% keyword                     │
+│  • Search speed: <100ms                                  │
+└───────────────────────┬─────────────────────────────────┘
+                        │
+                        ▼ (next session starts)
+┌─────────────────────────────────────────────────────────┐
+│              AGENT RECALLS CONTEXT                       │
+│  memory_search("what did we decide about X?")           │
+│  → Returns relevant snippets + file paths + line nums    │
+│  → Agent answers accurately from persistent memory       │
 └─────────────────────────────────────────────────────────┘
 ```

-### Pre-Compaction Flush
+## Why Your Agent Gets Amnesia (and how this fixes it)

-This is the secret sauce. When your session nears its context limit:
+**The problem:** When a chat session gets too long, Clawdbot *compacts* it — summarizing old messages and discarding the originals. Important decisions, preferences, and context get lost in the summary. Your agent "forgets."

-```
-Session approaching limit
-         │
-         ▼
-┌─────────────────────┐
-│  Pre-compaction ping  │  ← Clawdbot silently triggers this
-│  "Store durable       │
-│   memories now"       │
-└──────────┬────────────┘
-           │
-           ▼
-   Agent writes lasting notes
-   to memory/YYYY-MM-DD.md
-           │
-           ▼
-   Context gets compacted
-   (old messages removed)
-           │
-           ▼
-   BUT memories are on disk
-   AND indexed for search
-           │
-           ▼
-   Agent can find them anytime 🎉
-```
+**The fix:** This system adds two mechanisms:

---
+1. **Continuous writing** — Your agent writes important context to `memory/YYYY-MM-DD.md` files *throughout the session*, not just at the end. Decisions, preferences, project state — all captured on disk in real-time.

-## Embedding Provider Options
+2. **Pre-compaction flush** — When a session is ~4,000 tokens from the compaction threshold, Clawdbot triggers a *silent* reminder. The agent reviews what's in context, writes anything important that hasn't been saved yet, and then compaction proceeds safely. The agent responds `NO_REPLY` so you never see this happen.

-The installer will ask which provider you want:
+**Result:** Your agent's memory is on disk, indexed, and searchable. Compaction can't touch it. Crashes can't touch it. Restarts can't touch it.
+
+## Embedding Providers

 | Provider | Speed | Cost | Setup |
 |----------|-------|------|-------|
-| **OpenAI** (recommended) | ⚡ Fast | ~$0.02/million tokens | API key required |
+| **OpenAI** ⭐ | ⚡ Fast | ~$0.02/million tokens (~$0.50/mo) | API key required |
 | **Gemini** | ⚡ Fast | Free tier available | API key required |
-| **Local** | 🐢 Slower first run | Free | Downloads GGUF model (~100MB) |
+| **Local** | 🐢 Slower first build | Free forever | Auto-downloads ~600MB model |

-**OpenAI** (`text-embedding-3-small`) is recommended for the best experience. It's extremely cheap and fast.
+**Jake uses OpenAI** — it's the fastest and most reliable. At ~$0.50/month it's basically free.

-**Gemini** (`gemini-embedding-001`) works great and has a generous free tier.
+**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key, but first index build is slower.

-**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key needed, but the first index build is slower.
+## Production Stats (Jake's Setup)

---
+| Metric | Value |
+|--------|-------|
+| Files indexed | 35 |
+| Chunks | 121 |
+| Search speed | <100ms |
+| SQLite size | 15 MB |
+| Monthly cost | ~$0.50 (OpenAI) |
+| Data loss incidents | **Zero** |
+| Crashes survived | 5+ |
+| Days in production | 26+ |

-## Manual Setup (Alternative)
+## Manual Setup

 If you prefer to set things up yourself instead of using the installer:

-### 1. Create the memory directory
-
+### 1. Create memory directory
 ```bash
-mkdir -p ~/.clawdbot/workspace/memory
+cd ~/.clawdbot/workspace
+mkdir -p memory
 ```

-### 2. Add memory search config to clawdbot.json
-
-Open `~/.clawdbot/clawdbot.json` and add `memorySearch` inside `agents.defaults`:
-
-**For OpenAI:**
+### 2. Add to clawdbot.json
 ```json
 {
  "agents": {
    "defaults": {
      "memorySearch": {
+        "enabled": true,
        "provider": "openai",
-        "model": "text-embedding-3-small"
+        "model": "text-embedding-3-small",
+        "query": {
+          "hybrid": {
+            "enabled": true,
+            "vectorWeight": 0.7,
+            "textWeight": 0.3,
+            "candidateMultiplier": 4
+          }
+        }
      }
    }
  }
 }
 ```

-**For Gemini:**
-```json
-{
-  "agents": {
-    "defaults": {
-      "memorySearch": {
-        "provider": "gemini",
-        "model": "gemini-embedding-001"
-      }
-    }
-  }
-}
-```
-
-**For Local:**
-```json
-{
-  "agents": {
-    "defaults": {
-      "memorySearch": {
-        "provider": "local"
-      }
-    }
-  }
-}
-```
-
-### 3. Set your API key (if using OpenAI or Gemini)
-
-For OpenAI, set `OPENAI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.openai.apiKey`.
-
-For Gemini, set `GEMINI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.google.apiKey`.
-
-### 4. Build the index
+### 3. Add to AGENTS.md
+See `config/agents-memory-patch.md` for the exact text to append.

+### 4. Restart and verify
 ```bash
+clawdbot gateway restart
+clawdbot memory index --verbose
+clawdbot memory status --deep
+```
+
+## Troubleshooting
+
+### "Memory search disabled"
+**Cause:** No embedding provider configured or API key missing.
+**Fix:** Run the installer again, or add your API key to clawdbot.json manually.
+
+### Agent still forgets after compaction
+**Cause:** AGENTS.md may not have the memory instructions.
+**Fix:** Check that AGENTS.md contains the "Memory System" section. Re-run installer if needed.
+
+### Search returns no results
+**Possible causes:**
+1. Index not built — run `clawdbot memory index --verbose`
+2. No memory files yet — create your first daily log
+3. Query too specific — try broader terms
+
+### Rebuild index from scratch
+```bash
+rm ~/.clawdbot/memory/main.sqlite
 clawdbot memory index --verbose
 ```
+Markdown files are the source of truth — SQLite is always regenerable.

-### 5. Verify
-
+### Check system health
 ```bash
 clawdbot memory status --deep
 ```

-### 6. Restart the gateway
-
-```bash
-clawdbot gateway restart
-```
-
---
-
-## What Gets Indexed
-
-By default, Clawdbot indexes:
- `MEMORY.md` — Long-term curated memory
- `memory/*.md` — Daily logs and all memory files
-
-All files must be Markdown (`.md`). The index watches for changes and re-indexes automatically.
-
-### Adding Extra Paths
-
-Want to index files outside the default layout? Add `extraPaths`:
-
-```json
-{
-  "agents": {
-    "defaults": {
-      "memorySearch": {
-        "extraPaths": ["../team-docs", "/path/to/other/notes"]
-      }
-    }
-  }
-}
-```
-
---
-
-## Troubleshooting
-
-### "No API key found for provider openai/google"
-
-You need to set your embedding API key. Either:
- Set the environment variable (`OPENAI_API_KEY` or `GEMINI_API_KEY`)
- Or add it to `clawdbot.json` under `models.providers`
-
-### "Memory search stays disabled"
-
-Run `clawdbot memory status --deep` to see what's wrong. Common causes:
- No embedding provider configured
- API key missing or invalid
- No `.md` files in `memory/` directory
-
-### Index not updating
-
-Run a manual reindex:
-```bash
-clawdbot memory index --force --verbose
-```
-
-### Agent still seems to forget things
-
-Make sure your `AGENTS.md` includes memory instructions. The agent needs to be told to:
-1. Search memory before answering questions about prior work
-2. Write important things to daily logs
-3. Flush memories before compaction
-
-The installer handles this automatically.
-
-### Installer fails with "jq not found"
-
-The installer needs `jq` for safe JSON patching. Install it:
-```bash
-# macOS
-brew install jq
-
-# Ubuntu/Debian
-sudo apt-get install jq
-
-# Or download from https://jqlang.github.io/jq/
-```
-
---
-
 ## FAQ

-### Why does my agent forget everything?
+**Q: Will this slow down my agent?**
+A: No. Search takes <100ms. Indexing happens in the background. Writing to files takes milliseconds.

-Clawdbot uses a context window with a token limit. When a session gets long, old messages are **compacted** (summarized and removed) to make room. Without a memory system, the details in those old messages are lost forever.
+**Q: How much disk space does it use?**
+A: ~15MB for the SQLite index with 35 files. Markdown files themselves are tiny.

-This memory system solves it by:
-1. Writing important context to files on disk (survives any compaction)
-2. Indexing those files for semantic search (agent can find them later)
-3. Flushing memories right before compaction happens (nothing falls through the cracks)
+**Q: Can I edit memory files manually?**
+A: Yes! They're plain Markdown. Edit in any text editor. Changes are auto-indexed.

-### How is this different from just having MEMORY.md?
+**Q: What if I delete the SQLite database?**
+A: Just run `clawdbot memory index --verbose` to rebuild it. Markdown files are the source of truth.

-`MEMORY.md` alone is a single file that the agent reads at session start. It works for small amounts of info, but:
- It doesn't scale (gets too big to fit in context)
- It's not searchable (agent has to read the whole thing)
- Daily details get lost (you can't put everything in one file)
+**Q: Does this work with OpenClaw / Moltbot?**
+A: Yes. The installer auto-detects all three (Clawdbot, OpenClaw, Moltbot).

-This system adds **daily logs** (unlimited history) + **vector search** (find anything semantically) + **pre-compaction flush** (automatic safety net).
+**Q: Can multiple agents share the same memory?**
+A: Each agent gets its own SQLite index, but they can read the same markdown files if they share a workspace.

-### Does this cost money?
-
- **Local embeddings**: Free (but slower)
- **OpenAI embeddings**: ~$0.02 per million tokens (essentially free for personal use)
- **Gemini embeddings**: Free tier available
-
-For reference, indexing 100 daily logs costs about $0.001 with OpenAI.
-
-### Can I use this with multiple agents?
-
-Yes. Each agent uses the same workspace `memory/` directory by default. You can scope with `--agent <id>` for commands.
-
-### Is my data sent to the cloud?
-
-Only if you use remote embeddings (OpenAI/Gemini). The embedding vectors are generated from your text, but they can't be reversed back to the original text. If you want full privacy, use `local` embeddings — everything stays on your machine.
-
-### Can I run the installer multiple times?
-
-Yes! It's idempotent. It checks for existing files and config before making changes, and backs up your config before patching.
-
---
-
-## Architecture
-
-See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams.
-
-## Migrating from Another Setup
-
-See [MIGRATION.md](MIGRATION.md) for step-by-step migration guides.
+**Q: Is my data sent to the cloud?**
+A: Only the text chunks are sent to generate embeddings (OpenAI/Gemini). The actual memory files stay on your disk. Use `local` provider for fully offline operation.

 ## License

-MIT — see [LICENSE](LICENSE)
+MIT — use it however you want.

---
+## Credits

-**Built for the Clawdbot community** by people who got tired of explaining things to their agent twice.
-]]>
+Built by [Buba](https://github.com/BusyBee3333) (Jake's Clawdbot agent) based on 26+ days of production use.
+
+*Your agent will never forget again.* ᕕ( ᐛ )ᕗ
--- a/install.sh
+++ b/install.sh
--- a/templates/TEMPLATE-contacts.md
+++ b/templates/TEMPLATE-contacts.md
@ -0,0 +1,22 @@
+# {Project Name} — Contacts
+
+## Team / Key People
+
+### {Person Name}
+- **Role:** 
+- **Phone:** 
+- **Email:** 
+- **Platform:** (Discord/Slack/iMessage/etc.)
+- **Access Level:** (full / chat-only / view-only)
+- **Notes:** 
+
+### {Person Name}
+- **Role:** 
+- **Phone:** 
+- **Email:** 
+- **Platform:** 
+- **Access Level:** 
+- **Notes:** 
+
+## Communication Log
+- **{Date}:** {what was communicated, decisions made}