v2.0: Interactive onboarding, organization system, auto-scaffold, Jake's choices, contacts template

This commit is contained in:
Jake Shore 2026-02-10 14:27:03 -05:00
parent cb28c2649f
commit 9d48022c50
3 changed files with 624 additions and 586 deletions

433
README.md
View File

@ -1,349 +1,236 @@
<![CDATA[# 🧠 Clawdbot Memory System # 🧠 Clawdbot Memory System
**One-command persistent memory for Clawdbot — never lose context to compaction again.** **Never lose context to compaction again.**
> "Why does my agent forget everything after a long session?" Your Clawdbot agent forgets everything after a long session because compaction summarizes and discards old messages. This system fixes that with persistent, searchable memory that survives crashes, restarts, and compaction.
Because Clawdbot compacts old context to stay within its context window. Without a memory system, everything that was compacted is gone. This repo fixes that permanently. > *"My buddies hate when their agent just gets amnesia."* — Jake
--- ## One-Command Install
## What This Is
A **two-layer memory system** for Clawdbot:
1. **Markdown files** (source of truth) — Daily logs, research intel, project tracking, and durable notes your agent writes to disk
2. **SQLite vector search** (retrieval layer) — Semantic search index that lets your agent find relevant memories even when wording differs
Your agent writes memories to plain Markdown. Those files get indexed into a vector store. When the agent needs context, it searches semantically and finds what it needs — even across sessions, even after compaction.
## Quick Install
```bash ```bash
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
``` ```
That's it. The installer will: The installer walks you through an interactive setup (~2 minutes) with these choices:
- ✅ Detect your Clawdbot installation
- ✅ Create the `memory/` directory with templates
- ✅ Patch your `clawdbot.json` with memory search config (without touching anything else)
- ✅ Add memory habits to your `AGENTS.md`
- ✅ Build the initial vector index
- ✅ Verify everything works
### Preview First (Dry Run) | Step | What It Asks | Jake's Choice |
|------|-------------|---------------|
| Core Memory System | Install persistent memory + search? | ⭐ Yes |
| Embedding Provider | OpenAI / Gemini / Local? | ⭐ OpenAI |
| Organization System | Add project scaffolding templates? | ⭐ Yes |
| Auto-Scaffold | Auto-create files for new projects? | ⭐ Yes |
| Git Backup | Add backup instructions for agent? | ⭐ Yes |
```bash **Flags:**
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --dry-run - `--dry-run` — Preview all changes without applying
``` - `--uninstall` — Cleanly remove config (preserves your memory files)
### Uninstall ## What Gets Installed
```bash ### Core Memory System (always installed)
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --uninstall
```
--- | Component | What It Does |
|-----------|-------------|
| `memory/` directory | Where all memory files live |
| Daily log system | Agent writes `memory/YYYY-MM-DD.md` each session |
| SQLite vector index | Semantic search over all memory files (<100ms) |
| Hybrid search | 70% vector similarity + 30% keyword matching |
| Pre-compaction flush | Auto-saves context BEFORE session compacts |
| AGENTS.md patch | Tells your agent when and how to use memory |
| Config patch | Adds memorySearch to clawdbot.json (non-destructive) |
### Organization System (optional)
| Component | What It Does |
|-----------|-------------|
| Project Quickstart | Auto-create structured files for new projects |
| Research Intel System | Weekly intel rotation with auto-compression |
| Project Tracking | Standardized stages, blockers, decisions |
| Contacts Tracker | People, roles, access levels per project |
| Tag Convention | Consistent naming so search groups by project |
## How It Works ## How It Works
``` ```
┌─────────────────────────────────────────────────────────┐ ┌─────────────────────────────────────────────────────────┐
│ YOUR AGENT SESSION │ │ YOUR AGENT CHAT SESSION │
│ │ │ Decisions, preferences, facts, project progress... │
│ Agent writes notes ──→ memory/2026-02-10.md │ └───────────────────────┬─────────────────────────────────┘
│ Agent stores facts ──→ MEMORY.md │
│ │ │ ▼ (agent writes throughout session)
│ ▼ │ ┌─────────────────────────────────────────────────────────┐
│ ┌──────────────┐ │ │ MARKDOWN FILES (Source of Truth) │
│ │ File Watcher │ (debounced) │ │ memory/2026-02-10.md (daily log) │
│ └──────┬───────┘ │ │ memory/acme-progress.md (project tracking) │
│ │ │ │ memory/acme-research-intel.md (weekly intel) │
│ ▼ │ └───────────────────────┬─────────────────────────────────┘
│ ┌───────────────────────┐ │
│ │ Embedding Provider │ │ ▼ (file watcher, 1.5s debounce)
│ │ (OpenAI / Gemini / │ │ ┌─────────────────────────────────────────────────────────┐
│ │ Local GGUF) │ │ │ INDEXING PIPELINE (automatic) │
│ └───────────┬───────────┘ │ │ 1. Detect file changes │
│ │ │ │ 2. Chunk text (~400 tokens, 80 overlap) │
│ ▼ │ │ 3. Generate embeddings (OpenAI/Gemini/local) │
│ ┌───────────────────────┐ │ │ 4. Store in SQLite with vector + full-text indexes │
│ │ SQLite + sqlite-vec │ │ └───────────────────────┬─────────────────────────────────┘
│ │ Vector Index │ │
│ └───────────┬───────────┘ │
│ │ │ ┌─────────────────────────────────────────────────────────┐
│ Agent asks ──────────┤ │ │ SQLite DATABASE (Search Engine) │
│ "what did we decide │ │ │ ~/.clawdbot/memory/main.sqlite │
│ about the API?" ▼ │ │ • Vector similarity (semantic meaning) │
│ ┌───────────────────────┐ │ │ • BM25 full-text (exact keywords, IDs, code) │
│ │ Hybrid Search │ │ │ • Hybrid: 70% vector + 30% keyword │
│ │ (semantic + keyword) │ │ │ • Search speed: <100ms
│ └───────────┬───────────┘ │ └───────────────────────┬─────────────────────────────────┘
│ │ │
│ ▼ │ ▼ (next session starts)
│ Relevant memory chunks │ ┌─────────────────────────────────────────────────────────┐
│ injected into context │ │ AGENT RECALLS CONTEXT │
│ memory_search("what did we decide about X?") │
│ → Returns relevant snippets + file paths + line nums │
│ → Agent answers accurately from persistent memory │
└─────────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────┘
``` ```
### Pre-Compaction Flush ## Why Your Agent Gets Amnesia (and how this fixes it)
This is the secret sauce. When your session nears its context limit: **The problem:** When a chat session gets too long, Clawdbot *compacts* it — summarizing old messages and discarding the originals. Important decisions, preferences, and context get lost in the summary. Your agent "forgets."
``` **The fix:** This system adds two mechanisms:
Session approaching limit
┌─────────────────────┐
│ Pre-compaction ping │ ← Clawdbot silently triggers this
│ "Store durable │
│ memories now" │
└──────────┬────────────┘
Agent writes lasting notes
to memory/YYYY-MM-DD.md
Context gets compacted
(old messages removed)
BUT memories are on disk
AND indexed for search
Agent can find them anytime 🎉
```
--- 1. **Continuous writing** — Your agent writes important context to `memory/YYYY-MM-DD.md` files *throughout the session*, not just at the end. Decisions, preferences, project state — all captured on disk in real-time.
## Embedding Provider Options 2. **Pre-compaction flush** — When a session is ~4,000 tokens from the compaction threshold, Clawdbot triggers a *silent* reminder. The agent reviews what's in context, writes anything important that hasn't been saved yet, and then compaction proceeds safely. The agent responds `NO_REPLY` so you never see this happen.
The installer will ask which provider you want: **Result:** Your agent's memory is on disk, indexed, and searchable. Compaction can't touch it. Crashes can't touch it. Restarts can't touch it.
## Embedding Providers
| Provider | Speed | Cost | Setup | | Provider | Speed | Cost | Setup |
|----------|-------|------|-------| |----------|-------|------|-------|
| **OpenAI** (recommended) | ⚡ Fast | ~$0.02/million tokens | API key required | | **OpenAI** ⭐ | ⚡ Fast | ~$0.02/million tokens (~$0.50/mo) | API key required |
| **Gemini** | ⚡ Fast | Free tier available | API key required | | **Gemini** | ⚡ Fast | Free tier available | API key required |
| **Local** | 🐢 Slower first run | Free | Downloads GGUF model (~100MB) | | **Local** | 🐢 Slower first build | Free forever | Auto-downloads ~600MB model |
**OpenAI** (`text-embedding-3-small`) is recommended for the best experience. It's extremely cheap and fast. **Jake uses OpenAI** — it's the fastest and most reliable. At ~$0.50/month it's basically free.
**Gemini** (`gemini-embedding-001`) works great and has a generous free tier. **Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key, but first index build is slower.
**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key needed, but the first index build is slower. ## Production Stats (Jake's Setup)
--- | Metric | Value |
|--------|-------|
| Files indexed | 35 |
| Chunks | 121 |
| Search speed | <100ms |
| SQLite size | 15 MB |
| Monthly cost | ~$0.50 (OpenAI) |
| Data loss incidents | **Zero** |
| Crashes survived | 5+ |
| Days in production | 26+ |
## Manual Setup (Alternative) ## Manual Setup
If you prefer to set things up yourself instead of using the installer: If you prefer to set things up yourself instead of using the installer:
### 1. Create the memory directory ### 1. Create memory directory
```bash ```bash
mkdir -p ~/.clawdbot/workspace/memory cd ~/.clawdbot/workspace
mkdir -p memory
``` ```
### 2. Add memory search config to clawdbot.json ### 2. Add to clawdbot.json
Open `~/.clawdbot/clawdbot.json` and add `memorySearch` inside `agents.defaults`:
**For OpenAI:**
```json ```json
{ {
"agents": { "agents": {
"defaults": { "defaults": {
"memorySearch": { "memorySearch": {
"enabled": true,
"provider": "openai", "provider": "openai",
"model": "text-embedding-3-small" "model": "text-embedding-3-small",
"query": {
"hybrid": {
"enabled": true,
"vectorWeight": 0.7,
"textWeight": 0.3,
"candidateMultiplier": 4
}
}
} }
} }
} }
} }
``` ```
**For Gemini:** ### 3. Add to AGENTS.md
```json See `config/agents-memory-patch.md` for the exact text to append.
{
"agents": {
"defaults": {
"memorySearch": {
"provider": "gemini",
"model": "gemini-embedding-001"
}
}
}
}
```
**For Local:**
```json
{
"agents": {
"defaults": {
"memorySearch": {
"provider": "local"
}
}
}
}
```
### 3. Set your API key (if using OpenAI or Gemini)
For OpenAI, set `OPENAI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.openai.apiKey`.
For Gemini, set `GEMINI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.google.apiKey`.
### 4. Build the index
### 4. Restart and verify
```bash ```bash
clawdbot gateway restart
clawdbot memory index --verbose
clawdbot memory status --deep
```
## Troubleshooting
### "Memory search disabled"
**Cause:** No embedding provider configured or API key missing.
**Fix:** Run the installer again, or add your API key to clawdbot.json manually.
### Agent still forgets after compaction
**Cause:** AGENTS.md may not have the memory instructions.
**Fix:** Check that AGENTS.md contains the "Memory System" section. Re-run installer if needed.
### Search returns no results
**Possible causes:**
1. Index not built — run `clawdbot memory index --verbose`
2. No memory files yet — create your first daily log
3. Query too specific — try broader terms
### Rebuild index from scratch
```bash
rm ~/.clawdbot/memory/main.sqlite
clawdbot memory index --verbose clawdbot memory index --verbose
``` ```
Markdown files are the source of truth — SQLite is always regenerable.
### 5. Verify ### Check system health
```bash ```bash
clawdbot memory status --deep clawdbot memory status --deep
``` ```
### 6. Restart the gateway
```bash
clawdbot gateway restart
```
---
## What Gets Indexed
By default, Clawdbot indexes:
- `MEMORY.md` — Long-term curated memory
- `memory/*.md` — Daily logs and all memory files
All files must be Markdown (`.md`). The index watches for changes and re-indexes automatically.
### Adding Extra Paths
Want to index files outside the default layout? Add `extraPaths`:
```json
{
"agents": {
"defaults": {
"memorySearch": {
"extraPaths": ["../team-docs", "/path/to/other/notes"]
}
}
}
}
```
---
## Troubleshooting
### "No API key found for provider openai/google"
You need to set your embedding API key. Either:
- Set the environment variable (`OPENAI_API_KEY` or `GEMINI_API_KEY`)
- Or add it to `clawdbot.json` under `models.providers`
### "Memory search stays disabled"
Run `clawdbot memory status --deep` to see what's wrong. Common causes:
- No embedding provider configured
- API key missing or invalid
- No `.md` files in `memory/` directory
### Index not updating
Run a manual reindex:
```bash
clawdbot memory index --force --verbose
```
### Agent still seems to forget things
Make sure your `AGENTS.md` includes memory instructions. The agent needs to be told to:
1. Search memory before answering questions about prior work
2. Write important things to daily logs
3. Flush memories before compaction
The installer handles this automatically.
### Installer fails with "jq not found"
The installer needs `jq` for safe JSON patching. Install it:
```bash
# macOS
brew install jq
# Ubuntu/Debian
sudo apt-get install jq
# Or download from https://jqlang.github.io/jq/
```
---
## FAQ ## FAQ
### Why does my agent forget everything? **Q: Will this slow down my agent?**
A: No. Search takes <100ms. Indexing happens in the background. Writing to files takes milliseconds.
Clawdbot uses a context window with a token limit. When a session gets long, old messages are **compacted** (summarized and removed) to make room. Without a memory system, the details in those old messages are lost forever. **Q: How much disk space does it use?**
A: ~15MB for the SQLite index with 35 files. Markdown files themselves are tiny.
This memory system solves it by: **Q: Can I edit memory files manually?**
1. Writing important context to files on disk (survives any compaction) A: Yes! They're plain Markdown. Edit in any text editor. Changes are auto-indexed.
2. Indexing those files for semantic search (agent can find them later)
3. Flushing memories right before compaction happens (nothing falls through the cracks)
### How is this different from just having MEMORY.md? **Q: What if I delete the SQLite database?**
A: Just run `clawdbot memory index --verbose` to rebuild it. Markdown files are the source of truth.
`MEMORY.md` alone is a single file that the agent reads at session start. It works for small amounts of info, but: **Q: Does this work with OpenClaw / Moltbot?**
- It doesn't scale (gets too big to fit in context) A: Yes. The installer auto-detects all three (Clawdbot, OpenClaw, Moltbot).
- It's not searchable (agent has to read the whole thing)
- Daily details get lost (you can't put everything in one file)
This system adds **daily logs** (unlimited history) + **vector search** (find anything semantically) + **pre-compaction flush** (automatic safety net). **Q: Can multiple agents share the same memory?**
A: Each agent gets its own SQLite index, but they can read the same markdown files if they share a workspace.
### Does this cost money? **Q: Is my data sent to the cloud?**
A: Only the text chunks are sent to generate embeddings (OpenAI/Gemini). The actual memory files stay on your disk. Use `local` provider for fully offline operation.
- **Local embeddings**: Free (but slower)
- **OpenAI embeddings**: ~$0.02 per million tokens (essentially free for personal use)
- **Gemini embeddings**: Free tier available
For reference, indexing 100 daily logs costs about $0.001 with OpenAI.
### Can I use this with multiple agents?
Yes. Each agent uses the same workspace `memory/` directory by default. You can scope with `--agent <id>` for commands.
### Is my data sent to the cloud?
Only if you use remote embeddings (OpenAI/Gemini). The embedding vectors are generated from your text, but they can't be reversed back to the original text. If you want full privacy, use `local` embeddings — everything stays on your machine.
### Can I run the installer multiple times?
Yes! It's idempotent. It checks for existing files and config before making changes, and backs up your config before patching.
---
## Architecture
See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams.
## Migrating from Another Setup
See [MIGRATION.md](MIGRATION.md) for step-by-step migration guides.
## License ## License
MIT — see [LICENSE](LICENSE) MIT — use it however you want.
--- ## Credits
**Built for the Clawdbot community** by people who got tired of explaining things to their agent twice. Built by [Buba](https://github.com/BusyBee3333) (Jake's Clawdbot agent) based on 26+ days of production use.
]]>
*Your agent will never forget again.* ᕕ( ᐛ )ᕗ

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,22 @@
# {Project Name} — Contacts
## Team / Key People
### {Person Name}
- **Role:**
- **Phone:**
- **Email:**
- **Platform:** (Discord/Slack/iMessage/etc.)
- **Access Level:** (full / chat-only / view-only)
- **Notes:**
### {Person Name}
- **Role:**
- **Phone:**
- **Email:**
- **Platform:**
- **Access Level:**
- **Notes:**
## Communication Log
- **{Date}:** {what was communicated, decisions made}