v2.0: Interactive onboarding, organization system, auto-scaffold, Jake's choices, contacts template
This commit is contained in:
parent
cb28c2649f
commit
9d48022c50
433
README.md
433
README.md
@ -1,349 +1,236 @@
|
||||
<![CDATA[# 🧠 Clawdbot Memory System
|
||||
# 🧠 Clawdbot Memory System
|
||||
|
||||
**One-command persistent memory for Clawdbot — never lose context to compaction again.**
|
||||
**Never lose context to compaction again.**
|
||||
|
||||
> "Why does my agent forget everything after a long session?"
|
||||
Your Clawdbot agent forgets everything after a long session because compaction summarizes and discards old messages. This system fixes that with persistent, searchable memory that survives crashes, restarts, and compaction.
|
||||
|
||||
Because Clawdbot compacts old context to stay within its context window. Without a memory system, everything that was compacted is gone. This repo fixes that permanently.
|
||||
> *"My buddies hate when their agent just gets amnesia."* — Jake
|
||||
|
||||
---
|
||||
|
||||
## What This Is
|
||||
|
||||
A **two-layer memory system** for Clawdbot:
|
||||
|
||||
1. **Markdown files** (source of truth) — Daily logs, research intel, project tracking, and durable notes your agent writes to disk
|
||||
2. **SQLite vector search** (retrieval layer) — Semantic search index that lets your agent find relevant memories even when wording differs
|
||||
|
||||
Your agent writes memories to plain Markdown. Those files get indexed into a vector store. When the agent needs context, it searches semantically and finds what it needs — even across sessions, even after compaction.
|
||||
|
||||
## Quick Install
|
||||
## One-Command Install
|
||||
|
||||
```bash
|
||||
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
|
||||
```
|
||||
|
||||
That's it. The installer will:
|
||||
- ✅ Detect your Clawdbot installation
|
||||
- ✅ Create the `memory/` directory with templates
|
||||
- ✅ Patch your `clawdbot.json` with memory search config (without touching anything else)
|
||||
- ✅ Add memory habits to your `AGENTS.md`
|
||||
- ✅ Build the initial vector index
|
||||
- ✅ Verify everything works
|
||||
The installer walks you through an interactive setup (~2 minutes) with these choices:
|
||||
|
||||
### Preview First (Dry Run)
|
||||
| Step | What It Asks | Jake's Choice |
|
||||
|------|-------------|---------------|
|
||||
| Core Memory System | Install persistent memory + search? | ⭐ Yes |
|
||||
| Embedding Provider | OpenAI / Gemini / Local? | ⭐ OpenAI |
|
||||
| Organization System | Add project scaffolding templates? | ⭐ Yes |
|
||||
| Auto-Scaffold | Auto-create files for new projects? | ⭐ Yes |
|
||||
| Git Backup | Add backup instructions for agent? | ⭐ Yes |
|
||||
|
||||
```bash
|
||||
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --dry-run
|
||||
```
|
||||
**Flags:**
|
||||
- `--dry-run` — Preview all changes without applying
|
||||
- `--uninstall` — Cleanly remove config (preserves your memory files)
|
||||
|
||||
### Uninstall
|
||||
## What Gets Installed
|
||||
|
||||
```bash
|
||||
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --uninstall
|
||||
```
|
||||
### Core Memory System (always installed)
|
||||
|
||||
---
|
||||
| Component | What It Does |
|
||||
|-----------|-------------|
|
||||
| `memory/` directory | Where all memory files live |
|
||||
| Daily log system | Agent writes `memory/YYYY-MM-DD.md` each session |
|
||||
| SQLite vector index | Semantic search over all memory files (<100ms) |
|
||||
| Hybrid search | 70% vector similarity + 30% keyword matching |
|
||||
| Pre-compaction flush | Auto-saves context BEFORE session compacts |
|
||||
| AGENTS.md patch | Tells your agent when and how to use memory |
|
||||
| Config patch | Adds memorySearch to clawdbot.json (non-destructive) |
|
||||
|
||||
### Organization System (optional)
|
||||
|
||||
| Component | What It Does |
|
||||
|-----------|-------------|
|
||||
| Project Quickstart | Auto-create structured files for new projects |
|
||||
| Research Intel System | Weekly intel rotation with auto-compression |
|
||||
| Project Tracking | Standardized stages, blockers, decisions |
|
||||
| Contacts Tracker | People, roles, access levels per project |
|
||||
| Tag Convention | Consistent naming so search groups by project |
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ YOUR AGENT SESSION │
|
||||
│ │
|
||||
│ Agent writes notes ──→ memory/2026-02-10.md │
|
||||
│ Agent stores facts ──→ MEMORY.md │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ File Watcher │ (debounced) │
|
||||
│ └──────┬───────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌───────────────────────┐ │
|
||||
│ │ Embedding Provider │ │
|
||||
│ │ (OpenAI / Gemini / │ │
|
||||
│ │ Local GGUF) │ │
|
||||
│ └───────────┬───────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌───────────────────────┐ │
|
||||
│ │ SQLite + sqlite-vec │ │
|
||||
│ │ Vector Index │ │
|
||||
│ └───────────┬───────────┘ │
|
||||
│ │ │
|
||||
│ Agent asks ──────────┤ │
|
||||
│ "what did we decide │ │
|
||||
│ about the API?" ▼ │
|
||||
│ ┌───────────────────────┐ │
|
||||
│ │ Hybrid Search │ │
|
||||
│ │ (semantic + keyword) │ │
|
||||
│ └───────────┬───────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ Relevant memory chunks │
|
||||
│ injected into context │
|
||||
│ YOUR AGENT CHAT SESSION │
|
||||
│ Decisions, preferences, facts, project progress... │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│
|
||||
▼ (agent writes throughout session)
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ MARKDOWN FILES (Source of Truth) │
|
||||
│ memory/2026-02-10.md (daily log) │
|
||||
│ memory/acme-progress.md (project tracking) │
|
||||
│ memory/acme-research-intel.md (weekly intel) │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│
|
||||
▼ (file watcher, 1.5s debounce)
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ INDEXING PIPELINE (automatic) │
|
||||
│ 1. Detect file changes │
|
||||
│ 2. Chunk text (~400 tokens, 80 overlap) │
|
||||
│ 3. Generate embeddings (OpenAI/Gemini/local) │
|
||||
│ 4. Store in SQLite with vector + full-text indexes │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ SQLite DATABASE (Search Engine) │
|
||||
│ ~/.clawdbot/memory/main.sqlite │
|
||||
│ • Vector similarity (semantic meaning) │
|
||||
│ • BM25 full-text (exact keywords, IDs, code) │
|
||||
│ • Hybrid: 70% vector + 30% keyword │
|
||||
│ • Search speed: <100ms │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│
|
||||
▼ (next session starts)
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ AGENT RECALLS CONTEXT │
|
||||
│ memory_search("what did we decide about X?") │
|
||||
│ → Returns relevant snippets + file paths + line nums │
|
||||
│ → Agent answers accurately from persistent memory │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Pre-Compaction Flush
|
||||
## Why Your Agent Gets Amnesia (and how this fixes it)
|
||||
|
||||
This is the secret sauce. When your session nears its context limit:
|
||||
**The problem:** When a chat session gets too long, Clawdbot *compacts* it — summarizing old messages and discarding the originals. Important decisions, preferences, and context get lost in the summary. Your agent "forgets."
|
||||
|
||||
```
|
||||
Session approaching limit
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ Pre-compaction ping │ ← Clawdbot silently triggers this
|
||||
│ "Store durable │
|
||||
│ memories now" │
|
||||
└──────────┬────────────┘
|
||||
│
|
||||
▼
|
||||
Agent writes lasting notes
|
||||
to memory/YYYY-MM-DD.md
|
||||
│
|
||||
▼
|
||||
Context gets compacted
|
||||
(old messages removed)
|
||||
│
|
||||
▼
|
||||
BUT memories are on disk
|
||||
AND indexed for search
|
||||
│
|
||||
▼
|
||||
Agent can find them anytime 🎉
|
||||
```
|
||||
**The fix:** This system adds two mechanisms:
|
||||
|
||||
---
|
||||
1. **Continuous writing** — Your agent writes important context to `memory/YYYY-MM-DD.md` files *throughout the session*, not just at the end. Decisions, preferences, project state — all captured on disk in real-time.
|
||||
|
||||
## Embedding Provider Options
|
||||
2. **Pre-compaction flush** — When a session is ~4,000 tokens from the compaction threshold, Clawdbot triggers a *silent* reminder. The agent reviews what's in context, writes anything important that hasn't been saved yet, and then compaction proceeds safely. The agent responds `NO_REPLY` so you never see this happen.
|
||||
|
||||
The installer will ask which provider you want:
|
||||
**Result:** Your agent's memory is on disk, indexed, and searchable. Compaction can't touch it. Crashes can't touch it. Restarts can't touch it.
|
||||
|
||||
## Embedding Providers
|
||||
|
||||
| Provider | Speed | Cost | Setup |
|
||||
|----------|-------|------|-------|
|
||||
| **OpenAI** (recommended) | ⚡ Fast | ~$0.02/million tokens | API key required |
|
||||
| **OpenAI** ⭐ | ⚡ Fast | ~$0.02/million tokens (~$0.50/mo) | API key required |
|
||||
| **Gemini** | ⚡ Fast | Free tier available | API key required |
|
||||
| **Local** | 🐢 Slower first run | Free | Downloads GGUF model (~100MB) |
|
||||
| **Local** | 🐢 Slower first build | Free forever | Auto-downloads ~600MB model |
|
||||
|
||||
**OpenAI** (`text-embedding-3-small`) is recommended for the best experience. It's extremely cheap and fast.
|
||||
**Jake uses OpenAI** — it's the fastest and most reliable. At ~$0.50/month it's basically free.
|
||||
|
||||
**Gemini** (`gemini-embedding-001`) works great and has a generous free tier.
|
||||
**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key, but first index build is slower.
|
||||
|
||||
**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key needed, but the first index build is slower.
|
||||
## Production Stats (Jake's Setup)
|
||||
|
||||
---
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Files indexed | 35 |
|
||||
| Chunks | 121 |
|
||||
| Search speed | <100ms |
|
||||
| SQLite size | 15 MB |
|
||||
| Monthly cost | ~$0.50 (OpenAI) |
|
||||
| Data loss incidents | **Zero** |
|
||||
| Crashes survived | 5+ |
|
||||
| Days in production | 26+ |
|
||||
|
||||
## Manual Setup (Alternative)
|
||||
## Manual Setup
|
||||
|
||||
If you prefer to set things up yourself instead of using the installer:
|
||||
|
||||
### 1. Create the memory directory
|
||||
|
||||
### 1. Create memory directory
|
||||
```bash
|
||||
mkdir -p ~/.clawdbot/workspace/memory
|
||||
cd ~/.clawdbot/workspace
|
||||
mkdir -p memory
|
||||
```
|
||||
|
||||
### 2. Add memory search config to clawdbot.json
|
||||
|
||||
Open `~/.clawdbot/clawdbot.json` and add `memorySearch` inside `agents.defaults`:
|
||||
|
||||
**For OpenAI:**
|
||||
### 2. Add to clawdbot.json
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"defaults": {
|
||||
"memorySearch": {
|
||||
"enabled": true,
|
||||
"provider": "openai",
|
||||
"model": "text-embedding-3-small"
|
||||
"model": "text-embedding-3-small",
|
||||
"query": {
|
||||
"hybrid": {
|
||||
"enabled": true,
|
||||
"vectorWeight": 0.7,
|
||||
"textWeight": 0.3,
|
||||
"candidateMultiplier": 4
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**For Gemini:**
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"defaults": {
|
||||
"memorySearch": {
|
||||
"provider": "gemini",
|
||||
"model": "gemini-embedding-001"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**For Local:**
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"defaults": {
|
||||
"memorySearch": {
|
||||
"provider": "local"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Set your API key (if using OpenAI or Gemini)
|
||||
|
||||
For OpenAI, set `OPENAI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.openai.apiKey`.
|
||||
|
||||
For Gemini, set `GEMINI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.google.apiKey`.
|
||||
|
||||
### 4. Build the index
|
||||
### 3. Add to AGENTS.md
|
||||
See `config/agents-memory-patch.md` for the exact text to append.
|
||||
|
||||
### 4. Restart and verify
|
||||
```bash
|
||||
clawdbot gateway restart
|
||||
clawdbot memory index --verbose
|
||||
clawdbot memory status --deep
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Memory search disabled"
|
||||
**Cause:** No embedding provider configured or API key missing.
|
||||
**Fix:** Run the installer again, or add your API key to clawdbot.json manually.
|
||||
|
||||
### Agent still forgets after compaction
|
||||
**Cause:** AGENTS.md may not have the memory instructions.
|
||||
**Fix:** Check that AGENTS.md contains the "Memory System" section. Re-run installer if needed.
|
||||
|
||||
### Search returns no results
|
||||
**Possible causes:**
|
||||
1. Index not built — run `clawdbot memory index --verbose`
|
||||
2. No memory files yet — create your first daily log
|
||||
3. Query too specific — try broader terms
|
||||
|
||||
### Rebuild index from scratch
|
||||
```bash
|
||||
rm ~/.clawdbot/memory/main.sqlite
|
||||
clawdbot memory index --verbose
|
||||
```
|
||||
Markdown files are the source of truth — SQLite is always regenerable.
|
||||
|
||||
### 5. Verify
|
||||
|
||||
### Check system health
|
||||
```bash
|
||||
clawdbot memory status --deep
|
||||
```
|
||||
|
||||
### 6. Restart the gateway
|
||||
|
||||
```bash
|
||||
clawdbot gateway restart
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Gets Indexed
|
||||
|
||||
By default, Clawdbot indexes:
|
||||
- `MEMORY.md` — Long-term curated memory
|
||||
- `memory/*.md` — Daily logs and all memory files
|
||||
|
||||
All files must be Markdown (`.md`). The index watches for changes and re-indexes automatically.
|
||||
|
||||
### Adding Extra Paths
|
||||
|
||||
Want to index files outside the default layout? Add `extraPaths`:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"defaults": {
|
||||
"memorySearch": {
|
||||
"extraPaths": ["../team-docs", "/path/to/other/notes"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No API key found for provider openai/google"
|
||||
|
||||
You need to set your embedding API key. Either:
|
||||
- Set the environment variable (`OPENAI_API_KEY` or `GEMINI_API_KEY`)
|
||||
- Or add it to `clawdbot.json` under `models.providers`
|
||||
|
||||
### "Memory search stays disabled"
|
||||
|
||||
Run `clawdbot memory status --deep` to see what's wrong. Common causes:
|
||||
- No embedding provider configured
|
||||
- API key missing or invalid
|
||||
- No `.md` files in `memory/` directory
|
||||
|
||||
### Index not updating
|
||||
|
||||
Run a manual reindex:
|
||||
```bash
|
||||
clawdbot memory index --force --verbose
|
||||
```
|
||||
|
||||
### Agent still seems to forget things
|
||||
|
||||
Make sure your `AGENTS.md` includes memory instructions. The agent needs to be told to:
|
||||
1. Search memory before answering questions about prior work
|
||||
2. Write important things to daily logs
|
||||
3. Flush memories before compaction
|
||||
|
||||
The installer handles this automatically.
|
||||
|
||||
### Installer fails with "jq not found"
|
||||
|
||||
The installer needs `jq` for safe JSON patching. Install it:
|
||||
```bash
|
||||
# macOS
|
||||
brew install jq
|
||||
|
||||
# Ubuntu/Debian
|
||||
sudo apt-get install jq
|
||||
|
||||
# Or download from https://jqlang.github.io/jq/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## FAQ
|
||||
|
||||
### Why does my agent forget everything?
|
||||
**Q: Will this slow down my agent?**
|
||||
A: No. Search takes <100ms. Indexing happens in the background. Writing to files takes milliseconds.
|
||||
|
||||
Clawdbot uses a context window with a token limit. When a session gets long, old messages are **compacted** (summarized and removed) to make room. Without a memory system, the details in those old messages are lost forever.
|
||||
**Q: How much disk space does it use?**
|
||||
A: ~15MB for the SQLite index with 35 files. Markdown files themselves are tiny.
|
||||
|
||||
This memory system solves it by:
|
||||
1. Writing important context to files on disk (survives any compaction)
|
||||
2. Indexing those files for semantic search (agent can find them later)
|
||||
3. Flushing memories right before compaction happens (nothing falls through the cracks)
|
||||
**Q: Can I edit memory files manually?**
|
||||
A: Yes! They're plain Markdown. Edit in any text editor. Changes are auto-indexed.
|
||||
|
||||
### How is this different from just having MEMORY.md?
|
||||
**Q: What if I delete the SQLite database?**
|
||||
A: Just run `clawdbot memory index --verbose` to rebuild it. Markdown files are the source of truth.
|
||||
|
||||
`MEMORY.md` alone is a single file that the agent reads at session start. It works for small amounts of info, but:
|
||||
- It doesn't scale (gets too big to fit in context)
|
||||
- It's not searchable (agent has to read the whole thing)
|
||||
- Daily details get lost (you can't put everything in one file)
|
||||
**Q: Does this work with OpenClaw / Moltbot?**
|
||||
A: Yes. The installer auto-detects all three (Clawdbot, OpenClaw, Moltbot).
|
||||
|
||||
This system adds **daily logs** (unlimited history) + **vector search** (find anything semantically) + **pre-compaction flush** (automatic safety net).
|
||||
**Q: Can multiple agents share the same memory?**
|
||||
A: Each agent gets its own SQLite index, but they can read the same markdown files if they share a workspace.
|
||||
|
||||
### Does this cost money?
|
||||
|
||||
- **Local embeddings**: Free (but slower)
|
||||
- **OpenAI embeddings**: ~$0.02 per million tokens (essentially free for personal use)
|
||||
- **Gemini embeddings**: Free tier available
|
||||
|
||||
For reference, indexing 100 daily logs costs about $0.001 with OpenAI.
|
||||
|
||||
### Can I use this with multiple agents?
|
||||
|
||||
Yes. Each agent uses the same workspace `memory/` directory by default. You can scope with `--agent <id>` for commands.
|
||||
|
||||
### Is my data sent to the cloud?
|
||||
|
||||
Only if you use remote embeddings (OpenAI/Gemini). The embedding vectors are generated from your text, but they can't be reversed back to the original text. If you want full privacy, use `local` embeddings — everything stays on your machine.
|
||||
|
||||
### Can I run the installer multiple times?
|
||||
|
||||
Yes! It's idempotent. It checks for existing files and config before making changes, and backs up your config before patching.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams.
|
||||
|
||||
## Migrating from Another Setup
|
||||
|
||||
See [MIGRATION.md](MIGRATION.md) for step-by-step migration guides.
|
||||
**Q: Is my data sent to the cloud?**
|
||||
A: Only the text chunks are sent to generate embeddings (OpenAI/Gemini). The actual memory files stay on your disk. Use `local` provider for fully offline operation.
|
||||
|
||||
## License
|
||||
|
||||
MIT — see [LICENSE](LICENSE)
|
||||
MIT — use it however you want.
|
||||
|
||||
---
|
||||
## Credits
|
||||
|
||||
**Built for the Clawdbot community** by people who got tired of explaining things to their agent twice.
|
||||
]]>
|
||||
Built by [Buba](https://github.com/BusyBee3333) (Jake's Clawdbot agent) based on 26+ days of production use.
|
||||
|
||||
*Your agent will never forget again.* ᕕ( ᐛ )ᕗ
|
||||
|
||||
743
install.sh
743
install.sh
File diff suppressed because it is too large
Load Diff
22
templates/TEMPLATE-contacts.md
Normal file
22
templates/TEMPLATE-contacts.md
Normal file
@ -0,0 +1,22 @@
|
||||
# {Project Name} — Contacts
|
||||
|
||||
## Team / Key People
|
||||
|
||||
### {Person Name}
|
||||
- **Role:**
|
||||
- **Phone:**
|
||||
- **Email:**
|
||||
- **Platform:** (Discord/Slack/iMessage/etc.)
|
||||
- **Access Level:** (full / chat-only / view-only)
|
||||
- **Notes:**
|
||||
|
||||
### {Person Name}
|
||||
- **Role:**
|
||||
- **Phone:**
|
||||
- **Email:**
|
||||
- **Platform:**
|
||||
- **Access Level:**
|
||||
- **Notes:**
|
||||
|
||||
## Communication Log
|
||||
- **{Date}:** {what was communicated, decisions made}
|
||||
Loading…
x
Reference in New Issue
Block a user