One-command persistent memory for Clawdbot. Prevents context amnesia during compaction with: - Two-layer memory: Markdown source of truth + SQLite vector search - Pre-compaction flush to save context before it's lost - Semantic search across all memory files - Daily logs, research intel, and project tracking templates - Interactive installer with dry-run and uninstall support
349 lines
11 KiB
Markdown
349 lines
11 KiB
Markdown
<![CDATA[# 🧠 Clawdbot Memory System
|
|
|
|
**One-command persistent memory for Clawdbot — never lose context to compaction again.**
|
|
|
|
> "Why does my agent forget everything after a long session?"
|
|
|
|
Because Clawdbot compacts old context to stay within its context window. Without a memory system, everything that was compacted is gone. This repo fixes that permanently.
|
|
|
|
---
|
|
|
|
## What This Is
|
|
|
|
A **two-layer memory system** for Clawdbot:
|
|
|
|
1. **Markdown files** (source of truth) — Daily logs, research intel, project tracking, and durable notes your agent writes to disk
|
|
2. **SQLite vector search** (retrieval layer) — Semantic search index that lets your agent find relevant memories even when wording differs
|
|
|
|
Your agent writes memories to plain Markdown. Those files get indexed into a vector store. When the agent needs context, it searches semantically and finds what it needs — even across sessions, even after compaction.
|
|
|
|
## Quick Install
|
|
|
|
```bash
|
|
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
|
|
```
|
|
|
|
That's it. The installer will:
|
|
- ✅ Detect your Clawdbot installation
|
|
- ✅ Create the `memory/` directory with templates
|
|
- ✅ Patch your `clawdbot.json` with memory search config (without touching anything else)
|
|
- ✅ Add memory habits to your `AGENTS.md`
|
|
- ✅ Build the initial vector index
|
|
- ✅ Verify everything works
|
|
|
|
### Preview First (Dry Run)
|
|
|
|
```bash
|
|
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --dry-run
|
|
```
|
|
|
|
### Uninstall
|
|
|
|
```bash
|
|
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh) --uninstall
|
|
```
|
|
|
|
---
|
|
|
|
## How It Works
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ YOUR AGENT SESSION │
|
|
│ │
|
|
│ Agent writes notes ──→ memory/2026-02-10.md │
|
|
│ Agent stores facts ──→ MEMORY.md │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────┐ │
|
|
│ │ File Watcher │ (debounced) │
|
|
│ └──────┬───────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────┐ │
|
|
│ │ Embedding Provider │ │
|
|
│ │ (OpenAI / Gemini / │ │
|
|
│ │ Local GGUF) │ │
|
|
│ └───────────┬───────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌───────────────────────┐ │
|
|
│ │ SQLite + sqlite-vec │ │
|
|
│ │ Vector Index │ │
|
|
│ └───────────┬───────────┘ │
|
|
│ │ │
|
|
│ Agent asks ──────────┤ │
|
|
│ "what did we decide │ │
|
|
│ about the API?" ▼ │
|
|
│ ┌───────────────────────┐ │
|
|
│ │ Hybrid Search │ │
|
|
│ │ (semantic + keyword) │ │
|
|
│ └───────────┬───────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ Relevant memory chunks │
|
|
│ injected into context │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Pre-Compaction Flush
|
|
|
|
This is the secret sauce. When your session nears its context limit:
|
|
|
|
```
|
|
Session approaching limit
|
|
│
|
|
▼
|
|
┌─────────────────────┐
|
|
│ Pre-compaction ping │ ← Clawdbot silently triggers this
|
|
│ "Store durable │
|
|
│ memories now" │
|
|
└──────────┬────────────┘
|
|
│
|
|
▼
|
|
Agent writes lasting notes
|
|
to memory/YYYY-MM-DD.md
|
|
│
|
|
▼
|
|
Context gets compacted
|
|
(old messages removed)
|
|
│
|
|
▼
|
|
BUT memories are on disk
|
|
AND indexed for search
|
|
│
|
|
▼
|
|
Agent can find them anytime 🎉
|
|
```
|
|
|
|
---
|
|
|
|
## Embedding Provider Options
|
|
|
|
The installer will ask which provider you want:
|
|
|
|
| Provider | Speed | Cost | Setup |
|
|
|----------|-------|------|-------|
|
|
| **OpenAI** (recommended) | ⚡ Fast | ~$0.02/million tokens | API key required |
|
|
| **Gemini** | ⚡ Fast | Free tier available | API key required |
|
|
| **Local** | 🐢 Slower first run | Free | Downloads GGUF model (~100MB) |
|
|
|
|
**OpenAI** (`text-embedding-3-small`) is recommended for the best experience. It's extremely cheap and fast.
|
|
|
|
**Gemini** (`gemini-embedding-001`) works great and has a generous free tier.
|
|
|
|
**Local** uses `node-llama-cpp` with a GGUF model — fully offline, no API key needed, but the first index build is slower.
|
|
|
|
---
|
|
|
|
## Manual Setup (Alternative)
|
|
|
|
If you prefer to set things up yourself instead of using the installer:
|
|
|
|
### 1. Create the memory directory
|
|
|
|
```bash
|
|
mkdir -p ~/.clawdbot/workspace/memory
|
|
```
|
|
|
|
### 2. Add memory search config to clawdbot.json
|
|
|
|
Open `~/.clawdbot/clawdbot.json` and add `memorySearch` inside `agents.defaults`:
|
|
|
|
**For OpenAI:**
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"memorySearch": {
|
|
"provider": "openai",
|
|
"model": "text-embedding-3-small"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**For Gemini:**
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"memorySearch": {
|
|
"provider": "gemini",
|
|
"model": "gemini-embedding-001"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**For Local:**
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"memorySearch": {
|
|
"provider": "local"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. Set your API key (if using OpenAI or Gemini)
|
|
|
|
For OpenAI, set `OPENAI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.openai.apiKey`.
|
|
|
|
For Gemini, set `GEMINI_API_KEY` in your environment or in `clawdbot.json` under `models.providers.google.apiKey`.
|
|
|
|
### 4. Build the index
|
|
|
|
```bash
|
|
clawdbot memory index --verbose
|
|
```
|
|
|
|
### 5. Verify
|
|
|
|
```bash
|
|
clawdbot memory status --deep
|
|
```
|
|
|
|
### 6. Restart the gateway
|
|
|
|
```bash
|
|
clawdbot gateway restart
|
|
```
|
|
|
|
---
|
|
|
|
## What Gets Indexed
|
|
|
|
By default, Clawdbot indexes:
|
|
- `MEMORY.md` — Long-term curated memory
|
|
- `memory/*.md` — Daily logs and all memory files
|
|
|
|
All files must be Markdown (`.md`). The index watches for changes and re-indexes automatically.
|
|
|
|
### Adding Extra Paths
|
|
|
|
Want to index files outside the default layout? Add `extraPaths`:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"memorySearch": {
|
|
"extraPaths": ["../team-docs", "/path/to/other/notes"]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### "No API key found for provider openai/google"
|
|
|
|
You need to set your embedding API key. Either:
|
|
- Set the environment variable (`OPENAI_API_KEY` or `GEMINI_API_KEY`)
|
|
- Or add it to `clawdbot.json` under `models.providers`
|
|
|
|
### "Memory search stays disabled"
|
|
|
|
Run `clawdbot memory status --deep` to see what's wrong. Common causes:
|
|
- No embedding provider configured
|
|
- API key missing or invalid
|
|
- No `.md` files in `memory/` directory
|
|
|
|
### Index not updating
|
|
|
|
Run a manual reindex:
|
|
```bash
|
|
clawdbot memory index --force --verbose
|
|
```
|
|
|
|
### Agent still seems to forget things
|
|
|
|
Make sure your `AGENTS.md` includes memory instructions. The agent needs to be told to:
|
|
1. Search memory before answering questions about prior work
|
|
2. Write important things to daily logs
|
|
3. Flush memories before compaction
|
|
|
|
The installer handles this automatically.
|
|
|
|
### Installer fails with "jq not found"
|
|
|
|
The installer needs `jq` for safe JSON patching. Install it:
|
|
```bash
|
|
# macOS
|
|
brew install jq
|
|
|
|
# Ubuntu/Debian
|
|
sudo apt-get install jq
|
|
|
|
# Or download from https://jqlang.github.io/jq/
|
|
```
|
|
|
|
---
|
|
|
|
## FAQ
|
|
|
|
### Why does my agent forget everything?
|
|
|
|
Clawdbot uses a context window with a token limit. When a session gets long, old messages are **compacted** (summarized and removed) to make room. Without a memory system, the details in those old messages are lost forever.
|
|
|
|
This memory system solves it by:
|
|
1. Writing important context to files on disk (survives any compaction)
|
|
2. Indexing those files for semantic search (agent can find them later)
|
|
3. Flushing memories right before compaction happens (nothing falls through the cracks)
|
|
|
|
### How is this different from just having MEMORY.md?
|
|
|
|
`MEMORY.md` alone is a single file that the agent reads at session start. It works for small amounts of info, but:
|
|
- It doesn't scale (gets too big to fit in context)
|
|
- It's not searchable (agent has to read the whole thing)
|
|
- Daily details get lost (you can't put everything in one file)
|
|
|
|
This system adds **daily logs** (unlimited history) + **vector search** (find anything semantically) + **pre-compaction flush** (automatic safety net).
|
|
|
|
### Does this cost money?
|
|
|
|
- **Local embeddings**: Free (but slower)
|
|
- **OpenAI embeddings**: ~$0.02 per million tokens (essentially free for personal use)
|
|
- **Gemini embeddings**: Free tier available
|
|
|
|
For reference, indexing 100 daily logs costs about $0.001 with OpenAI.
|
|
|
|
### Can I use this with multiple agents?
|
|
|
|
Yes. Each agent uses the same workspace `memory/` directory by default. You can scope with `--agent <id>` for commands.
|
|
|
|
### Is my data sent to the cloud?
|
|
|
|
Only if you use remote embeddings (OpenAI/Gemini). The embedding vectors are generated from your text, but they can't be reversed back to the original text. If you want full privacy, use `local` embeddings — everything stays on your machine.
|
|
|
|
### Can I run the installer multiple times?
|
|
|
|
Yes! It's idempotent. It checks for existing files and config before making changes, and backs up your config before patching.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed diagrams.
|
|
|
|
## Migrating from Another Setup
|
|
|
|
See [MIGRATION.md](MIGRATION.md) for step-by-step migration guides.
|
|
|
|
## License
|
|
|
|
MIT — see [LICENSE](LICENSE)
|
|
|
|
---
|
|
|
|
**Built for the Clawdbot community** by people who got tired of explaining things to their agent twice.
|
|
]]> |