You've spent hours building context with your AI agent. You've explained your project architecture, your preferences, your entire business setup. Then the next morning, you say "Hey, remember that API integration we worked on?" and get back: "I don't have any context about a previous API integration. Could you tell me more?"
This isn't a bug — it's how AI agents are designed. When your conversation gets too long, the system runs compaction: it summarizes the conversation to free up context space. The summary keeps the gist, but the details — the exact config values, the specific decisions you made, the nuances of your preferences — those get compressed into oblivion. Your agent essentially gets amnesia.
The problem is structural. Every AI assistant today — Claude, ChatGPT, Gemini, all of them — faces this same constraint. Context windows are finite. Compaction is inevitable. And without a system to persist knowledge outside the chat window, everything you build is written in sand.
"My buddies hate when their agent just gets amnesia." — Jake
The Clawdbot Memory System solves this completely. It gives your agent a persistent memory layer — a real filesystem-backed brain that survives compaction, survives restarts, survives everything. Your agent writes knowledge to disk continuously and reads it back automatically. No more amnesia. No more re-explaining yourself.
Agent amnesia isn't one problem — it's three separate failure modes stacked on top of each other. We engineered a fix for each one.
Everything your agent knows lives exclusively in the chat context window. When compaction fires, the system generates a brief summary and discards the full conversation. Specific details — the exact configuration values you discussed, the precise reasoning behind a decision, your stated preferences — all get lost in summarization.
Fix
AGENTS.md instructs the agent to write to memory/YYYY-MM-DD.md throughout the session — not at the end, but continuously. Every decision, preference, project update, and notable piece of context gets written to disk in real-time. Even if compaction fires immediately after, the knowledge is already safely persisted.
Even with continuous writing, there's a race condition. The session hits its token limit, compaction fires immediately, and any context accumulated since the last write is gone. The agent doesn't get a chance to save before the rug is pulled out.
Fix
Pre-compaction flush. When the session is approximately 4,000 tokens from the compaction threshold, Clawdbot triggers a silent turn. The agent receives a system-level instruction to flush all unsaved context to disk immediately. It saves everything, responds with NO_REPLY (the user never sees this turn), and then compaction proceeds safely. Zero data loss.
Even if memories exist on disk, a fresh session starts with a clean slate. The agent has no idea those memory files are there unless something tells it to look. So it starts from zero every time — the knowledge exists but is never retrieved.
Fix
Mandatory Memory Recall rule baked into AGENTS.md: before answering any question about prior work, the agent MUST run memory_search first. Additionally, on every session start, the agent automatically reads today's and yesterday's logs. The result: the agent always knows what you've been working on.
The system uses a two-layer architecture designed for both human readability and machine-speed retrieval.
All memories are stored as plain Markdown files in the memory/ directory. Daily logs follow the YYYY-MM-DD.md convention. Project files, research intel, and contacts all live as readable .md files. This layer is human-readable, git-backed, and editable. You can open any memory file in your text editor and read exactly what your agent knows. This is the source of truth — everything else is derived from it.
When memory files are created or updated, they're automatically chunked and indexed into a local SQLite database with vector embeddings. This gives the agent semantic search — it can find relevant memories even when the exact keywords don't match. Searching for "that API we integrated" will find the entry about "REST endpoint configuration for Stripe webhooks" because the vectors capture meaning, not just words.
Search results are ranked using a hybrid scoring algorithm: 70% vector similarity (semantic meaning) blended with 30% BM25 keyword matching (exact term relevance). This means the system excels at both fuzzy conceptual queries and precise keyword lookups. Every search completes in under 100 milliseconds.
Clawdbot monitors token usage throughout the session. When the conversation reaches approximately 4,000 tokens below the compaction threshold, a silent flush turn is injected. The agent writes all accumulated context to memory files, responds with NO_REPLY, and the user never sees this happen. Compaction then proceeds with all data safely persisted. It's the safety net that catches everything.
Measured in production on Jake's Clawdbot instance. Zero data loss across months of daily use.
The installer is modular — you choose what you want. The core memory system is always installed; everything else is optional and additive.
memory/ directory with date-based templates for daily loggingAGENTS.md — teaches the agent how to use memory properly#project/name, #decision, #preference) for reliable searchWhen you start a new project, the agent automatically creates structured memory files — project tracker, research intel, decision log — from templates. You just say "let's start working on X" and the scaffolding appears.
Adds a daily backup habit to the agent's workflow. At the end of each session, the agent commits all memory changes to git and pushes to your private repo. Full version history of everything your agent has ever learned.
Open your terminal and run:
bash <(curl -sL https://raw.githubusercontent.com/BusyBee3333/clawdbot-memory-system/main/install.sh)
The installer is interactive and walks you through five questions:
Restart the Clawdbot gateway to pick up the new configuration:
clawdbot gateway restart
Then test it — just ask your agent:
"What did we work on today?"
If you've been chatting before the install, the agent will start building memory from this point forward. Within one session, it'll be writing to disk automatically.
| Provider | Model | Quality | Speed | Cost |
|---|---|---|---|---|
| OpenAI ⭐ | text-embedding-3-small | Excellent | ~50ms | ~$0.50/mo |
| Gemini | text-embedding-004 | Very Good | ~80ms | Free tier available |
| Local | all-MiniLM-L6-v2 | Good | ~20ms | Free (CPU only) |
Cost estimates based on typical usage (~100 memory operations/day). Local option requires no API key but has slightly lower semantic accuracy.
The best part of the memory system is that you don't have to do anything. From your perspective, you just chat normally. Everything happens in the background.
We believe in being honest about limitations. Here's where you might notice rough edges:
Your agent should remember you.
Now it does.