diff --git a/memory/2026-03-01-incremental-embedding-refresh-tracker-implementati.md b/memory/2026-03-01-incremental-embedding-refresh-tracker-implementati.md new file mode 100644 index 000000000..7397accd6 --- /dev/null +++ b/memory/2026-03-01-incremental-embedding-refresh-tracker-implementati.md @@ -0,0 +1,40 @@ +# 2026-03-01 Session Notes + +## Incremental Embedding Refresh Tracker Implementation Plan + +Received detailed plan for implementing a background polling loop to detect and refresh stale/missing embeddings in the Signet memory pipeline. The tracker runs independently and processes embeddings in small batches to avoid overwhelming the system. + +### Architecture Decisions + +The tracker uses a **setTimeout chain** instead of setInterval for natural backpressure — each cycle schedules the next after current processing completes, rather than on a fixed timer. This prevents queue buildup if embedding fetches slow down. + +### Core Mechanism + +Polling loop: +1. Check embedding provider health (uses existing 30s cache) +2. Query stale embeddings: missing embeddings, content hash mismatches, or model drift +3. Fetch embeddings sequentially with 30s timeout per request +4. Batch write successful results in single transaction +5. Idempotent via `ON CONFLICT(content_hash) DO UPDATE` + +### Configuration + +Three parameters in `PipelineEmbeddingTrackerConfig`: +- `enabled` (boolean, default true) +- `pollMs` (5000ms default, clamped 1000–60000ms) +- `batchSize` (8 default, clamped 1–20) + +### Integration Points + +1. **types.ts** — Add `PipelineEmbeddingTrackerConfig` interface to `PipelineV2Config` +2. **memory-config.ts** — Parse tracker config using existing `clampPositive` pattern +3. **embedding-tracker.ts** (new ~200 LOC) — Core polling module with graceful shutdown +4. **daemon.ts** — Start tracker after pipeline init, stop before DB cleanup, enhance `/api/embeddings/status` endpoint + +### Edge Cases Handled + +Race conditions on concurrent remember/update (idempotent write), provider downtime (retries next cycle), model switching (old embeddings deleted by hash), large backlogs (intentional backpressure at ~100/min), empty DB (returns immediately). + +### Next Steps + +Implementation in order: types → memory-config → new embedding-tracker module → daemon wiring. Verification via build, typecheck, lint, then manual testing with stale embeddings. \ No newline at end of file diff --git a/memory/memories.db-shm b/memory/memories.db-shm index f7910f3f4..379fb0d5a 100644 Binary files a/memory/memories.db-shm and b/memory/memories.db-shm differ diff --git a/memory/memories.db-wal b/memory/memories.db-wal index 2e8751dab..e05a7e1c1 100644 Binary files a/memory/memories.db-wal and b/memory/memories.db-wal differ