.agents/memory/2026-02-28-predictive-memory-scorer-phase-2-planning.md

2.1 KiB

2026-02-28 Session Notes

Predictive Memory Scorer Phase 2 Planning

Nicholai presented a detailed specification for Phase 2 of the predictive memory scorer project. Phases 0 and 1 are complete: the daemon has session_memories tracking, FTS hit logging, and enhanced continuity scoring. Phase 2 focuses on filling the critical gap in packages/predictor/src/data.rs — currently a 50-line stub returning empty vectors.

The implementation plan rewrites data.rs to read from SQLite directly, enabling the predictor to autonomously assemble training batches without serializing embeddings over JSON-RPC for each session.

Key Specifications

Data Loading Architecture: Two-query pattern (scored sessions, then per-session candidates). Reads session_scores with confidence >= 0.6, joins session_memories → memories → embeddings with LEFT JOIN for optional embedding blobs.

Feature Engineering: 12-dimensional vectors per candidate combining recency (log age in days), importance, access frequency (log count), cyclical encodings for time-of-day/day-of-week/month (sin/cos pairs), session gap hours, embedded flag, and deleted flag.

Label Construction: Blends continuity scorer's per-memory relevance scores with FTS behavioral signals. Deleted memories strongly negative (-0.3), injected memories use relevance scores (fallback to session score), non-injected memories use FTS hits as miss signals (0.6 for 2+ hits, 0.3 for 1 hit, 0.0 for no matches). Session-level novel_context_count refines labels further.

Embedding Handling: Query embedding computed synthetically as mean of injected embeddings (768 dims, Float32 LE format). Blob parsing validates dimension mismatch and skips invalid entries.

Implementation Notes: Manual ISO 8601 timestamp parsing (no chrono dependency), Zeller's formula for day-of-week, read-only SQLite mode, no loss_temperature or min_scorer_confidence hardcoding (configurable via DataConfig struct).

Next Steps

Implementation of data.rs rewrite following the specification. Message was truncated; may contain additional context on error handling or optimization strategies.