2026-02-25 Session Notes

Predictive Memory Scorer Spec Review & Feedback Integration

Nicholai received detailed external code review feedback on the predictive memory scorer specification (docs/wip/predictive-memory-scorer.md). The reviewer provided high-level validation of the overall architecture—calling it "exceptionally well-thought-out" and praising the North Star vision ("difference between a tool that remembers and a mind that persists")—while identifying five concrete technical refinements.

Feedback Highlights

The reviewer validated three core design strengths: (1) Dynamic baseline weighting via RRF with EMA success rate ensures graceful degradation; (2) Zero-dependency Rust sidecar keeps binary size negligible and inference sub-millisecond; (3) Outcome-driven labels via continuity scorer avoids hand-labeling overhead.

HashTrick bucket count: Bump from 4,096 to 16,384 to reduce collisions in code-heavy memories (still ~1.1M parameters, fits L2 cache)
Listwise loss temperature: Use T < 1.0 to sharpen soft label distributions; flat P_true from continuity scorer soft labels requires aggressive sharpening
Negative sample filtering: Apply cosine similarity filter before assigning strict 0.0 labels; prevents training model to replicate baseline mistakes
Drift reset strategy: Use replay buffer (80% recent + 20% historical by continuity score) instead of 2x learning rate to avoid catastrophic forgetting
RRF constant k: Drop from 60 to 10–15 for ~50 candidate memories; k=60 compresses rank variance when candidate list is small

All five recommendations are sound and likely to be incorporated into the specification.

1.7 KiB Raw Permalink Blame History Unescape Escape

2026-02-25 Session Notes

Predictive Memory Scorer Spec Review & Feedback Integration

Feedback Highlights

Five Technical Refinements Identified

1.7 KiB

Raw Permalink Blame History