DEV Community
Follow
I Built a Memory System for AI Agents That Actually Forgets
Existing AI agent memory systems suffer from perpetual accumulation, leading to degraded performance and accuracy over time as old facts pollute retrieval. This problem arises because more tokens directly correlate with worse results and a slower, dumber agent. To address this, recall-sqlite was developed as a memory system that actively forgets. Its core principle is tiered storage, where memories are automatically moved between tiers based on their access frequency. The hot tier holds frequently accessed memories, utilizing ANN and keywords for fast retrieval. The warm tier stores less accessed memories, relying only on keywords and FTS5 for significantly reduced compute. The cold tier can store an unlimited number of memories with zero compute, automatically promoted when needed. Key design choices include zero LLM use at query time, relying solely on a small local embedding model. It avoids traditional vector databases, using SQLite with sqlite-vec instead. The system supports graceful degradation, falling back to keyword and FTS5 search when offline. Automatic schema migration simplifies updates, and it offers a single pip install without API keys or Docker. After six months of daily use with nearly 1500 memories, latency remains around 80ms with a fixed memory footprint of about 1.5MB. recall-sqlite has now been integrated into the Hermes Agent ecosystem.