DMF: A Deterministic Memory Framework for Conversational AI Agents

2026-06-02Artificial Intelligence

Artificial IntelligenceComputation and Language
AI summary

The authors propose a new way to manage memory in conversational AI that avoids using large language models for summarizing conversations. Their system, called the Deterministic Memory Framework (DMF), uses fixed, rule-based methods instead of AI-generated summaries to decide which parts of a conversation to keep or discard. This approach reduces computational costs and makes the memory management predictable and transparent. They tested DMF against an existing method and found it saves a lot of processing without losing accuracy.

deterministic memoryconversational AIlarge language modelsmemory compressiontoken costslogistic projectionvector geometrypruningcontent relevance decaybenchmark datasets
Authors
Matteo Stabile, Enrico Zimuel
Abstract
Conversational AI agents require memory systems that are both scalable and semantically coherent across long interaction horizons. Existing approaches rely predominantly on large language model (LLM)-based summarisation at write time, which introduces non-determinism, escalating token costs, and opacity in pruning decisions. We present the Deterministic Memory Framework (DMF), a CPU-first approach that replaces generative memory compression with a fully deterministic pipeline grounded in classical NLP analysis, vector geometry, and mathematical scoring. DMF assigns each conversational interaction a Survival Score $Ω$ computed from deterministic content signals, conversational cues, and structured provenance, combined through a logistic projection. An interaction-count decay law, denoted as $Ω_{\mathrm{eff}}(Δn)$, governs how relevance evolves as new turns arrive, where $Δn$ is the number of newer interactions rather than wall-clock time, preserving full determinism. We present the mathematical formulation of DMF, its structured recall pipeline, the pruning decision procedure, and the evaluation protocol. Experiments are conducted on a purpose-built benchmark using the LoCoMo and LongMemEval datasets. We compare DMF against Mem0, a popular memory layer for AI agents. DMF achieves comparable accuracy while using zero tokens to prepare the memory context and 5x to 242x fewer tokens over the entire conversation. These results show that it is possible to eliminate LLM calls from the memory-management loop, reducing token costs to nearly zero and enabling deterministic memory systems for conversational AI agents.