PRIME: Training Free Proactive Reasoning via Iterative Memory Evolution for User-Centric Agent
2026-04-08 • Artificial Intelligence
Artificial Intelligence
AI summaryⓘ
The authors developed PRIME, a new way for AI agents to learn how to use tools while working with humans over many interactions. Instead of heavy training with lots of computing power, PRIME helps the agent learn by remembering and organizing past experiences into clear categories like what worked, what failed, and what the user prefers. This organized memory guides the AI’s future actions without needing complicated training. Their tests show PRIME matches traditional methods in performance but is cheaper and easier to understand.
autonomous agentstool-use agentshuman-AI interactionreinforcement learninggradient-free learningexperience accumulationcredit assignmentretrieval-augmented generationmeta-level operations
Authors
Prince Zizhuang Wang, Shuli Jiang
Abstract
The development of autonomous tool-use agents for complex, long-horizon tasks in collaboration with human users has become the frontier of agentic research. During multi-turn Human-AI interactions, the dynamic and uncertain nature of user demands poses a significant challenge; agents must not only invoke tools but also iteratively refine their understanding of user intent through effective communication. While recent advances in reinforcement learning offer a path to more capable tool-use agents, existing approaches require expensive training costs and struggle with turn-level credit assignment across extended interaction horizons. To this end, we introduce PRIME (Proactive Reasoning via Iterative Memory Evolution), a gradient-free learning framework that enables continuous agent evolvement through explicit experience accumulation rather than expensive parameter optimization. PRIME distills multi-turn interaction trajectories into structured, human-readable experiences organized across three semantic zones: successful strategies, failure patterns, and user preferences. These experiences evolve through meta-level operations and guide future agent behavior via retrieval-augmented generation. Our experiments across several diverse user-centric environments demonstrate that PRIME achieves competitive performance with gradient-based methods while offering cost-efficiency and interpretability. Together, PRIME presents a practical paradigm for building proactive, collaborative agents that learn from Human-AI interaction without the computational burden of gradient-based training.