PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory

2026-04-09Artificial Intelligence

Artificial IntelligenceComputation and LanguageComputer Vision and Pattern RecognitionHuman-Computer InteractionMultiagent Systems
AI summary

The authors address the challenge of building proactive AI agents that work well in complex, real-life situations where understanding hidden user needs and managing ongoing context is tough. They propose a system called DD-MM-PAS that combines detecting user demands, modeling memory over time, and proactive agent actions in a continuous loop. They created a prototype named Pask using new models and memory systems to handle real-time and long-term tasks, and introduced a new benchmark called LatentNeeds-Bench from real user data to test their approach. Their experiments show that their method can identify deeper user intentions as effectively as strong existing models while working under strict timing constraints.

Proactive AIDemand DetectionMemory ModelingIntentFlowReal-time ConstraintsUser IntentLong-term MemoryStreaming AI AgentsLatentNeeds-BenchBenchmarking
Authors
Zhifei Xie, Zongzheng Hu, Fangda Ye, Xin Zhang, Haobo Chai, Zihang Liu, Pengcheng Wu, Guibin Zhang, Yue Liao, Xiaobin Hu, Deheng Ye, Chunyan Miao, Shuicheng Yan
Abstract
Proactivity is a core expectation for AGI. Prior work remains largely confined to laboratory settings, leaving a clear gap in real-world proactive agent: depth, complexity, ambiguity, precision and real-time constraints. We study this setting, where useful intervention requires inferring latent needs from ongoing context and grounding actions in evolving user memory under latency and long-horizon constraints. We first propose DD-MM-PAS (Demand Detection, Memory Modeling, Proactive Agent System) as a general paradigm for streaming proactive AI agent. We instantiate this paradigm in Pask, with streaming IntentFlow model for DD, a hybrid memory (workspace, user, global) for long-term MM, PAS infra framework and introduce how these components form a closed loop. We also introduce LatentNeeds-Bench, a real-world benchmark built from user-consented data and refined through thousands of rounds of human editing. Experiments show that IntentFlow matches leading Gemini3-Flash models under latency constraints, while identifying deeper user intent.