LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

2026-05-06Artificial Intelligence

Artificial Intelligence
AI summary

The authors address the challenge of managing lots of information when AI agents work on tasks that take many steps. They propose a system called Context-ReAct, which helps agents keep only the important details by using five operations to skip, summarize, or delete parts of their memory. This approach makes the agent's thinking more efficient and less error-prone. They built LongSeeker, an agent based on this idea, and it performed better than other agents on several long-task benchmarks. Their work shows that smart management of information can help AI agents think more reliably over long tasks.

long-horizon searchcontext managementreasoning agenttool usememory compressionadaptive contexthallucination riskfine-tuningbenchmarkknowledge summarization
Authors
Yijun Lu, Rui Ye, Yuwen Du, Jiajun Wang, Songhua Liu, Siheng Chen
Abstract
Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors. We propose that effective context management should be adaptive: parts of the agent's trajectory are maintained at different levels of detail depending on their current relevance to the task. To operationalize this principle, we introduce Context-ReAct, a general agentic paradigm for elastic context orchestration that integrates reasoning, context management, and tool use in a unified loop. Context-ReAct provides five atomic operations: Skip, Compress, Rollback, Snippet and Delete, which allow the agent to dynamically reshape its working context, preserving important evidence, summarizing resolved information, discarding unhelpful branches, and controlling context size. We prove that the Compress operator is expressively complete, while the other specialized operators provide efficiency and fidelity guarantees that reduce generation cost and hallucination risk. Building on this paradigm, we develop LongSeeker, a long-horizon search agent fine-tuned from Qwen3-30B-A3B on 10k synthesized trajectories. Across four representative search benchmarks, LongSeeker achieves 61.5% on BrowseComp and 62.5% on BrowseComp-ZH, substantially outperforming Tongyi DeepResearch (43.2% and 46.7%) and AgentFold (36.2% and 47.3%). These results highlight the potential of adaptive context management, showing that agents can achieve more reliable and efficient long-horizon reasoning by actively shaping their working memory.