HyperTool: Beyond Step-Wise Tool Calls for Tool-Augmented Agents
2026-06-11 • Computation and Language
Computation and Language
AI summaryⓘ
The authors identify a problem where models using tools step-by-step get bogged down by too many small decisions, making it hard for the model to keep track of everything. They propose HyperTool, a new way to bundle multiple tool calls into a single step by letting the model write a small program that uses tools internally. Training models with HyperTool helps them perform better on complex tasks requiring multiple tools, as shown by improved accuracy in their tests. This approach makes multi-step tool use more efficient and effective.
LLM agentstool callsexecution granularityHyperToolMCP interfacemulti-step tool usecode block invocationmodel trainingQwen3MCP-Universe
Authors
Yaxin Du, Yifan Zhou, Yujie Ge, Jiajun Wang, Xianghe Pang, Shuo Tang, Tuney Zheng, Bryan Dai, Jian Yang, Siheng Chen
Abstract
Tool-augmented LLM agents commonly rely on step-wise atomic tool calls, where each invocation, observation, and value transfer is exposed in the main reasoning trace. This creates an \emph{execution-granularity mismatch}: locally deterministic tool workflows are unfolded into repeated model-visible decisions, consuming context and forcing the model to manage low-level dataflow in the trace. We introduce \textbf{HyperTool}, a unified executable MCP-style tool interface that changes the model-visible unit of tool execution. A model invokes HyperTool with a code block that can call existing tools through their original schemas, manipulate returned values, and pass intermediate results locally, folding deterministic tool subroutines into a single outer call. To train models to use this interface, we synthesize HyperTool-format trajectories from cross-tool compositional tasks and verify them in real MCP environments. On MCP-Universe, HyperTool improves average accuracy from 15.69\% to 35.29\% on Qwen3-32B and from 9.93\% to 33.33\% on Qwen3-8B, and surpass GPT-OSS and Kimi-k2.5 on average accuracy, showing that our HyperTool can substantially improve multi-step tool use.