Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense
2026-05-04 • Cryptography and Security
Cryptography and Security
AI summaryⓘ
The authors study how autonomous AI agents that use large language models (LLMs) can unintentionally let harmful code persist and spread through their saved files and memory. They create tools to automatically find and test how these dangerous behaviors, like worms, can move across agents and systems without human help. Their work also shows that certain prompt types and file read actions are more risky, and they propose a defense method called RTW-A that stops these persistent threats while keeping normal functions working. This research helps make multi-agent AI systems safer from hidden attacks. The specific systems they tested remain unnamed for security reasons.
Autonomous LLM AgentsPersistent WorkspaceData Flow AnalysisWorm PropagationPayload OptimizationPrivilege EscalationData ExfiltrationPrompt EngineeringSecurity DefensesCapability Attenuation
Authors
Mingming Zha, Xiaofeng Wang
Abstract
Autonomous LLM agents operate as long-running processes with persistent workspaces, memory files, scheduled task state, and messaging integrations. These features create a new propagation risk: attacker-influenced content can be written into persistent agent state, re-enter the LLM decision context through scheduled autoloading, and drive high-risk actions including configuration changes and cross-agent transmission. We present the first systematic framework for automated analysis of persistent worm propagation in file-backed multi-agent LLM ecosystems. SSCGV, our automated source-code graph analyzer, traces data flow from file I/O to LLM context injection points and ranks carriers by context injection position without manual analysis. SRPO, our summary-resilient payload optimizer, generates worm payloads robust to LLM-mediated summarization and paraphrasing across multi-hop communication. Evaluated on three production agent frameworks, we demonstrate zero-click autonomous propagation, 3-hop cross-platform transmission without platform-specific adaptation, inter-agent privilege escalation, and data exfiltration. We identify two empirical insights: user prompt carriers achieve higher attack compliance than system prompt carriers, and read operations represent the primary integrity threat in LLM-mediated systems. To defend against this class of attacks, we develop RTW-A, proven under a formal No Persistent Worm Propagation theorem. RTW blocks write-before-exposed-read re-entry; sealed configuration protects static files; typed memory promotion prevents untrusted summaries from entering trusted memory; and capability attenuation limits high-risk actions after external reads. These mechanisms eliminate the persistence, re-entry, action chain while preserving ordinary workflows. Affected systems are anonymized pending coordinated disclosure.