Agent-Native Immune System: Architecture, Taxonomy, and Engineering
2026-06-26 • Artificial Intelligence
Artificial IntelligenceMultiagent Systems
AI summaryⓘ
The authors explain that as AI agents become more complex and autonomous, current security methods are not enough because they work outside the agent's inner thinking processes. They propose a new defense system called Agent-Native Immune System (ANIS) that works inside the agent itself, much like a biological immune system. This system has multiple layers for protection, a way to categorize threats and defenses, and a self-learning mechanism to adapt to new attacks during run time. They also clarify the difference between training-based alignment and real-time immunity in AI agents and discuss future challenges for this approach.
Autonomous agentsMemory poisoningTool-chain manipulationMulti-agent collaborationAgent alignmentImmune system analogyContinual immune learningMeta-cognitionRuntime securityAutoimmunity rate
Authors
Bo Shen, Lifeng Chang, Tianyuan Wei, Yunpeng Li, Feng Shi, Yichen Han, Peijie Gao, Shiyi Kuang, Xin Chang, Dehui Li
Abstract
The transition from static chat bots to autonomous agents--equipped with persistent memory, tool-use protocols, and multi-agent collaboration--has fundamentally expanded the AI threat landscape. Current defense mechanisms, such as perimeter security and training-time alignment, remain external to the agent's active reasoning loop. Consequently, they fall short: a fully aligned agent remains highly vulnerable to runtime hijacking via memory poisoning, tool-chain manipulation, or multi-agent protocol attacks. To address this critical gap, we introduce the Agent-Native Immune System (ANIS), the first biologically inspired, endogenous defense architecture embedded directly within the agent's cognitive loop. Our framework presents four primary contributions. First, we design a six-layer Immune Tower (L0-L5), distinctly incorporating Barrier Immunity (L1) as a non-cognitive, physical-and-logical isolation layer. Second, we establish a unified taxonomy of Agent Viruses and Agent Vaccines, formalizing the critical distinction between superficial non-parametric defenses and robust parametric vaccines. Third, we conceptualize the Harness Triad--Meta, Self, and Auto--a self-monitoring, meta-cognitive automation backbone that drives Continual Immune Learning (CIL), enabling vaccines to dynamically adapt to novel threats. Finally, we establish a rigorous theoretical demarcation between model alignment and agent immunity: while alignment provides a static "constitutional" value foundation during training, ANIS serves as the dynamic "law enforcement" mechanism during runtime. We conclude by framing open challenges for the field, including immune protocol standardization, novel evaluation metrics such as the Autoimmunity Rate (false-positive intervention rate), and the co-evolutionary dynamics between pathogens and vaccines within collective intelligence ecosystems.