FlowGuard: Flow Matching for Identity-Independent Detection of Data-Free Model Stealing Attacks on Energy System Intrusion Detection Systems
2026-06-02 • Cryptography and Security
Cryptography and SecurityArtificial Intelligence
AI summaryⓘ
The authors discuss a way to protect energy system defenses that use AI from attacks where hackers try to copy the AI model to fool it. Existing protections either track who is asking (which fails if there are many fake users) or change the AI’s responses (which doesn’t work for certain systems). They propose FlowGuard, which checks if incoming data looks unusual before the AI processes it, based on patterns found in real traffic versus fake queries. Their tests show FlowGuard works better than previous methods even when many fake users are involved, and it doesn’t rely on identifying the user.
Intrusion Detection SystemModel Theft AttacksFlow MatchingOut-of-Distribution DetectionContinuous Normalizing FlowSybil AttackSoft-label PerturbationEnergy Infrastructure SecurityData-free Model Stealing
Authors
Maxime Schwarzer, Laurin Holz, Tobias Huerten, Johannes Loevenich, Thies Moehlenhof, Roberto Rigolin F. Lopes, Veit Hagenmeyer
Abstract
Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS) deployed in energy infrastructure are vulnerable to model theft attacks, which allow adversaries to create evasive traffic offline. Current defences against model extraction rely either on identity-bound query monitoring, which is ineffective against distributed attackers (Sybil), or on prediction poisoning through soft-label perturbation, which is inapplicable to hard-label IDS deployments. Therefore, we propose FlowGuard, an identity-independent defence based on flow matching that classifies incoming queries as out-of-distribution (OOD) prior to IDS processing. This approach exploits the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic. This results in measurably lower log-likelihoods when using a Continuous Normalizing Flow that has been trained on legitimate data. We evaluate our method against PRADA and FDINet using MAZE and DisGUIDE attacks in single-client and distributed (100-client Sybil) settings. While PRADA's detection rate dropped to 0% when the distribution changed, our defence maintained a stable detection rate across both settings without relying on identity information. We discuss the scope and limitations of the approach, and outline potential applications to data-dependent attacks.