Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection

2026-04-09Cryptography and Security

Cryptography and SecurityMachine Learning
AI summary

The authors study a sneaky hacking method called stepping-stone intrusions, where attackers hide by routing their activity through several compromised computers. They use a new deep learning model named ESPRESSO to better identify these attacks by analyzing network flow patterns more accurately than older methods. To train and test their model, they created fake but realistic data simulating different ways attackers tunnel through networks. Their model detects attacks with very few mistakes and can even estimate how many steps the attacker uses, helping to tell bad activity from normal computer use. They also found that the model mostly struggles when attackers change timing patterns, which is a key weak spot.

Stepping-stone intrusionNetwork flow correlationDeep learningTransformerTriplet metric learningTunneling protocolsSSHSOCATFalse positive ratePivoting
Authors
Nate Mathews, Nicholas Hopper, Matthew Wright
Abstract
Stepping-stone intrusions (SSIs) are a prevalent network evasion technique in which attackers route sessions through chains of compromised intermediate hosts to obscure their origin. Effective SSI detection requires correlating the incoming and outgoing flows at each relay host at extremely low false positive rates -- a stringent requirement that renders classical statistical methods inadequate in operational settings. We apply ESPRESSO, a deep learning flow correlation model combining a transformer-based feature extraction network, time-aligned multi-channel interval features, and online triplet metric learning, to the problem of stepping-stone intrusion detection. To support training and evaluation, we develop a synthetic data collection tool that generates realistic stepping-stone traffic across five tunneling protocols: SSH, SOCAT, ICMP, DNS, and mixed multi-protocol chains. Across all five protocols and in both host-mode and network-mode detection scenarios, ESPRESSO substantially outperforms the state-of-the-art DeepCoFFEA baseline, achieving a true positive rate exceeding 0.99 at a false positive rate of $10^{-3}$ for standard bursty protocols in network-mode. We further demonstrate chain length prediction as a tool for distinguishing malicious from benign pivoting, and conduct a systematic robustness analysis revealing that timing-based perturbations are the primary vulnerability of correlation-based stepping-stone detectors.