High-Precision APT Malware Attribution with Out-of-Scope Resilience
2026-06-02 • Cryptography and Security
Cryptography and SecurityArtificial IntelligenceMachine Learning
AI summaryⓘ
The authors developed a new method to identify which hacking groups created malicious software, focusing on advanced persistent threats (APTs). Instead of guessing from a fixed list of known groups, their approach uses many smaller yes/no tests for each group and only makes a guess when confident, otherwise it chooses not to guess. They tested this on datasets with many unknown groups and showed their method avoided many wrong identifications while being accurate when it did make a call. This helps reduce mistakes in attributing malware to the wrong hackers in real-world situations.
Advanced Persistent Threat (APT)malware attributionbinary classifiersclosed-set classificationout-of-scope samplesprecisionselective accuracymachine learningabstention
Authors
Peter Williams, Adam Sobey, Erisa Karafili
Abstract
Early attribution of Advanced Persistent Threat (APT) activity can help defenders prioritise investigation, select countermeasures, and reduce the impact of an intrusion. Malware provides useful attribution evidence, but automated APT malware attribution remains difficult in practice. Existing approaches are typically trained and evaluated as closed-set classifiers over a limited number of known APT groups. In operational environments, however, classifiers are likely to encounter samples from groups not represented during training. Closed-set classifiers are then forced to assign such samples to known groups, producing unsupported and potentially misleading attributions. We present a high-precision APT malware attribution method based on ranked binary classifiers with explicit abstention. Rather than training a single multi-class classifier, our approach trains and tunes two binary classifiers per APT group, ranks the classifiers by validation performance, and applies them sequentially. A sample is attributed only when a classifier provides sufficient evidence; otherwise, it abstains. We evaluate the method on the APT Malware dataset and on a larger combined dataset designed to stress-test out-of-scope behaviour. On the APT Malware dataset, the method achieves higher precision than previously published results on the same dataset. In the most challenging setting, where 87% of test samples came from 60 APT groups excluded from training, the method abstained on 94% of out-of-scope samples while maintaining 92% precision and 95% selective accuracy on the samples it classified.