AFGNN: API Misuse Detection using Graph Neural Networks and Clustering
2026-04-09 • Software Engineering
Software Engineering
AI summaryⓘ
The authors developed AFGNN, a new tool that uses graph neural networks to find mistakes in how Java APIs are used in code. Their method represents the code as an API Flow Graph to understand the sequence and flow of API calls better. They trained the model to recognize different usage patterns, including errors, even for APIs it hasn't seen before. Tests showed AFGNN performs better than existing models and tools at spotting API misuse.
APIJava standard libraryGraph Neural NetworkAPI misuse detectionAPI Flow Graphself-supervised learningembeddingsoftware bugscontrol flowdata flow
Authors
Ponnampalam Pirapuraj, Tamal Mondal, Sharanya Gupta, Akash Lal, Somak Aditya, Jyothi Vedurada
Abstract
Application Programming Interfaces (APIs) are crucial to software development, enabling integration of existing systems with new applications by reusing tried and tested code, saving development time and increasing software safety. In particular, the Java standard library APIs, along with numerous third-party APIs, are extensively utilized in the development of enterprise application software. However, their misuse remains a significant source of bugs and vulnerabilities. Furthermore, due to the limited examples in the official API documentation, developers often rely on online portals and generative AI models to learn unfamiliar APIs, but using such examples may introduce unintentional errors in the software. In this paper, we present AFGNN, a novel Graph Neural Network (GNN)-based framework for efficiently detecting API misuses in Java code. AFGNN uses a novel API Flow Graph (AFG) representation that captures the API execution sequence, data, and control flow information present in the code to model the API usage patterns. AFGNN uses self-supervised pre-training with AFG representation to effectively compute the embeddings for unknown API usage examples and cluster them to identify different usage patterns. Experiments on popular API usage datasets show that AFGNN significantly outperforms state-of-the-art small language models and API misuse detectors.