Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot

2026-04-09Software Engineering

Software EngineeringCryptography and SecurityHuman-Computer Interaction
AI summary

The authors looked at what software developers are worried about when using AI tools like GitHub Copilot to help write code. They studied online discussions and found four main security concerns: private data might leak, unclear code ownership rules, trick attacks that confuse the AI, and suggestions that create unsafe code. Their study helps better understand these security issues so future AI coding tools can be made safer and more trustworthy.

Generative AIGitHub Copilotcode securitydata leakagecode licensingadversarial attacksprompt injectionsoftware engineeringBERTopicthematic analysis
Authors
Nicolás E. Díaz Ferreyra, Monika Swetha Gurupathi, Zadia Codabux, Nalin Arachchilage, Riccardo Scandariato
Abstract
Generative Artificial Intelligence (GenAI) has become a central component of many development tools (e.g., GitHub Copilot) that support software practitioners across multiple programming tasks, including code completion, documentation, and bug detection. However, current research has identified significant limitations and open issues in GenAI, including reliability, non-determinism, bias, and copyright infringement. While prior work has primarily focused on assessing the technical performance of these technologies for code generation, less attention has been paid to emerging concerns of software developers, particularly in the security realm. OBJECTIVE: This work explores security concerns regarding the use of GenAI-based coding assistants by analyzing challenges voiced by developers and software enthusiasts in public online forums. METHOD: We retrieved posts, comments, and discussion threads addressing security issues in GitHub Copilot from three popular platforms, namely Stack Overflow, Reddit, and Hacker News. These discussions were clustered using BERTopic and then synthesized using thematic analysis to identify distinct categories of security concerns. RESULTS: Four major concern areas were identified, including potential data leakage, code licensing, adversarial attacks (e.g., prompt injection), and insecure code suggestions, underscoring critical reflections on the limitations and trade-offs of GenAI in software engineering. IMPLICATIONS: Our findings contribute to a broader understanding of how developers perceive and engage with GenAI-based coding assistants, while highlighting key areas for improving their built-in security features.