A Survey of Bugs in AI-Generated Code

Explainable & Ethical AI
Published: arXiv: 2512.05239v1
Authors

Ruofan Gao Amjed Tahir Peng Liang Teo Susnjak Foutse Khomh

Abstract

Developers are widely using AI code-generation models, aiming to increase productivity and efficiency. However, there are also quality concerns regarding the AI-generated code. The generated code is produced by models trained on publicly available code, which are known to contain bugs and quality issues. Those issues can cause trust and maintenance challenges during the development process. Several quality issues associated with AI-generated code have been reported, including bugs and defects. However, these findings are often scattered and lack a systematic summary. A comprehensive review is currently lacking to reveal the types and distribution of these errors, possible remediation strategies, as well as their correlation with the specific models. In this paper, we systematically analyze the existing AI-generated code literature to establish an overall understanding of bugs and defects in generated code, providing a reference for future model improvement and quality assessment. We aim to understand the nature and extent of bugs in AI-generated code, and provide a classification of bug types and patterns present in code generated by different models. We also discuss possible fixes and mitigation strategies adopted to eliminate bugs from the generated code.

Paper Summary

Problem
Artificial intelligence (AI) code generation tools have revolutionized software development by automating coding tasks and suggesting code snippets. However, these tools are not perfect and often produce buggy code, which can lead to errors, security vulnerabilities, and maintenance issues. Researchers are concerned about the quality and reliability of AI-generated code, but a comprehensive review of the existing literature on this topic is lacking.
Key Innovation
This research paper aims to fill this gap by systematically analyzing the existing literature on AI-generated code and identifying the types and distribution of bugs, as well as possible remediation strategies. The authors propose a taxonomy of bugs in AI-generated code, which includes functional bugs, syntax bugs, semantic bugs, and logical bugs. They also analyze the frequency and distribution of each bug category and discuss possible fixes and mitigation strategies.
Practical Impact
The findings of this research have significant practical implications for software developers, quality assurance teams, and researchers. By understanding the types and distribution of bugs in AI-generated code, developers can take steps to improve the quality of the code, reduce errors, and ensure the reliability of their software systems. Additionally, this research can inform the development of more robust and reliable AI code generation tools, which can further enhance productivity and efficiency in software development.
Analogy / Intuitive Explanation
Imagine a skilled writer who can generate high-quality text, but occasionally makes grammatical errors or uses incorrect vocabulary. Similarly, AI code generation tools can produce high-quality code, but may also introduce bugs or errors. The goal of this research is to understand the types and distribution of these errors, so that developers can identify and fix them, and ensure that the generated code meets the required standards of quality and reliability.
Paper Information
Categories:
cs.SE cs.AI
Published Date:

arXiv ID:

2512.05239v1

Quick Actions