Defect Prediction with Content-based Features
Pham, Hung Viet, Nguyen, Tung Thanh
–arXiv.org Artificial Intelligence
Traditional defect prediction approaches often use metrics that measure the complexity of the design or implementing code of a software system, such as the number of lines of code in a source file. In this paper, we explore a different approach based on content of source code. Our key assumption is that source code of a software system contains information about its technical aspects and those aspects might have different levels of defect-proneness. Thus, content-based features such as words, topics, data types, and package names extracted from a source code file could be used to predict its defects. We have performed an extensive empirical evaluation and found that: i) such content-based features have higher predictive power than code complexity metrics and ii) the use of feature selection, reduction, and combination further improves the prediction performance.
arXiv.org Artificial Intelligence
Sep-26-2024
- Country:
- North America > United States (0.46)
- Genre:
- Research Report
- Experimental Study (0.68)
- New Finding (0.46)
- Research Report
- Technology: