How Many Software Metrics Should be Selected for Defect Prediction?
Wang, Huanjing (Western Kentucky University) | Khoshgoftaar, Taghi M. (Florida Atlantic University) | Seliya, Naeem (University of Michigan, Dearborn)
A software practitioner is interested in the solution to “for a given project, what is the minimum number of software metrics that should be considered for building an effective defect prediction model?” During the development life cycle various software metrics are collected for different reasons. In the case of a metricsbased defect prediction model, an intelligent selection of software metrics prior to building defect predictors is likely to improve model performance. This study utilizes the proposed threshold-based feature selection technique to remove irrelevant and redundant software metrics (a.k.a. features or attributes). A comparative investigation is presented for evaluating the size of the selected feature subsets. The case study is based on software measurement data obtained from a real-world project, and the defect predictors are trained using three commonly used classifiers. The empirical case study results demonstrate that an effective defect predictor can be built with only three metrics; and moreover, model performances improved when over 98.5% of the software metrics were eliminated.
May-18-2011
- Country:
- North America > United States
- Kentucky (0.04)
- District of Columbia > Washington (0.04)
- Michigan > Wayne County
- Dearborn (0.04)
- Florida > Hillsborough County
- University (0.04)
- California > San Mateo County
- San Mateo (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (1.00)
- Technology: