Abstract
During the software development process, prediction of the number of faults in software modules can be more helpful instead of predicting the modules being faulty or non-faulty. Such an approach may help in more focused software testing process and may enhance the reliability of the software system. Most of the earlier works on software fault prediction have used classification techniques for classifying software modules into faulty or non-faulty categories. The techniques such as Poisson regression, negative binomial regression, genetic programming, decision tree regression, and multilayer perceptron can be used for the prediction of the number of faults. In this paper, we present an experimental study to evaluate and compare the capability of six fault prediction techniques such as genetic programming, multilayer perceptron, linear regression, decision tree regression, zero-inflated Poisson regression, and negative binomial regression for the prediction of number of faults. The experimental investigation is carried out for eighteen software project datasets collected from the PROMISE data repository. The results of the investigation are evaluated using average absolute error, average relative error, measure of completeness, and prediction at level l measures. We also perform Kruskal–Wallis test and Dunn’s multiple comparison test to compare the relative performance of the considered fault prediction techniques.
1 Introduction
From software development perspective, dealing with software faults is a vital and foremost important task. Presence of faults not only deteriorates the quality of the software, but also increases the development and maintenance cost of the software (Menzies et al. 2010). Therefore, identifying which software module is likely to be fault prone during early phases of software development may help in improving the quality of software system. By predicting number of faults1 in software modules, we can guide software testers to focus on faulty modules first.
7 Conclusions and future work
This paper evaluated and compared the performance of six fault prediction techniques for the prediction of number of faults in given software modules. We have used four known fault prediction techniques, i.e., LR, MLP, NBR, and ZIP for the prediction of number of faults. In addition, we have investigated two techniques, i.e., GP and DTR, which until now have not been fully explored for the prediction of number of faults. The experiments were performed for eighteen software project datasets available publicly. AAE, ARE, measure of completeness, and prediction at level l measures have been used to evaluate the results of fault prediction models. In addition, Kruskal–Wallis test and Dunn’s multiple comparison test were performed to assess the relative performance of the used fault prediction techniques. The results found that among the used different fault prediction techniques, decision tree regression, genetic programming, multilayer perceptron, and linear regression demonstrated better prediction performance for all the datasets under consideration. The analysis of results obtained from Kruskal–Wallis test and Dunn’s multiple comparison test suggested that except negative binomial regression (NBR) and zero-inflated Poisson regression (ZIP) techniques all other techniques have performed significantly accurate for the prediction of number of faults. Generally, NBR and ZIP techniques produced the worst prediction accuracy. Thus, it is observed that the count models (NBR and ZIP) generally underperformed as compared to other considered fault prediction techniques. In the future, it may be tried to investigate and evaluate ensemble of these methods for the number of faults prediction to overcome the limitations of individual techniques.