A hybrid approach based on k-nearest neighbors and decision tree for software fault prediction
Software testing is a very important part of the software development life cycle to develop reliable and bug-free software but it consumes a lot of resources like development time, cost, and effort. Researchers have developed many techniques to get prior knowledge of fault-prone modules so that testing time and cost can be reduced. In this research article, a hybrid approach of distance-based pruned classification and regression tree (CART) and k- nearest neighbors is proposed to improve the performance of software fault prediction. The proposed technique is tested on eleven medium to large scale software fault prediction datasets and performance is compared with decision tree classifier, SVM and its three variations, random forest, KNN, and classification and regression tree. Four performance metrics are used for comparison purposes that are accuracy, precision, recall, and f1-score. Results show that our proposed approach gives better performance for accuracy, precision, and f1-score performance metrics. The second experiment shows a significant amount of running time improvement over the standard k-nearest neighbor algorithm.