Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction
IEEE Transactions on Reliability
Software defect prediction (SDP) is a convenient way to identify defects in the early phases of the software development life cycle. This early warning system can help in the removal of software defects and yield a cost-effective and good quality of software products. A wide range of statistical and machine learning models have been employed to predict defects in software modules. But the imbalanced nature of this type of SDP datasets is pivotal for the successful development of a defect prediction model. Imbalanced software datasets contain nonuniform class distributions with a few instances belonging to a specific class compared to that of the other class. This article proposes a novel hybrid methodology, namely the Hellinger net model, for imbalanced learning to improve defect prediction for software modules. Hellinger net, a tree to network mapped model, is a deep feedforward neural network with a built-in hierarchy, just like decision trees. Hellinger net also utilizes the strength of a skew insensitive distance measure, namely Hellinger distance, in handling class imbalance problems. On the theoretical side, this article proves the theoretical consistency of the proposed model. A thorough experiment was conducted over ten NASA SDP datasets to show the superiority of the proposed method.
Chakraborty, Tanujit and Chakraborty, Ashis Kumar, "Hellinger Net: A Hybrid Imbalance Learning Model to Improve Software Defect Prediction" (2021). Journal Articles. 1919.