Improving Defect Localization by Classifying the Affected Asset using Machine Learning
A vital part of a defects resolution is the task of defect localization. Defect localization is the task of finding the exact location of the defect in the system. The defect report, in particular, the asset attribute, helps the person assigned to handle the problem to limit the search space when investigating the exact location of the defect. However, research has shown that oftentimes reporters initially assign values to these attributes that provide incorrect information. In this paper, we propose and evaluate the way of automatically identifying the location of a defect using machine learning to classify the source asset. By training an Support-Vector-Machine (SVM) classifier with features constructed from both categorical and textual attributes of the defect reports we achieved an accuracy of 58.52% predicting the source asset. However, when we trained an SVM to provide a list of recommendations rather than a single prediction, the recall increased to up to 92.34%. Given these results, we conclude that software development teams can use these algorithms to predict up to ten potential locations, but already with three predicted locations, the teams can get useful results with the accuracy of over 70%.
Authors: Sam Halali, Miroslaw Staron, Miroslaw Ochodeck, Wilhelm Meding