摘要
即时软件缺陷预测(JIT-SDP)是一种针对代码变更的软件缺陷预测技术,具有细粒度、即时性和可追溯性的优点。工作量感知JIT-SDP进一步考虑代码检查工作量,旨在以有限的工作量识别更多的缺陷变更。尽管目前已有不少工作量感知JIT-SDP,但这些方法大多只针对分类模型算法进行优化。为提升工作量感知JIT-SDP的性能表现与泛用性,首次从特征工程方面入手,提出了一种工作量感知场景下的进化特征构建方法EEF。首先,EEF方法通过遗传编程树来表示特征,从分类性能与工作量感知性能两个角度出发,通过基于多目标优化的进化特征构建方法来获取新的特征转换方法;之后,通过得到的特征转换方法来构建新的特征集,并基于新的特征集训练与测试分类模型。为了验证EEF方法的有效性,在6个开源项目上,通过3个不同评估方案进行了实验研究,结果证明EEF方法可以提升分类模型在工作量感知场景下的性能,并优于其他特征工程方法,而且在保证特征选取多样性的前提下,基于单一模型的EEF方法同样可以提升其他模型的性能。
Just-in-time software defect prediction(JIT-SDP)is a software defect prediction technology for code changes,which has the advantages of fine granularity,instantaneity,and traceability.Effort-aware JIT-SDP further considers the cost of code inspection and aims to detect more defective code changes with limited testing resources.Although many effort-aware JIT-SDPs have been proposed,most of them only optimize model algorithms.In order to improve the performance and generalizability of effort-aware JIT-SDP,an effort-aware evolutionary feature construction method EEF is proposed for the first time from the aspect of feature engineering.Firstly,EEF represents features through genetic programming trees.From the two aspects of classification performance and effort-aware performance,a new feature transformation is obtained through an evolutionary feature construction method based on multi-objective optimization.After that,a new feature set is constructed through the obtained feature transformation,and the classification model is trained and tested on the new feature set.In order to verify the effectiveness of EEF,expe-riments are conducted in three different evaluation schemes on six open source datasets.The results prove that EEF can improve the performance of the classification model in effort-aware scenarios and performs better than other feature engineering methods.Moreover,under the premise of ensuring the diversity of feature selection,EEF based on a single model can also improve the performance of other models.
作者
赵晨阳
刘磊
江贺
ZHAO Chenyang;LIU Lei;JIANG He(School of Software,Dalian University of Technology,Dalian,Liaoning 116600,China)
出处
《计算机科学》
北大核心
2025年第1期232-241,共10页
Computer Science
关键词
即时缺陷预测
工作量感知
进化特征构建
多目标优化
特征工程
Just-in-time defect prediction
Effort-aware
Evolutionary feature construction
Multi-objective optimization
Feature engineering