期刊文献+

基于最大信息系数的软件缺陷数目预测特征选择方法 被引量:2

Feature selection method for software defect number prediction based on maximum information coefficient
下载PDF
导出
摘要 针对传统特征选择方法仅考虑变量间的线性关系而忽略非线性相关性,导致软件缺陷数目预测模型的性能较低的问题,提出了一种基于最大信息系数的特征选择方法。该方法考虑特征与特征以及特征与缺陷数目间的线性及非线性关系,将特征的冗余性分析和相关性分析分离为两个阶段。在冗余特征分析阶段,基于特征间的相关度,采用凝聚层次聚类算法将冗余特征分到同一簇中;在相关性分析阶段,依据特征与软件缺陷数目之间的相关度,对每个特征簇中的特征进行排序,然后从簇中选择排名靠前的特征组成特征子集。实验结果表明,该方法能够选择有效的特征子集,提高软件缺陷数目预测模型的预测性能。 The traditional feature selection method only considers the linear correlation between variables and ignores the nonlinear correlation,so it is difficult to select effective feature subsets to build the effective model to predict the number of faults in software modules.Considering the linear and nonlinear relationship,a feature selection method based on maximum information coefficient(MIC)was proposed.The proposed method separated the redundancy analysis and correlation analysis into two phases.In the previous phase,the cluster algorithm,which was based on the correlation between features,was used to divide the redundant features into the same cluster.In the later phase,the features in each cluster were sorted in descending order according to the correlation between features and the number of software defects,and then the top features were selected to form the feature subset.The experimental results show that the proposed method can improve the prediction performance of software defect number prediction model by effectively removing redundant and irrelevant features.
作者 刘国庆 王兴起 魏丹 方景龙 邵艳利 LIU Guoqing;WANG Xingqi;WEI Dan;FANG Jinglong;SHAO Yanli(School of Computer Science and Technology,Hangzhou Dianzi University,Hangzhou 310018,China)
出处 《电信科学》 2021年第5期133-147,共15页 Telecommunications Science
基金 浙江省自然科学基金资助项目(No.LY20F020015,No.LY21F020015) 国家自然科学基金资助项目(No.61702517,No.61972121,No.61702146) 国防基础科研计划资助项目(No.JCKY2019415C001)。
关键词 软件缺陷数目预测 特征选择 最大信息系数 software defect number prediction feature selection maximum information coefficient
  • 相关文献

参考文献5

二级参考文献73

  • 1Wang Q, Wu S J, Li M S. Software defect prediction. J Softw, 2008, 19:1565-1580. 被引量:1
  • 2Hall T, Beecham S, Bowes D, et al. A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng, 2012, 38:1276-1304. 被引量:1
  • 3Yu S S, Zhou S G, Guan J H. Software engineering data mining: a survey. J Front Comput Sci Tech, 2012, 6:1-31. 被引量:1
  • 4Chen X, Gu Q, Liu W S, et al. Survey of static software defect prediction. J Softw, 2016, 1:1-25. 被引量:1
  • 5Ghotra B, McIntosh S, Hassan A E. Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the International Conference on Software Engineering, Firenze, 2015. 789 -800. 被引量:1
  • 6Peters F, Menzies T, Layman L. LACE2: better privacy-preserving data sharing for cross project defect prediction. In: Proceedings of the International Conference on Software Engineering, Firenze, 2015. 801-811. 被引量:1
  • 7Tantithamthavorn C, McIntosh S, Hassan A E, et al. The impact of mislabelling on the performance and interpretation of defect prediction models. In: Proceedings of the International Conference on Software Engineering, Firenze, 2015. 812-823. 被引量:1
  • 8Jing X Y, Wu F, Dong X W, et M. Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: Proceedings of the International Symposium on Foundations of Software Engineering, Bergamo, 2015. 496-507. 被引量:1
  • 9Nam J, Kim S. Heterogeneous defect prediction. In: Proceedings of the International Symposium on Foundations of Software Engineering, Bergamo, 2015. 508-519. 被引量:1
  • 10Kim M, Nam J, Yeon J, et al. REMI: defect prediction for efficient API testing. In: Proceedings of the International Symposium on Foundations of Software Engineering, Bergamo, 2015. 990-993. 被引量:1

共引文献75

同被引文献23

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部