期刊文献+

一种基于F-Score的特征选择方法 被引量:3

The Research and Application of Feature Selection Method Based on F-Score
下载PDF
导出
摘要 原始数据中的冗余特征和不相关特征会使得构建的学习模型复杂度提高,并对模型的性能有负面的影响.对此,提出一种基于Filter和Wrapper特征选择方法的两阶段式特征选择方法.首先以原始数据中特征的F-Score统计值为先验知识,然后结合序列前向搜索策略搜索优化的特征子集,搜索过程中依据分类算法的性能评价所选择的特征组合.采用十折交叉验证进行测试,并分别采用SVM、Logistic Regression、Adaboost分类模型进行对比实验,结果表明,算法能够有效地降低特征维数,并进一步提升算法的性能. The redundant features and irrelevant features in the raw dataset not only improve the complexity of the learning model,but have negative impact on the performance of the model.A two-stage feature selection method based on Filter and Wrapper feature selection was proposed.First,the F-Score statistical characteristics of raw data were used as a prior knowledge,then combined with the sequence forward search strategy to search the optimal feature subset,and the feature subset was evaluated according to the performance of the classification algorithm in the search process.The proposed algorithm was tested by ten-fold cross-validation technique,and SVM,Logistic Regression,Adaboost classification model were adopted for comparative experiment.Experiment results show that the algorithm can effectively reduce the feature dimension,and further enhance the performance of the algorithm.
作者 秦彩杰 管强 QIN Caijie;GUAN Qiang(College of Information Engineering,Sanming University,Sanming,Fujian 365004,China)
出处 《宜宾学院学报》 2018年第6期4-8,共5页 Journal of Yibin University
基金 国家自然科学基金项目(11401341) 福建省自然科学基金项目(2017J01779)
关键词 特征选择 F-Score 十折交叉验证 feature selection F-Score ten-fold cross-validation
  • 相关文献

参考文献1

二级参考文献2

  • 1(美)Pang-NingTan,(美)MichaelSteinbach,(美)VipinKumar著,范明,范宏建等.数据挖掘导论[M]人民邮电出版社,2006. 被引量:1
  • 2Sarah Jane Delany,Pádraig Cunningham,Lorcan Coyle. An Assessment of Case-Based Reasoning for Spam Filtering[J] 2005,Artificial Intelligence Review(3-4):359~378 被引量:1

共引文献12

同被引文献27

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部