摘要
深入研究大间隔从样本间相似性、信息熵从特征间相关性进行特征选择的特点,提出一种有效地融合这两类方法的特征选择算法。采用Relief算法得到一个有效的特征排序,进而将其划分为若干区段。设置各区段的采样率,以对称不确定性作为启发因子获得每个局部随机子空间的特征子集。将获得的所有特征子集作为最终的特征选择结果。实验结果表明该方法优于一些常用的特征选择算法。
Two kinds of feature selection algorithms are further studied, i.e., the characteristic of large margin is the similarity between samples and the entropy is the correlation between features, an effective feature selection algorithm via fusing large margin and information entropy is proposed. The features are ranked by employing the algorithm of Relief, and the ranked feature list is partitioned into a few sections. Based on the heuristic factor of symmetric uncertainty, the feature subset in each local random subspace is obtained by setting the sampling rate of each section. The final feature subset is obtained by merging all feature subsets. Experimental results show that the proposed algorithm is superior to several feature selection algorithms.
出处
《计算机工程与应用》
CSCD
北大核心
2016年第2期170-174,185,共6页
Computer Engineering and Applications
基金
国家自然科学基金(No.61303131
No.61379021)
福建省自然科学基金(No.2013J01028)
漳州市科技项目(No.ZZ2013J04)
关键词
特征选择
大间隔
对称不确定性
局部随机子空间
feature selection
large margin
symmetric uncertainty
local random subspace