摘要
针对传统的前向序列特征算法中存在的"嵌套效应",提出了一种新的次优特征子集搜索策略。算法主要的改进在于每次对特征进行前向搜索时会同时找出与其高度相关的特征组合,再次搜索时忽略这些特征避免特征子集过度冗余,即通过减少每次所选特征之间的相关性来获得更优的特征子集。在两个不同的数据集上的实验结果表明,相对于传统的特征选择算法,基于相关的前向序列特征选择算法的性能更优,可以得到更好的特征子集,尤其是在需要选择较小特征子集的情况下。
Aiming at the"nesting effect"existing in the traditional forward sequence feature algorithm,a new sub-optimal feature subset search strategy is proposed.The main improvement of the algorithm is that each time a feature is forward searched;it will simultaneously find a combination of features that are highly correlated with it.When performing a further search,these features are ignored to avoid excessive redundancy of feature subsets,i.e.,by reducing the correlation between each selected feature to obtain a better subset of features.Experimental results on two different data sets indicate that the performance of the correlation-based forward-sequence feature selection algorithm is better than the traditional feature selection algorithm.It can get a better subset of features,especially if a smaller subset of features is selected.
作者
李三川
吴丽丽
LI San-chuan;WU Li-li(China Mobile Information Technology,Shenzhen Guangdong 518048,China)
出处
《通信技术》
2018年第12期2920-2924,共5页
Communications Technology
关键词
特征选择
相关搜索
序列选择
降维
feature selection
correlation search
sequence search
dimensionality reduction