期刊文献+

基于近邻噪声处理的KNN缺失数据填补算法 被引量:29

Predicting Missing Values with KNN Based on the Elimination of Neighbor Noise
下载PDF
导出
摘要 在优化算法的研究中,针对KNN算法对缺失数据的填补效果会因为原始数据中存在噪声而受到严重影响的问题,根据待填补缺失数据最近邻的近邻关系,提出了一种新的缺失数据填补算法——ENN-KNN(Eliminate Neighbor Noise k-Nearest Neighbor)。通过比较待填补缺失数据每个最近邻的真实近邻程度能够有效地识别潜在的噪声最近邻。最后使用所有非噪声最近邻对待填补缺失数据进行填补,从而消除了噪声最近邻对填补结果的影响。通过观察四组UCI数据集的仿真结果,可知ENN-KNN算法的填补准确性总体上要优于KNN算法。 Traditional KNN imputation method for dealing with missing data is severely affected by the noise in the original data. This paper presents a novel imputation method for dealing with missing data, which is based on the relationship of nearest neighbors of missing data ENN-KNN( Eliminate Neighbor Noise k-Nearest Neighbor). ENN -KNN imputation method can effectively identify potential noise nearest neighbor by comparing each real nearest de- gree of nearest neighbor of missing data. It uses all nearest neighbors which are not noise nearest neighbor to deal with missing data, for this reason it can eliminate the effect of noise nearest neighbor for dealing with missing data. The experiment results of four groups of UCI data sets show that the ENN-KNN imputation method is overall superior to KNN imputation method on the performance of prediction accuracy.
出处 《计算机仿真》 CSCD 北大核心 2014年第7期264-268,共5页 Computer Simulation
基金 北京市自然科学基金(7110001)
关键词 缺失数据填补 近邻 噪声最近邻 Missing data imputation Nearest neighbors Noise nearest neighbor
  • 相关文献

参考文献5

二级参考文献71

共引文献331

同被引文献219

引证文献29

二级引证文献142

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部