摘要
基于粒度计算的思想,以粗糙集的近似空间为背景,研究了一种新的特征选择算法。该算法是以相容函数为基础生成特征选择过程中的粒度,可以区分噪声数据和不一致数据,实现了决策信息表中的特征选择,可有效地实现大规模数据集的特征选择。这为特征选择在模式识别的进一步发展和在知识发现和数据挖掘领域的应用开辟了新的空间。
Bottlenecks trend has emerged in feature selection in high-dimension data processing,so in the past decade feature selection researches have not adhered to the traditional algorithms and ideas,showing a new trend of combining with new mathematical tools,which expands not only new space applied in data mining and knowledge discovery but also further development in pattern recognition.Granule computing as a new idea of intelligent information processing has begun to take shape,which creates the conditions for feature selection applied in data mining.Based on granule computing,and rough set approximation as the background,a new feature selection algorithm proposed in the paper can distinguish noise data and inconsistent data to achieve feature selection in the decision information table,and generate tolerance granules with tolerance function in large-scale data sets.
出处
《宿州学院学报》
2011年第2期18-20,共3页
Journal of Suzhou University
基金
安徽省高校优秀青年人才基金项目(2009SQRZ171)
关键词
相容粒
特征选择
信息熵
tolerance granule
feature selection
information entropy