摘要
不确定性数据主要分为元组存在不确定和属性值不确定两种,针对属性值不确定提出了一种k近邻分类算法。算法中对象属性是离散型的,其值的不确定性用概率分布向量描述。根据概念层次树计算属性分量值间的语义距离,进而计算属性及对象间的期望语义距离。对算法分类准确率进行了实验验证,实验结果表明这是一个分类准确率高的基于不确定数据分类挖掘算法。
The uncertainty of data mainly includes the uncertainty of tuples and that of attribute values. For the latter type, a knearest neighbor classifier is proposed. The attribute value in this classifier is discrete, and the uncertainty of it is expressed by probability distribution vector. The semantic distance among probability distributions is firstly computed according to Concept Hierarchy Tree, and then the semantic distances among attributes and objects are computed. The classification accuracy rate has been validated by experimentation, which indicates that this classifier is a highly effective algorithm for uncertain data.
出处
《大理大学学报》
CAS
2017年第12期16-20,共5页
Journal of Dali University
关键词
分类
K近邻分类
不确定数据
期望语义距离
classification
KNN classifier
uncertain data
expected semantic distance