摘要
聚类作为一种无监督的学习,能根据数据间的相似程度自动地进行分类。提出的基于交集的聚类组合新方法,借鉴了选举投票的思想。给定同一数据集的不同聚类结果,此算法先求出不同聚类结果中每个簇的对应关系,然后计算这几个聚类结果对应簇的交集,对剩余的有争议对象进行投票,最后把投票之后仍未确定归属的对象分配给最近对象所在的簇,或者不经过投票直接将有争议的对象分配给最近对象所在的簇。实验表明,两种方法都能明显改善聚类质量,投票后得到的结果要略优于不投票的结果。
Being an unsupervised learning,clustering is a division of data into groups of similar objects.This paper presents a new intersection-based clustering combination algorithm,which imitates the ways of voting.Assigns some different clustering results of a same data set,this algorithm extracts the corresponding relations of each cluster in these different clustering results first,and then compute the intersection of corresponding clusters of these results,put the remaining disputable objects to vote,finally distribute the objects in abeyance after voting to the nearest object's cluster,or distribute the remaining disputable objects to the nearest object's cluster without voting.The experiment indicates both methods can obviously improve the clustering performance, the result with voting is better than the result without voting.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第2期177-179,243,共4页
Computer Engineering and Applications
基金
四川省重大基础研究项目子课题(04JY029-001-4)
西南交通大学科技发展基金(A2004015)。
关键词
聚类
聚类组合
交集
投票
clustering
clustering combination
intersection
vote