摘要
尽管综合了K-means和K-modes的K-prototypes算法已能有效地处理符号数据,但用聚类中的符号模(modes)来表示聚类中的数据均值将引起大量的信息丢失。为此,本文提出了一种适合于混合类型数据的结构化模糊K-prototypes算法(SFKP),在不增加时空开销的情况下提高聚类能力。实际数据集上的实验结果显示,SFKP算法能够进行更加有效的聚类。
Although K-prototypes algorithm integrating K-means and K-modes algorithms has removed the numeric- only limitation of the K-means algorithm and enable it to be used to efficiently cluster large categorical data sets, the fact that replacing the means of clusters with the frequency-based modes will cause the lose of information in clusters. In this paper, a structural fuzzy K-prototypes algorithm for clustering mixed-type databases is presented and can en- hance the clustering ability without increasing computational cost and memory storage. Experiments on several real databases show that the structural K-prototypes algorithm can get better clustering result than the corresponding non-structural algorithm.
出处
《计算机科学》
CSCD
北大核心
2005年第5期155-158,共4页
Computer Science
基金
江苏省高校自然科学研究计划项目(编号:03KJB520054)