期刊文献+

高斯加权的重构性K-NN算法研究 被引量:1

Research on Gauss Weighed Reorganization K-NN
下载PDF
导出
摘要 该文提出基于高斯加权距离以及聚类重构机制的K-NN文本聚类算法。文章提出K-NN近邻域的概念,通过高斯加权的近邻域算法实施K-NN聚类。利用高斯函数根据样本与聚类中心的距离为样本赋权,计算聚类距离。基于近邻域权重和聚类密度对形成的聚类实施重构,实现聚类数目的自适应调整。使用拆分算子拆分稀疏聚类并调整异常样本;使用合并算子合并相似聚类。实验显示聚类重构机制能够有效地提高聚类的准确率及召回率,增加聚类密度,使得形成的聚类结果更加合理。 This paper presents a K-NN text clustering algorithm employing uses Gauss Weighed Distance and Cluster Reorganization Mechanism. The concept of Nearest Domain is proposed and Nearest Domain Rules are elaborated. Then Gauss Weighing Algorithm is designed to Quantification samples' distance and weights. A text is weighed based on the distance from cluster center via Gauss function in order that distances of clusters can be calculated. Further, Cluster Reorganization Mechanism will make a self adaption to the amount of clusters. Splitting operator separates sparse clusters and adjusts abnormal texts while consolidating operator combines similar ones. Clustering experiment shows that reorganization process effectively improves the accuracy and recall rate and makes result more reasonable by increasing the inner density of clusters.
出处 《中文信息学报》 CSCD 北大核心 2015年第5期112-116,共5页 Journal of Chinese Information Processing
基金 国家自然科学基金(61363028)
关键词 文本聚类 K-NN算法 高斯加权 近邻域规则 聚类重构 text clustering K-NN Gauss weighing nearest domain rule cluster reorganization
  • 相关文献

参考文献7

二级参考文献75

共引文献39

同被引文献8

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部