期刊文献+

基于改进K-modes聚类的KNN分类算法 被引量:23

KNN classification algorithm based on improved K-modes clustering
下载PDF
导出
摘要 为解决K-modes算法初始化k簇时误差率较高和KNN(K最近邻算法)算法面对大样本数据量时分类不准确的现状,分析传统的K-modes算法从k簇的初始化到簇中心不再变化的全过程和KNN(K最近邻算法)算法在面对大样本数据时执行效率低下的问题,提出改进的K-modes-KNN算法。使用字符串核函数初始化k簇,字符串核函数迭代计算样本到簇中心的距离来动态改变簇中心,利用改进的K-modes算法将数据集进行分簇处理后,在每个子簇中建立KNN(K最近邻算法)分类模型。通过真实数据验证了所提算法在一定程度上优于同种分类算法。 To solve the problems that the K-modes algorithm initializes k clusters with high error rate and KNN (K nearest neighbor algorithm) algorithm is inaccurate when it faces large sample data volume,the problems that the traditional K-modes algorithm from the initialization of the k-cluster to the whole process of the cluster center is no longer changed and the KNN (K-nearest neighbor algorithm) algorithm is inefficient in the face of large sample data were analyzed.An improved K-modes-KNN algorithm was proposed.The string kernel function was used to initialize the k-cluster.The string kernel function was used to iteratively calculate the distance from the sample to the cluster center to dynamically change the cluster center,and the improved K-modes algorithm was used to cluster the data set after each sub-cluster.A KNN (K nearest neighbor algorithm) classification model was established.The real data of a research institute verified that the proposed algorithm is better than the same classification algorithm to some extent.
作者 王志华 刘绍廷 罗齐 WANG Zhi-hua;LIU Shao-ting;LUO Qi(School of Software and Applied Science and Technology,Zhengzhou University,Zhengzhou 450002,China)
出处 《计算机工程与设计》 北大核心 2019年第8期2228-2234,共7页 Computer Engineering and Design
基金 国家社会科学基金项目(15BTQ064) 河南省科技攻关基金项目(182102210007)
关键词 K-modes算法 KNN算法 分类 簇中心 K-modes-KNN算法 字符串核函数 K-modes algorithm KNN algorithm classification cluster center K-modes-KNN algorithm string kernel function
  • 相关文献

参考文献9

二级参考文献98

共引文献228

同被引文献207

引证文献23

二级引证文献87

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部