摘要
鉴于手写识别在当今社会的重要性,文章利用加州大学的开放数据集“数据库标题:手写数字的光学识别”进行前瞻性分析,希望为以后的研究提供参考。首先,利用K-means聚类算法建立模型对不同组的手写数字进行聚类,并采用ARI和Silhouette系数等两种聚类质量方法对聚类效果进行评价,验证聚类结果的真实性和可靠性。最终的实验结果也证实了K-means算法在处理此类数据时具有相对稳定的效果。如果样本量能够不断增加,特征点提取并可以形成多维矩阵,则K-means算法会取得更好的效果。
In view of the importance of handwriting identification in today's society,this paper uses the open data set"Title of Database:Optical Recognition of Handwritten Digits"of THE University of California to conduct prospective analysis,hoping to provide reference for future research.In this paper,K-means clustering algorithm is specifically used to cluster handwritten numbers of different groups by building a model.Two clustering quality methods,ARI and Silhouette coefficient,are used to evaluate the clustering effect,so as to verify the authenticity and reliability of the clustering results.The final experimental results also confirm that K-means algorithm has a relatively stable effect in processing such data.If the sample size continues to increase,multi-dimensional matrix can be formed in feature point extraction,and we believe that K-means clustering algorithm will achieve better results.
作者
蔡鲲鹏
CAI Kun-peng(School of Computer and Information Engineering,Fuyang Normal University,Fuyang,Anhui,236041,China)
出处
《新疆师范大学学报(自然科学版)》
2022年第3期64-72,共9页
Journal of Xinjiang Normal University(Natural Sciences Edition)
基金
安徽省高校自然科学研究重点项目(KJ2019A0541)
阜阳师范大学自然科学重点研究项目(2020FSKJ08ZD)。