期刊文献+

基于马氏距离的文本聚类算法在自动阅卷系统中的应用 被引量:6

APPLYING MAHALANOBIS DISTANCE-BASED TEXT CLUSTERING ALGORITHM IN AUTOMATIC PAPER MARKING SYSTEM
下载PDF
导出
摘要 基于欧氏距离的传统模糊划分聚类算法较适用于球型结构的聚类。将其应用于维度较高的文本聚类时,准确率和效率均有所下降。为解决这一问题,提出一种基于马氏距离的文本聚类算法。该算法可发现非球形结构的类簇,在不需要先验知识的情况下,仅通过数学迭代即可得到聚类结果。鉴于当前无纸化考试系统的广泛应用,将该算法应用于主观题的自动阅卷系统中。通过对多种主观题的仿真实验,表明了该算法与C均值和FCM算法相比,不仅能获得较高的准确率,算法收敛速度也较快。 Traditional clustering algorithm with fuzzy partition based on Euclidean distance fits more the clustering of spherical structural clusters.When applying it to the text clustering with higher dimensions,the accuracy and efficiency will all be decreased.Focus on solving this problem,we propose a Mahalanobis distance-based text clustering algorithm.It can detect the class clusters with non-spherical structure, and can obtain the clustering result just through the mathematical iteration without the need of priori knowledge.In view of the wide applica-tion of paperless examination system at present,we apply this algorithm to automatic paper marking system of subjective questions.Through the simulation experiments on a variety of subjective questions,it is demonstrate that the algorithm can achieve higher accuracy rate than the c-means and FCM algorithms,furthermore,its convergence rate is higher as well.
出处 《计算机应用与软件》 CSCD 2015年第4期80-82,86,共4页 Computer Applications and Software
基金 河南省教育厅自然科学研究计划项目(2011C510002)
关键词 聚类 文本聚类 模糊C均值 欧氏距离 马氏距离 自动阅卷 Clustering Text clustering Fuzzy c-means (FCM) Euclidean distance Mahalanobis distance Automatic paper marking
  • 相关文献

参考文献13

  • 1Jiawei Han Micheline Kamber.数据挖掘概念与技术[M].机械工业出版社,2005,4. 被引量:2
  • 2TomMMitchell.机器学习[M].北京:机械工业出版社,2003.. 被引量:28
  • 3Deng J, Hu J, Chi H, et al. An Improve Fuzzy Clustering Method for Text Mining [ C ]//Proceeding of the Second International Conference on Networks Security, Wireless Communications and Trusted Compu- ting, 2010:65 - 69. 被引量:1
  • 4Liu H, Yin J, Wu D, et al. Fuzzy C-mean Algorithm Based on "Com- plete" Mahalanobis Distances[ C ]//Proceedings of the Seventh Inter- national Conference on Machine Learning and Cybernetics, 2008:87 -91. 被引量:1
  • 5Li Cong, Georgiopoulos, Michael, et al. Kernel principal subspace Mahalanobis distances for outlier detection 2011 International Joint Conference on Neural Network, UCNN 2011:2528 -2535. 被引量:1
  • 6Liu H, Yih J, Wu D, et al. Fuzzy C-Mean Algorithm Based on Mahal- anobis Distance and New Separable Criterion[ C ]//Proceedings of In- ternational Conference on Machine Learning and Cybernetics, 2007: 1851 - 1855. 被引量:1
  • 7Liu Boqin. Research on question bank constructing and automatic marking design for network open examination system[ J]. Advances in Information Sciences and Service Sciences, 2012, 4 (23):1880 -1883. 被引量:1
  • 8Li NianFeng, Wang LiRong. A kind of Braille paper automatic mark- ing system [ C ]//Proceedings 2011 International Conference on Mecha- tronic Science, Electric Engineering and Computer, MEC 2011:664 - 667. 被引量:1
  • 9Naehya, Beata. Creating knowledge base from automatically extracted information[ C]. 8th International Conference on Hybrid Artificial In- telligent Systems, HAIS 2013:608 -617. 被引量:1
  • 10李娟.形式语言在网页制作操作题自动阅卷中的应用[J].计算机应用,2013,33(3):882-885. 被引量:4

二级参考文献45

共引文献144

同被引文献56

引证文献6

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部