摘要
针对大规模题库中存在相似试题的问题,提出一种自动识别相似试题的方法.在知网词汇语义相似度模型的基础上,引入领域词汇对其进行改进,并且提出一种试题去重模型,来实现试题相似度的计算,解决了题库中相似和重复试题的自动识别问题,提高了相似试题识别的准确率.综合随机抽取法和试探回溯法两种组卷算法的优点,提出一种基于相似试题识别的组卷算法,提高了组卷的质量.实验表明试题相似度识别准确率达96%,非常接近人工判断结果,该方法不仅可以从同一试题类型内部,还可在不同类型之间消除相似试题.该方法已在C语言上机考试中进行了应用.
To solve the problem of identifying similar questions in examination database, an algorithm for question similarity identification is proposed in this paper. By introducing domain words to the improvement of the word similarity model in HowNet, a model for question similarity identification is proposed to make the same or similar questions be identified and cut off automatically. This method improves the accuracy of identi- fication compared with other methods. By combining merits of the random selection with those of the back- tracking method, a new algorithm of generating papers automatically based on question similarity identification is proposed. It can guarantee the quality of papers. Test results show that the accuracy of question similarity i- dentification of this method is 96% , which approaches to that of manual identification. This method can cut off similar questions not only of the same type, but also of different types. Finally, this method has been applied to the on-line examination of C programming language.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2009年第1期85-88,共4页
Journal of Harbin Institute of Technology
基金
国家自然科学基金资助项目(60673035)
关键词
相似题识别
智能组卷
难度等级
题库系统
similarity identification
automatic paper generation
difficult level
system of examination