摘要
汉语词语语义相似度计算是中文信息处理中的一个关键问题。在知网(HowNet)环境下,通过分析影响词汇相似度计算结果的概念层次树结构,提出了一种同时考虑层次树深度、密度及语义路径等多因素的义元相似度计算方法,并应用于词汇相似度计算过程。实验结果表明,该方法使词汇相似度计算结果更趋于合理,绝大部分结果更符合人们的日常体验,有效提高了词汇相似度计算结果的精确度和准确性。
Semantic similarity computation of Chinese words is a key issue in Chinese information processing. Under HowNet, by analyzing the impact factor of vocabulary similarity calculations in the level of concept tree structure, a similarity calculation method was proposed. This method takes the level of depth, density and semantic factors into account, and it was used in similarity calculation. Experimental results show that the calculating results of word similarity by this method have become more reasonable, and the majority of the results are more in line with people's daily experience, which effectively improves the precision and exactness of word similarity calculations.
出处
《重庆邮电大学学报(自然科学版)》
北大核心
2009年第4期533-537,共5页
Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金
重庆市自然科学基金重点项目(2008BA2017)
重庆市信息产业发展专项资金(200811004)
关键词
知网
语义
词汇相似度
义元
HowNet
semantics
word's similarity
sememe