摘要
将实体搜索技术应用于中文专利摘要文本,并在实体搜索过程中使用了实体消歧技术,得到用户最关注的实体相关信息。在充分分析了专利摘要文本的特点的基础上,提出一种面向专利实体的消歧方法。使用基于IPC和向量空间模型的词向量表示法,结合凝聚式层次聚类算法,得到专利实体消歧结果。根据对比实验的结果能够得出结论,本方法能够实现准确的实体消歧,评测结果高达78.9%。
We apply entity search technology to the abstract text of Chinese patent, and make use of the technology of entity disambiguation, obtaining the related information that users are concerned about most. This paper proposes a disambiguation method based on the adequate analysis of the features of the abstract text of patent. Combined with HAC, the paper has gotten the result of disambiguation by using the word vector representation on the basis of IPC and Vector Space Model. The parallel experiments lead to the conclusion that the method could disambiguate accurately with the measure value of 78.9 percent.
出处
《沈阳航空航天大学学报》
2015年第1期77-83,共7页
Journal of Shenyang Aerospace University
基金
国家自然科学基金(项目编号:2012BAH14F00)