期刊文献+

基于双词主题模型的半监督实体消歧方法研究 被引量:6

Semi-supervised Entity Disambiguation Method Research Based on Biterm Topic Model
下载PDF
导出
摘要 针对实体上下文信息主题漂移的问题,提出一种基于双词主题模型的实体消歧方法.方法考虑到实体在一定语义环境下具有不同的主题,且在同一文档中同时出现的其他实体在一定程度上能够帮助待消歧实体确定所指代内容,利用命名实体构建双词的思想,将协同实体关系融合到主题模型中,并在此基础上利用维基百科知识库,进行半监督消歧.本文最后在网络文本数据上进行了相关的实验,验证了所提算法的有效性.实验表明该方法有效的提高了实体消歧精度. Aimed at the problem of theme drift of the entity context information,this paper proposes an entity disambiguation method based on biterm topic model.The proposed method considers that the entity has a different theme in a certain semantic environment and the other entity appearing in the same document at the same time can help the disambiguated entity to determine the referred content to a certain extent.Therefore,using the ideas of named entity constructing double words to incorporate collaborative entity relationship to the topic model,and on this basis,we conduct semi-supervised disambiguation using Wikipedia knowledge base.Finally,this paper conducts some relevant experiments on the web text data,and verifies the effectiveness of the proposed algorithm.The experiments show that the proposed method effectively improve the precision of entity disambiguation.
作者 张雄 陈福才 黄瑞阳 ZHANG Xiong;CHEN Fu-cai;HUANG Rui-yang(National Digital Switching System Engineering and Technological R&D Center,Zhengzhou,Henan 450001,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2018年第3期607-613,共7页 Acta Electronica Sinica
基金 国家自然科学基金(No.61171108) 国家重点基础研究发展计划("973"计划)资金(No.2012CB315901 No.2012CB315905) 国家科技支撑计划(No.2014BAH30B01)
关键词 实体消歧 维基百科 双词主题模型 entity disambiguation Wikipedia biterm topic model
  • 相关文献

参考文献4

二级参考文献33

  • 1Guha V, Garg A. Disambiguating People in Search [ M ]. 2004. 被引量:1
  • 2Nancy Ide,Jean Veronis. Introduction to the Special Issue on Word Sense disambiguation :the state of the art [ J ]. Computational Linguistics, 1998, 24(1):1-40. 被引量:1
  • 3bagga A, Baldwin. Entity-Based CrossDocument Coreferencing Using The Vector Space Model. 被引量:1
  • 4Fleischman, Hovy. Multi-Document Person Name Resolution [ M ]. 2004. 被引量:1
  • 5Chen Y, Martin J. Towards robust unsupervised personal name disambig- uation[ C ]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language I.earning (EMNLP-CoNLL) :190-198. 被引量:1
  • 6Quang Minh Vu,Atsuhiro Takasu,jun Adachi. Improving the perform- ance of personal name disambiguation using web directories[ J]. Infor, mation Processing and Management,2008,44 : 1546 - 1561. 被引量:1
  • 7David M Blei, Andrew Y Ng,Michael I Jordan. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research,2003,3:993-1022,. 被引量:1
  • 8Gfiffiths T L, Steyvers M. Finding scientific topics [ J ]. Proceedings of the National Academy of Science,2004,101 ( 1 ) :5228 -5235. 被引量:1
  • 9Frey B J,Dueck D. Clustering by passing messages between data points [ J ]. Science ,2007,315 (5814) :972 - 976. 被引量:1
  • 10Hachey B, Radford W, Nothman J, Honnibal M, Curran JR. Evaluating entity linking with Wikipedia. Artificial Intelligence, 2013,194:130-150. 被引量:1

共引文献33

同被引文献52

引证文献6

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部