摘要
针对实体上下文信息主题漂移的问题,提出一种基于双词主题模型的实体消歧方法.方法考虑到实体在一定语义环境下具有不同的主题,且在同一文档中同时出现的其他实体在一定程度上能够帮助待消歧实体确定所指代内容,利用命名实体构建双词的思想,将协同实体关系融合到主题模型中,并在此基础上利用维基百科知识库,进行半监督消歧.本文最后在网络文本数据上进行了相关的实验,验证了所提算法的有效性.实验表明该方法有效的提高了实体消歧精度.
Aimed at the problem of theme drift of the entity context information,this paper proposes an entity disambiguation method based on biterm topic model.The proposed method considers that the entity has a different theme in a certain semantic environment and the other entity appearing in the same document at the same time can help the disambiguated entity to determine the referred content to a certain extent.Therefore,using the ideas of named entity constructing double words to incorporate collaborative entity relationship to the topic model,and on this basis,we conduct semi-supervised disambiguation using Wikipedia knowledge base.Finally,this paper conducts some relevant experiments on the web text data,and verifies the effectiveness of the proposed algorithm.The experiments show that the proposed method effectively improve the precision of entity disambiguation.
作者
张雄
陈福才
黄瑞阳
ZHANG Xiong;CHEN Fu-cai;HUANG Rui-yang(National Digital Switching System Engineering and Technological R&D Center,Zhengzhou,Henan 450001,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2018年第3期607-613,共7页
Acta Electronica Sinica
基金
国家自然科学基金(No.61171108)
国家重点基础研究发展计划("973"计划)资金(No.2012CB315901
No.2012CB315905)
国家科技支撑计划(No.2014BAH30B01)
关键词
实体消歧
维基百科
双词主题模型
entity disambiguation
Wikipedia
biterm topic model