摘要
提出基于重启随机游走的实体识别和链接方法,在知识库部分实体构成的图结构中进行随机游走,从而获得实体和指称的分布式表示,并由此计算出相似度最高的实体作为链接实体.该方法在2015年Tri-Lingual Entity Discovery and Linking评测任务中的F值为0.665,高于其他参赛系统.实验结果表明,本方法可以有效克服特征稀缺问题,并减轻流行度差异对实验结果造成的影响.
An entity discovery and linking approach based on random walk with restart was presented.Unified semantic representation for entities and documents—the probability distribution obtained from a random walk on a subgraph of the knowledge based was adopted.According to this distributed representation,the entities that are similar with mentions as the linking results was obtained.This method achieved 0.665 F value on entity linking section of TAC 2015 TEDL task,it performs better than other participating systems.It is illustrated that the method can overcome the feature sparsity issue and is less amenable to feature sparsity bias.
出处
《北京邮电大学学报》
EI
CAS
CSCD
北大核心
2017年第6期115-119,共5页
Journal of Beijing University of Posts and Telecommunications
基金
网络文化与数字传播北京市重点实验室开放课题(ICDD201703)
国家自然科学基金面上项目(61671070)
关键词
实体链接
语义相似度
随机游走
entity linking
semantic relatedness
random walk