摘要
针对传统实体对齐方法无法体现潜在语义信息的问题,对其进行优化,使实体对齐效果更加显著。使用潜在狄利克雷分配(latent Dirichlet allocation,LDA)模型对网络百科非结构化数据进行建模,采用改进的置信传播(belief propagation,BP)算法求解LDA模型中的隐藏参数,进而生成实体特征向量进行相似度计算,通过计算结果判断是否可以对齐。实验结果表明,通过与三种传统算法进行比较,所提算法在准确率、召回率和综合指标F值三个评价指标方面均有所提高。针对具有描述信息的网络百科实体,该算法可以有效提升实体对齐效果。
Aiming at the problem that traditional entity alignment method could not reflect latent semantic information,this paper optimized it,making the effect of entity alignment more significant.It used the LDA model to model the unstructured data of the network encyclopedia,and adopted the improved BP algorithm to solve the hidden parameters of LDA model,in turn,generated entity eigenvectors to perform similarity calculation.Finally,through calculation results could determine whether alignment.The experimental results show that,through comparing with three kinds of traditional algorithms,the proposed algorithm increases the three-evaluation index that above precision,recall and F-score.Aiming at the network encyclopedia entity with description information,the algorithm can effectively improve the entity alignment effect.
作者
刘振鹏
贺梦洁
张彬
董静
徐建民
Liu Zhenpeng;He Mengjie;Zhang Bin;Dong Jing;Xu Jianmin(College of Electronic Information Engineering,Hebei University,Baoding Hebei 071002,China;Information Technology Center,Hebei University,Baoding Hebei 071002,China;School of Cyber Security&Computer,Hebei University,Baoding Hebei 071002,China)
出处
《计算机应用研究》
CSCD
北大核心
2019年第11期3286-3289,3343,共5页
Application Research of Computers
基金
河北省自然科学基金资助项目(2015201142)