摘要
上下位关系抽取是知识图谱构建的关键环节,目前常用的基于模板和分布式的方法存在可移植性差、召回率低等不足。针对这些问题,提出了一种基于多通道特征融合的上下位关系抽取方法,通过预训练词嵌入、双向LSTM和依存句法树结果编码三个通道来构建模型编码器。首先,提出了上下位关系抽取整体框架,包括数据挖掘与标注模块、特征抽取模块、候选句打分模块及结果排序模块。然后,针对特征抽取模块,提出了融合句法依存关系、上下文特征以及预训练特征的自适应编码方法;针对句子打分模块,提出了包含编解码器结构的网络模型。最后,通过对准确率、召回率、查全率进行消融实验,表明所提出的模型具有较好的有效性和更好的可解释性。
Hypernymy relationship extraction is a key step in the construction of knowledge graphs.Currently,the commonly used template-based and distributed methods have shortcomings such as poor portability and low recall rate.To address these issues,a multi-channel feature fusion based hypernymy relationship extraction method is proposed,which constructs a model encoder through three channels:pre-trained word embedding,bi-directional LSTM,and dependency syntax tree result encoding.First,an overall framework for hypernymy relationship extraction is proposed,which includes data mining and annotation modules,feature extraction modules,candidate sentence scoring modules and result sorting modules.Then,for the feature extraction module,an adaptive encoding method integrating syntactic dependencies,contextual features,and pre-trained features is proposed;and for the sentence scoring module,a network model including codec structure is proposed.Finally,the ablation experiments on the accuracy rate,recall rate,etc.,indicate that the proposed model has better validity and better interpretability.
作者
靖琦东
翟值楚
周在龙
杨松柏
Jing Qidong;Zhai Zhichu;Zhou Zailong;Yang Songbai(CEC Industrial Internet Co.,Ltd.,Changsha Hunan 410000,China)
出处
《通信技术》
2023年第6期744-749,共6页
Communications Technology
关键词
上下位关系抽取
多通道特征融合
图卷积网络
依存句法树
hypernymy relationship extraction
multi-channel feature fusion
graph convolutional network
dependency syntax tree