摘要
针对词向量语义信息不完整以及文本特征抽取时的一词多义问题,提出基于BERT(Bidirectional Encoder Representation from Transformer)的两次注意力加权算法(TARE)。首先,在词向量编码阶段,通过构建Q、K、V矩阵使用自注意力机制动态编码算法,为当前词的词向量捕获文本前后词语义信息;其次,在模型输出句子级特征向量后,利用定位信息符提取全连接层对应参数,构建关系注意力矩阵;最后,运用句子级注意力机制算法为每个句子级特征向量添加不同的注意力分数,提高句子级特征的抗噪能力。实验结果表明:在NYT-10m数据集上,与基于对比学习框架的CIL(Contrastive Instance Learning)算法相比,TARE的F1值提升了4.0个百分点,按置信度降序排列后前100、200和300条数据精准率Precision@N的平均值(P@M)提升了11.3个百分点;在NYT-10d数据集上,与基于注意力机制的PCNN-ATT(Piecewise Convolutional Neural Network algorithm based on ATTention mechanism)算法相比,精准率与召回率曲线下的面积(AUC)提升了4.8个百分点,P@M值提升了2.1个百分点。在主流的远程监督关系抽取(DSER)任务中,TARE有效地提升了模型对数据特征的学习能力。
Aiming at the problem of incomplete semantic information of word vectors and the problem of word polysemy faced by text feature extraction,a BERT(Bidirectional Encoder Representation from Transformer)word vector-based Twice Attention mechanism weighting algorithm for Relation Extraction(TARE)was proposed.Firstly,in the word embedding stage,the self-attention dynamic encoding algorithm was used to capture the semantic information before and after the text for the current word vector by constructing Q,K and V matrices.Then,after the model output the sentence-level feature vector,the locator was used to extract the corresponding parameters of the fully connected layer to construct the relation attention matrix.Finally,the sentence level attention mechanism algorithm was used to add different attention scores to sentence-level feature vectors to improve the noise immunity of sentence-level features.The experimental results show that compared with Contrastive Instance Learning(CIL)algorithm for relation extraction,the F1 value is increased by 4.0 percentage points and the average value of Precision@100,Precision@200,and Precision@300(P@M)is increased by 11.3 percentage points on the NYT-10m dataset.Compared with the Piecewise Convolutional Neural Network algorithm based on ATTention mechanism(PCNN-ATT),the AUC(Area Under precision-recall Curve)value is increased by 4.8 percentage points and the P@M value is increased by 2.1 percentage points on the NYT-10d dataset.In various mainstream Distantly Supervised for Relation Extraction(DSRE)tasks,TARE effectively improves the model’s ability to learn data features.
作者
袁泉
陈昌平
陈泽
詹林峰
YUAN Quan;CHEN Changping;CHEN Ze;ZHAN Linfeng(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Research Center of New Communication Technology Applications,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《计算机应用》
CSCD
北大核心
2024年第4期1080-1085,共6页
journal of Computer Applications
关键词
远程监督
关系抽取
注意力机制
词向量特征
全连接层
distant supervision
relation extraction
attention mechanism
word embedding feature
fully connected layer