期刊文献+

融合交叉自注意力和预训练模型的文本语义相似性评估方法 被引量:1

Evaluation Semantic Textual Similarity Method using Cross Self-attention and Pre-trained Model
原文传递
导出
摘要 评估两个句子的语义相似性是文本检索和文本摘要等自然语言处理任务的重要组成部分.学者利用深度神经网络执行该任务,然而它们依赖于上下文独立词向量,从而导致性能不佳.为了缓解该问题,采用预训练模型BERT替换传统的词向量,并提出交叉自注意力以增强两个句子的语义,然后与BERT结合.在提出的模型中,为了实现交叉自注意力操作,设计了向量的对齐方法.最后,将BERT输出输入一个双向循环神经网络,以稳定性能,克服BERT自身带来的波动性.实验中,采用3个公开数据集DBMI2019、CDD-ref和CDD-ful对提出的混合模型进行评价.实验结果表明,由于使用了BERT生成的语境词向量,提出模型的性能始终优于现存方法;交叉自注意力实现了彼此的语义交互而增强了句对的语义,使得相似句对的语义差异更小,而不相似句对的语义差异更大,提高了相似性评估的性能.最终,提出模型在DBMI2019、CDD-ref和CDD-ful上分别取得了0.846,0.849和0.845的皮尔逊相关系数,超越了直接以[CLS]输出向量作为评估的方法. Evaluation semantic textual similarity is an important component in such natural language processing(NLP) tasks as text retrieval and text summarization.Researchers generally employ deep neural networks to perform the task,but they depend on pre-trained context-independent word embedding,resulting in poor performance.To alleviate the drawback,this paper proposed a novel framework,in which traditional word embeddings were replaced with Bidirectional Encoder Representations from Transformers(BERT),a proposed cross self-attention was introduced to enhance the semantic of sentence pair.However,cross self-attention can not be conducted owing to the unaligned output of BERT,and thus the aligned method was designed.Finally,the output was fed into a bidirectional recurrent neural network(bi-RNN) to stablize the results and relieve the volatility of BERT.In the experiments,three open corpora,namely DBMI2019,CDD-ref and CDD-ful,are used to asses the whole model.The experimental results show that the fine-tuned BERT models consistently outperform existing methods;Semantic interaction realized by cross self-attention enhance the semantic information each other,enlarging the semantic difference between dissimilar sentences and reducing the semantic difference between similar sentences.Hence,the evaluation performance is promoted.Eventually,the proposed model achieves Pearson correlation coefficients of 0.846,0.849 and 0.845 on DBMI2019,CDD-ref and CDD-ful,respectively,surpassing the method which directly taking [CLS] output vector as evaluation.
作者 李正光 陈恒 李远刚 LI Zheng-guang;CHEN Heng;LI Yuan-gang(Research Center for Language Intelligence,Dalian University of Foreign Languages,Dalian 116044,China;Institute of Belt-Road Urban&Regional Development,Dalian University of Foreign Languages,Dalian 116044,China;Faculty of Business Information,Shanghai Business School,Shanghai 200235,China)
出处 《数学的实践与认识》 2022年第7期165-175,共11页 Mathematics in Practice and Theory
基金 2019年辽宁省高等学校创新人才(WR2019005) 2020年辽宁省教育科学“十三五”规划项目(JG20DB120)。
关键词 语义相似性 交叉自注意力 预训练模型 语义交互 semantic textual similarity cross self-attention pre-trained model semantic interaction
  • 相关文献

参考文献2

二级参考文献7

共引文献29

同被引文献15

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部