期刊文献+

基于 XLNet 的农业命名实体识别方法

Agricultural named entity recognition method based on XLNet
下载PDF
导出
摘要 随着农业领域人工智能的研究不断深入,农业文本中命名实体识别是其他任务开展的基础之一。鉴于农业领域缺乏公开语料库,本文构建了自己的农业文本的注释语料库。针对目前存在的文本语义表达不足、缺乏语境特征、词向量多样性表达困难等问题,本文提出了基于XLNet(Generalized Autoregressive Pretraining for Language Understanding,XLNet)的农业命名实体识别模型XLNet-IDCNN-CRF。嵌入层XLNet对于输入文本进行向量化表示,丰富文本的语义信息,缓解一词多义问题,通过编码层迭代膨胀卷积神经网络(Iterated Dilated Convolutional Neural Network,IDCNN)并行计算减少训练时间,获取文本特征信息,结合起来输入到输出层条件随机场模型(Conditional Random Field,CRF)识别标签信息,输出最优序列。本文在自建语料库上准确率达到95.58%,召回率92.36%,F1值93.91%,对比优于其他模型。实验结果表明,XLNet-IDCNNCRF模型能够较好地完成农业命名实体识别任务。 With the deepening of artificial intelligence research in agriculture,named entity recognition in agricultural texts is one of the foundations for other tasks.This study constructed an annotated corpus of agricultural texts that fulfilled the lack of public corpora in the field of agriculture.Aiming at the current problems of insufficient text semantic expression,lack of contextual features,and expression difficulty of word vector diversity,this paper proposed an agricultural named entity recognition model XLNet-IDCNN-CRF based on XLNet(Generalized Autoregressive Pretraining for Language Understanding,XLNet).The embedding layer XLNet performed vectorized representation of the input texts enriching the semantic information of the text and alleviating the problem of polysemy.The parallel computing then ran on the Iterated Dilated Convolutional Neural Network(IDCNN)in the encoding layer to obtain text within reduced training time.The feature information was combined and inputted into the output layer using the conditional random field model(Conditional Random Field,CRF)to identify the label information and output the optimal sequence.This model performed better on the self-built corpus than other models,whose accuracy rate reached 95.58%,recall rate was 92.36%and F1 value was 93.91%.The experimental results showed that the XLNet-IDCNN-CRF model was competent to the agricultural named entity recognition task.
作者 陈明 顾凡 CHEN Ming;GU Fan(School of information,Shanghai Ocean University,Shanghai 201306,China;Key Laboratory of fishery information,Ministry of agriculture,Shanghai 201306,China)
出处 《河北农业大学学报》 CAS CSCD 北大核心 2023年第4期111-117,共7页 Journal of Hebei Agricultural University
基金 江苏现代农业产业关键技术创新项目(CX(20)2028).
关键词 农业文本 命名实体识别 XLNet模型 预训练语言模型 迭代膨胀卷积 agricultural text named entity recognition XLNet model pretrained language model iterative dilated convolution
  • 相关文献

参考文献5

二级参考文献47

共引文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部