期刊文献+

基于BERT-CRF模型的电子病历实体识别研究 被引量:8

Research on Entity Recognition of Electronic Medical Record Based on BERT-CRF Model
下载PDF
导出
摘要 电子病历实体识别是智慧医疗服务中一项重要的基础任务,当前医院诊疗过程中采用人工分析病历文本的方法,容易产生关键信息遗漏且效率低下。为此,提出一种结合BERT与条件随机场的实体识别模型,使用基于双向训练Transformer的BERT中文预训练模型,在手工标注的符合BIOES标准的语料库上微调模型参数,通过BERT模型学习字符序列的状态特征,并将得到的序列状态分数输入到条件随机场层,条件随机场层对序列状态转移做出约束优化。BERT模型具有巨大的参数量、强大的特征提取能力和实体的多维语义表征等优势,可有效提升实体抽取的效果。实验结果表明,论文提出的模型能实现88%以上的实体识别F1分数,显著优于传统的循环神经网络和卷积神经网络模型。 Electronic medical record entity recognition is an important basic task in intelligent medical services.At present,the method of manual analysis of medical record text is used in the process of diagnosis and treatment in hospitals,which is easy to produce key information omission and inefficient.Therefore,a kind of entity recognition model combining BERT and conditional random field is proposed.Using the BERT Chinese pre-training model based on bi-directional training transformers,the parameters of the model are fine-tuned on the manually marked corpus which conforms to the BIOES standard.Through the BERT model,the state characteristics of character sequences are learned,and the obtained sequence state scores are input into conditional random field layer,which makes a reduction to the sequence state transition bundle.BERT model has many advantages,such as huge parameters,powerful feature extraction ability and multi-dimensional semantic representation of entities,which can effectively improve the effect of entity extraction.The experimental results show that the BERT-CRF model obtained more than 88% of the entity recognition F1 score,which is significantly better than the traditional recurrent neural network and convolutional neural network model.
作者 何涛 陈剑 闻英友 HE Tao;CHEN Jian;WEN Yingyou(Neusoft Reserch,Northeastern University,Shenyang 110169;Research Center of Safety Engineering Technology in Industrial Control of Liaoning Province,Shenyang 110169)
出处 《计算机与数字工程》 2022年第3期639-643,共5页 Computer & Digital Engineering
基金 国家重点研发计划(编号:2018YFC0830601) 辽宁省重点研发计划(编号:2019JH2/10100027) 教育部基本科研业务费项目(编号:N171802001) 辽宁省“兴辽英才计划”项目(编号:XLYC1802100)资助。
关键词 深度学习 BERT 条件随机场 命名实体识别 电子病历 deep learning BERT conditional random field named entity recognition electronic medical records
  • 相关文献

参考文献4

二级参考文献56

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:117
  • 3董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 4罗智勇 宋柔.现代汉语自动分词中专名的一体化、快速识别方法[A]..ICCC,Singapore[C].,2001.11.. 被引量:2
  • 5季姮,罗振声.基于反比概率模型和规则的中文姓名自动辨识系统[A].自然语言理解与机器翻译[C].北京:清华大学出版社,2001.123-128. 被引量:1
  • 6何燕.基于单字词转移概率的未登录词识别[A].自然语言理解与机器翻译[C].北京:清华大学出版社,2001 141-146. 被引量:1
  • 7张艳丽,黄德根等.统计和规则相结合的中文机构名称识别[A].自然语言理解与机器翻译[C].北京:清华大学出版社,2001.233-239. 被引量:2
  • 8SUN J,GAO J F,ZHANG L,et al.Chinese named entity identification using class-based language model[A].Proc of the 19th International Conference on Computational Linguistics[C].Taipei:Morgan Kauffmann Press,2002.967-973. 被引量:1
  • 9YU H,ZHANG H,LIU Q.Recognition of Chinese organization name based on role tagging[A].Advances in Computation of Oriental Languages[C].Beijing:Tsinghua University Press,2003.79-87 被引量:1
  • 10ZHANG H,LIU Q,YU H,et al.Chinese named entity recognition using role model[J].The International Journal of Computational Linguistics and Chinese Language Processing,2003,8(2):1-31. 被引量:1

共引文献467

同被引文献82

引证文献8

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部