期刊文献+

融合领域知识的医学命名实体识别研究 被引量:5

Medical Named Entity Recognition with Domain Knowledge
原文传递
导出
摘要 【目的】构建融合医学领域知识的图神经网络结构模型GraphModel-Dict,针对医学文本进行命名实体识别研究。【方法】首先,采用图结构方式对领域知识进行融合,将原始文本数据与领域词典作为不同类别的节点进行构图,利用门控循环单元进行节点更新,以得到结合领域知识的原始文本数据节点语义表示;其次,将文本数据节点的最终表示作为双向长短期记忆网络的输入;然后,通过条件随机场预测标签并输出识别序列;最后,使用两个数据集评估模型的性能。【结果】在人工标注的3100份中文乳腺癌超声检查报告数据集上,GraphModel-Dict模型的实体识别的精确率、召回率和F1值达到96.91%、97.52%以及97.22%。另外,在对每类实体的识别效果评估中,针对提取样本数据稀少或表达形式多样化的实体类型,GraphModel-Dict模型表现出更优的识别性能。在CCKS2020医疗数据集上进行性能评估实验,与基线模型相比,GraphModelDict模型的F1值至少提高了1.39%。【局限】GraphModel-Dict模型的实验只在医疗数据集上展开,在其他领域的有效性需进一步研究。【结论】领域知识的有效使用能够提高其在命名实体识别中的作用,为促进医学信息挖掘和临床研究提供了潜力。 [Objective]This paper builds a graph neural network model integrating medical domain knowledge(GraphModel-Dict)to identify named entities from medical texts.[Methods]First,we used the graph neural network structure to integrate domain knowledge,mapping the raw text data and domain dictionaries as nodes of different categories.We also updated the nodes of raw text data with Gated Recurrent Unit(GRU)to obtain their semantic representation with domain knowledge.Then,we used the representation of the text data node as an input to a Bidirectional Long Short-Term Memory network(BiLSTM).We predicted the labels and generated recognition results with a Conditional Random Field(CRF)model.Finally,we evaluated GraphModel-Dict’s performance on two datasets.[Results]We examined the GraphModel-Dict on a manually annotated dataset of 3,100 Chinese ultrasound examination reports on breast cancer.The model’s precision,recall,and F1-score for entity recognition reached 96.91%,97.52%,and 97.22%,respectively.Furthermore,GraphModel-Dict showed better recognition performance for entity types with fewer sample data or diverse expressions.On the CCKS2020 medical dataset,the F1-value of GraphModel-Dict increased by at least 1.39%compared to the baseline model.[Limitations]More research is needed to examine the effectiveness of the proposed model in other fields.[Conclusions]Integrating domain knowledge can improve the effectiveness of named entity recognition,which benefits medical information mining and clinical research.
作者 裴伟 孙水发 李小龙 鲁际 杨柳 吴义熔 Pei Wei;Sun Shuifa;Li Xiaolong;Lu Ji;Yang Liu;Wu Yirong(Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydropower Engineering,China Three Gorges University,Yichang 443002,China;Yichang Key Laboratory of Intelligent Medicine,Yichang 443002,China;College of Computer and Information Technology,China Three Gorges University,Yichang 443002,China;College of Economics&Management,China Three Gorges University,Yichang 443002,China;Faculty of Psychology,Beijing Normal University,Zhuhai 519087,China;Institute of Advanced Studies in Humanities and Social Sciences,Beijing Normal University,Zhuhai 519087,China)
出处 《数据分析与知识发现》 CSSCI CSCD 北大核心 2023年第3期142-154,共13页 Data Analysis and Knowledge Discovery
基金 国家社会科学基金项目(项目编号:20BTQ066)的研究成果之一。
关键词 医学命名实体识别 图神经网络 领域知识词典 乳腺癌超声检查报告 Medical Named Entity Recognition Graph Neural Network Domain Knowledge Dictionary Breast Cancer Ultrasound Examination Reports
  • 相关文献

参考文献4

二级参考文献24

共引文献928

同被引文献157

引证文献5

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部