摘要
【目的】研究医学影像诊断报告的结构化方法,实现从医学影像诊断报告中准确高效地提取信息。【方法】分析医学影像诊断报告的文本特征,提出基于实体识别和规则抽取相结合的结构化方法,标注800份医学影像诊断报告构建数据集实验评估。【结果】所提方法对医学影像诊断报告各类实体的识别精确率均达到了0.87,相较于BERT-BiLSTM-CRF在识别精确率上提升了4.03个百分点,召回率提升了2.81个百分点。该医学影像诊断报告结构化方法比基于依存分析的结构化方法对检查项和检查结果的识别精确率分别提升5.62个百分点和2.31个百分点。【局限】研究基于某医院PET-CT影像诊断报告,数据来源单一。【结论】实现医学影像诊断报告从自由文本到结构化数据的转换,不仅优化医学影像诊断报告的分类、检索与存储,还为医学影像领域后续研究提供数据支持。
[Objective]This paper tries to turn medical imaging diagnosis reports into structured data,aiming to effectively extract information from these free-text-reports.[Methods]First,we analyzed the text characteristics of medical imaging diagnosis reports,and proposed a structuring method based on entity recognition and rule extraction.Then,we annotated 800 reports to construct datasets for model evaluation.[Results]The proposed method had a precision rate of 0.87 for all entities from the medical imaging diagnostic reports,which was 4.03% higher than that of the BERT-BiLSTM-CRF.Its recall rate was also 2.81% higher than that of the BERT-BiLSTMCRF.Compared with the method of dependency analysis,the proposed model improved the recognition precision of medical exam items and results by 5.62% and 2.31%.[Limitations]We only examined the proposed method with diagnostic PET-CT imaging reports from one hospital.[Conclusions]This study successfully converts the free texts of medical imaging diagnostic reports to structured data.It not only optimizes the classification,storage,and retrieval of medical reports,but also provides supports for future research on medical imaging.
作者
盛羽
胡慧荣
王聪聪
杨晟艺
Sheng Yu;Hu Huirong;Wang Congcong;Yang Shengyi(School of Computer Science and Engineering,Central South University,Changsha 410083,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2022年第10期46-56,共11页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金面上项目(项目编号:61877059)的研究成果之一。
关键词
医学影像诊断报告
实体识别
规则抽取
结构化
Medical Imaging Diagnosis Report
Entity Recognition
Rule Extraction
Structure