摘要
得益于深度神经网络的特征提取功能,和深度神经网络结合的CAD系统在许多医学图像分析领域取得很大的成功。多数情况下,CAD系统基于监督学习构建,而训练一个监督学习系统需要大量人工标注的数据,费时费力。超声图像常被用做诊断的依据,也是用作训练模型的数据集,但是在临床上,超声图像并不准确,病理报告才是金标准,通过病理报告可以判断对应病人的超声图像为阳性还是阴性。由超声图像和对应病人病理报告得出的标签(阴性或阳性)就组成一个可用于训练模型的数据。通过文本检测、文本识别、句向量编码、二分类四个步骤提出一个自动打标签的模型,将病理报告作为输入,就可以得到标签,而不需要大量专业医师费事费力人工标注。
Thanks to the feature extraction function of the deep neural network,the CAD system combined with the deep neural network has achieved great success in many medical image analysis fields.In most cases,CAD systems are constructed based on supervised learning,and train⁃ing a supervised learning system requires a lot of manually labeled data,which is time-consuming and laborious.Ultrasound images are of⁃ten used as a basis for diagnosis and also as a data set for training models.However,clinically,ultrasound images are not accurate,and the pathological report is the gold standard.The pathological report can be used to determine whether the ultrasound image of the correspond⁃ing patient is positive or Negative.The label(negative or positive)derived from the ultrasound image and the corresponding patient pathol⁃ogy report constitutes a data that can be used to train the model.Through the four steps of text detection,text recognition,sentence vector coding,and two classifications,an automatic labeling model is proposed.Using the pathological report as input,the label can be obtained without the need for a large number of professional doctors to manually label.
作者
曹晏阁
王利团
CAO Yan-ge;WANG Li-tuan(College of Computer Science,Sichuan University,Chengdu 610065)
出处
《现代计算机》
2020年第31期3-13,共11页
Modern Computer
基金
国家自然科学基金(No.61772353)。
关键词
深度学习
文本检测
文本识别
句向量
自动打标签
Deep Learning
Text Detection
Text Recognition
Sentence Embedding
Automatically Labelling