摘要
自然场景乌金体藏文文本信息作为高度浓缩的高层语义信息,不仅具有较大的研究和实用价值,而且可以用于协助藏文场景文本理解领域的研究.目前针对自然场景下乌金体藏文的检测与识别的相关研究甚少,本文在人工收集的自然场景乌金体藏文图像数据集的基础上,对比了目前常见的文字检测算法在自然场景乌金体藏文上的检测性能以及在不同特征提取网络下基于序列的文字识别算法CRNN在自然场景乌金体藏文图像数据集上的识别准确率并分析了在314张真实自然场景下乌金体藏文识别出错的特殊例子.实验表明本文在文字检测阶段采用的可微分的二值化网络DBNet在测试集上具有更好的检测性能,该方法在测试集上的准确率、召回率、F1值分别达到了0.89、0.59、0.71;在文字识别阶段采用MobileNetV3 Large作为特征提取网络时,CRNN算法在测试集上的识别准确率最高,达到了0.4365.
As a highly condensed high-level semantic information,the text information of Wujin style Tibetan scripts in natural scenes not only has great research and practical value,but also can be used to assist researchers with text understanding in Tibetan scenes.At present,there are few related studies on the detection and recognition of Wujin style Tibetan scripts in natural scenes.Based on the manually collected image data set of Wujin style Tibetan scripts in natural scenes,this study compares the detection performance of common text detection algorithms on such scripts.The recognition accuracy of the sequence-based text recognition algorithm,CRNN,under different feature extraction networks is also compared on the image data set collected.Examples of recognition failure during the recognition of Wujin style Tibetan scripts in 314 real natural scenes are analyzed as well.Experiments show that the differentiable binary network,DBNet,used in the text detection stage has better detection performance on the test set.The accuracy,recall,and F1 value of this method on the test set reach 0.89,0.59,and 0.71,respectively;when MobileNetV3 Large is used as the feature extraction network in the text recognition stage,the CRNN algorithm has the highest recognition accuracy of 0.4365 on the test set.
作者
洪松
高定国
三排才让
取次
HONG Song;GAO Ding-Guo;SAMPEL Tsering;QU Ci(Information Science and Technology Academy,Tibet University,Lasa 850000,China)
出处
《计算机系统应用》
2021年第12期332-338,共7页
Computer Systems & Applications
基金
西藏大学研究生高水平人才培养计划(2018-GSP-020)
青海省藏文信息处理与机器翻译重点实验室/藏文信息处理教育部重点实验室开放课题(2020Z001)。
关键词
自然场景
乌金体藏文
检测
识别
natural scene
Wujin Tibetan script
detection
recognition