摘要
自然场景图像质量易受光照及采集设备的影响,且其背景复杂,图像中文字颜色、尺度、排列方向多变,因此,自然场景文字检测具有很大的挑战性.本文提出一种基于全卷积网络的端对端文字检测器,集中精力在网络结构和损失函数的设计,通过设计感受野模块并引入Focalloss、GIoUloss进行像素点分类和文字包围框回归,从而获得更加稳定且准确的多方向文字检测器.实验结果表明本文方法与现有先进方法相比,无论是在多方向场景文字数据集还是水平场景文字数据集均取得了具有可比性的成绩.
The quality of natural scene images is influenced easily by the shooting environment and conditions,and scene image background is relatively complex and has a strong interference for detection,besides,text in scene images may have different colors,fonts,sizes,directions,languages and so on,all these situations make natural scene text detection be still a challenging research topic.This paper proposes an end-to-end text detector based on fully convolution network.We focus on the design of the network structure and the loss function,through adding the enhanced receptive field module and introducing Focalloss,GIoUloss for pixels classification and text boxes regression respectively,we gain a more stable accurate multi-oriented text detector.Our method provides promising performance compared to the recent state-of-the art methods on both the multi-oriented scene text dataset and horizontal text dataset.
作者
李晓玉
宋永红
余涛
LI Xiao-Yu;SONG Yong-Hong;YU Tao(School of Software Engineering,Xi'an Jiaotong University,Xi'an 710049;College of Artificial Inteligence,Xi'an Jiao-tong University,Xi'an 710049)
出处
《自动化学报》
EI
CAS
CSCD
北大核心
2022年第3期797-807,共11页
Acta Automatica Sinica
基金
陕西省自然科学基础研究计划(2018JM6104)
国家重点研究开发计划(017YFB1301101)资助。