摘要
大量基于卷积神经网络的场景文本检测方法对于密集的长文本容易检测不全,且泛化能力较差。针对这些问题,提出一种面向自底向上的场景文本检测方法。使用自适应通道注意力机制(ACA),通过局部跨通道交互获得更具代表性的文本特征,提高深度卷积神经网络的性能;利用特征增强金字塔(FPEM)融合低层和高层信息进一步增强不同尺度的特征;为解决长文本尺度变化问题,提出一种加权感知损失(WAL),通过调整不同大小的文本实例的权重来增强鲁棒性。实验在CTW1500及MSRA-TD500标准数据集上验证了该方法的优越性。
A large number of scene text detection methods based on convolutional neural networks are prone to incomplete detection and poor generalization ability for dense long texts.Aiming at these problems,a bottom-up oriented scene text detection method was proposed.An adaptive channel attention mechanism(ACA)was used to obtain more representative textual features through local cross-channel interactions,improving the performance of deep convolutional neural networks.The feature enhancement pyramid module(FPEM)was used to fuse low-level and high-level information to further enhance features at different scales.To address the scale variation problem of long texts,a weighted aware loss(WAL)was proposed to enhance robustness by adjusting the weights of text instances of different sizes.The experiments verify the superiority of the method on CTW1500 and MSRA-TD500 standard datasets.
作者
刘倩
杨鹏
毛红梅
LIU Qian;YANG Peng;MAO Hong-mei(School of Information Engineering,Nanjing Audit University,Nanjing 211815,China;School of Information Engineering,Nanchang Hangkong University,Nanchang 330063,China)
出处
《计算机工程与设计》
北大核心
2023年第3期901-907,共7页
Computer Engineering and Design
基金
国家自然科学基金项目(62172229)
江苏省自然科学基金项目(SBK2021020091)
江苏省研究生培养创新工程基金项目(KYCX21_1950)。