期刊文献+

一种低亮度非均匀光照文档图片快速二值化方法 被引量:7

A fast binarization method for dark and uneven illumination document images
原文传递
导出
摘要 二值化是光学文字识别(OCR)的重要步骤,直接影响到光学文字识别的成功率。目前基于亮度分割局域二值化算法效果好,但是过程复杂、运算耗时。快速二值化算法流程简单、噪声敏感。低亮度图片一般有不可忽略的噪声,并且文字对比度低。为获取低对比度文字,快速二值化算法需对亮度梯度敏感,但是也会导致快速二值化结果文字断裂、丢失、背景噪声大。为实现高质量快速二值化,本文采取非局域均值滤波算法抑制噪声,同时避免过度平滑图片。采用改进的Bradley算法提取低对比度文字,并解决了文字断裂等问题。最后采用膨胀腐蚀算法抑制二值化噪声。本方法适用于非均匀低亮度和高亮度的图片。实验结果表明,本方法在非均匀高亮度下,表现和其他快速二值化算法相同。在非均匀低亮度下,提取文字更多、文字断裂更少、噪声更小。本方法二值化结果的OCR召回率达到了93.5%。 Binarization is an important step in optical character recognition(OCR),directly affects the accuracy of OCR.At present,the local binarization algorithms based on luminance segmentation have good effect,complicated process and long elapsed time.The fast binarization algorithms are simple and noise sensitive.Generally,low-luminance images have nonnegligible noise and low contrast of text.In order to obtain low contrast text,fast binarization algorithms need to be sensitive to luminance gradient.However,in the binarization result,luminance gradient sensitivity also leads to nonnegligible background noise,textual breakage and loss.In this paper,for high-quality and fast binarization,non-local mean filtering is adopted to suppress noise and avoid over-smooth.Improved Bradley algorithm is used to extract the low contrast text in order to solve the problem of textual breakage.In the end,dilation algorithm and erosion algorithm are used to suppress the noise of binarization.Our method is suitable for uneven low luminance pictures and uneven high luminance pictures.Experimental results show that our method performs the same as other fast binarization algorithms under uneven high luminance,however,extracts more text with less noise under uneven low luminance,solves the problem of textual breakage.The OCR recall rate of the binarization results of this method reached 93.5%.
作者 王康维 赵磊 黄鑫炎 彭玉发 马思远 范虹伯 WANG Kang-wei;ZHAO Lei;HUANG Xin-yan;PENG Yu-fa;MA Si-yuan;FAN Hong-bo(School of Applied Sciences,Harbin University of Science and Technology,Harbin,Heilongjiang Province 150000,China)
出处 《光电子.激光》 EI CAS CSCD 北大核心 2020年第12期1333-1340,共8页 Journal of Optoelectronics·Laser
基金 大学生创新创业训练项目(201810214035)资助项目。
关键词 模式识别 二值化 文档图片 光照不均匀 Bradley算法 非局域均值滤波 pattern recognition document image binarization uneven illumination bradley algorithm non-local mean filter
  • 相关文献

参考文献9

二级参考文献65

  • 1赵善龙,刘明勇.图像二值化时阈值自适应选取方法及其Visual C++实现[J].哈尔滨铁道科技,2006(1):8-10. 被引量:5
  • 2黄文杰,陈斌.一种快速图像处理的积分图方法[J].计算机应用,2005,25(B12):266-268. 被引量:13
  • 3Dai Ruwei,Liu Chenglin,Xiao Baihua.Chinese Character Recognition:History,Status and Prospects[J].Frontiers of Computer Science in China,2007,1(2):126-136. 被引量:1
  • 4Shin J,Sakoe H.Optimal Stroke-correspondence Search Method for On-line Character Recognition[J].Pattern Recognition Letters,2002,23(6):601-608. 被引量:1
  • 5Lee Seong-whan,Kim Chang-hun,Ma Hong,et al.Multiresolution Recognition of Unconstrained Handwritten Numerals with Wavelet Transform and Multilayer Cluster Neural Network[J].Pattern Recognition,1996,29(12):1953-1961. 被引量:1
  • 6Hu Jiangying,Lim Sok-gek,Michael K.Writer Independent On-line Handwriting Recognition Using an HMM Approach[J].Pattern Recognition,2000,33(1):133-147. 被引量:1
  • 7Su Tonghua,Zhang Tianwen,Qiu Zhaowen,et al.Hmm-based System for Transcribing Chinese Hand Writing[C] //Proc.of the 6th International Conference on Machine Learning and Cybernetics.Hong Kong,China:[s.n.] ,2007. 被引量:1
  • 8Li Yanfang,Yang Huamin,Xu Jing.Chinese Character Recognition Method Based on Multi-features and Parallel Neural Network Computation[C] //Proc.of the 3rd InternationalConference on Advanced Intelligent Computing Theories and Applications.Qingdao,China:[s.n.] ,2007. 被引量:1
  • 9Fu Chang.Techniques for Solving the Large-scale Classification Problem in Chinese Handwriting Recognition[C] //Proc.of the 2006 Conference on Arabic and Chinese Handwriting Recognition.College Park,USA:Springer-Verlag,2008. 被引量:1
  • 10Bahlmann C,Haasdonk B,Burkhardt H,et al.On-line Handwriting Recognition with Support Vector Machines--A Kernel ApproachC] //Proc.of the 8th International Workshop on Frontiers in Handwriting Recognition.[S.l.] :IEEE Computer Society,2002. 被引量:1

共引文献139

同被引文献48

引证文献7

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部