摘要
手写数字串切分是手写数字OCR系统中必不可少的组成部分。实际应用中一般用框格对数字的书写范围进行约束,切分过程比较容易,如果没有框格约束,手写数字串的切分就成为一个难题。针对无约束的手写数字串切分的难点,提出了一种新的粘连数字串切分方法。该方法先使用主曲线实现字符模板的笔画抽取,然后依据字符笔画的模糊特征处理笔画,最后以字符识别器提供的置信度为依据完成切分过程。为验证该新切分方法的效果,对从银行实地采集的3000份真实支票进行了切分实验,其中363张支票存在粘连现象,切分正确率为89.68%。实验结果表明,该算法能够有效地切分多字粘连的手写体数字串。
Numeral strings segmentation plays a significant role in the OCR systems. In many applications, numeral strings are filled in preprinted form frames. This makes,the segmentation problem easier. Other wise, the segmentation is difficult. A new segment method for handwritten numeral strings is proposed. Principal curves are selected to extract strokes of characters. The strokes in the initial group are disposed of by the fuzzy features and grouped based on the confidence of the classifiers. On the database composed of 3 000 bank checks with touching digits in 363 checks, the proposed algorithm has been evaluated qualitatively and quantitatively with an the accurate rate of 89. 68%.
出处
《中国图象图形学报》
CSCD
北大核心
2009年第11期2292-2298,共7页
Journal of Image and Graphics
基金
国家自然科学基金重点项目(60632050)
国家高技术研究发展计划(863)项目(2006AAO12119)
关键词
主曲线
模糊特征
数字串分割
笔画组合
principal curve, fuzzy feature, segmentation of numeral strings, stroke grouping