摘要
传统光学字符识别(Optical Character Recognition,OCR)方法一般只提取图像亮度特征,在图像退化较严重时识别准确率不高。针对这一问题,提出一种新的扫描字符特征提取方法。除各通道亮度外,还提取像素位置、亮度的一阶导、二阶导等特征构成特征图像,并根据各个特征对图像的贡献程度进行加权处理。计算以当前像素为中心的局部区域特征图像块的协方差矩阵作为当前像素的描述子,然后在黎曼空间对字符实施分类。实验结果表明,采用典型的结构化分类器时,该特征提取方法对字符识别的准确率高于传统方法,表现出较强的鲁棒性。
Only image intensity used in traditional OCR method leads to poor accuracy when existing serious degradation in scanning image.This paper proposes a novel feature extraction method for scanning image.In addition to intensity,the paper also extracts the pixel position,the first and second order derivative information of the original image to form a feature image.Furthermore,those features are weighted according to their contribution to the image.Then the paper calculates the covariance matrix of a local area around current pixel as its descriptor,these covariance matrix descriptors are classified in Riemannian space.The experimental results show that the accuracy of the proposed method is higher than traditional one when using structure classifier,so it is a robust feature extraction method for character recognition.
出处
《智能计算机与应用》
2012年第2期24-26,29,共4页
Intelligent Computer and Applications
关键词
光学字符识别
协方差矩阵
特征提取
黎曼流形
Optical Character Recognition(OCR)
Covariance Matrix
Feature Extraction
Riemannian Manifold