摘要
预处理是整个文字识别系统的重要组成部分,预处理性能的优劣将直接影响整个识别系统的性能。根据藏文在字形和书写方式上的特点,实现了一种适用于藏文识别的预处理技术,整个预处理过程包括二值化、版面分析、倾斜校正、字符切分和归一化。在预处理过程中还提取了一些有关字丁的基本特征,这些特征充分反映了藏文的特点,具有良好的稳定性,可以用于识别系统的粗分类和后处理。
The preprocessing is an important part of the character recognition system. Its performance will seriously affect the capacity of the system. In this paper, we present a preprocessing algorithm for Tibetan character based on the topology structures and writing habits of Tibetan characters. The entire preprocessing procedure includes binary, page analysis, skew correction, character segmentation and normalization. In addition, some basic features of Tibetan characters are extracted. These features can be used for recognition and postprocessing.
出处
《计算机工程》
CAS
CSCD
北大核心
2001年第9期93-96,共4页
Computer Engineering
关键词
藏文识别
预处理
字符切分
文字识别系统
计算机
Tibetan character recognition
Preprocessing
Skew correction
Character segmentation
Normalization