期刊文献+

农业文献知识获取中斜体字符识别技术的应用研究 被引量:2

Research on detection method of English italic characters in agricultural knowledge acquisition
下载PDF
导出
摘要 传统的光学字符识别(OCR)系统中,由于训练的样本中并没有包括斜体字符,导致系统无法正确识别出斜体字符,这对农业文献的知识获取造成了一定的影响。针对这个问题,提出了一种斜体字符检测和纠正的方法。首先将文本行分割成单词,并进一步细分为单个字符,然后分别检测各个字符的形态特征,并依此判断出单词的形态,最后收集检测为斜体结果的所有单词,并利用这些单词计算出斜体字符的准确角度并加以纠正。经农业文献知识获取的实践结果证明,该方法能取得很好的检测和纠正效果。 In the optical character recognition (OCR) system, due to the training sample does not include italic characters, the system cannot correctly identify the italic characters, which impacts on knowledge acquisition of agricultural literature. If the italic character were con- tained in the training sample, the complexity of the sample will be increased and also will have some impact in the recognition of positive body. For this phenomenon, this paper presents a method to detect and correct the English italics. The first step is to split lines of text into words, and further to subdivide the words into individual characters, and then detect the mor- phological characteristics of each character and so determine the word shape. Furthermore, collect the test results of all the words in italics, and use these words to calculate the italic characters' accurate angle and correct. The results of knowledge acquisition of agricultural lit- erature show that this method can achieve good detection and correction results.
机构地区 河北农业大学
出处 《河北农业大学学报》 CAS CSCD 北大核心 2015年第6期124-128,共5页 Journal of Hebei Agricultural University
基金 河北省高等学校科学技术研究青年基金(Z2012142) 保定市科学技术研究与发展指导计划项目(13ZN025 13ZF098) 保定市科学技术协会自然科学课题(KX2013A20) 河北农业大学理工基金项目(LG20120604)资助
关键词 OCR 斜体检测 斜体校正 农业知识获取 OCR italic detection italic correction agricultural knowledge acquisition
  • 相关文献

参考文献12

  • 1Ding Yimei, ()kada M, Kimura F, et al. Application of Slant Correction to Handwritten Japanese AddressRecognition[A]// Proc of the 6th International Con {erenee on Document Analysis and Recognition, Seat tle, USAACM, 2001:670 - 674. 被引量:1
  • 2Ding Yimei, Kimura F, Miyake Y, et al. Slant Esti- mation for Handwritten Words by Directionally Re- fined Chain Code[A] // Proe of the 7th International Workshop on Frontiers in Handwritten Recognition, Amsterdam, Netherlands.. ACM, 2000 53 - 62. 被引量:1
  • 3Ding Yimei, Ohyama W, Kimura F, et al. Local Slant Estimation for Handwritten English Words[A] // Proc of the 9th International Workshop on Fron- tiers in Handwritten Recognition. Kokubunji, Ja- pan..ACM, 2004;328 - 333. 被引量:1
  • 4Simoncini L, Kovaes Z M. A System for Reading USA Census'90 Hand-Written Fields[A] // Proc of the 3rd International Conference on Document Anal- ysis and Recognition, Montreal, Canada: IEEE, 1995 .. 86 - 91. 被引量:1
  • 5Nicchiotti G, Seagliola C. Generalized Projeetions, A Tool for Cursive Character Normalization [A]//Proe of the 5th International Conference on Document A- nalysis and Recognition, Bangalore, India.. IEEE, 1999729 - 733. 被引量:1
  • 6Kavallieratou E, Fakotakis N, Kokkinakis G. Slant Estimation Algorithm for OCR System[J]. PatternRecognition, 200J, 34(12): 2515 - 2522. 被引量:1
  • 7Li Yun, Naoi S, Cheriet M, et al. A Segmentation Method for Touching Italic Characters [A]/ Proe of the 17th International Conference on Pattern Recogni- tion, Cambridge, UK:IEEE, 2004:594-597. 被引量:1
  • 8Sun Changming, Si Deyi. Skew and Slant Correction for Document Images Using Gradient Direction[A]// Proe of the 4th International Conference on Document Analysis and Recognition, Ulm. Germany: IEEE, 1997: 170-174. 被引量:1
  • 9Shi Na,Pan Jinxiao. Fast and Robust Skew Detection for Scanned Documents[A] International Confer- ence on Electronic and Mechanical Engineering and Information Technology (EMEIT) , Harbin Universi- ty of Science and Technolog: IEEE, 2011: 4170 - 4173. 被引量:1
  • 10马驰,于淼.基于主曲线算法的手写字符特征分析与提取[J].计算机工程与应用,2013,49(3):202-206. 被引量:4

二级参考文献7

共引文献3

同被引文献36

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部