摘要
粘连符号切分一直是数学公式识别中的重要问题,也是造成识别错误的主要原因之一,针对这种情况论文用深度学习对粘连符号进行了分割。使用FasterRCNN神经网络进行有监督训练,在训练数据集中包含各种粘连情况,如:水平粘连、垂直粘连以及对角线粘连等,对特殊的粘连情况另外做特殊的处理。通过实验表明这种分割方法使得粘连符号的切分得到显著的提高,同时也提高了数学符号识别的正确率。
The segmentation of touching characters is an important problem in the recognition of mathematical formulas,and is also one of the main causes of error recognition.In this case,this paper uses deep learning to divide the touching characters.FasterRCNN neural network is used for supervised training,and contains various adhesion situations in training data set,such as horizontal adhesion,vertical adhesion and diagonal adhesion.Special treatment for special adhesions is also done.The experiment shows that the segmentation method makes the segmentation of the touching characters significantly improved,At the same time,the correct rate of mathematical symbol recognition is also improved.
作者
郭蓉蓉
李涛
魏琦
GUO Rongrong;LI Tao;WEi Qi(College of Computer,Xi'an University of Posts and Telecommunication,Xi'an 710100)
出处
《计算机与数字工程》
2019年第10期2579-2584,共6页
Computer & Digital Engineering
基金
陕西省教育厅科研项目(编号:2050205)
国家自然基金重点项目(编号:61136002)
常州钟楼开发区与西安邮电大学科技合作项目资助
关键词
数学公式识别
深度学习
粘连符号
切分
mathematical formula recognition
deep learning
touching characters
segmentation