基于编码器-解码器的离线手写数学公式识别

Offline Handwritten Mathematical Expression Recognition Based on Encode-Decoder

下载PDF

导出

摘要提出一种改进的编码器-解码器模型。模型采用多尺度密集卷积网络作为编码器,以提取手写数学公式图像的多分辨率特征。采用完全基于注意力机制的Transformer模型作为解码器,依据图像特征将二维手写数学公式解码为一维LaTeX序列。通过相对位置编码嵌入图像位置信息和LaTeX符号位置信息。实验结果表明,模型在官方CROHME 2014数据集上取得了优异的性能,相比于当前最先进的方法,其公式识别准确率提高了3.55%,字错误率降低了1.41%。 In recent years,great progress on handwritten mathematical expression recognition have been made by using Encoder-Decoder models.However,these Encoder-Decoder models still have two shortcomings.One is that the image feature information is insufficient by the encoder,and the other is that the decoder is inefficient in processing long sequences.For these shortcomings,this paper proposes an improved Encoder-Decoder model.The model uses a multi-scale Densely Connected Convolutional Networks as the encoder to extract the multi-resolution features of handwritten mathematical expressions images.By using a Transformer model based on the attention entirely as the decoder we decode two-dimensional handwritten mathematical expressions into one-dimensional LaTeX sequences according to the image features.Hence,image position information and LaTeX symbol position information have been embedded by relative position encoding.The results show that the model achieves excellent performance on the official CROHME 2014 dataset,with a 3.55%improvement in formula recognition accuracy and a 1.41%reduction in word error rate compared to current state-of-the-art methods.

作者杜永涛余元辉 DU Yongtao;YU Yuanhui(College of Computer Engineering,Jimei University,Xiamen 361021,China)

机构地区集美大学计算机工程学院

出处《集美大学学报（自然科学版）》 CAS 2022年第6期570-576,共7页 Journal of Jimei University：Natural Science

基金厦门市科技补助项目(2022CXY0301)。

关键词编码器-解码器离线手写数学公式识别多尺度密集卷积网络 Transformer模型相对位置编码 Encoder-Decoder offline handwritten mathematical expression recognition multi-scale Densely Connected Convolutional Networks Transformer model relative position encoding

分类号 TP [自动化与计算机技术]

引文网络
相关文献

1陈瑶,熊棋,郭一娜.面向会话推荐的注意力图神经网络[J].小型微型计算机系统,2023,44(2):307-312.
2孙祥辉,杨雨,孙道强.基于LaTeX的高校毕业论文在线自动编排系统的设计与实现[J].计算机应用与软件,2022,39(12):39-46.
3胡越杰,蒋高明.SwinBN:一种基于Swin Transformer的针织物疵点检测模型[J].丝绸,2023,60(1):59-69. 被引量：1
4董添,李广,杨振宇,张博,于波,王巍.基于Transformer的电网企业文件密级分类系统[J].吉林大学学报（信息科学版）,2022,40(6):1039-1044. 被引量：1
5刘婉春,景明利,王子昭,陈腾飞,樊锐博.基于Transformer和双残差网络的图像去模糊算法研究[J].信息技术与信息化,2023(1):217-220.
6《应用概率统计》编辑部.《应用概率统计》征稿简则[J].应用概率统计,2022,38(6).
7张志远,陈亚瑞,杨剑宁,丁文强,杨巨成.熵正则化下的变分深度生成聚类模型[J].计算机科学与探索,2023,17(2):376-384.
8Lan-Fang Dong,Han-Chao Liu,Xin-Ming Zhang.Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition[J].Journal of Computer Science & Technology,2022,37(6):1427-1443.

集美大学学报（自然科学版）

2022年第6期

浏览历史

内容加载中请稍等...

基于编码器-解码器的离线手写数学公式识别

相关作者

相关机构

相关主题

浏览历史