期刊文献+

基于编码器-解码器的离线手写数学公式识别

Offline Handwritten Mathematical Expression Recognition Based on Encode-Decoder
下载PDF
导出
摘要 提出一种改进的编码器-解码器模型。模型采用多尺度密集卷积网络作为编码器,以提取手写数学公式图像的多分辨率特征。采用完全基于注意力机制的Transformer模型作为解码器,依据图像特征将二维手写数学公式解码为一维LaTeX序列。通过相对位置编码嵌入图像位置信息和LaTeX符号位置信息。实验结果表明,模型在官方CROHME 2014数据集上取得了优异的性能,相比于当前最先进的方法,其公式识别准确率提高了3.55%,字错误率降低了1.41%。 In recent years,great progress on handwritten mathematical expression recognition have been made by using Encoder-Decoder models.However,these Encoder-Decoder models still have two shortcomings.One is that the image feature information is insufficient by the encoder,and the other is that the decoder is inefficient in processing long sequences.For these shortcomings,this paper proposes an improved Encoder-Decoder model.The model uses a multi-scale Densely Connected Convolutional Networks as the encoder to extract the multi-resolution features of handwritten mathematical expressions images.By using a Transformer model based on the attention entirely as the decoder we decode two-dimensional handwritten mathematical expressions into one-dimensional LaTeX sequences according to the image features.Hence,image position information and LaTeX symbol position information have been embedded by relative position encoding.The results show that the model achieves excellent performance on the official CROHME 2014 dataset,with a 3.55%improvement in formula recognition accuracy and a 1.41%reduction in word error rate compared to current state-of-the-art methods.
作者 杜永涛 余元辉 DU Yongtao;YU Yuanhui(College of Computer Engineering,Jimei University,Xiamen 361021,China)
出处 《集美大学学报(自然科学版)》 CAS 2022年第6期570-576,共7页 Journal of Jimei University:Natural Science
基金 厦门市科技补助项目(2022CXY0301)。
关键词 编码器-解码器 离线手写数学公式识别 多尺度密集卷积网络 Transformer模型 相对位置编码 Encoder-Decoder offline handwritten mathematical expression recognition multi-scale Densely Connected Convolutional Networks Transformer model relative position encoding
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部