期刊文献+

神经网络词切分在蒙汉机器翻译中的应用 被引量:1

Application of Neural Network Word Slicing Method in Mongolian-Chinese Machine Translation
下载PDF
导出
摘要 蒙古文词切分是蒙古文信息处理的主要任务,词切分的准确性和合理性不仅可以缓解数据稀疏问题,且会直接影响蒙古文信息处理的后续工作。针对蒙古文的黏着性以及词语形态变化丰富的特点,首先,给出基于神经网络的蒙古文词切分预处理方法,实验结果表明,神经网络词切分方法准确率达到97.37%;其次,构建了基于蒙古文词切分的Transformer蒙汉神经机器翻译模型;最后,对比了不同蒙古文词切分方法在Transformer蒙汉机器翻译中的效果。研究结果表明,根据蒙古文语法规则把蒙古文连接元音字母和不稳定“N”当作停用词进行过滤的BiLSTM-CNN-CRF神经网络的词切分方法在机器翻译译文中的BLEU值可达73.30%,提高了机器翻译质量。 Mongolian word segmentation is the main task in Mongolian information processing. The accuracy and rationality in word segmentation can alleviate data sparsity problems and directly affect the subsequent processing tasks of Mongolian information. A neural network-based word segmentation method for Mongolian is firstly proposed for the sticky and rich word form variations of Mongolian language. The experimental results showed that the accuracy of the neural network word division method reached 97.37%. Secondly, a Mongolian-Chinese neural machine translation model of Transformer based on Mongolian word segmentation was constructed. Finally, the effectiveness of different Mongolian word segmentation methods in Transformer’s Mongolian-Chinese machine translation was compared. The word segmentation method of BiLSTM-CNN-CRF neural network, which filters Mongolian connected vowel letters and unstable "N" as deactivated words according to Mongolian grammar rules, achieved a BLEU value of 73.30% in machine translation and improved the quality of machine translation to a certain extent.
作者 何乌云 王斯日古楞 HE Wuyun;WANG Siriguleng(Inner Mongolia Normal University,Hohhot 011500,China)
出处 《中央民族大学学报(自然科学版)》 2022年第4期36-46,共11页 Journal of Minzu University of China(Natural Sciences Edition)
基金 国家自然科学基金(61762072) 内蒙古自治区科技计划项目(2021GG0139) 内蒙古师范大学研究生科研创新基金资助项目(GXJJS20129)。
关键词 词切分 神经网络 蒙汉机器翻译 word slicing neural networks Mongolian-Chinese machine translation
  • 相关文献

参考文献13

二级参考文献58

共引文献67

同被引文献6

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部