期刊文献+

面向汉维机器翻译的调序表重构模型 被引量:4

Reordering table reconstruction model for Chinese-Uyghur machine translation
下载PDF
导出
摘要 针对词汇化调序模型在机器翻译中存在的上下文无关性及稀疏性问题,提出了基于语义内容进行调序方向及概率预测的调序表重构模型。首先,使用连续分布式表示方法获取调序规则的特征向量;然后,通过循环神经网络(RNN)对于向量化表示的调序规则进行调序方向及概率预测;最后,过滤并重构调序表,赋予原始调序规则更加合理的调序概率分布值,提高调序模型中调序信息的准确度,同时降低调序表规模,提高后续解码速率。实验结果表明,将调序表重构模型应用至汉维机器翻译任务中,BLEU值可以获得0.39的提升。 Focused on the issue that lexicalized reordering models are faced with context independence and sparsity problems in machine translation, a reordering table reconstruction model based on semantic content for reordering orientation and probability prediction was proposed. Firstly, continuous distributed representation approach was employed to acquire the feature vectors of reordering rules. Secondly, Recurrent Neural Networks( RNN) were utilized to predict the reordering orientation and probability of each reordering rule that represented with dense vectors. Finally, the original reordering table was filtered and reconstructed with more reasonable reordering probability distribution for the purpose of improving the reordering information accuracy in reordering model as well as reducing the size of the reordering table to speed up subsequent decoding process. The experimental results show that the reordering table reconstruction model can provide BLEU point gains( + 0. 39) for Chinese to Uyghur machine translation task.
作者 潘一荣 李晓 杨雅婷 米成刚 董瑞 PAN Yirong;LI Xiao;YANG Yating;MI Chenggang;DONG Rui(Xinjiang Technical Institute of Physics & Chemistry,Chinese Academy of Scienees,Urumqi Xinjiang 830011,China;University of Chinese Academy of Scienees,Beijing 100049,China;Xinjiang Laboratory of Minority Speech and Language Information Processing,Urumqi Xinjiang 830011,China)
出处 《计算机应用》 CSCD 北大核心 2018年第5期1283-1288,共6页 journal of Computer Applications
基金 中国科学院西部之光项目(2015-XBQN-B-10) 新疆自治区重大科技专项课题(2016A03007-3) 新疆自治区重点实验室开放课题(2015KL031) 新疆维吾尔自治区自然科学基金资助项目(2015211B034)~~
关键词 汉维机器翻译 调序表重构模型 词汇化调序 语义内容 连续分布式表示 循环神经网络 Chinese-Uyghur machine translation reordering table reconstruction model lexicalized reordering semantic content continuous distributed representation Recurrent Neural Network (RNN)
  • 相关文献

参考文献4

二级参考文献34

  • 1Arianna B, Marcello F. Morphological Pre-processing for Turkish to English Statistical Machine Translation[C] //Proc. of IWSLT’09. Tokyo, Japan:[s. n.] , 2009. 被引量:1
  • 2Durgar E K, Oflazer K. Initial Explorations in English to Turkish Statistical Machine Translation[C] //Proc. of IEEE Int’l Conf. on Statistical Machine Translation. New York, USA:[s. n.] , 2006. 被引量:1
  • 3Oflazer K, Durgar E K. Exploring Different Representational Units in English to Statistical Machine Translation[C] //Proc. of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic:[s. n.] , 2007. 被引量:1
  • 4Habash N, Sadat F. Arabic Preprocessing Schemes for Statistical Machine Translation[C] //Proc. of the Human Language Technology Conference.[S. l.] : IEEE Press, 2006. 被引量:1
  • 5Zollmann A, Venugopal A, Vogel S. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation[C] // Proc. of the Human Language Technology Conference. New York, USA:[s. n.] , 2006. 被引量:1
  • 6李国臣, 孟 静. 利用主语和谓语的句法关系识别谓语中心 词[D]. 太原: 山西大学, 2005. 被引量:1
  • 7Mathias C, Krista L. Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0. Publications[EB/OL]. (2005-07-12). http:// www.cis.hut.fi/projects/morpho/. 被引量:1
  • 8Koehn P, Och F J, Marcu D. Statistical Phrase-based Translation[C] // Proc. of Conference for Computational Linguistics on Human Language. Stroudsburg, USA: [s. n.] , 2003: 127-133. 被引量:1
  • 9Elming J. Syntactic Reordering Integrated with Phrase-based SMT[C] //Proc. of the 22nd International Conference on Computational Linguistics. Manchester, UK: [s. n.] , 2008: 209- 216. 被引量:1
  • 10Zollmann A, Venugopal A, Vogel S. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation[C] // Proc. of North American Chapter of the Association for Computational Linguistics. New York, USA: [s. n.] , 2006: 201-204. 被引量:1

共引文献10

同被引文献14

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部