期刊文献+

汉维统计机器翻译中的形态学处理 被引量:5

Morphology Processing in Chinese-Uyghur Statistical Machine Translation
下载PDF
导出
摘要 针对汉语和维吾尔语语序差别(前者是主-谓-宾结构,后者是主-宾-谓结构)及形态差别较大的问题,通过编写调序规则将汉语调整为主-宾-谓结构,将维吾尔语单词切分为词干、词缀等更小的词素单元来训练统计模型,同时测试词素的切分粒度对翻译性能的影响。实验结果表明,对汉语句法结构的调整及以词干、词缀等更小的词素形式参与训练可以有效提高翻译质量。 For the large differences of syntactic structure between Chinese and Uyghur, it composes rules to reorder the structure of Chinese sentences to that of Uyghar. For the large morphological differences between Chinese and Uyghur, it splits Uyghur words into stems and affixes, that is, morphemes, to train the statistical model. Meanwhile, it tests the effects of splitting granularities on translation performance. Experimental results show Chinese sentence reordering and splitting Uyghur words into morphemes can effectively improve the performance of translation system.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第12期150-152,共3页 Computer Engineering
基金 中国科学院西部行动计划高新技术基金资助项目(KGCX2-YN-507)
关键词 汉维 统计机器翻译 词素 调序 Chinese-Uyghur statistical machine translation morpheme reordering
  • 相关文献

参考文献8

  • 1Arianna B, Marcello F. Morphological Pre-processing for Turkish to English Statistical Machine Translation[C] //Proc. of IWSLT’09. Tokyo, Japan:[s. n.] , 2009. 被引量:1
  • 2Durgar E K, Oflazer K. Initial Explorations in English to Turkish Statistical Machine Translation[C] //Proc. of IEEE Int’l Conf. on Statistical Machine Translation. New York, USA:[s. n.] , 2006. 被引量:1
  • 3Oflazer K, Durgar E K. Exploring Different Representational Units in English to Statistical Machine Translation[C] //Proc. of the 2nd Workshop on Statistical Machine Translation. Prague, Czech Republic:[s. n.] , 2007. 被引量:1
  • 4Habash N, Sadat F. Arabic Preprocessing Schemes for Statistical Machine Translation[C] //Proc. of the Human Language Technology Conference.[S. l.] : IEEE Press, 2006. 被引量:1
  • 5Zollmann A, Venugopal A, Vogel S. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation[C] // Proc. of the Human Language Technology Conference. New York, USA:[s. n.] , 2006. 被引量:1
  • 6李国臣, 孟 静. 利用主语和谓语的句法关系识别谓语中心 词[D]. 太原: 山西大学, 2005. 被引量:1
  • 7Mathias C, Krista L. Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0. Publications[EB/OL]. (2005-07-12). http:// www.cis.hut.fi/projects/morpho/. 被引量:1
  • 8董兴华,周俊林,郭树盛,吐尔洪.吾司曼.基于短语的汉维/维汉统计机器翻译[J].计算机工程,2011,37(9):16-18. 被引量:15

二级参考文献7

共引文献14

同被引文献56

引证文献5

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部