基于强制对齐的层次短语模型过滤和优化被引量：1

Filtration and Optimization for Hierarchical Phrase-based Model with Forced Alignment

下载PDF

导出

摘要该文提出一种层次短语模型过滤和优化方法.该方法在采用传统方法训练得到层次短语规则的基础上,通过强制对齐同时构建源语言和目标语言的解析树,从中过滤并抽取对齐的层次短语规则,最后利用这些规则重新估计翻译模型的翻译概率.该方法不需要引入任何语言学知识,适合大规模语料训练模型.在大规模中英翻译评测任务中,采用该方法训练的模型与传统层次短语模型相比,不仅能够过滤50％左右规则,同时获得0.8～1.2BLEU值的提高. This paper proposes an effective method for filtering and optimizing hierarchical phrase-based （HPB） model. After obtaining the original HPB rules with traditional training method, we generate the bilingual derivation trees that represent source and target sentences with forced alignment, and then extract the HPB rules from derivation trees. At last, we re-estimated the probabilities of HPB rules with the extracted rules. This method does not need any linguistic knowledge, and it is suitable for large-scale training corpus. In the large scale Chinese-English translation tasks, our proposed method filters about 50 % of the original HPB rules and improves the translation per- formance ranging from 0.8- 1.2 BLEU on the test sets, comparing to the traditional training method.

作者付晓寅魏玮卢世祥徐波

机构地区中国科学院自动化研究所数字内容技术与服务中心

出处《中文信息学报》 CSCD 北大核心 2013年第6期134-138,150,共6页 Journal of Chinese Information Processing

基金国家高技术研究发展计划(863)资助项目(2011AA01A207)

关键词统计机器翻译层次短语强制对齐模型训练 statistical machine translation hierarchical phrase-based model forced alignment model training

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献15

1David Chiang. A hierarchical phrase-based model for statistical machine translation[C]//Proceedings of the 43rd Annual Meeting of the ACL. 2005: 263-270. 被引量：1
2David Chiang. Hierarchical phrase-based translation [J]. Computational Linguistics, 2007, 33(2): 201- 228. 被引量：1
3Philipp Koehn, Franz Joseph Och, Daniel Mareu. Sta- tistical Phrase-Based Translation[C]//Proceedings of the 2003 Conference of the NAACL: HLT. 2003: 48- 54. 被引量：1
4Zhongjun He, Yao Meng, Yajuan L, et al. Reducing smt rule table with monolingual key phrase[C]//Pro- ceedings of the ACL-IJCNLP 2009 Con[erence Short Papers. 2009: 121-124. 被引量：1
5Gonzalo Iglesias, Adri de Gispert, Eduardo R Banga, et al. Rule filtering by pattern for efficient hierarchical translation[C]//Proceedings of the 12th Conference of the EACL. 2009: 380-388. 被引量：1
6Libin Shen, Jinxi Xu, Ralph Weischedel. A new string-to-dependency machine translation algorithm with a target dependency language model[C]//Pro- ceedings of ACL-08: HLT, 2008: 577-585. 被引量：1
7Zhiyang Wang, Yajuan L, Qun Liu, et al. Better fil- tration and augmentation for hierarchical phrase-based translation rules[C]//Proceedings of the ACL 2010 Conference Short Papers. 2010: 142-146. 被引量：1
8Joern Wuebker, Arne Mauser, Hermann Ney. Train- ing phrase translation models with leaving-one-out [C]// Proceedings of the 48th Annual Meeting of the ACL. 2010: 475-484. 被引量：1
9Carmen Heger, Joern Wuebker, David Vilar, et al. A combination of hierarchical systems with forced align- ments from phrase-based systems[C]//Proceeding of the IWSLT. 2010: 291-297. 被引量：1
10Phil Blunsom, Trevor Cohn, Miles Osborne. A dis eriminative latent variable model for statistical ma chine translation[C]//Proeeedings of ACL-08: HLT 2008 : 200-208. 被引量：1

同被引文献4

1董人菘,王华,张晓钟,余正涛,张涛.依存句法语言模型对短语统计机器翻译性能的影响[J].计算机科学,2014,41(2):99-101. 被引量：4
2孙水华,丁鹏,黄德根.利用句法短语改善统计机器翻译性能[J].中文信息学报,2015,29(2):95-102. 被引量：5
3冯志伟.基于短语和句法的统计机器翻译[J].燕山大学学报,2015,39(6):546-554. 被引量：20
4肖桐,朱靖波.基于树到串模型强化的层次短语机器翻译解码方法[J].计算机学报,2016,39(4):808-821. 被引量：5

引证文献1

1宋鼎新,黄德根.一种融合句法短语的汉英统计机器翻译方法[J].小型微型计算机系统,2017,38(10):2197-2201. 被引量：6

二级引证文献6

1姚兰.基于改进短语翻译模型的计算机智能化校对系统研究[J].电子设计工程,2020,28(18):52-55. 被引量：10
2高巍,陈子祥,李大舟,李耀松.预标准化Transformer在乌英机器翻译中的实现[J].小型微型计算机系统,2020,41(11):2286-2291. 被引量：13
3郑萌.基于变分模型的英汉翻译系统设计[J].电子科技,2020,33(12):75-78. 被引量：1
4刘晶.基于融合句法特征的翻译方法研究[J].电子设计工程,2021,29(16):153-157. 被引量：1
5李宁.弱化语法规则下英文机器翻译准确度对比测试[J].信息技术,2021,45(11):31-37.
6张海玲,邵玉斌,杨丹,龙华,杜庆治.基于句法规则层次化分析的神经机器翻译[J].小型微型计算机系统,2021,42(11):2300-2306. 被引量：6

1肖桐,朱靖波.基于树到串模型强化的层次短语机器翻译解码方法[J].计算机学报,2016,39(4):808-821. 被引量：5
2王春荣,王斯日古楞,阿荣.基于层次短语的汉蒙统计机器翻译研究[J].内蒙古师范大学学报（自然科学汉文版）,2013,42(3):350-353. 被引量：2
3冯洋,张冬冬,刘群.层次短语翻译模型的介词短语调序[J].中文信息学报,2012,26(1):31-36. 被引量：3
4薛振华,王萍,张楚涵,蔡思佳.图像匹配中去除误配的对抗性优化方法的改进[J].计算机应用,2012,32(11):3157-3160. 被引量：4
5涂兆鹏,刘群,林守勋.利用依存限制抽取长距离调序规则[J].中文信息学报,2011,25(2):55-60.
6肖欣延,刘洋,刘群,林守勋.面向层次短语翻译的词汇化调序方法研究[J].中文信息学报,2012,26(1):37-41. 被引量：6
7甘星超,陈毅东.引入韵律结构信息的层次短语模型改进研究[J].电脑知识与技术,2013,9(4X):2860-2863.
8米莉万.雪合来提,麦热哈巴.艾力,吐尔根.依布拉音,姜文斌.维吾尔语词尾对汉维统计机器翻译影响的研究[J].计算机工程,2014,40(3):224-227. 被引量：8
9王韦华,徐波.汉语语言模型的规模对统计机器翻译系统的影响[J].微计算机信息,2010,26(27):108-109. 被引量：1
10李锐,王斌.一种基于作者建模的微博检索模型[J].中文信息学报,2014,28(2):136-143. 被引量：8

中文信息学报

2013年第6期

浏览历史

内容加载中请稍等...

基于强制对齐的层次短语模型过滤和优化被引量：1

参考文献15

同被引文献4

引证文献1

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

基于强制对齐的层次短语模型过滤和优化 被引量：1

参考文献15

同被引文献4

引证文献1

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

基于强制对齐的层次短语模型过滤和优化被引量：1