基于多层过滤的统计机器翻译被引量：3

Multi-Layer Filtering Based Statistical Machine Translation

下载PDF

导出

摘要本文提出了一种基于多层过滤的算法。该算法主要实现从对齐的中英文句子中自动的抽取与对齐双语语块。根据不同语块具备的不同特性,采用不同的层次对其处理。该算法不同于传统的算法,它不需要对句子进行标注,句法分析,词法分析甚至不需要对汉语句子进行分词等操作。初步的实验结果表明该算法性能较好,测试的结果是:抽取语块的准确率能达到F =0 70 ,对齐语块的准确率能达到F =0 80 ;而且将此算法获得的对齐双语语块用于统计机器翻译系统,跟基于词的系统做对比,结果表明基于语块的翻译系统明显提高了翻译水平,差不多能提高10 %。 In this paper we propose a new algorithm called multi-layer filtering to extract the bilingual alignment chunks automatically from Chinese-English parallel texts. Various layers are used to extract bilingual chunks according to different features possessed by different chunks in the bilingual corpus. Our chunking and alignment algorithm does not rely on the information from tagging, parsing or syntax analyzing as most conventional algorithms do. The preliminary experimental results express that our algorithm achieves a good performance in chunking and alignment. The F-measure of chunking is 0.7 and the F-measure of alignment is 0.8. Moreover, the translations generated by this algorithm are much better than the results generated by the baseline word alignment algorithm; it almost improves of 10%.

作者周玉宗成庆徐波

机构地区中国科学院自动化研究所模式识别国家重点实验室

出处《中文信息学报》 CSCD 北大核心 2005年第3期54-60,共7页 Journal of Chinese Information Processing

基金国家自然科学基金资助项目 (6 0 2 72 0 4 1 6 0 12 130 2 )

关键词人工智能机器翻译多层过滤双语语块识别与对齐 artificial intelligence machine translation chunking and alignment multi-layer-filtering

分类号 TP391.2 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1Brown P. F., et al., The Mathematics of Statistical Machine Translation: Parameter Estimation [ J], Computational Linguistics, 1993, 19(2): 263-311. 被引量：1
2Vogel, S., H. Ney, and C. Tillman. 1996. HMM-Based Word Alignment in Statistical Translation [A]. In:Proceedings of the Seventeenth International Conference on Computational Linguistics: COLING-96 [ C ], 836 - 841,Copenhagen, Denmark. 被引量：1
3Wang Y. Y., Grammar Inference and Statistical Machine Translation [D], PhD thesis, School of Computer Science,Carnegie Mellon University, Pittsburgh, PA, 1998. 被引量：1
4Och. F. J., Tillmann C. and Ney H., Improved alignment models for statistical machine translation [ A], In:Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) [ C ], 20 - 28, College Park, Maryland, USA, 1999. 被引量：1
5Vogel S., et al.., The CMU statistical machine translation system [ A], In: Proceedings of MT Summit [ C ],110 - 117, New Orleans, Louisiana, September, 2003. 被引量：1
6Yamada K. and Knignt K., A Syntax-based Statistical Translation Model [ A ], Annual Meeting of the Ass. for Computational Linguistics [C], Toulouse, France, 523-530, July 2001. 被引量：1
7Andrew Roberts. Automatic Acquisition of Word Classification Using Distribution Analysis of Content Words with Respect to Function Words [ J]. Computer Science, 2000. 被引量：1
8K. Papineni, et al. BLEU: a method for automatic evaluation of MT [R]. Research Report, Computer Science RC22176 (W0109022), September 17, 2001, IBM, 2001. 被引量：1
9Doddington. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics [ R].NIST Rearch Report, 2002. 被引量：1

同被引文献15

1李瑞.电影片名翻译中的归化异化理论[J].电影评介,2007(15):56-57. 被引量：8
2路景菊.社会符号学翻译理论视角下的英语电影片名翻译[J].电影文学,2007(17):74-75. 被引量：12
3陈秀.论译者介入[J].中国翻译,2002,23(1):19-22. 被引量：70
4贺莺.电影片名的翻译理论和方法[J].外语教学,2001,22(1):56-60. 被引量：449
5陶友兰,黄瑾.试论认知图式关照下的翻译教材练习设计[J].上海翻译,2005(1):35-39. 被引量：21
6王东风.译家与作家的意识冲突:文学翻译中的一个值得深思的现象[J].中国翻译,2001,22(5):43-48. 被引量：122
7Brown P F.The Mathematics of Statistical Machine Translation:Parameter Estimation[J].Computational Linguistics,1993,19(2):263-311. 被引量：1
8Frantzi K,Ananiadou S,Tsuji J.The C-value/NC-value Method of Automatie Recognition for Multi-Word Terms[C] //Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries.Springer-Verlag,1998. 被引量：1
9Franz Josef Och,Hermann Ney.Discriminative Training and Maximum Entropy Models for Statistical Machine Translation[C].ACL,2002. 被引量：1
10Franz Josef Och.Minimum Error Rate Training for Statistical Machine Translation[C] //Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL),Japan,Sapporo,July 2003. 被引量：1

引证文献3

1狄萍,周宥良,贡正仙,周国栋.基于短语的统计机器翻译中短语表的过滤[J].计算机应用与软件,2011,28(5):28-30. 被引量：1
2张懿.西方文学中翻译不对等研究[J].时代文学（下半月）,2012,0(5):106-107.
3戴洪波.中国近代文学作品翻译策略探究[J].作家,2013(07X):153-154.

二级引证文献1

1孔金英,李晓,王磊,杨雅婷,罗延根.调序规则表的深度过滤研究[J].计算机科学与探索,2017,11(5):785-793. 被引量：4

1王志军.利用多层过滤智能管理QQ邮箱[J].电脑迷,2012(3):71-71.
2高觐悦,张功萱.基于UDDI的语义Web服务匹配算法的研究[J].信息化研究,2009,35(10):45-47. 被引量：2
3奚建荣.基于综合过滤技术的邮件过滤终端研究[J].计算机应用与软件,2011,28(6):186-188. 被引量：3
4王莉.贵阳学院邮件信息系统安全建设探索[J].信息系统工程,2013(2):79-80.
5周集良,彭小宁,王正华.动态处理OSPF配置参数的多机制模型研究与实现[J].计算机应用研究,2004,21(10):202-204.
6奚建荣.基于多层过滤技术的邮件客户端的研究[J].科学技术与工程,2010,10(20):5082-5084.
7郭丽春.分布式防火墙在网络安全中的技术应用[J].沈阳航空工业学院学报,2006,23(3):47-49.
8刘健,赵刚,郑运鹏.恶意URL多层过滤检测模型的设计与实现[J].信息网络安全,2016(1):75-80. 被引量：10
9孙晓,李承程,叶嘉麒,任福继.基于重复字串的微博新词非监督自动抽取[J].合肥工业大学学报（自然科学版）,2014,37(6):674-678. 被引量：4
10刘健,赵刚,郑运鹏.恶意URL多层过滤检测模型策略研究[J].信息安全研究,2016,2(1):80-85. 被引量：4

中文信息学报

2005年第3期

浏览历史

内容加载中请稍等...

基于多层过滤的统计机器翻译被引量：3

参考文献9

同被引文献15

引证文献3

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于多层过滤的统计机器翻译 被引量：3

参考文献9

同被引文献15

引证文献3

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于多层过滤的统计机器翻译被引量：3