摘要
基于精确匹配的EBMT,由于翻译覆盖率过低,导致其难以大规模实际应用。本文提出一种实例模式泛化匹配算法,试图改善EBMT的翻译覆盖率:以输入的待翻译句子为目标导向,对候选翻译实例有针对性地进行实时泛化,使得算法既能满足实时文档翻译对速度的要求,又能充分利用系统使用过程中用户新添加和修改的翻译知识,从而总体上提高了系统的翻译覆盖率和翻译质量。实验结果表明,在语料规模为16万句对的情况下,系统翻译覆盖率达到了75%左右,充分说明了本文算法的有效性。
Example-based machine translation is currently difficult in large-scale implications because of its low translation coverage. In this paper, an algorithm of generalizing match of translation examples is proposed to improve the translation coverage of EBMT: the candidate translation examples are generalized in real time controlled and guided by the input sentence which to be translated. The algorithm not only can satisfy the speed of real time documents translation but also can use the new language knowledge which added and revised by users in the translation processing. So a higher translation coverage and translation quality is obtained as a whole. The positive experiment results of 75% translation coverage basis on 160,000 pairs of translation examples confirm the algorithm’s effect.
出处
《中文信息学报》
CSCD
北大核心
2005年第4期1-9,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(6027088)
国家863计划资助项目(2002AA117010-02)
关键词
人工智能
机器翻译
基于实例的机器翻译
泛化匹配
翻译覆盖率
artificial intelligence
machine translation
example-based machine translation
generalizing match
translation coverage