期刊文献+

浅议基于GIZA++的汉英手动词对齐法 被引量:2

On the GIZA + +-based Manual Chinese-English Word Alignment Method
下载PDF
导出
摘要 基于统计和计算的自动词对齐法优点在于以词的频率与分布情形来猜测词的对应,只需要大量语料库、不需要机读词典或语言知识即可搜寻出句子的对应。这种方法的缺点是准确率受频率、语系、文类、风格等因素影响很大。针对这一不足,提出基于GIZA++的手动汉英词对齐法设想,主要思路是先通过GIZA++工具进行预对齐,在此基础上再进行人工编辑和对齐。实验证明:与单纯的无监督对齐法相比,速度大幅提高;与其他纯自动词对齐法相比,准确率有所提高。 The advantage of automatic word alignment based on statistics and computation lies in getting the equivalent words by the frequency and distribution of words. Meantime,it only needs a large number of corpora,the corresponding sentences can be searched out without machine-readable dictionary or language knowledge. However,the disadvantage of this method is that its accuracy is greatly affected by the frequency,language,genre,style and other factors. In order to resolve this problem,this paper proposes a GIZA + +-based manual Chinese-English word alignment method,which is to align first with the GIZA + + tool,and then manually edit and align it. According to some experiments with this method,it shows that: compared with the unsupervised alignment method,the speed of this method is greatly increased; compared with other automatic word alignment method,its accuracy is improved as well.
作者 谢庚全
出处 《海南广播电视大学学报》 2017年第4期7-11,共5页 Journal of Hainan Radio & TV University
基金 2016年海南省自然科学基金项目"基于多预处理机制的多种重映射融合汉英自动词对齐系统研究-以海南旅游文本汉英翻译网上平行语料库创建为例"(编号:20167238)成果之一
关键词 自动词对齐 GIZA++ 手动对齐 automatic word alignment GIZA + + manual word alignment
  • 相关文献

参考文献1

二级参考文献9

  • 1张孝飞,陈肇雄,黄河燕,王建德.基于锚点词对的双语词对齐算法[J].小型微型计算机系统,2006,27(2):330-334. 被引量:10
  • 2董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007,21(4):3-9. 被引量:98
  • 3Yang LIU, Qun LIU, and Shouxun LIN. Log-linear Models for Word Alignment[C]. Morristown, NJ, USA: The 43rd Annual Meeting of Association of Computational Linguistics (ACL-05). Publisher Association for Computational Linguistics, 2005: 25-30. 被引量:1
  • 4Wang Haifeng, Wu Hua, Liu Zhanyi. Word alignment for languages with scarce resources using bilingual corpora of other language pairs [C]. Morristown, NJ, USA: Proceedings of the COLING/ACL on Main Conference Poster Sessions Table of Contents.Publisher Association for Computational Linguistics, 2006:874-881. 被引量:1
  • 5Phil BI e word alignment with conditional random fields[C]. Morristown, N J, USA: Proceedings of the 21 st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL Table of Contents. Publisher Association for Computational Linguistics,2006:65-67. 被引量:1
  • 6Dan Tufis,Radu Ion,Alexandru Ceausu, et al.Combined word alignments[C].Morristown,NJ,USA:Proc of the ACL-2005 Workshop on Building and Using Parallel Texts:Data-driven Machine Translation and Beyond, Publisher Association for Computational Linguistics, 2005:107-110. 被引量:1
  • 7Shankar Kumar, Franz Och,Wolfgang Macherey.lmproving word aligmnent with bridge languages [C]. Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007: 42-50. 被引量:1
  • 8The Giza++ Toolkit[EB/OL]. http: //www.fjoch.com/GIZA++. html. 被引量:1
  • 9刘划 蔡东风 代翠.一种基于知网的双语词对齐方法.小型微型计算机系统,2007,28(8):436-437. 被引量:1

共引文献4

同被引文献8

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部