摘要
[目的/意义]文章旨在探究将不同语义知识融入机器翻译模型能否增强机器翻译的效果以及何种语义知识的作用更为显著,以助力机器翻译研究与中华优秀传统文化的传承与传播。[方法/过程]研究选取了30万对精加工的《二十四史》“古代汉语-现代汉语”平行语料作为实验数据,基于神经机器翻译OpenNMT模型,通过三种不同的特征融合方法,将词边界知识、词性知识、实体知识和依存句法知识分别融入机器翻译模型的训练过程中。[结果/结论]不同语义知识与模型的融合对典籍翻译效果有不同的影响,词边界知识、词性知识、实体知识对机器翻译任务有一定的贡献且实体知识的贡献最大,依存句法知识无明显作用。
[Purpose/significance]This article aims to explore whether integrating different semantic knowledge into machine translation models can enhance the effectiveness of machine translation and which type of semantic knowl⁃edge plays a more significant role.The purpose is to support the research in machine translation and the inheritance and dissemination of Chinese excellent traditional culture.[Method/process]The study selected 300,000 pairs of me⁃ticulously processed"Ancient Chinese-Modern Chinese"parallel corpora from the"Twenty-Four Histories"as experi⁃mental data.Based on the neural machine translation model OpenNMT,it integrated word boundary knowledge,partof-speech knowledge,entity knowledge,and dependency syntax knowledge into the training process of the machine translation model through three different feature fusion methods.[Result/conclusion]The integration of different se⁃mantic knowledge with the model has varying impacts on the translation effectiveness of classical texts.Word boundary knowledge,part-of-speech knowledge,and entity knowledge contribute to the machine translation task,with entity knowledge making the largest contribution,while the role of dependency syntax knowledge has no obvious effect.
作者
吴梦成
林立涛
吴娜
许乾坤
王东波
Wu Mengcheng;Lin Litao;Wu Na;Xu Qiankun;Wang Dongbo(College of Information Management,Nanjing Agricultural University,Jiangsu,210095;Research Center for Humanities and Social Computing,Nanjing Agricultural University,Jiangsu,210095;Research Center for Correlation of Domain Knowledge,Nanjing Agricultural University,Jiangsu,210095;School of Information Management,Nanjing University,Jiangsu,210023)
出处
《情报资料工作》
北大核心
2024年第2期97-104,共8页
Information and Documentation Services
基金
国家社会科学基金重大项目“中国古代典籍跨语言知识库构建及应用研究”(批准号:21&ZD331)的研究成果之一。
关键词
古籍文献
语义知识
《二十四史》
机器翻译
ancient books
semantic knowledge
Twenty-Four Histories
machine translation