摘要
由于神经机器翻译模型具有大规模参数且其性能极大地依赖大规模高质量平行语料,当训练数据规模小于模型复杂度时,模型易出现过拟合问题而泛化能力不足.针对此问题,研究了词级别的正则化技术,通过对模型输入句子中的单词进行随机干扰,以此削弱数据的特异性,从而抑制模型对于数据的过度学习,防止过拟合,提高模型的泛化能力.通过Transformer模型在标准规模中文-英语数据集和中小规模英语-土耳其语数据集上进行的相关实验,结果证明词级别正则化方法能够使模型在收敛后更加稳定,不易出现过拟合的情况,并且翻译质量也得到了明显的提升.
For the reason that neural machine translation(NMT)model have large-scale parameters and its performance relys on large-scale and high-quality parallel corpus,the model is prone to over-fitting and insufficient generalization ability while the training data size is smaller than the model complexity.To resolve this problem,we study the word-level regularization technology,which reduces the specificity of the data by randomly perturbing the words in the input sentence of the model,thereby inhibiting the model from over-learning,preventing the data over-fitting,and improving the generalization ability of the model.Through the relevant experiments of the Transformer model on the standard-scale Chinese-English dataset and the small-scale English-Turkish dataset,our results prove that the word-level regularization method can stabilize the model after the convergence and is not prone to over-fitting.Finally,the translation quality has also been significantly improved.
作者
邱石贵
章化奥
段湘煜
张民
QIU Shigui;ZHANG Huaao;DUAN Xiangyu;ZHANG Min(School of Computer Science and Technology,Soochow University,Suzhou 215006,China)
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2021年第4期662-669,共8页
Journal of Xiamen University:Natural Science
基金
国家自然科学基金(61673289)。
关键词
神经机器翻译
泛化能力
过拟合
正则化
neural machine translation
generalization
over-fitting
regularization