期刊文献+

基于对抗训练策略的语言模型数据增强技术 被引量:19

Data Augmentation for Language Models via Adversarial Training
下载PDF
导出
摘要 基于最大似然估计(Maximum likelihood estimation,MLE)的语言模型(Language model,LM)数据增强方法由于存在暴露偏差问题而无法生成具有长时语义信息的采样数据.本文提出了一种基于对抗训练策略的语言模型数据增强的方法,通过一个辅助的卷积神经网络判别模型判断生成数据的真伪,从而引导递归神经网络生成模型学习真实数据的分布.语言模型的数据增强问题实质上是离散序列的生成问题.当生成模型的输出为离散值时,来自判别模型的误差无法通过反向传播算法回传到生成模型.为了解决此问题,本文将离散序列生成问题表示为强化学习问题,利用判别模型的输出作为奖励对生成模型进行优化,此外,由于判别模型只能对完整的生成序列进行评价,本文采用蒙特卡洛搜索算法对生成序列的中间状态进行评价.语音识别多候选重估实验表明,在有限文本数据条件下,随着训练数据量的增加,本文提出的方法可以进一步降低识别字错误率(Character error rate,CER),且始终优于基于MLE的数据增强方法.当训练数据达到6 M词规模时,本文提出的方法使THCHS 30数据集的CER相对基线系统下降5.0%,AISHELL数据集的CER相对下降7.1%. The conventional approach to data augmentation for language models based on maximum likelihood estimation(MLE) causes the exposure bias problem, which leads to generated text lacking of long-term semantics. We propose a novel data augmentation approach via adversarial training, which uses a convolutional neural network as a discriminator to guide the training of a recurrent neural network based generative model. The matter of augmentation for language models can be regarded as discrete sequential data generation. When outputs of the generative model are discrete, backforward propagation algorithm fails to update the generative model via the gradient of discriminator errors. To deal with this problem, we treat the generative model as a stochastic policy in reinforcement learning and optimize it by rewards from the discriminator. Since the discriminator can only judge completed sequences, we evaluate intermediate states by Monte Carlo search. Experiments on rescoring the n-best lists of speech recognition outputs show that with the increase of training corpus, the proposed approach achieves a lower character error rate(CER) and always outperforms the MLE-based approach. When training corpus reaches 6 million tokens, the proposed approach provides a relative 5.0 % CER reduction on THCHS 30 dataset and a relative 7.1 % CER reduction on AISHELL dataset compared with the baseline.
作者 张一珂 张鹏远 颜永红 ZHANG Yi-Ke;ZHANG Peng-Yuan;YAN Yong-Hong(Key Laboratory of Speech Acoustics and Content Under standing, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190;University of Chinese Academy of Sciences, Beijing 100049;Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Insti- tute of Physics and Chemistry, Chinese Academy of Sciences, Urumchi 830011)
出处 《自动化学报》 EI CSCD 北大核心 2018年第5期891-900,共10页 Acta Automatica Sinica
基金 国家自然科学基金(11590770-4 U1536117 11504406 11461141004) 国家重点研发计划(2016YFB0801203 2016YFB0801200) 新疆维吾尔自治区科技重大专项(2016A03007-1)资助~~
关键词 数据增强 语言模型 生成对抗网络 强化学习 语音识别 Data augmentation language modeling generative adversarial nets (GAN) reinforcement learning speechrecognition
  • 相关文献

参考文献3

二级参考文献26

  • 1王飞跃.平行系统方法与复杂系统的管理和控制[J].控制与决策,2004,19(5):485-489. 被引量:331
  • 2王飞跃.计算实验方法与复杂系统行为分析和决策评估[J].系统仿真学报,2004,16(5):893-897. 被引量:147
  • 3王飞跃.关于复杂系统的建模、分析、控制和管理[J].复杂系统与复杂性科学,2006,3(2):26-34. 被引量:64
  • 4杨行竣 迟惠生.语音信号数字处理[M].北京:电子工业出版社,1995.. 被引量:4
  • 5Chen S F, Goodman J. An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Lin- guistics. Association for Computational Linguistics. Santa Cruz, CA, 1996. 310-318. 被引量:1
  • 6Allauzen C, Riley M. Bayesian language model interpola- tion for mobile speech input. In: Proceedings of the 2011 Interspeech. Italy, 2011. 1429-1432. 被引量:1
  • 7Khudanpur S, Wu J. A maximum entropy language model integrating n-grams and topic dependencies for conversa- tional speech recognition. In: Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Phoenix, AZ: IEEE, 1999. 553-556. 被引量:1
  • 8Schwenk H. CSLM - a modular open-source continuous space language modeling toolkit. In: Proceedings of the 2013 Interspeech. Lyyon, France, 2013. 1198-1202. 被引量:1
  • 9Mikolov T, Karafit M, Burget L, Cernock J H, Khudanpur S. Recurrent neural network based language model. In: Pro- ceedings of the 2010 INTERSPEECH. Lyon, France: ISCA, 2010. 1045-1048. 被引量:1
  • 10Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky J H. Empirical evaluation and combination of advanced lan- guage modeling techniques. In: Proceedings of the 2011 In- terspeech. Italy, 2011. 605-608. 被引量:1

共引文献351

同被引文献208

引证文献19

二级引证文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部