期刊文献+

Trainable prosodic model for standard Chinese Text-to-Speech system 被引量:1

Trainable prosodic model for standard Chinese Text-to-Speech system
原文传递
导出
摘要 Putonghua prosody is characterized by its hierarchical structure when influenced by linguistic environments. Based on this, a neural network, with specially weighted factors and optimizing outputs, is described and applied to construct the Putonghua prosodic model in Text-to-Speech (TTS) system. Extensive tests show that the structure of the neural network characterizes the Putonghua prosody more exactly than traditional models. Learning rate is speeded up and computational precision is improved, which makes the whole prosodic model more efficient. Furthermore, the paper also stylizes the Putonghua syllable pitch contours with SPiS parameters (Syllable Pitch Stylized Parameters), and analyzes them in adjusting the syllable pitch. It shows that the SPiS parameters effectively characterize the Putonghua syllable pitch contours, and facilitate the establishment of the network model and the prosodic controlling. Putonghua prosody is characterized by its hierarchical structure when influenced by linguistic environments. Based on this, a neural network, with specially weighted factors and optimizing outputs, is described and applied to construct the Putonghua prosodic model in Text-to-Speech (TTS) system. Extensive tests show that the structure of the neural network characterizes the Putonghua prosody more exactly than traditional models. Learning rate is speeded up and computational precision is improved, which makes the whole prosodic model more efficient. Furthermore, the paper also stylizes the Putonghua syllable pitch contours with SPiS parameters (Syllable Pitch Stylized Parameters), and analyzes them in adjusting the syllable pitch. It shows that the SPiS parameters effectively characterize the Putonghua syllable pitch contours, and facilitate the establishment of the network model and the prosodic controlling.
出处 《Chinese Journal of Acoustics》 2001年第3期257-265,共9页 声学学报(英文版)
基金 This work was supported by the National Natural Science Foundation of China (69875008) and 863National High Technology Project
  • 相关文献

参考文献5

  • 1HUANG Yan,HUANG Taiyi.A neural learning approach for duration parameter generation inPutonghua speech synthesis[].ISCSLP’.1998 被引量:1
  • 2CHEN Sinhorng et al.An RNN-based prosodic information synthesizer for Putonghua text-to-speech[].IEEE Transcations on Speech and Audio Processing.1998 被引量:1
  • 3TAO Jianhua,CAI Lianhong,ZHONG Yuzuo.The context-based method of creating Chineseprosodic model[].ISSPR’.1998 被引量:1
  • 4YANG Shunan.A tonal model for synthesizing polysyllabic words and phrases in standard Chinese[].Essays on Linguistics.1990 被引量:1
  • 5XU Chingx,XU Yi,LUO Lishi.A pitch target approximation model for FO contours in Putonghua[].ICPHS San Francisco.1999 被引量:1

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部