摘要
在语音合成系统中,语调短语的自动预测是影响合成语音的自然度和可懂度的关键因素之一。采用了最大熵(Maximum Entropy,ME)模型从无限制的文本中预测语调短语,并且提出了一个自动生成特征模板的层次聚类算法,从而减少了最大熵模型训练过程中的人工参与。实验结果表明,对于语调短语预测而言,最大熵模型明显优于分类与回归树(Classification And Regression Trees,CART)。相比手工总结的特征模板,自动生成的特征模板不仅将语调短语预测的F-score提高了3.18,而且将最大熵模型的大小缩小了78.38。
In Text-To-Speech(TTS) systems,intonational phrase prediction is important for both the naturalness and intelligibility of synthetic speech.This paper presents a Maximum Entropy(ME) model to predict intonational phrases from unrestricted text.Furthermore,a hierarchical clustering algorithm is proposed for automatic generation of feature templates,which minimizes the need for human supervision during ME model training.Results of comparative experiments show that,for the task of intonational phrase prediction,ME model obviously outperforms Classification And Regression Tree(CART).Compared with manual templates,templates automatically generated by the proposed approach not only make an improvement of 3.18 on the F-score of ME based intonational phrase prediction,but also reduce the size of ME model by up to 78.38.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第16期19-21,34,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.61011140075
湖南省科技计划(No.2010FJ4131)
湖南省教育厅科研项目(No.10C0955)~~