摘要
语音识别技术中说话人快速自适应技术受到普遍关注。最大似然模型插值 (maxim um likelihood model inter-polation,ML MI)算法是一种有效的快速自适应算法 ,它的主要缺点是需要存储大量的特定人模型。为克服这一缺点 ,该文提出一种改进方法——矩阵线性插值自适应算法。该算法用表示说话人特性的矩阵代替 ML MI中的特定人模型进行线性插值。而插值系数由测试者提供的语音数据按照最大似然准则确定。插值后的线性矩阵与非特定人模型相作用得到最终的说话人自适应模型。该算法大大减少了计算存储量 ,且自适应性能基本与 ML
Fast speaker adaptation techniques for speech recognition are of great interest. A fast speaker adaptation method named the maximum likelihood model interpolation (MLMI) has been developed as an effective speaker adoptation method. The main shortcoming of MLMI is the large memory need to store speaker dependent (SD) models. A modified method, the matrix linear interpolation adaptation method, is proposed in this paper to overcome the memory limitation. This method uses matrix instead of the SD model used in MLMI to represent the speaker characteristics. An estimated interpolation coefficient maximizes the likelihood of the adaptation data. The interpolated matrix is then used to transform the speaker independent model to the speaker adapted model. This method greatly reduces the memory requirement while maintaining the adaptation performance of MLMI.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2002年第1期27-29,共3页
Journal of Tsinghua University(Science and Technology)
基金
清华大学"九八五"重大项目 ( 985校 -2 2 -攻关 -0 6 )