摘要
本征音自适应是一种快速自适应算法 ,它根据对说话人矢量全空间的本征分析指导参数更新。该文提出一种基于子空间分析的本征音自适应算法 ,并且不同于一般本征音自适应采用最大似然准则的做法 ,该算法用最大后验准则以更好地估计参数。实验证明 ,在仅有 1句自适应数据的情况下它即能取得 6 .4 5 %的相对误识率下降 ,自适应速度远快于传统的最大后验方法 ,也不存在最大似然线性回归方法在极少数据量情况下反而造成系统识别性能下降的现象。结果表明该方法并不明显依赖相关子空间的划分数量 。
The eigenvoice approach is an efficient method for rapid speaker adaptation which directs the adaptation according to an analysis of the full speaker vector space. This article descirbes an algorithm for eigenspace-based adaptation restricting eigenvoices in clustered subspaces, with the maximum-likelihood (ML) criterion replaced with the maximum a posteriori (MAP) criterion for better parameter estimation. Experiments show that even with only one sentence of adaptation data, this algorithm had a 6.45% relative error ratio reduction. This method overcomes the instability of the ML linear-regression method with limited data and is much faster than the traditional MAP method. The algorithm is also not highly dependent on the number of subspace divisions, so it is a very robust adaptation algorithm.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2004年第6期829-832,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家"八六三"高技术项目 ( 863 -3 0 6-ZD0 3 -0 1-2 )
关键词
信息处理
语音识别
快速自适应
本征音
最大似然
最大后验
相关子空间
information processing
speaker recognition
fast adaptation
eigenvoice
maximum likelihood (ML)
maximum a posteriori (MAP)
correlation subspaces