摘要
本文从直接合并汉英双语的 phoneset入手 ,对三种不同的汉英双语混合声学建模方法进行了研究。这三种方法分别是 :(1)直接合并二者的 phoneset进行声学建模 ;(2 )基于IPA映射的统一声学表示 ;(3)对汉英双语的Phone进行自动合并聚类。实验结果表明 ,方法 (1)的声学模型较为鲁棒 ,但是建模单元也最多 ,模型不够紧凑 ;方法 (2 )具有紧凑的模型 ,但是鲁棒性较差 ;方法 (3)以较少的Phone进行双语混合声学建模 ,不仅保持了 (2 )中模型紧凑的特点 ,而且基本达到 (1)的识别率 ;特别是当使用声学似然度准则时 ,英语的识别率甚至超过了方法 (1)
In this paper, three different approaches of Chinese-English bilingual acoustic modeling are investigated and compared. The first approach is to simply combine Chinese and English phone inventories together without phone shared across the languages. The second one is to map language-dependent phones to the inventory of the International Phonetic Association (IPA) based on phonetic knowledge to construct the bilingual phone inventory. The third one is to merge the language-dependent phone models by hierarchical phone clustering algorithm to get a compact bilingual inventory. Experimental results show that phone clustering approach outperforms IPA-based phone mapping approach, and it can also achieve comparable performance to the simple combination of language-dependent phone inventories with less model parameters, especially when using acoustic likelihood measurement.
出处
《中文信息学报》
CSCD
北大核心
2004年第5期78-84,共7页
Journal of Chinese Information Processing
基金
国家 8 6 3计划 (0 0 2AA1170 10 )
北京市数字奥运资助项目 (H0 30 130 0 5 0 4 30 )
关键词
计算机应用
中文信息处理
语音识别
声学建模
汉语双语
合并聚类
似然度
computer application
Chinese information processing
speech recognition
acoustic modeling
bilingual
hierarchical clustering
likelihood