摘要
连续语音识别技术融合了声学、语音学和语言学知识,是当前人工智能研究领域的热点之一。连续语音的切分是语音识别的重要基础。传统的双门限端点检测技术、基于模型的端点检测技术等方法在语音切分中的效果不尽如人意。论文针对该问题,分析了汉语的语音结构和发音特点,研究了连续汉语语音的多级切分方法,综合利用双门限端点检测技术、基于倒谱的端点检测技术和相干分析等技术,实现了汉语连续语音的切分。
Continuous speech recognition technology has become a hot topic in artificial intelligence research field which contains acoustics,phonetics and linguistics.The segmentation of continuous speech is an important basis for the accuracy of speech recognition.Traditional double-threshold endpoint detection technology or model-based endpoint detection is not satisfactory in speech segmentation.Based on the Chinese phonetic structure and the characteristics of pronunciation,this paper studies a multi-level segmentation method to solve the problem.The segmentation of Chinese continuous speech is realized by using traditional double-threshold endpoint detection technology,cepstrum endpoint detection technology and coherence analysis.
作者
曹冠彬
张二华
王凯龙
CAO Guanbin;ZHANG Erhua;WANG Kailong(School of Computer Science and Engineering, Nanjing University of Science and Technology,Nanjing 210094)
出处
《计算机与数字工程》
2019年第7期1667-1671,1712,共6页
Computer & Digital Engineering
关键词
端点检测
倒谱
语谱图
相干分析
endpoint detection
cepstrum
speech spectrum
coherence analysis