摘要
语音增强是语音识别经常采用的前端处理方法。针对农贸市场环境中的语音信号,为获取更佳的识别效果,采用两种不同的评价方法,对谱减法(SS)、多带谱减(MB)、最小均方误差(MMSE)和对数最小均方误差(log MMSE)4种语音增强效果进行比较。结果表明,采用分段信噪为评价标准比对语音质量进行评价,以SS为最佳,其次为MMSE和log MMSE,多带算法较差;针对分段信噪比评价没有考虑语音可懂度的问题,对上述算法采用语音质量感知评价PESQ标准进行评分,以MB算法得分最高;提高信噪比与增加懂度往往不能同时获得,针对语音识别的复杂性,增强算法的选择要根据具体的应用进行取舍。本研究对农贸市场环境下语音信号的增强算法选用提供了参考。
Speech recognition is applyed to the interaction inferface of the mobile device of agricultural price acquistion so as to make up for the lack of voice inferface in traditional devices.But the environmental noise often decreases the recognition rate sharply,speech enhancement is often used in the front-end of speech recognition,beacause speech enhancement can improve the signal-to-noise ratio(SNR)of the input sigal.In this paper,we mainly studied the speech signal in agricultural market environment.In order to obtain a better speech recognition performance,the effects of four kinds of speech enhancement methods,including spectral subtraction(SS),multi-band spectral subtraction(MB),minimum mean square error(MMSE)and logarithmic minimum mean square error(log MMSE)were compared by using two different speech quality evaluation methods.The results showed that,by using SNR as evaluation standard,SS was the best,followed by MMSE and log MMSE,MB algorithm was poorer.Since segment SNR algorithm did not consider the problem of speech intelligibility,we used the perceptural evaluation of speech quarlity(PESQ)algorithm to mark above the four kinds methods,and the MB alogrithm got the highest score,followed by log MMSE,SS,and MMSE was poorer.So,the result was not consistent with the result of segment SNR method.The study also found that the improvement of SNR and the intelligibility increasing usually could not get at the same time.Considering the complexity of speech recognition,we should choose the enhancement alogrithm according to the specific application.The study provided references for choosing speech enhancement alogrithm in agricultural market environment.
出处
《广东农业科学》
CAS
2015年第10期166-172,共7页
Guangdong Agricultural Sciences
基金
国家自然科学基金(61271364)
关键词
语音增强
分段信噪比
PESQ
农产品价格
语音识别
speech enhancement
segment signal-to-noise ratio
PESQ
agricultural price
speech recognition