Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher acc...Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results.展开更多
We have realized a watermark embedding system based on audio perceptual masking and brought forward a watermark detection system using pre\|processing technology. We can detect watermark from watermarked audio without...We have realized a watermark embedding system based on audio perceptual masking and brought forward a watermark detection system using pre\|processing technology. We can detect watermark from watermarked audio without original audio by using this method. The results have indicated that this embedding and detecting method is robust, on the premise of not affecting the hearing quality, it can resist those attacks such as MPEG compressing, filtering and adding white noise.展开更多
论文描述了为得到高质量的宽带语音而使用的有效编解码方法。CELP(Code Excited Linear Prediction)技术应用于窄带语音编码时已获得了很高的语音质量,然而直接应用于宽带语音信号编码时不能有效地保持高质量语音,因此需要在CELP模型上...论文描述了为得到高质量的宽带语音而使用的有效编解码方法。CELP(Code Excited Linear Prediction)技术应用于窄带语音编码时已获得了很高的语音质量,然而直接应用于宽带语音信号编码时不能有效地保持高质量语音,因此需要在CELP模型上添加额外的方法,以使其在宽带信号上亦取得高质量。文章所讨论的提高CELP模型性能的有效技术是感觉加权滤波器,其技术也被用于3GPP所选的AMR-WB(Adaptive Multi-Rate Wideband)[1][2]声码器中。展开更多
基金Project(61072087) supported by the National Natural Science Foundation of ChinaProject(20093048) supported by Shanxi ProvincialGraduate Innovation Fund of China
文摘Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results.
文摘We have realized a watermark embedding system based on audio perceptual masking and brought forward a watermark detection system using pre\|processing technology. We can detect watermark from watermarked audio without original audio by using this method. The results have indicated that this embedding and detecting method is robust, on the premise of not affecting the hearing quality, it can resist those attacks such as MPEG compressing, filtering and adding white noise.