摘要
针对非用户语音和噪声干扰下命令词识别的准确率和稳健性问题,提出多通道麦克风阵列与功率归一化倒谱系数结合残差神经网络的命令词识别系统。首先,应用残差单元构建标准ResNet⁃CW⁃15多任务模型和低功耗ResNet⁃CW⁃6多任务模型,当模型判断命令词为用户发出后,开始执行命令词识别功能,若判断为非用户,则不执行命令词识别功能。其次,采用多通道麦克风阵列采集含有空域特征信息的语音命令词数据集。最后,应用对噪声具有一定鲁棒性的功率归一化倒谱系数作为命令词数据集的特征对网络进行训练。标准ResNet⁃CW⁃15模型在噪声条件下命令词识别率和用户/非用户判断性能表现良好。低功耗模型ResNet⁃CW⁃6虽然在整体命令词识别率和用户判断准确率有所降低,但网络参数大幅度减少,极大降低了系统的功耗,更适合广泛部署在小型低功耗智能设备。
In order to improve accuracy and robustness of the command word recognition suffering from non⁃user voice and noise interference,a command word recognition system based on multi⁃channel microphone array and power normalized cepstrum coefficient(PNCC)combined with the residual neural network is proposed.The standard ResNet⁃CW⁃15 multi⁃task model and the low power consumption ResNet⁃CW⁃6 multi⁃task model are constructed with residual unit.When the model determines that the command word is issued by a user,the command word recognition function of the model will be activated.If the command word is judged to be from a non⁃user,its command word recognition function will not be activated.The multi⁃channel microphone array is used to collect the voice command word data set containing spatial feature information.The PNCC,which is robust to noise,is used as the features of the command word data set to train the network.The ResNet⁃CW⁃15 model has high command word recognition rate and good user/non⁃user judgment performance in noise occasion.Although the overall command word recognition rate and user judgment accuracy of the low power consumption model ResNet⁃CW⁃6 decline,its network parameters are reduced,which greatly reduces the system power consumption,so the system is more suitable for being deployed widely in small size intelligent devices with low power consumption.
作者
张硕
曾庆宁
郑展恒
卜玉婷
ZHANG Shuo;ZENG Qingning;ZHENG Zhanheng;BU Yuting(School of Information and Communication,Guilin University of Electronic Technology,Guilin 541004,China)
出处
《现代电子技术》
2022年第21期37-42,共6页
Modern Electronics Technique
基金
国家自然科学基金项目(61961009)
广西自然科学基金重点项目(2016GXNSFDA380018)
广西无线宽带通信与信号处理重点实验室基金(GXKL06200107)
认知无线电与信息处理省部共建教育部重点实验室主任基金项目(CRKL200110)
桂林电子科技大学研究生教育创新计划项目(2021YCXS028)。