With the development of multichannel audio systems, corresponding audio quality assessment techniques, especially the objective prediction models, have received increasing attention. Existing methods, such as PEAQ(Per...With the development of multichannel audio systems, corresponding audio quality assessment techniques, especially the objective prediction models, have received increasing attention. Existing methods, such as PEAQ(Perceptual Evaluation of Audio Quality) recommended by ITU, usually lead to poor results when assessing multichannel audio, which have little correlation with subjective scores. In this paper, a novel two-layer model based on Multiple Linear Regression(MLR) and Neural Network(NN) is proposed. Through the first layer, two indicators of multichannel audio, Audio Quality Score(AQS) and Spatial Perception Score(SPS) are derived, and through the second layer the overall score is output. The final results show that this model can not only improve the correlation with the subjective test score by 30.7% and decrease the Root Mean Square Error(RMSE) by 44.6%, but also add two new indicators: AQS and SPS, which can help reflect the multichannel audio quality more clearly.展开更多
In this paper, we present an approach to improve the accuracy of environmental sound event detection in a wireless acoustic sensor network for home monitoring. Wireless acoustic sensor nodes can capture sounds in the ...In this paper, we present an approach to improve the accuracy of environmental sound event detection in a wireless acoustic sensor network for home monitoring. Wireless acoustic sensor nodes can capture sounds in the home and simultaneously deliver them to a sink node for sound event detection. The proposed approach is mainly composed of three modules, including signal estimation, reliable sensor channel selection, and sound event detection. During signal estimation, lost packets are recovered to improve the signal quality. Next, reliable channels are selected using a multi-channel cross-correlation coefficient to improve the computational efficiency for distant sound event detection without sacrificing performance. Finally, the signals of the selected two channels are used for environmental sound event detection based on bidirectional gated recurrent neural networks using two-channel audio features. Experiments show that the proposed approach achieves superior performances compared to the baseline.展开更多
The quality of a multichannel audio signal may be reduced by missing data, which must be recovered before use. The data sets of multichannel audio can be quite large and have more than two axes of variation, such as c...The quality of a multichannel audio signal may be reduced by missing data, which must be recovered before use. The data sets of multichannel audio can be quite large and have more than two axes of variation, such as channel, frame, and feature. To recover missing audio data, we propose a low-rank tensor completion method that is a high-order generalization of matrix completion. First, a multichannel audio signal with missing data is modeled by a three-order tensor. Next, tensor completion is formulated as a convex optimization problem by defining the trace norm of the tensor, and then an augmented Lagrange multiplier method is used for solving the constrained optimization problem. Finally, the missing data is replaced by alternating iteration with a tensor computation. Experiments were conducted to evaluate the effectiveness on data of a 5.1-channel audio signal. The results show that the proposed method outperforms state-of-the-art methods. Moreover, subjective listening tests with MUSHRA(Multiple Stimuli with Hidden Reference and Anchor) indicate that better audio effects were obtained by tensor completion.展开更多
基金supported by the National Natural Science Foundation of China (No.61571044,No.11590772,and No.61473041)
文摘With the development of multichannel audio systems, corresponding audio quality assessment techniques, especially the objective prediction models, have received increasing attention. Existing methods, such as PEAQ(Perceptual Evaluation of Audio Quality) recommended by ITU, usually lead to poor results when assessing multichannel audio, which have little correlation with subjective scores. In this paper, a novel two-layer model based on Multiple Linear Regression(MLR) and Neural Network(NN) is proposed. Through the first layer, two indicators of multichannel audio, Audio Quality Score(AQS) and Spatial Perception Score(SPS) are derived, and through the second layer the overall score is output. The final results show that this model can not only improve the correlation with the subjective test score by 30.7% and decrease the Root Mean Square Error(RMSE) by 44.6%, but also add two new indicators: AQS and SPS, which can help reflect the multichannel audio quality more clearly.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF2015R1D1A1A01059804)the MSIP (Ministry of Science,ICT and Future Planning),Korea,under the ITRC(Information Technology Research Center) support program (IITP-2016-R2718-16-0011) supervised by the IITP(Institute for Information & communications Technology Promotion)the present Research has been conducted by the Research Grant of Kwangwoon University in 2017
文摘In this paper, we present an approach to improve the accuracy of environmental sound event detection in a wireless acoustic sensor network for home monitoring. Wireless acoustic sensor nodes can capture sounds in the home and simultaneously deliver them to a sink node for sound event detection. The proposed approach is mainly composed of three modules, including signal estimation, reliable sensor channel selection, and sound event detection. During signal estimation, lost packets are recovered to improve the signal quality. Next, reliable channels are selected using a multi-channel cross-correlation coefficient to improve the computational efficiency for distant sound event detection without sacrificing performance. Finally, the signals of the selected two channels are used for environmental sound event detection based on bidirectional gated recurrent neural networks using two-channel audio features. Experiments show that the proposed approach achieves superior performances compared to the baseline.
基金partially supported by the National Natural Science Foundation of China under Grants No. 61571044, No.61620106002, No.61473041, No.11590772, No.61640012Inner Mongolia Natural Science Foundation under Grants No. 2017MS(LH)0602
文摘The quality of a multichannel audio signal may be reduced by missing data, which must be recovered before use. The data sets of multichannel audio can be quite large and have more than two axes of variation, such as channel, frame, and feature. To recover missing audio data, we propose a low-rank tensor completion method that is a high-order generalization of matrix completion. First, a multichannel audio signal with missing data is modeled by a three-order tensor. Next, tensor completion is formulated as a convex optimization problem by defining the trace norm of the tensor, and then an augmented Lagrange multiplier method is used for solving the constrained optimization problem. Finally, the missing data is replaced by alternating iteration with a tensor computation. Experiments were conducted to evaluate the effectiveness on data of a 5.1-channel audio signal. The results show that the proposed method outperforms state-of-the-art methods. Moreover, subjective listening tests with MUSHRA(Multiple Stimuli with Hidden Reference and Anchor) indicate that better audio effects were obtained by tensor completion.