三维(Three-dimension,3D)多媒体技术,尤其是和3D视频相比有所差距的3D音频技术受到了广泛的关注。当前三维音频技术研究可分为基于物理声场重建的多声道音频技术和基于感知的声音场景重建的多声道音频技术两大类。物理声场重建技术的...三维(Three-dimension,3D)多媒体技术,尤其是和3D视频相比有所差距的3D音频技术受到了广泛的关注。当前三维音频技术研究可分为基于物理声场重建的多声道音频技术和基于感知的声音场景重建的多声道音频技术两大类。物理声场重建技术的重要代表是基于球谐分解的声重放技术和波场合成技术(Wave field synthesis,WFS),基于感知的声音场景重建技术主要包括幅度平移技术(Amplitude panning,AP)和基于头相关传输函数的双耳重建技术(Head related transfer function,HRTF)。本文对上述4类三维音频技术及其对应的典型系统进行了介绍及对比分析,并对三维音频技术当前3大主要研究热点:空间听觉机制、三维音频压缩编码以及三维音频系统精简的现状与前沿技术进行了介绍。展开更多
This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamicall...This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamically simulating the acoustic propagation from sound source to two ears,the system is capable of recreating free-field virtual sources at various directions and distances as well as auditory perception in reflective environment via headphone presentation.Schemes for improving VAE performance,including PCA-based(principal components analysis) near-field virtual source synthesis,simulating six degrees of freedom of head movement,are proposed.Especially,the PCA-based scheme greatly reduces the computational cost of multiple virtual sources synthesis.Test demonstrates that the system exhibits improved performances as compared with some existing systems.It is able to simultaneously render up to 280 virtual sources using conventional scheme,and 4500 virtual sources using the PCA-based scheme.A set of psychoacoustic experiments also validate the performance of the system,and at the same time,provide some preliminary results on the research of binaural hearing.The functions of the VAE system is being extended and the system serves as a flexible and powerful platform for future binaural hearing researches and virtual reality applications.展开更多
本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角...本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。展开更多
针对数字助听器中现存声源定位算法精确度低和算法复杂度高的问题,提出一种新的双耳声源定位算法.首先,采集到的双耳声源信号通过Gammatone滤波器分解为若干个子带信号,根据能量的大小对数据进行压缩.然后,利用头相关传递函数(head-rela...针对数字助听器中现存声源定位算法精确度低和算法复杂度高的问题,提出一种新的双耳声源定位算法.首先,采集到的双耳声源信号通过Gammatone滤波器分解为若干个子带信号,根据能量的大小对数据进行压缩.然后,利用头相关传递函数(head-related transfer function,HRTF)中包含的双耳线索,即双耳时间差、双耳声级差及耳间相关性,提取声源位置的特征.最后,声源的位置信息由高斯混合模型(Gaussian mixture model,GMM)分类器识别.实验结果表明,建议的算法具有高精确度、低复杂度及强鲁棒性.展开更多
文摘三维(Three-dimension,3D)多媒体技术,尤其是和3D视频相比有所差距的3D音频技术受到了广泛的关注。当前三维音频技术研究可分为基于物理声场重建的多声道音频技术和基于感知的声音场景重建的多声道音频技术两大类。物理声场重建技术的重要代表是基于球谐分解的声重放技术和波场合成技术(Wave field synthesis,WFS),基于感知的声音场景重建技术主要包括幅度平移技术(Amplitude panning,AP)和基于头相关传输函数的双耳重建技术(Head related transfer function,HRTF)。本文对上述4类三维音频技术及其对应的典型系统进行了介绍及对比分析,并对三维音频技术当前3大主要研究热点:空间听觉机制、三维音频压缩编码以及三维音频系统精简的现状与前沿技术进行了介绍。
基金supported by the National Natural Science Foundation of China(11174087,10774049)State Key Laboratory of Subtropical Building Science,South China University of Technology
文摘This paper reports the recent works and progress on a PC and C++ language-based virtual auditory environment(VAE) system platform.By tracing the temporary location and orientation of listener's head and dynamically simulating the acoustic propagation from sound source to two ears,the system is capable of recreating free-field virtual sources at various directions and distances as well as auditory perception in reflective environment via headphone presentation.Schemes for improving VAE performance,including PCA-based(principal components analysis) near-field virtual source synthesis,simulating six degrees of freedom of head movement,are proposed.Especially,the PCA-based scheme greatly reduces the computational cost of multiple virtual sources synthesis.Test demonstrates that the system exhibits improved performances as compared with some existing systems.It is able to simultaneously render up to 280 virtual sources using conventional scheme,and 4500 virtual sources using the PCA-based scheme.A set of psychoacoustic experiments also validate the performance of the system,and at the same time,provide some preliminary results on the research of binaural hearing.The functions of the VAE system is being extended and the system serves as a flexible and powerful platform for future binaural hearing researches and virtual reality applications.
文摘本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。
基金This work was supported by National Natural Science Foundation of China(No.1037 4031)and Natural Science Foundation of the South China University of Technology(No.123-E4050600).
文摘针对数字助听器中现存声源定位算法精确度低和算法复杂度高的问题,提出一种新的双耳声源定位算法.首先,采集到的双耳声源信号通过Gammatone滤波器分解为若干个子带信号,根据能量的大小对数据进行压缩.然后,利用头相关传递函数(head-related transfer function,HRTF)中包含的双耳线索,即双耳时间差、双耳声级差及耳间相关性,提取声源位置的特征.最后,声源的位置信息由高斯混合模型(Gaussian mixture model,GMM)分类器识别.实验结果表明,建议的算法具有高精确度、低复杂度及强鲁棒性.